Option 5 seems to benefit both of the worlds, fast and customizable 

But I'm fine with option 1 which 95% of the users will use anyway

Sent from my iPhone

> On 17 Oct 2016, at 17:24, Shad Storhaug <s...@shadstorhaug.com> wrote:
> I have completed implementing all but one of the missing Lucene.Net.Core 
> tests, and have started getting some of the new failing tests to pass (was 
> originally over 60, now is more like 35). But one issue I discovered along 
> the way is that some of the test doubles that depended on the 
> OLD_FORMAT_IMPERSONATION_IS_ACTIVE setting being static are not able to 
> function.
> After thinking this through, the reason why that variable was made non-static 
> is because if we run multiple tests in parallel using xUnit they cannot use a 
> static setting that changes for various tests. So we need another solution 
> than going back to the original Java implementation that was static (that is, 
> if we plan to go forward with xUnit...).
> Also, as I previously mentioned 
> (http://apache.markmail.org/search/?q=remaining+list%3Aorg.apache.incubator.lucene-net-dev+order%3Adate-backward+date%3A200603-201610#query:remaining%20list%3Aorg.apache.incubator.lucene-net-dev%20order%3Adate-backward%20date%3A200603-201610+page:1+mid:so67bfksvrebrhru+state:results),
>  the fact that the SPIClassIterator just loads the types from the current 
> AppDomain in random order is an issue for anyone who needs to extend it (as 
> well as a problem for testing).
> The Problem
> Many of the RW codec files (test mocks) require the ability to read the 
> OLD_FORMAT_IMPERSONATION_IS_ACTIVE variable from LuceneTestCase. In Lucene, 
> they did so by reading the setting the variable directly because it was 
> static, so they would have access to the setting in real time.
> Connie solved part of the issue by adding a constructor to the codec test 
> mocks, which allows you to inject this setting into the IndexWriterConfig (an 
> example here: 
> https://github.com/apache/lucenenet/blob/686b75113be7c4bae2daee20afb5f7a70d79a1d5/src/Lucene.Net.Tests/core/Index/TestNumericDocValuesUpdates.cs#L1067).
>  That works for the IndexWriter, however, when using the IndexReader the 
> codec names are read from the index file and loaded directly by 
> NamedSPILoader without any way to intervene or pass anything to the 
> constructor.
> In addition, the ORDER of the classes that are currently loaded by 
> SPIClassIterator is important for the tests. They require that the test 
> codecs are loaded before the production codecs. This is also important if 
> someone wanted to extend the codec for some reason - they will also need 
> their customizations to load first before the internal Lucene classes.
> Proposed Solution
> We could add a correlation to the Java ClassLoader to Lucene.Net, called 
> ITypeLoader. The constructor and method overloads from Lucene that accept 
> ClassLoader were left out of Lucene.Net, which is one reason we are missing 
> this functionality. The main purpose of ITypeLoader would be to load and 
> ORDER the types that are iterated by the SPIClassIterator. ITypeLoader would 
> be set statically at application startup 
> (Lucene.Net.Support.TypeLoader.SetTypeLoader(ITypeLoader)) and will also have 
> a default value (DefaultTypeLoader) if no customization is needed.
> The existing SPIClassIterator would be refactored to only iterate and not 
> cache the types. It will also be passed a parameter (as it was in Lucene) to 
> filter for types that directly inherit the passed in type. The caching of 
> types could be moved to a decorator ITypeLoader instead, but we should remove 
> the constraints that we currently have on constructor signatures.
> In addition, we could have an IInstanceFactory (abstract factory) that could 
> be implemented in order to provide additional dependencies/configuration to 
> types that are customized, similar to what is specified here: 
> http://blog.ploeh.dk/2014/05/19/di-friendly-framework/.  Its purpose would be 
> to replace the current Activator.CreateInstance() calls that only accept a 
> default constructor (so we can pass in test variables that apply to the 
> instance). We could also statically specify IInstanceFactory at application 
> startup (Lucene.Net.Support.TypeLoader.SetInstanceFactory(IInstanceFactory)). 
> One major benefit of this is that if one were so inclined, they could use a 
> 3rd party dependency injection container to supply these instances to the 
> NamedSPILoader (which is where they are cached until they are used).
> The behavior of the default value of IInstanceFactory 
> (DefaultInstanceFactory) is pretty clear - just wrap the same old call to 
> Activator.CreateInstance(). We can then easily override this behavior in the 
> test environment to inject whatever we need (I am thinking that we could just 
> pass a Func<T> to the constructor so it can access the real-time value of the 
> OLD_FORMAT_IMPERSONATION_IS_ACTIVE variable, which would most likely just be 
> local to each test). We could even use this in other places in Lucene.Net to 
> allow for injection of custom constructor parameters where we currently have 
> fixed parameter signatures.
> However, the behavior of the default value of ITypeLoader (DefaultTypeLoader) 
> is less clear. Do we really need to load all of the types from the current 
> AppDomain? If so, how do we control the order of the types and/or assemblies?
> Option 1
> A possible solution would be to just hard code - pure DI - the Lucene.Net 
> assemblies (specified by type string) in DefaultTypeLoader (or possibly in a 
> custom IComparer<Assembly> dependency?). We could easily control the order 
> this way and if people don't need customization this would perform much 
> better. If they do need customization, they could simply inherit 
> DefaultTypeLoader, override its GetTypes() method, and call base.GetTypes() 
> after specifying the customization types. The only downside to this is that 
> new types are not automatically discovered.
> Option 2
> Another solution to this would be to load everything from the AppDomain as we 
> do now, but ensure all of the built-in Lucene.Net assemblies are skipped 
> until the very end of the process, which means all custom types are 
> automatically loaded before all built-in types.
> Option 3
> Same as Option 1, but also have a configuration file of some sort (possibly 
> JSON format...) that could, if supplied, be used to provide the names and 
> order of the assemblies to load. This is actually the closest to how it was 
> done in Java - there is a .classpath file that determines the order in which 
> the packages are loaded, which made it configurable. However, the .classpath 
> is apparently a closer analog to a .csproj file as this order is added to the 
> compiled .jar package. It doesn't seem very reasonable to require someone to 
> hand edit the .csproj file in order to control the dependency order (and I am 
> not sure if the assembly/project reference order is preserved for a .NET 
> compiled assembly anyway).
> Option 4
> Same as Option 1, but add a configuration setting "AutoDiscoverAssemblies" to 
> plug in the default behavior from Option 2. Basically, default to best 
> performance but allow automatic discovery of new types if the option is 
> enabled.
> Option 5
> Same as Option 1, but have an alternative AssemblyTypeLoader ITypeLoader 
> implementation with a constructor parameter that accepts an 
> IEnumerable<Assembly> where the user can supply any assemblies to scan for 
> Lucene.Net types (in the proper order). If the DefaultTypeLoader is used, no 
> scanning will happen. But if the user sets the ITypeLoader to 
> AssemblyTypeLoader, the constructor-specified assemblies will be scanned. The 
> advantage here is that we don't need to mess with loading and then ignoring 
> unwanted assemblies from the AppDomain, but we can still scan for types.
> So, what would be the most logical default behavior for 
> DefaultInstanceLoader? Is there an alternative way of ordering the assemblies 
> and/or types that we should consider?
> Shad Storhaug (NightOwl888)

Reply via email to