I have completed implementing all but one of the missing Lucene.Net.Core tests, 
and have started getting some of the new failing tests to pass (was originally 
over 60, now is more like 35). But one issue I discovered along the way is that 
some of the test doubles that depended on the 
OLD_FORMAT_IMPERSONATION_IS_ACTIVE setting being static are not able to 

After thinking this through, the reason why that variable was made non-static 
is because if we run multiple tests in parallel using xUnit they cannot use a 
static setting that changes for various tests. So we need another solution than 
going back to the original Java implementation that was static (that is, if we 
plan to go forward with xUnit...).

Also, as I previously mentioned 
 the fact that the SPIClassIterator just loads the types from the current 
AppDomain in random order is an issue for anyone who needs to extend it (as 
well as a problem for testing).

The Problem

Many of the RW codec files (test mocks) require the ability to read the 
OLD_FORMAT_IMPERSONATION_IS_ACTIVE variable from LuceneTestCase. In Lucene, 
they did so by reading the setting the variable directly because it was static, 
so they would have access to the setting in real time.

Connie solved part of the issue by adding a constructor to the codec test 
mocks, which allows you to inject this setting into the IndexWriterConfig (an 
example here: 
 That works for the IndexWriter, however, when using the IndexReader the codec 
names are read from the index file and loaded directly by NamedSPILoader 
without any way to intervene or pass anything to the constructor.

In addition, the ORDER of the classes that are currently loaded by 
SPIClassIterator is important for the tests. They require that the test codecs 
are loaded before the production codecs. This is also important if someone 
wanted to extend the codec for some reason - they will also need their 
customizations to load first before the internal Lucene classes.

Proposed Solution

We could add a correlation to the Java ClassLoader to Lucene.Net, called 
ITypeLoader. The constructor and method overloads from Lucene that accept 
ClassLoader were left out of Lucene.Net, which is one reason we are missing 
this functionality. The main purpose of ITypeLoader would be to load and ORDER 
the types that are iterated by the SPIClassIterator. ITypeLoader would be set 
statically at application startup 
(Lucene.Net.Support.TypeLoader.SetTypeLoader(ITypeLoader)) and will also have a 
default value (DefaultTypeLoader) if no customization is needed.

The existing SPIClassIterator would be refactored to only iterate and not cache 
the types. It will also be passed a parameter (as it was in Lucene) to filter 
for types that directly inherit the passed in type. The caching of types could 
be moved to a decorator ITypeLoader instead, but we should remove the 
constraints that we currently have on constructor signatures.

In addition, we could have an IInstanceFactory (abstract factory) that could be 
implemented in order to provide additional dependencies/configuration to types 
that are customized, similar to what is specified here: 
http://blog.ploeh.dk/2014/05/19/di-friendly-framework/.  Its purpose would be 
to replace the current Activator.CreateInstance() calls that only accept a 
default constructor (so we can pass in test variables that apply to the 
instance). We could also statically specify IInstanceFactory at application 
startup (Lucene.Net.Support.TypeLoader.SetInstanceFactory(IInstanceFactory)). 
One major benefit of this is that if one were so inclined, they could use a 3rd 
party dependency injection container to supply these instances to the 
NamedSPILoader (which is where they are cached until they are used).

The behavior of the default value of IInstanceFactory (DefaultInstanceFactory) 
is pretty clear - just wrap the same old call to Activator.CreateInstance(). We 
can then easily override this behavior in the test environment to inject 
whatever we need (I am thinking that we could just pass a Func<T> to the 
constructor so it can access the real-time value of the 
OLD_FORMAT_IMPERSONATION_IS_ACTIVE variable, which would most likely just be 
local to each test). We could even use this in other places in Lucene.Net to 
allow for injection of custom constructor parameters where we currently have 
fixed parameter signatures.

However, the behavior of the default value of ITypeLoader (DefaultTypeLoader) 
is less clear. Do we really need to load all of the types from the current 
AppDomain? If so, how do we control the order of the types and/or assemblies?

Option 1

A possible solution would be to just hard code - pure DI - the Lucene.Net 
assemblies (specified by type string) in DefaultTypeLoader (or possibly in a 
custom IComparer<Assembly> dependency?). We could easily control the order this 
way and if people don't need customization this would perform much better. If 
they do need customization, they could simply inherit DefaultTypeLoader, 
override its GetTypes() method, and call base.GetTypes() after specifying the 
customization types. The only downside to this is that new types are not 
automatically discovered.

Option 2

Another solution to this would be to load everything from the AppDomain as we 
do now, but ensure all of the built-in Lucene.Net assemblies are skipped until 
the very end of the process, which means all custom types are automatically 
loaded before all built-in types.

Option 3

Same as Option 1, but also have a configuration file of some sort (possibly 
JSON format...) that could, if supplied, be used to provide the names and order 
of the assemblies to load. This is actually the closest to how it was done in 
Java - there is a .classpath file that determines the order in which the 
packages are loaded, which made it configurable. However, the .classpath is 
apparently a closer analog to a .csproj file as this order is added to the 
compiled .jar package. It doesn't seem very reasonable to require someone to 
hand edit the .csproj file in order to control the dependency order (and I am 
not sure if the assembly/project reference order is preserved for a .NET 
compiled assembly anyway).

Option 4

Same as Option 1, but add a configuration setting "AutoDiscoverAssemblies" to 
plug in the default behavior from Option 2. Basically, default to best 
performance but allow automatic discovery of new types if the option is enabled.

Option 5

Same as Option 1, but have an alternative AssemblyTypeLoader ITypeLoader 
implementation with a constructor parameter that accepts an 
IEnumerable<Assembly> where the user can supply any assemblies to scan for 
Lucene.Net types (in the proper order). If the DefaultTypeLoader is used, no 
scanning will happen. But if the user sets the ITypeLoader to 
AssemblyTypeLoader, the constructor-specified assemblies will be scanned. The 
advantage here is that we don't need to mess with loading and then ignoring 
unwanted assemblies from the AppDomain, but we can still scan for types.

So, what would be the most logical default behavior for DefaultInstanceLoader? 
Is there an alternative way of ordering the assemblies and/or types that we 
should consider?

Shad Storhaug (NightOwl888)

Reply via email to