Hi I reviewed the index configuration section in the ref guide and then SolrIndexConfig and I'm not sure if that's intentional, or again a relic from older days when configuration was simpler. But I think that the parameter useCompoundFiles needs some clarification:
IndexWriterConfig.useCompoundFiles controls whether newly flushed segments are packed in a compound file or not. This is important to modify if you e.g. do batch indexing and intend to finish with a big merge, in that case the extra CFS packing is redundant for every flush. MergePolicy.noCFSRatio determines which merged segments are packed into a CFS (and there's also maxCFSSegmentSizeMB). This lets you avoid the extra packing for very large segments, where the packing itself is expensive during indexing, but does not buy you much during searching. In SolrIndexConfig I see that if the user defined a top-level <useCompoundFile> element (i.e. outside the MP setting), it controls both IWC and MP (sets noCFSRatio=1.0). The code does the right thing though, in that if you also specify noCFSRatio and maxCFSSegmentSizeMB, they are applied correctly later on. I understand that this might seem as a simplification to users, where they set this value once and it controls both places, but I think it's bad. First, because if you set <useCompoundFile>, you basically *always* end up w/ CFS, even if you intend that to apply to only newly flushed segments. In order to use default settings for merged segments, you have to explicitly include the default settings in the <mergePolicy> element. This is trappy I think and looks odd. Also, I think that it's fine if our users understand the implications of setting either values. The defaults are fine as they are, and if users really want to get into that place, it's OK if we ask them to read the docs and understand which parameter they set and for what purpose. Beyond that, SolrIndexConfig is trunk contains deprecated code around this parameter and somewhat hacks around older schemas that defined useCFS inside the MP element -- are we still required to support that back-compat in trunk as well? These two issues could be handled separately, but if others agree that we should use explicit settings for this, I don't mind tackling both (explicit settings and remove deprecated code in trunk) under one issue. Shai
