NightOwl888 commented on issue #403: URL: https://github.com/apache/lucenenet/issues/403#issuecomment-765444859
@rclabo There is one "default" codec that defaults to `"Lucene46"` (and floats per Lucene version) which can be set/retrieved through the `Codec.Default` property. If there is no codec registered with the name `"Lucene46"` and the `Codec.Default` property is not explicitly set, there will be a `NullReferenceException` when opening the `IndexWriter` (this should probably be changed to `InvalidOperationException` for .NET compatibility). This means codec doesn't actually have to be specified in `IndexWriterConfig` each time you open an index unless it varies from whatever the default is, it can be set once at application startup. ```c# Codec.Default = new Lucene46HighCompressionCodec(); ``` However, in `IndexWriter` the codec that is set/defaulted is for writing *new segments* to the index. Each segment can technically have a different codec which is specified through the `SegmentInfo.Codec` property, but they are all initialized using the codec that is passed through `IndexWriterConfig.Codec` by default (which can be overridden). As you have correctly pointed out, when opening an index for reading (even with NRT), it will use the codec specified in the index header rather than the `IndexWriter` class. > Is there an existing API to get this codec name from the header? There is, but it is not technically meant for end-users. It requires you know the name of the segment file in the index as well as the zero-based index of the segment within the file. ```c# var sis = new SegmentInfos(); sis.Read(directory, segmentFileName); string codecName = sis.Segments[segmentIndex].Info.Codec.Name; ``` Do note however that this internally calls `Codec.ForName()` to instantiate the codec so the codec needs to be registered with Lucene.NET first in order to read the name this way. The actual `Read()` method has quite a bit of version-specific branching logic within it, so deconstructing it so it always gives you a name without ever calling `Codec.ForName()` is a bit more involved. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
