what's the concern? a read-once file we slurp in hurts nothing as far as open file limits etc. I don't think we should be so damn crazy about # of files anyway: we have cfs as a solution for that (unrelated) On Jun 26, 2012 6:13 PM, "Andrzej Bialecki" <[email protected]> wrote:
> On 26/06/2012 23:13, Michael McCandless wrote: > >> +1, if we can find some clean way of doing it that doesn't rely on >> file length on read (ie, to seek backwards to the header). >> > > I don't like the additional file idea, we already create too many files > ... maybe record this in a segmentInfo attribute? > > Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Tue, Jun 26, 2012 at 11:32 AM, Robert Muir <[email protected]> wrote: >> >>> Just looking at the previous thread, I wonder if we should consider >>> removing AppendingCodec and just removing this seek stuff. >>> >>> Currently this is essentially metadata stuff in terms dict/index (e.g. >>> terms dict field summary section and offsets for each field in terms >>> index: https://builds.apache.org/job/**Lucene-trunk/javadoc/core/org/** >>> apache/lucene/codecs/lucene40/**Lucene40PostingsFormat.html<https://builds.apache.org/job/Lucene-trunk/javadoc/core/org/apache/lucene/codecs/lucene40/Lucene40PostingsFormat.html> >>> ) >>> >>> I know the typical argument for keeping this stuff is that we would >>> need to rely upon additional file operations (e.g. length), and we >>> want to limit that, but this isn't the only possible solution, e.g. we >>> could write a read-once file with this metadata thats just slurped in. >>> >>> And really relying upon seek at write could be viewed as just as bad >>> as relying upon length, obviously we know some filesystems dont >>> support it. >>> >>> >>> -- >>> lucidimagination.com >>> >>> ------------------------------**------------------------------** >>> --------- >>> To unsubscribe, e-mail: >>> [email protected].**org<[email protected]> >>> For additional commands, e-mail: [email protected] >>> >>> >> ------------------------------**------------------------------**--------- >> To unsubscribe, e-mail: >> [email protected].**org<[email protected]> >> For additional commands, e-mail: [email protected] >> >> >> > > > -- > Best regards, > Andrzej Bialecki > http://www.sigram.com, blog http://www.sigram.com/blog > ___.,___,___,___,_._. __________________<><_________**___________ > [___||.__|__/|__||\/|: Information Retrieval, System Integration > ___|||__||..\|..||..|: Contact: info at sigram dot com > > > > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > [email protected].**org<[email protected]> > For additional commands, e-mail: [email protected] > >
