Le Mercredi 16 Août 2006 14:51, Grant Ingersoll a écrit : > On Aug 16, 2006, at 8:32 AM, Nicolas Lalevée wrote: > > Hi, > > > > In the issue, you wrote that "This way the indexing level just > > stores opaque > > binary fields, and then Document handles compress/uncompressing as > > needed." > > > > I have looked into the Lucene code, and it seems to me that it is > > Field that > > should take care of compress/uncompress, and it is the FieldsReader > > and > > FieldsWriter that should only view binary data. > > Or you mean that compression should be completely external to Lucene ? > > I believe the consensus is it should be done externally. > > > In fact, from the end of the other thread "Flexible index format / > > Payloads > > Cont'd", I was discussing about how to cutomize the way data are > > stored. So I > > have looked deeper in the code and I think I have found a way to do > > so. And > > as you could change the way is it stored, you also can define the > > compression > > level, or handle your own compression algorithm. I will show you a > > patch, but > > I have modified so much code because of my sevral tries, that I > > need first to > > remove the unecessary changes. To describe it shortly : > > - I have provided a way to provide you own FieldsReader and > > FieldsWriter (via > > a factory). To create a IndexReader, you have to provide that > > factory; the > > actual API is just using a default factory. > > - I have moved the code of FieldsReader and FieldsReader that do > > the field > > data reading to a new class FieldData. The FieldsReader instanciates a > > FieldData, do a fielddata.read(input), and do a new Field > > (fielddata,...). The > > FieldsReader do a field.getFieldData().write(output); > > - so extending FieldsReader, you can provide you own implementation of > > FieldData, so you can implement the way you want how data are > > stored and > > read. > > The tests pass successfully, but I have an issue with that design : > > one thing > > that is important I think is that in the current design, we can > > read an index > > in an old format, and just do a writer.addIndexes() into a new > > format. With > > the new design, you cannot, because the writer will use the > > FieldData.write > > provided by the reader. > > To be continued... > > I would love to see this patch. I think one could make a pretty good > argument for this kind of implementation being done "cleanly", that > is, it shouldn't necessarily involve reworking the internals, but > instead could represent the foundation for a new, codec based > indexing mechanism (with an implementation that can read/write the > existing file format.)
here it is : https://issues.apache.org/jira/browse/LUCENE-662 enjoy ! Nicolas -- Nicolas LALEVÉE Solutions & Technologies ANYWARE TECHNOLOGIES Tel : +33 (0)5 61 00 52 90 Fax : +33 (0)5 61 00 51 46 http://www.anyware-tech.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]