Le Lundi 14 Août 2006 20:44, Michael McCandless a écrit : > >> If you make the compression external this is already done. In order > >> to do what the poster requires, you still need to read and update > >> fields without reading the entire document. You just do this at a > >> binary field level, and do all of he compression/decompression > >> externally. > >> > >> I think putting the compression into Lucene needlessly complicates > >> matters. All that is required is in place field updating, and binary > >> field support. > > > > I agree with you. > > The API should be kept compatible between versions, but what about > > breaking the compatibility in trunk? Is this will ba a problem is the > > function Fieldable.isCompressed() is removed ? > > OK I think this makes total sense. I've opened an issue to track this: > > http://issues.apache.org/jira/browse/LUCENE-652
Hi, In the issue, you wrote that "This way the indexing level just stores opaque binary fields, and then Document handles compress/uncompressing as needed." I have looked into the Lucene code, and it seems to me that it is Field that should take care of compress/uncompress, and it is the FieldsReader and FieldsWriter that should only view binary data. Or you mean that compression should be completely external to Lucene ? In fact, from the end of the other thread "Flexible index format / Payloads Cont'd", I was discussing about how to cutomize the way data are stored. So I have looked deeper in the code and I think I have found a way to do so. And as you could change the way is it stored, you also can define the compression level, or handle your own compression algorithm. I will show you a patch, but I have modified so much code because of my sevral tries, that I need first to remove the unecessary changes. To describe it shortly : - I have provided a way to provide you own FieldsReader and FieldsWriter (via a factory). To create a IndexReader, you have to provide that factory; the actual API is just using a default factory. - I have moved the code of FieldsReader and FieldsReader that do the field data reading to a new class FieldData. The FieldsReader instanciates a FieldData, do a fielddata.read(input), and do a new Field(fielddata,...). The FieldsReader do a field.getFieldData().write(output); - so extending FieldsReader, you can provide you own implementation of FieldData, so you can implement the way you want how data are stored and read. The tests pass successfully, but I have an issue with that design : one thing that is important I think is that in the current design, we can read an index in an old format, and just do a writer.addIndexes() into a new format. With the new design, you cannot, because the writer will use the FieldData.write provided by the reader. To be continued... cheers, Nicolas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]