[ http://issues.apache.org/jira/browse/LUCENE-662?page=all ]
Nicolas Lalevée updated LUCENE-662:
-----------------------------------
Attachment: generic-fieldIO-3.patch
Here is an update of the patch:
- merged with the last commit in trunk
- I have fixed the issue with stream cloning (just reusing the same way of
cloning as it is done in the current trunk)
- the FieldData is back. So the Fieldable is back too. And the worry I had
about offering an internal function to public is gone.
- every test passed.
- I have moved the bunch of classes that implement the FieldReader/FieldWriter
in a RDF way into the tests. So there are some tests on this extension
mechanism.
> Extendable writer and reader of field data
> ------------------------------------------
>
> Key: LUCENE-662
> URL: http://issues.apache.org/jira/browse/LUCENE-662
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Store
> Reporter: Nicolas Lalevée
> Priority: Minor
> Attachments: generic-fieldIO-2.patch, generic-fieldIO-3.patch,
> generic-fieldIO.patch
>
>
> As discussed on the dev mailing list, I have modified Lucene to allow to
> define how the data of a field is writen and read in the index.
> Basically, I have introduced the notion of IndexFormat. It is in fact a
> factory of FieldsWriter and FieldsReader. So the IndexReader, the indexWriter
> and the SegmentMerger are using this factory and not doing a "new
> FieldsReader/Writer()".
> I have also introduced the notion of FieldData. It handles every data of a
> field, and also the writing and the reading in a stream. I have done this way
> because in the current design of Lucene, Fiedable is an interface, so methods
> with a protected or package visibility cannot be defined.
> A FieldsWriter just writes data into a stream via the FieldData of the field.
> A FieldsReader instanciates a FieldData depending on the field name. Then it
> use the field data to read the stream. And finnaly it instanciates a Field
> with the field data.
> About compatibility, I think it is kept, as I have writen a
> DefaultIndexFormat that provides some DefaultFieldsWriter and
> DefaultFieldsReader. These implementations do the exact job that is done
> today.
> To acheive this modification, some classes and methods had to be moved from
> private and/or final to public or protected.
> About the lazy fields, I have implemented them in a more general way in the
> implementation of the abstract class FieldData, so it will be totally
> transparent for the Lucene user that will extends FieldData. The stream is
> kept in the fieldData and used as soon as the stringValue (or something else)
> is called. Implementing this way allowed me to handle the recently introduced
> LOAD_FOR_MERGE; it is just a lazy field data, and when read() is called on
> this lazy field data, the saved input stream is directly copied in the output
> stream.
> I have a last issue with this patch. The current design allow to read an
> index in an old format, and just do a writer.addIndexes() into a new format.
> With the new design, you cannot, because the writer will use the
> FieldData.write provided by the reader.
> enjoy !
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]