[ 
https://issues.apache.org/jira/browse/LUCENE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5216:
-------------------------------

    Attachment: LUCENE-5216.patch

Patch for trunk:

* Remove SegmentInfo.attributes() and related methods
* New Lucene46SegmentInfoFormat with reader/writer
* Modify Lucene40SegmentInfoFormat to write an empty map (for tests) and 
read+ignore existing map
* Made relevant changes to Lucene4XRWCodecs

This cannot be merged as-is to 4x though as we only deprecate SI.attributes() 
there, but the majority of the changes can be backported (i.e. the changes to 
the RW codecs and the new format). I'll take care of it when we get there.

All core tests pass.
                
> Fix SegmentInfo.attributes when updates are involved
> ----------------------------------------------------
>
>                 Key: LUCENE-5216
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5216
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Shai Erera
>         Attachments: LUCENE-5216.patch
>
>
> Today, SegmentInfo.attributes are write-once. However, in the presence of 
> field updates (see LUCENE-5189 and LUCENE-5215) this creates an issue, in 
> which if a Codec decides to alter the attributes when updates are applied, 
> they are silently discarded. This is rather a corner case, though one that 
> should be addressed.
> There were two solutions to address this:
> # Record SI.attributes in SegmentInfos, so they are written per-commit, 
> instead of the .si file.
> # Remove them altogether, as they don't seem to be used anywhere in Lucene 
> code today.
> If we remove them, we basically don't take away special capability from 
> Codecs, because they can still write the attributes to a separate file, or 
> even the file they record the other data in. This will work even with 
> updates, as long as Codecs respect the given segmentSuffix.
> If we keep them, I think the simplest solution is to read/write them by 
> SegmentInfos. But if we don't see a good use case, I suggest we remove them, 
> as it's just extra code to maintain. I think we can even risk a backwards 
> break and remove them completely from 4x, though if that's a problem, we can 
> deprecate too.
> If anyone sees a good usage for them, or better - already uses them, please 
> speak up, so we can make the proper decision.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to