[ https://issues.apache.org/jira/browse/LUCENE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865487#action_12865487 ]
Shai Erera commented on LUCENE-1585: ------------------------------------ I've been thinking about the multi-threading issue, and as far as I understand, it only concerns the local segment merging? PPP works w/ Directory+Term because the format of the payloads is per term for the entire Directory (not per segment). Therefore, I don't think there is multi-threading issues with the external Directories (the result of addIndexe*)? For the local segments, I see what you mean - it is possible that several threads will ask a PP for the same Dir+Term. PPP implementations can still work well in such scenario (if they wish to process payloads of local Dir as well) by holding a ThreadLocal PP for Dir+Term combination? I think proper documentation should be enough in this case. The whole point of this issue is to allow better control when addIndexes* are used. Affecting local payloads is a nice bonus, and I think we should wait for a real scenario which takes advantage of that. If the threading documentation warnings won't help, we can discuss then how to solve it? > Allow to control how payloads are merged > ---------------------------------------- > > Key: LUCENE-1585 > URL: https://issues.apache.org/jira/browse/LUCENE-1585 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Reporter: Michael Busch > Assignee: Shai Erera > Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-1585_3x.patch, LUCENE-1585_3x.patch, > LUCENE-1585_trunk.patch > > > Lucene handles backwards-compatibility of its data structures by > converting them from the old into the new formats during segment > merging. > Payloads are simply byte arrays in which users can store arbitrary > data. Applications that use payloads might want to convert the format > of their payloads in a similar fashion. Otherwise it's not easily > possible to ever change the encoding of a payload without reindexing. > So I propose to introduce a PayloadMerger class that the SegmentMerger > invokes to merge the payloads from multiple segments. Users can then > implement their own PayloadMerger to convert payloads from an old into > a new format. > In the future we need this kind of flexibility also for column-stride > fields (LUCENE-1231) and flexible indexing codecs. > In addition to that it would be nice if users could store version > information in the segments file. E.g. they could store "in segment _2 > the term a:b uses payloads of format x.y". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org