[jira] [Updated] (LUCENE-5513) Binary DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5513: --- Attachment: LUCENE-5513.patch Thanks Mike. In an edge case where there are field updates, but also deletes, such that all of the updated documents were deleted, I created the DVFUpdates instances prematurely, leading to the NPE. Patch fixes this as well as integrated BDV updates in TestIWExceptions.testNoLostDeletesOrUpdates. > Binary DocValues Updates > > > Key: LUCENE-5513 > URL: https://issues.apache.org/jira/browse/LUCENE-5513 > Project: Lucene - Core > Issue Type: Wish > Components: core/index >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: LUCENE-5513.patch, LUCENE-5513.patch, LUCENE-5513.patch, > LUCENE-5513.patch > > > LUCENE-5189 was a great move toward. I wish to continue. The reason for > having this feature is to have "join-index" - to write children docnums into > parent's binaryDV. I can try to proceed the implementation, but I'm not so > experienced in such deep Lucene internals. [~shaie], any hint to begin with > is much appreciated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5513) Binary DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5513: --- Attachment: LUCENE-5513.patch Fixed stupid bug in BinaryDocValuesFieldUpdates.merge(). > Binary DocValues Updates > > > Key: LUCENE-5513 > URL: https://issues.apache.org/jira/browse/LUCENE-5513 > Project: Lucene - Core > Issue Type: Wish > Components: core/index >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: LUCENE-5513.patch, LUCENE-5513.patch, LUCENE-5513.patch > > > LUCENE-5189 was a great move toward. I wish to continue. The reason for > having this feature is to have "join-index" - to write children docnums into > parent's binaryDV. I can try to proceed the implementation, but I'm not so > experienced in such deep Lucene internals. [~shaie], any hint to begin with > is much appreciated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5513) Binary DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5513: --- Attachment: LUCENE-5513.patch Patch makes the following refactoring changes (all internal API): * DocValuesUpdate abstract class w/ common implementation for NumericDocValuesUpdate and BinaryDocValuesUpdate. * DocValuesFieldUpdates hold the doc+updates for a single field. It mostly defines the API for the Numeric* and Binary* implementations. * DocValuesFieldUpdates.Container holds numeric+binary updates for a set of fields. It is as its name says -- a container of updates used by ReaderAndUpdates. ** It helps not bloat the API w/ more maps being passed as well as simplified BufferedUpdatesStream and IndexWriter.commitMergedDeletes. ** It also serves as a factory method based on the updates Type * Finished TestBinaryDVUpdates * Added TestMixedDVUpdates which ports some of the 'big' tests from both TestNDV/BDVUpdates and mixes some NDV and BDV updates. ** I'll beast it some to make sure all edge cases are covered. I may take a crack at simplifying IW.commitMergedDeletes even more by pulling a lot of duplicate code into a method. This is impossible now because those sections modify more than one state variables, but I'll try to stuff these variables in a container to make this method more sane to read. Otherwise, I think it's ready. > Binary DocValues Updates > > > Key: LUCENE-5513 > URL: https://issues.apache.org/jira/browse/LUCENE-5513 > Project: Lucene - Core > Issue Type: Wish > Components: core/index >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: LUCENE-5513.patch, LUCENE-5513.patch > > > LUCENE-5189 was a great move toward. I wish to continue. The reason for > having this feature is to have "join-index" - to write children docnums into > parent's binaryDV. I can try to proceed the implementation, but I'm not so > experienced in such deep Lucene internals. [~shaie], any hint to begin with > is much appreciated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5513) Binary DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5513: --- Attachment: LUCENE-5513.patch Patch: * Add IW.updateBinaryDocValue * Makes necessary changes to DW, BufferedUpdates(Stream), ReaderAndUpdates * Add new BinaryUpdate and BinaryFieldUpdates * Copied TestNumericDocValuesUpdates and changed to add BDV fields: ** I still add numbers as it makes asserting easy, but I encode them as VLongs, so we get different lengths of byte[] ** There are some tests still disabled, see below Patch still doesn't handle updates that came in while a merge was in flight. The reason is that the code in IW.commitMergedDeletes is hairy and adding BinaryDV updates will make it even worse. So I want to refactor how the updates are represented internally, such that there is a single class DocValuesUpdates which captures all DV updates. Since the DV fields cannot overlap (a DV field cannot be both numeric and binary), I think it will allow us to use a single UpdatesIterator from IW.commitMergedDeletes. I'll take a look at that next and re-enable the tests after this has been resolved. There's one thing to note -- binary DV updates are more expensive to apply than NDV updates, depends on the length of the BDV. I.e. when we rewrite the DV file, then for NDV we know we write at most 8 bytes per document (compressed), but for BDV we may write a large number of bytes per document. I prefer to leave that as an optimization for later. One idea I have (which applies to NDV as well) is to do that in a sparse DV, or add "stacked" DV files. Either will make the producing code more complex, and therefore I prefer to explore that later. > Binary DocValues Updates > > > Key: LUCENE-5513 > URL: https://issues.apache.org/jira/browse/LUCENE-5513 > Project: Lucene - Core > Issue Type: Wish > Components: core/index >Reporter: Mikhail Khludnev >Priority: Minor > Attachments: LUCENE-5513.patch > > > LUCENE-5189 was a great move toward. I wish to continue. The reason for > having this feature is to have "join-index" - to write children docnums into > parent's binaryDV. I can try to proceed the implementation, but I'm not so > experienced in such deep Lucene internals. [~shaie], any hint to begin with > is much appreciated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5513) Binary DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-5513: - Component/s: core/index Description: LUCENE-5189 was a great move toward. I wish to continue. The reason for having this feature is to have "join-index" - to write children docnums into parent's binaryDV. I can try to proceed the implementation, but I'm not so experienced in such deep Lucene internals. [~shaie], any hint to begin with is much appreciated. > Binary DocValues Updates > > > Key: LUCENE-5513 > URL: https://issues.apache.org/jira/browse/LUCENE-5513 > Project: Lucene - Core > Issue Type: Wish > Components: core/index >Reporter: Mikhail Khludnev >Priority: Minor > > LUCENE-5189 was a great move toward. I wish to continue. The reason for > having this feature is to have "join-index" - to write children docnums into > parent's binaryDV. I can try to proceed the implementation, but I'm not so > experienced in such deep Lucene internals. [~shaie], any hint to begin with > is much appreciated. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org