[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2015-01-30 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-4524:
--
Attachment: LUCENE-4524.patch

This is a better patch, the old one still had some of the Weight API changes 
from LUCENE-2878 in it.

Scorer extends PostingsEnum directly at the moment, which means that there are 
lots of Scorer implementations that have to implement empty position, offset 
and payload methods.  Might be worth having it extend DocsEnum instead.

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.9, Trunk

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch, LUCENE-4524.patch, 
 LUCENE-4524.patch, LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2015-01-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-4524:
--
Attachment: LUCENE-4524.patch

Patch adding a basic re-use test to BasePostingsFormatTestCase.  The verifyEnum 
method already does a lot of randomized testing of reuse, so the new test just 
asserts that a TermsEnum is reused or not reused in a couple of cases.

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.9, Trunk

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch, LUCENE-4524.patch, 
 LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2015-01-28 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-4524:
--
Attachment: LUCENE-4524.patch

This patch merges the old DocsEnum and DocsAndPositionsEnum into a new 
PostingsEnum class (which is basically the old DaPE class), with DocsEnum 
extending it as a convenience class returning empty values for positions, 
offsets and payloads.

TermsEnum.docs() methods are renamed to TermsEnum.postings().

The old docs() and docsAndPositions() methods can be added back to keep 
backwards compatibility.

Next up: some basic re-use tests.  I think we should be able to assert that 
things *aren't* reused when we have different postings requested for all 
postings formats, and check specific cases for those formats where re-use is 
actually implemented.

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.9, Trunk

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch, LUCENE-4524.patch, 
 LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2015-01-27 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-4524:
--
Attachment: LUCENE-4524.patch

Here's what I've got so far.  Warning: tests fail, due to some things returning 
null when they're not expected to.

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.9, Trunk

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2014-03-15 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-4524:
-

Fix Version/s: (was: 4.7)
   4.8

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.8

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2013-05-09 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4524:
--

Fix Version/s: (was: 4.3)
   4.4

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.4

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2013-01-21 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4524:


Attachment: LUCENE-4524.patch

here is an initial patch that moves this over. I really just did some initial 
porting and this patch has still some problems.

I removed DocsAndPosEnum entirely and changed how the DocsEnum Flags work such 
that we only have TermsEnum#docs and a simple sugar method for docsAndPos which 
should go away IMO. We need to figure out what kind of behavior those flags 
should trigger ie. if we have no freqs we still return and enum while no pos we 
return null.

anyway, most of the patch is rename etc. all test pass, comments welcome

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4524) Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum

2013-01-21 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4524:


Attachment: LUCENE-4524.patch

new patch bringing back TermsEnum#docsAndPositions(...) this make this entire 
thing way simpler and I think this is how it should be. All tests pass and I 
think this is pretty close already.

 Merge DocsEnum and DocsAndPositionsEnum into PostingsEnum
 -

 Key: LUCENE-4524
 URL: https://issues.apache.org/jira/browse/LUCENE-4524
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4524.patch, LUCENE-4524.patch


 spinnoff from http://www.gossamer-threads.com/lists/lucene/java-dev/172261
 {noformat}
 hey folks, 
 I have spend a hell lot of time on the positions branch to make 
 positions and offsets working on all queries if needed. The one thing 
 that bugged me the most is the distinction between DocsEnum and 
 DocsAndPositionsEnum. Really when you look at it closer DocsEnum is a 
 DocsAndFreqsEnum and if we omit Freqs we should return a DocIdSetIter. 
 Same is true for 
 DocsAndPostionsAndPayloadsAndOffsets*YourFancyFeatureHere*Enum. I 
 don't really see the benefits from this. We should rather make the 
 interface simple and call it something like PostingsEnum where you 
 have to specify flags on the TermsIterator and if we can't provide the 
 sufficient enum we throw an exception? 
 I just want to bring up the idea here since it might simplify a lot 
 for users as well for us when improving our positions / offset etc. 
 support. 
 thoughts? Ideas? 
 simon 
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org