[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1791: - Attachment: LUCENE-1791.patch Patch showing what i have in mind. Current patch causes 14 failures in Te

[jira] Created: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-06 Thread Hoss Man (JIRA)
Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher --- Key: LUCENE-1791 URL: https://issues.apache.org/jira/browse/LUCENE-1791

Hudson build is back to normal: Lucene-trunk #912

2009-08-06 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/912/changes - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1790) Boosting Max Term Query

2009-08-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740380#action_12740380 ] Grant Ingersoll commented on LUCENE-1790: - Was actually just thinking we could hav

[jira] Updated: (LUCENE-1790) Boosting Max Term Query

2009-08-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1790: Attachment: LUCENE-1790.patch Will commit tomorrow or Saturday, as it is a pretty minor va

[jira] Commented: (LUCENE-1790) Boosting Max Term Query

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740378#action_12740378 ] Mark Miller commented on LUCENE-1790: - What about a common class with chooseable aggre

[jira] Created: (LUCENE-1790) Boosting Max Term Query

2009-08-06 Thread Grant Ingersoll (JIRA)
Boosting Max Term Query --- Key: LUCENE-1790 URL: https://issues.apache.org/jira/browse/LUCENE-1790 Project: Lucene - Java Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersol

[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740362#action_12740362 ] Mark Miller commented on LUCENE-1789: - Its basically what I did as a first attempt at

[jira] Resolved: (LUCENE-1788) Cleanup highlighter test class

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1788. - Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) > Cleanup highlight

[jira] Updated: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1771: Attachment: LUCENE-1771.bc-tests.patch workarounds for the back compat test branch > Using explai

[jira] Commented: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740343#action_12740343 ] Mark Miller commented on LUCENE-1771: - Thanks - BoostingNearQuery was just added, so i

[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-06 Thread Luis Alves (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740340#action_12740340 ] Luis Alves commented on LUCENE-1768: You could still do something similar by simply ov

[jira] Updated: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1771: - Attachment: LUCENE-1771.patch FWIW: the last patch was giving me compile errors because BoostingNearQuer

[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740328#action_12740328 ] Hoss Man commented on LUCENE-1789: -- This idea orriginated in LUCENE-1749, see these comme

[jira] Created: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-06 Thread Hoss Man (JIRA)
getDocValues should provide a MultiReader DocValues abstraction --- Key: LUCENE-1789 URL: https://issues.apache.org/jira/browse/LUCENE-1789 Project: Lucene - Java Issue Type: Improv

Sorting cleanup and FieldCacheImpl.Entry confusion

2009-08-06 Thread Chris Hostetter
Hey everybody, over in LUCENE-1749 i'm trying to make sanity checking of the FieldCache possible, and i'm banging my head into a few walls, and hoping people can help me fill in the gaps about how sorting w/FieldCache is *suppose* to work. For starters: i was getting confused why some debugg

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-06 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated LUCENE-1781: -- Attachment: LLRect.java Large distance fixer > Large distances in Spatial go beyond Prime MEridian >

[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-06 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740320#action_12740320 ] Bill Bell commented on LUCENE-1781: --- I did some additional testing, and here is the new

[jira] Commented: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-06 Thread Luis Alves (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740319#action_12740319 ] Luis Alves commented on LUCENE-1782: I finally was able to apply the patch in eclipse.

[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1749: - Attachment: LUCENE-1749.patch bq. the interestingthing is that the CacheEntry.toString() doesn't show t

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740311#action_12740311 ] Hoss Man commented on LUCENE-1749: -- bq. I think that TestCustomScoreQuery, TestFieldScore

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740308#action_12740308 ] Mark Miller commented on LUCENE-1749: - Okay, sorry - I messed up when merging with tru

[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1749: - Attachment: LUCENE-1749.patch checkpoint: no functional change from mark's previous patch, just improved

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740278#action_12740278 ] Mark Miller commented on LUCENE-1749: - {quote}(Actually: that seems like a wroth while

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740275#action_12740275 ] Mark Miller commented on LUCENE-1749: - Here is the output - it appears to think String

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740272#action_12740272 ] Mark Miller commented on LUCENE-1749: - I think that TestCustomScoreQuery, TestFieldSco

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740265#action_12740265 ] Hoss Man commented on LUCENE-1749: -- H... actually mark, testing our your latest patc

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740256#action_12740256 ] Hoss Man commented on LUCENE-1749: -- Mark: I'll start working on improving the docs (and o

[jira] Updated: (LUCENE-1788) Cleanup highlighter test class

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1788: Attachment: LUCENE-1788.patch > Cleanup highlighter test class > -- >

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Grant Ingersoll
On Aug 6, 2009, at 5:06 PM, Shai Erera wrote: Only w/ ScoreDocs we reuse the same instance. So I guess we'd like to do the same here. Seems like providing a TopSpansCollector is what you want, only unlike TopFieldCollector which populates the fields post search, you'd like to do it durin

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Shai Erera
Only w/ ScoreDocs we reuse the same instance. So I guess we'd like to do the same here. Seems like providing a TopSpansCollector is what you want, only unlike TopFieldCollector which populates the fields post search, you'd like to do it during search. I've been typing and deleting suggestions for

[jira] Created: (LUCENE-1788) Cleanup highlighter test class

2009-08-06 Thread Mark Miller (JIRA)
Cleanup highlighter test class -- Key: LUCENE-1788 URL: https://issues.apache.org/jira/browse/LUCENE-1788 Project: Lucene - Java Issue Type: Task Components: contrib/highlighter Reporter: Mar

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Grant Ingersoll
On Aug 6, 2009, at 4:25 PM, Shai Erera wrote: But still you might collect spans for docs unnecessarily during processing. If a doc is added to the PQ and later removed, then the spans collection was just a waste of time (unless the collection comes in free during query processing). sure,

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Shai Erera
But still you might collect spans for docs unnecessarily during processing. If a doc is added to the PQ and later removed, then the spans collection was just a waste of time (unless the collection comes in free during query processing). Also, if you build a paging search UI, then as soon as the us

[jira] Commented: (LUCENE-1787) Standard Tokenizer doesn't recognise I.B.M as Acronym, it requires it ends with a dot i.e I.B.M.

2009-08-06 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740233#action_12740233 ] Shai Erera commented on LUCENE-1787: We should fix ACRONYM, not ACRONYM_DEP right? ACR

[jira] Commented: (LUCENE-1787) Standard Tokenizer doesn't recognise I.B.M as Acronym, it requires it ends with a dot i.e I.B.M.

2009-08-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740232#action_12740232 ] Yonik Seeley commented on LUCENE-1787: -- You would want it to be greedy such that it w

[jira] Created: (LUCENE-1787) Standard Tokenizer doesn't recognise I.B.M as Acronym, it requires it ends with a dot i.e I.B.M.

2009-08-06 Thread Paul taylor (JIRA)
Standard Tokenizer doesn't recognise I.B.M as Acronym, it requires it ends with a dot i.e I.B.M. Key: LUCENE-1787 URL: https://issues.apache.org/jira/browse/LUCENE-17

[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-06 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740222#action_12740222 ] Yonik Seeley commented on LUCENE-1768: -- bq. I think, this should be in 2.9. The stan

[jira] Assigned: (LUCENE-1768) NumericRange support for new query parser

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1768: -- Assignee: Uwe Schindler > NumericRange support for new query parser >

[jira] Commented: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740219#action_12740219 ] Michael McCandless commented on LUCENE-1782: I think if you run these commands

[jira] Commented: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-06 Thread Luis Alves (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740213#action_12740213 ] Luis Alves commented on LUCENE-1782: I'm not able to apply your latest patch, all fil

[jira] Reopened: (LUCENE-1760) TokenStream API javadoc improvements

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-1760: Reopening so we don't forget Mark's last comment... > TokenStream API javadoc improve

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Grant Ingersoll
On Aug 6, 2009, at 2:31 PM, Paul Elschot wrote: With a single search one might end up collecting lots of span info that will be thrown away because the document score is too low. Presumably, you would only collect it if the result was actually put onto the PriorityQueue, in other words, aft

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Paul Elschot
With a single search one might end up collecting lots of span info that will be thrown away because the document score is too low. So I think the best way is to first collect the best hits in the usual way, and then get the spans of the query (effectively once more, but now without SpanScorer in b

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Mark Miller
>> besides the fact that Spans is an interface and it would break back compat, ugh! back compat is almost out the window for Spans and 2.9 - we already broke it with the payloads, so PayloadSpans had been merged to Spans. I don't know that we have time to squeeze anything in (2.9 is so close !

Re: SpanQuery and Spans optimizations

2009-08-06 Thread Grant Ingersoll
seek() seems somewhat doable, although inefficient because the underlying TermPositions supports seek, but that really would only allow us to go back to the beginning, I think (besides the fact that Spans is an interface and it would break back compat, ugh!). Collector route seems more pro

SpanQuery and Spans optimizations

2009-08-06 Thread Grant Ingersoll
I think it is fairly common use case (relative to the rather uncommon use case of using SpanQuery that is) to want to do something like: ... SpanQuery sq = ... topDocs = searcher.search(tq, 10); Spans spans = sq.getSpans(searcher.getIndexReader()); for (int i = 0; i < topDocs.scoreDocs.length;

[jira] Commented: (LUCENE-1760) TokenStream API javadoc improvements

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740165#action_12740165 ] Mark Miller commented on LUCENE-1760: - tokenstream still says token is deprecated > T

[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740155#action_12740155 ] Mark Miller commented on LUCENE-1749: - P.S. I'm not sure we want to go with the way I

RE: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Uwe Schindler
Thanks, we are always here to help :-) > Test passes with this patch - thanks a lot Robert ! I was going to ask > you to create a solr issue, but I see you already have, thanks! > > No need to create a test I think - put in the new Lucene jars and it > fails, so likely thats good enough. Though

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
Mark, I agree it could use some more tests in the future, like many things :) On Thu, Aug 6, 2009 at 11:52 AM, Mark Miller wrote: > Test passes with this patch - thanks a lot Robert ! I was going to ask you > to create a solr issue, but I see you already have, thanks! > > No need to create a test

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Mark Miller
Test passes with this patch - thanks a lot Robert ! I was going to ask you to create a solr issue, but I see you already have, thanks! No need to create a test I think - put in the new Lucene jars and it fails, so likely thats good enough. Though it is spooky that the test passed without the n

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Mark Miller
Thanks a lot guys. Uwe: thats why I was asking ;) I had no proof it was the TokenStream API, that just seemed a likely candidate - I'm not familiar with that filter, but it worked with a version of Lucene right before the TokenStream improvements patch, and then started failing after. When I

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
Mark, I looked at this and think it might be unrelated to tokenstreams. I think the length argument being provided to processWord(char[] buffer, int offset, int length, int wordCount) in that filter might be incorrectly calculated. This is the method that checks the keep list. (There is trailing

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
that makes perfect sense On Thu, Aug 6, 2009 at 11:31 AM, Uwe Schindler wrote: >> I have seen ur mail, but this bug should not be related to the new Token >> API, it should occur with old API, too. > > Maybe the problem is an unrelated change: > https://issues.apache.org/jira/browse/LUCENE-1762 >

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
the bug does occur with the old api (some of the evaluations have incorrect length, but they are not keep words). its just doesnt happen to make any tests fail (i guess termBufferLength() happens to == termBuffer.length() for all the tested keep words) with the old jar file... On Thu, Aug 6, 2009

RE: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Uwe Schindler
> I have seen ur mail, but this bug should not be related to the new Token > API, it should occur with old API, too. Maybe the problem is an unrelated change: https://issues.apache.org/jira/browse/LUCENE-1762 This issue changed the default length of the termBuffer in Token/TermAttributeImpl. Beca

RE: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Uwe Schindler
I have seen ur mail, but this bug should not be related to the new Token API, it should occur with old API, too. I did not look very close into the implementations, I only checked who changes what in which way. And I see that there is only one Token instance with a termBuffer that is changed. No p

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
uwe look at the patch i pasted in haste (i have a delivery guy here, sorry). the filter had a bug all along (it was using termBuffer.length for some length calculations). On Thu, Aug 6, 2009 at 11:17 AM, Uwe Schindler wrote: > I looked into the code of this Filter. It is very simple and should wo

RE: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Uwe Schindler
I looked into the code of this Filter. It is very simple and should work out of the box. There is no cloning done. When the indexer calls incrementToken, the delegation to next(Token) does not clone at all. It just uses the encapsulated Token instance (inside the AttributeImpl TokenWrapper) as reus

[jira] Resolved: (LUCENE-1341) BoostingNearQuery class (prototype)

2009-08-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-1341. - Resolution: Fixed Lucene Fields: (was: [Patch Available]) Committed revision 80

Re: Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Robert Muir
Index: src/java/org/apache/solr/analysis/CapitalizationFilterFactory.java === --- src/java/org/apache/solr/analysis/CapitalizationFilterFactory.java (revision 778975) +++ src/java/org/apache/solr/analysis/CapitalizationFilterFactory.

Issue with Solr TokenFilter and the new TokenStream API

2009-08-06 Thread Mark Miller
I think there is an issue here, but I didn't follow the TokenStream improvements very closely. In Solr, CapitalizationFilterFactory has a CharArray set that it loads up with keep words - it then checks (with the old TokenStream API) each token (char array) to see if it should keep it. I think

[jira] Updated: (LUCENE-1768) NumericRange support for new query parser

2009-08-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1768: -- Fix Version/s: 2.9 I think, this should be in 2.9. Any Chance to do this. In my Opinion, it sh

[jira] Updated: (LUCENE-1784) Make BooleanWeight and DisjunctionMaxWeight protected

2009-08-06 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Smith updated LUCENE-1784: -- Attachment: LUCENE-1784.patch Patch that makes BooleanWeight and DisjunctionMaxWeight protected also

[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1749: Attachment: LUCENE-1749.patch I still havn't looked at this in the detail that I want to, but time

[jira] Commented: (LUCENE-1767) Add sizeof to OpenBitSet

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740034#action_12740034 ] Mark Miller commented on LUCENE-1767: - I'm about to push this to 3.1 unless someone sp

[jira] Updated: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1486: Fix Version/s: (was: 2.9) 3.1 3.0 > Wildcards, ORs etc i

[jira] Commented: (LUCENE-1785) Simple FieldCache merging

2009-08-06 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740033#action_12740033 ] Mark Miller commented on LUCENE-1785: - I think this might have to be 3.1 ... > Simple

Re: Attributes, DocConsumer, Flexible Indexing, etc.

2009-08-06 Thread Grant Ingersoll
On Aug 6, 2009, at 5:48 AM, Michael McCandless wrote: Agreed. Yes, the ability to do things like implement Okapi, Language Modeling or very sparse indexes (although we kind of have that already) would not fit in with this stuff. Of course, those couldn't be solved through the Attribute

[jira] Commented: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739978#action_12739978 ] Michael McCandless commented on LUCENE-1771: Patch looks good Mark! This is a

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1781: --- Fix Version/s: (was: 2.9) 3.1 > Large distances in Spatial go

[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739974#action_12739974 ] Michael McCandless commented on LUCENE-1781: So, here's one thing that worries

[jira] Created: (LUCENE-1786) improve performance of contrib/TestCompoundWordTokenFilter

2009-08-06 Thread Robert Muir (JIRA)
improve performance of contrib/TestCompoundWordTokenFilter -- Key: LUCENE-1786 URL: https://issues.apache.org/jira/browse/LUCENE-1786 Project: Lucene - Java Issue Type: Test C

Re: Attributes, DocConsumer, Flexible Indexing, etc.

2009-08-06 Thread Michael McCandless
Agreed. Grant's idea is something new and I think useful, ie offering some sort of pluggability of what's stored in payloads, sitting entirely outside (above) Lucene's core. Maybe we should call it 'Flexible Payloads', or something, to differentiate the two. Mike On Thu, Aug 6, 2009 at 5:10 AM,

[jira] Updated: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1782: --- Attachment: LUCENE-1782.patch OK new patch attached w/ the above renaming. I added

[jira] Commented: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739949#action_12739949 ] Michael McCandless commented on LUCENE-1782: {quote} My reason for this, is th

Re: Attributes, DocConsumer, Flexible Indexing, etc.

2009-08-06 Thread Earwin Burrfoot
I always thought flexible indexing is not only for storing your app-specific data next to terms/docs. Something more along the lines of efficient geo search, or ability to try out various index encoding schemes without patching lucene. In other words, this is something that can be a basis for easy

[jira] Commented: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739944#action_12739944 ] Michael McCandless commented on LUCENE-1771: bq. however that project has apac