[jira] Commented: (LUCENE-1488) multilingual analyzer based on icu

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851714#action_12851714 ] Robert Muir commented on LUCENE-1488: - bq. I have a possibly naive question on the big

[jira] Commented: (LUCENE-1488) multilingual analyzer based on icu

2010-03-30 Thread David Bowen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851713#action_12851713 ] David Bowen commented on LUCENE-1488: - I have a possibly naive question on the bigram

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851659#action_12851659 ] Steven Rowe commented on LUCENE-2358: - Sorry for cluttering this issue... {quote} I'm

[jira] Resolved: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-2126. --- Resolution: Fixed Committed revision 929340. > Split up IndexInput and IndexOutput into Dat

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851656#action_12851656 ] Robert Muir commented on LUCENE-2358: - {quote} I needed to be able to mark cached toke

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851652#action_12851652 ] Steven Rowe commented on LUCENE-2358: - Hi Robert, I'm working on a change to ShingleF

[jira] Updated: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2358: Attachment: LUCENE-2358.patch attached is a patch (really svn move of KeywordMarkerTokenFilter and

[jira] Created: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
rename KeywordMarkerTokenFilter --- Key: LUCENE-2358 URL: https://issues.apache.org/jira/browse/LUCENE-2358 Project: Lucene - Java Issue Type: Task Components: Analysis Reporter: Robert Muir

[jira] Commented: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851596#action_12851596 ] Uwe Schindler commented on LUCENE-2302: --- Will add the javadocs and think about the C

[jira] Commented: (LUCENE-2354) Convert NumericUtils and NumericTokenStream to use BytesRef instead of Strings/char[]

2010-03-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851598#action_12851598 ] Uwe Schindler commented on LUCENE-2354: --- Will work here the next days and rewrite th

[jira] Commented: (LUCENE-2071) Allow updating of IndexWriter SegmentReaders

2010-03-30 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851528#action_12851528 ] Tim Smith commented on LUCENE-2071: --- found a couple of small issues with the patch attac

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851511#action_12851511 ] Michael McCandless commented on LUCENE-2111: {quote} The term dictionary shoul

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851508#action_12851508 ] Michael McCandless commented on LUCENE-2111: bq. Awesome work! What changes ma

Lucene Spatial

2010-03-30 Thread Guillermo Payet
Hello, I sent this query to the user list, with no luck. I'm hoping someone on dev might be able to help us. We've been using locallucene for years and years in our search engine of family farms: http://www.localharvest.org/ We'd like to upgrade Lucene to 3.0.1, which also means migrating from

Re: Query modifier

2010-03-30 Thread Jason Rutherglen
David, I totally agree with this idea. On Tue, Mar 30, 2010 at 9:58 AM, David Smiley (@MITRE.org) wrote: > > I observed this problem when I started using Lucene (ages ago) and it's a > shame this situation persists.  In summary, it would be tremendously useful > if Query objects were fully mutabl

Re: Query modifier

2010-03-30 Thread David Smiley (@MITRE.org)
I observed this problem when I started using Lucene (ages ago) and it's a shame this situation persists. In summary, it would be tremendously useful if Query objects were fully mutable and offered a visitor pattern to allow walking the query tree to facilitate rewriting. It would also be nice if

Re: Query modifier

2010-03-30 Thread David Smiley (@MITRE.org)
I observed this problem when I started using Lucene (ages ago) and it's a shame this situation persists. In summary, it would be tremendously useful if Query objects were fully mutable and offered a visitor pattern to allow walking the query tree to facilitate rewriting. I could open a JIRA issu

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851456#action_12851456 ] Robert Muir commented on LUCENE-2111: - {quote} There are certain specific wildcard cor

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851452#action_12851452 ] Michael Busch commented on LUCENE-2111: --- bq. Flex is generally faster. Awesome work

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851451#action_12851451 ] Michael Busch commented on LUCENE-2126: --- I'll try to commit tonight to flex, but it'

Landing the flex branch

2010-03-30 Thread Michael McCandless
I think the time has finally come! Pending one issue (LUCENE-2354 -- Uwe), I think flex is ready to land I think the other issues with Fix Version = Flex Branch can be moved to 3.1 after we land. We still use the pre-flex APIs in a number of places... I think this is actually good (so we cont

[jira] Updated: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2111: --- Attachment: LUCENE-2111.patch Small fixes for flex -- fixes SpanTermQuerty to throw

Re: new facet parameter: facet.exists=true

2010-03-30 Thread Erik Hatcher
Faceting on a "facet_fields" field will only have a handful (most likely) or less values so you'd be able to have that particular faceting cached to use quickly. I'm not sure how much memory it'd take up, but certainly not as much as actually faceting on the fields themselves. However, a

Re: new facet parameter: facet.exists=true

2010-03-30 Thread Gregor Kaczor
I am not sure if i got your approach right. If i did not, please explain where the advantages are in time and memory footprint. In my opinion faceting on facet field names does not avoid counting facets. If my result set is huge so will be the facet numbers on on the field of facet names. It do

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851400#action_12851400 ] Robert Muir commented on LUCENE-2111: - bq. I think net/net we are good to land flex!

[jira] Commented: (LUCENE-2071) Allow updating of IndexWriter SegmentReaders

2010-03-30 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851388#action_12851388 ] Tim Smith commented on LUCENE-2071: --- +1 I have a special subclassed IndexSearcher that

Re: new facet parameter: facet.exists=true

2010-03-30 Thread Erik Hatcher
One trick to doing this is to index a field that lists the facet field names that each document possesses. Then you can facet on the field of field names (sounds confusing, sorry) and you'll know if there are any documents in a result set that have values in, say, a "category" field. The

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851372#action_12851372 ] Michael McCandless commented on LUCENE-2111: Towards wrapping up flex, I ran a

new facet parameter: facet.exists=true

2010-03-30 Thread Gregor Kaczor
Facetting in indexes with document volumes exceeding twenty million documents is a time and particularly memory consuming search. In such huge indexes i am not interested if there is 4 or 5 million documents of a special type, i just want to know there are some and if i choose that facet will i

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851347#action_12851347 ] Michael McCandless commented on LUCENE-2126: Michael will this land on flex or

[jira] Commented: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851346#action_12851346 ] Michael McCandless commented on LUCENE-2302: Uwe is this issue done? > Replac

Re: Incremental Field Updates

2010-03-30 Thread Grant Ingersoll
On Mar 29, 2010, at 10:11 AM, mark harwood wrote: > >Of course, but what about the Lucene doc id doesn't provide that? > > The question being how you determine the correct doc id to use in the first > place (especially when they are know to be volatile) - the current answer is > to use a stabl

[jira] Resolved: (LUCENE-2351) optimize automatonquery

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2351. - Resolution: Fixed Committed revision 929065. > optimize automatonquery > --

[jira] Updated: (LUCENE-2351) optimize automatonquery

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2351: Attachment: LUCENE-2351.patch attached is the same patch as before, except it includes a random te