Re: WriteLineDocTask does not release resources

2009-04-10 Thread Shai Erera
In fact, the more I think of it, I think it can be generalized even further. Create a TestResource abstract class with an abstract method release(). Then allow to aggregate TestResources in PerfRunData, either in a map (Task, TestResource) or just as a list. We can then either: (1) create a Release

Re: Grouping Lucene search results and calculating frequency by category

2009-04-10 Thread J. Delgado
Have you looked at SOLR? http://lucene.apache.org/solr/ It pretty much has what you are looking for. -- Joaquin On Fri, Apr 10, 2009 at 9:39 PM, mitu2009 wrote: > > Am working on a store search API using Lucene. > > I need to show store search results for each City,State combination with > its

WriteLineDocTask does not release resources

2009-04-10 Thread Shai Erera
WriteLineDocTask instantiates a BufferedWriter, but never closes it. This causes some problems in LUCENE-1591 since I want to wrap CBZip2OutputStream, and the stream has to be closed in order for the archive to be valid (flush() is not enough). Unlike DocMaker, which has a resetInputs method, task

Grouping Lucene search results and calculating frequency by category

2009-04-10 Thread mitu2009
Am working on a store search API using Lucene. I need to show store search results for each City,State combination with its frequency in bracketsfor example: Los Angles,CA (450) Atlant,GA (212) Boston, MA (78) . . . As of now, my search results return around 7000 lucene documents on an aver

Re: Problem using Lucene RangeQuery

2009-04-10 Thread mitu2009
thanks for your message...yes,i was able to get this working! Danil Ε’ORIN wrote: > > Lucene stores and searches STRINGS > so range [0..2] may return 0,1,101, ..109, 11, 110, ..119, 12, ., 2 > prefix and normalize your number, like: 001,002...011,012,, 113, etc, > if you'll have bigger number

Re: Benchmark: EnwikiDocMaker does not use fileIn (BufferedReader)

2009-04-10 Thread Shai Erera
Thanks Uwe. Then I think we should at least wrap the IS with a Buffered IS in EnwikiDocMaker (that's what I wanted to achieve in the first place, reusing LDM's BufferedReader)? On Fri, Apr 10, 2009 at 10:22 AM, Uwe Schindler wrote: > Hi Shai, > > > > with XML parsers you should generally avoid

[jira] Updated: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-831: --- Attachment: LUCENE-831.patch Here is fairly decent base to start from. Still needs a lot, but a surp

[jira] Commented: (LUCENE-1570) QueryParser.setAllowLeadingWildcard could provide finer granularity

2009-04-10 Thread Jonathan Watt (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698037#action_12698037 ] Jonathan Watt commented on LUCENE-1570: --- Okay. Thanks for your work on Deki guys. Lo

[jira] Commented: (LUCENE-1284) Set of Java classes that allow the Lucene search engine to use morphological information developed for the Apertium open-source machine translation platform (http://www

2009-04-10 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697952#action_12697952 ] Otis Gospodnetic commented on LUCENE-1284: -- Hi Felipe, OK, I looked at this some

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697942#action_12697942 ] Michael McCandless commented on LUCENE-1575: Patch looks good, and all tests p

[jira] Resolved: (LUCENE-1570) QueryParser.setAllowLeadingWildcard could provide finer granularity

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1570. - Resolution: Won't Fix Yoniks solution is the right call for this rather than any changes I think

[jira] Assigned: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-831: -- Assignee: Mark Miller > Complete overhaul of FieldCache API/Implementation > --

[jira] Resolved: (LUCENE-1304) Memory Leak when using Custom Sort (i.e., DistanceSortSource) of LocalLucene with Lucene

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1304. - Resolution: Won't Fix LUCENE-1483 and a new FieldComparator I saw going in the other day should

[jira] Commented: (LUCENE-1567) New flexible query parser

2009-04-10 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697929#action_12697929 ] Michael Busch commented on LUCENE-1567: --- {quote} Now we need the Software Grant. {q

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697919#action_12697919 ] Michael McCandless commented on LUCENE-1575: bq. I added to following tests:

[jira] Updated: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-10 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1575: --- Attachment: LUCENE-1575.9.patch bq. Did you add a test case verifying maxScore is correct (so that t

[jira] Updated: (LUCENE-1594) Use source code specialization to maximize search performance

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1594: --- Attachment: FastSearchTask.java Example of what the specialized code looks like. Th

[jira] Updated: (LUCENE-1594) Use source code specialization to maximize search performance

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1594: --- Attachment: LUCENE-1594.patch Initial patch. > Use source code specialization to ma

[jira] Created: (LUCENE-1594) Use source code specialization to maximize search performance

2009-04-10 Thread Michael McCandless (JIRA)
Use source code specialization to maximize search performance - Key: LUCENE-1594 URL: https://issues.apache.org/jira/browse/LUCENE-1594 Project: Lucene - Java Issue Type: New Featur

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697832#action_12697832 ] Mark Miller commented on LUCENE-831: Yes, good point. Okay, I think I have a much clear

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697830#action_12697830 ] Michael McCandless commented on LUCENE-831: --- bq. A way to say, use this builder

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697822#action_12697822 ] Mark Miller commented on LUCENE-831: {quote} bq. We have always been able to customize

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697820#action_12697820 ] Michael McCandless commented on LUCENE-831: --- {quote} If we are going to allow ran

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697812#action_12697812 ] Mark Miller commented on LUCENE-831: Some random thoughts: If we are going to allow ra

[jira] Commented: (LUCENE-1567) New flexible query parser

2009-04-10 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697808#action_12697808 ] Grant Ingersoll commented on LUCENE-1567: - OK, I see the CLA is registered. Now w

[jira] Assigned: (LUCENE-1567) New flexible query parser

2009-04-10 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned LUCENE-1567: --- Assignee: Grant Ingersoll (was: Michael Busch) > New flexible query parser > --

[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-04-10 Thread Ali Oral (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697810#action_12697810 ] Ali Oral commented on LUCENE-1486: -- This issue is very interesting. I see that you use qu

Re: MoreLikeThisQuery term frequency caching

2009-04-10 Thread Grant Ingersoll
What was your approach to handling stale cache entries? Did you flush it when you opened a new reader? On Apr 7, 2009, at 2:28 AM, Richard Marr wrote: Hi all, I've been exploring MoreLikeThisQuery as part of a recent project and something that came out of that might be useful to others here

Re: Modularization

2009-04-10 Thread Grant Ingersoll
I'm really ambivalent about Maven. Having just converted Mahout to it, am using it for some other projects and used it quite a bit in the past, I am still on the fence (although I am mostly happy w/ it for Mahout). I keep being lured in by the promise of it (dep. management, convention ov

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697791#action_12697791 ] Michael McCandless commented on LUCENE-1575: bq. BTW, I wonder if we can repla

RE: Benchmark: EnwikiDocMaker does not use fileIn (BufferedReader)

2009-04-10 Thread Uwe Schindler
Hi Shai, with XML parsers you should generally avoid using Readers, unless you know exactly that the underlying XML encoding is really the one given to the Reader. Readers as parameters should only be used for sources that are invariant of the encoding (like Java Strings containing XML, and wit