Re: clustering results
So as Venu pointed out, sorting doesn't seem to help the problem. If we have to walk the result set, access docs and dedupe using brute force, we're better off w/ the standard order by relevance. If you've got an example of this type of clustering done in a more efficient way, that'd be great. Any other ideas? - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, April 10, 2004 12:35 AM Subject: Re: clustering results On Apr 9, 2004, at 8:16 PM, Michael A. Schoen wrote: I have an index of urls, and need to display the top 10 results for a given query, but want to display only 1 result per domain. It seems that using either Hits or a HitCollector, I'll need to access the doc, grab the domain field (I'll have it parse ahead of time) and only take/display documents that are unique. A significant percentage of the time I expect I may have to access thousands of results before I find 10 in unique domains. Is there a faster approach that won't require accessing thousands of documents? I have examples of this that I can post when I have more time, but a quick pointer... check out the overloaded IndexSearcher.search() methods which accept a Sort. You can do really really interesting slicing and dicing, I think, using it. Try this one on for size: example.displayHits(allBooks, new Sort(new SortField[]{ new SortField(category), SortField.FIELD_SCORE, new SortField(pubmonth, SortField.INT, true) })); Be clever indexing the piece you want to group on - I think you may find this the solution you're looking for. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
clustering results
I have an index of urls, and need to display the top 10 results for a given query, but want to display only 1 result per domain. It seems that using either Hits or a HitCollector, I'll need to access the doc, grab the domain field (I'll have it parse ahead of time) and only take/display documents that are unique. A significant percentage of the time I expect I may have to access thousands of results before I find 10 in unique domains. Is there a faster approach that won't require accessing thousands of documents?
Re: problem with SearchFiles demo
I am using 1.3-final. Specifically I'm using the jar files from lucene-1.3-final.zip. Any other ideas? - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, February 23, 2004 3:11 AM Subject: Re: problem with SearchFiles demo I remember somebody reporting a similar problem a few months ago. The problem has been fixed since then. You need 1.3-final version Lucene. You didn't say if that is what you are using. Otis --- Michael A. Schoen [EMAIL PROTECTED] wrote: I'm sure there's some obvious explanation for this that I'm missing -- I can't get the SearchFiles demo class to work. I can successfully use the IndexFiles class to index a directory, but searching doesn't work; I just get a NullPointerException. So I wrote my own Search class, which is basically just a slightly tweaked version of SearchFiles. And I get a NullPointerException there as well. I added a stack trace, which shows the exception coming from IndexSearcher.explain(). Any ideas? I've attached the source for Search.java, and below is the stack trace. thanks, Michael $ java Search Query: casino Searching for: casino caught a class java.lang.NullPointerException with message: null java.lang.NullPointerException at org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93) at org.apache.lucene.search.Hits.init(Hits.java:80) at org.apache.lucene.search.Searcher.search(Searcher.java:71) at org.apache.lucene.search.Searcher.search(Searcher.java:65) at Search.main(Search.java:35) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: problem with SearchFiles demo
$ java -version java version 1.3.0_02 $ uname -a SunOS qadyn1i.looksmart.com 5.8 Generic_108528-20 sun4u sparc SUNW,Ultra-250 I entered a bug into BugZilla, but curiously it doesn't allow me to enter a bug against 1.3-final -- so I chose unspecified. Bug 27174. This is my first bug post, so please let me know if you need more detail. I've attached the source to this message as well. thanks, Michael - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, February 23, 2004 9:31 AM Subject: Re: problem with SearchFiles demo Michael, What JVM and OS are you using? Your attachment did not make it through. If you continue to have problems please submit a bug report and attach test code there. Thanks, Doug Michael A. Schoen wrote: I am using 1.3-final. Specifically I'm using the jar files from lucene-1.3-final.zip. Any other ideas? - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, February 23, 2004 3:11 AM Subject: Re: problem with SearchFiles demo I remember somebody reporting a similar problem a few months ago. The problem has been fixed since then. You need 1.3-final version Lucene. You didn't say if that is what you are using. Otis --- Michael A. Schoen [EMAIL PROTECTED] wrote: I'm sure there's some obvious explanation for this that I'm missing -- I can't get the SearchFiles demo class to work. I can successfully use the IndexFiles class to index a directory, but searching doesn't work; I just get a NullPointerException. So I wrote my own Search class, which is basically just a slightly tweaked version of SearchFiles. And I get a NullPointerException there as well. I added a stack trace, which shows the exception coming from IndexSearcher.explain(). Any ideas? I've attached the source for Search.java, and below is the stack trace. thanks, Michael $ java Search Query: casino Searching for: casino caught a class java.lang.NullPointerException with message: null java.lang.NullPointerException at org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93) at org.apache.lucene.search.Hits.init(Hits.java:80) at org.apache.lucene.search.Searcher.search(Searcher.java:71) at org.apache.lucene.search.Searcher.search(Searcher.java:65) at Search.main(Search.java:35) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: problem with SearchFiles demo
That was it -- using 1.3.1.6 solved the problem. Thanks Hui. Doug: should I close the bug, or was this supposed to work properly on 1.3.0_02? - Original Message - From: hui [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Monday, February 23, 2004 11:44 AM Subject: RE: problem with SearchFiles demo I got same problem when upgrading from 1.2 to 1.3RC1. Upgrading to JDK1.3.1 up solved the problem. It may work for you too. -Original Message- From: Michael A. Schoen [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 1:47 PM To: Lucene Users List Subject: Re: problem with SearchFiles demo $ java -version java version 1.3.0_02 $ uname -a SunOS qadyn1i.looksmart.com 5.8 Generic_108528-20 sun4u sparc SUNW,Ultra-250 I entered a bug into BugZilla, but curiously it doesn't allow me to enter a bug against 1.3-final -- so I chose unspecified. Bug 27174. This is my first bug post, so please let me know if you need more detail. I've attached the source to this message as well. thanks, Michael - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, February 23, 2004 9:31 AM Subject: Re: problem with SearchFiles demo Michael, What JVM and OS are you using? Your attachment did not make it through. If you continue to have problems please submit a bug report and attach test code there. Thanks, Doug Michael A. Schoen wrote: I am using 1.3-final. Specifically I'm using the jar files from lucene-1.3-final.zip. Any other ideas? - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, February 23, 2004 3:11 AM Subject: Re: problem with SearchFiles demo I remember somebody reporting a similar problem a few months ago. The problem has been fixed since then. You need 1.3-final version Lucene. You didn't say if that is what you are using. Otis --- Michael A. Schoen [EMAIL PROTECTED] wrote: I'm sure there's some obvious explanation for this that I'm missing -- I can't get the SearchFiles demo class to work. I can successfully use the IndexFiles class to index a directory, but searching doesn't work; I just get a NullPointerException. So I wrote my own Search class, which is basically just a slightly tweaked version of SearchFiles. And I get a NullPointerException there as well. I added a stack trace, which shows the exception coming from IndexSearcher.explain(). Any ideas? I've attached the source for Search.java, and below is the stack trace. thanks, Michael $ java Search Query: casino Searching for: casino caught a class java.lang.NullPointerException with message: null java.lang.NullPointerException at org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93) at org.apache.lucene.search.Hits.init(Hits.java:80) at org.apache.lucene.search.Searcher.search(Searcher.java:71) at org.apache.lucene.search.Searcher.search(Searcher.java:65) at Search.main(Search.java:35) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
problem with SearchFiles demo
I'm sure there's some obvious explanation for this that I'm missing -- I can't get the SearchFiles demo class to work. I can successfully use the IndexFiles class to index a directory, but searching doesn't work; I just get a NullPointerException. So I wrote my own Search class, which is basically just a slightly tweaked version of SearchFiles. And I get a NullPointerException there as well. I added a stack trace, which shows the exception coming from IndexSearcher.explain(). Any ideas? I've attached the source for Search.java, and below is the stack trace. thanks, Michael $ java Search Query: casino Searching for: casino caught a class java.lang.NullPointerException with message: null java.lang.NullPointerException at org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93) at org.apache.lucene.search.Hits.init(Hits.java:80) at org.apache.lucene.search.Searcher.search(Searcher.java:71) at org.apache.lucene.search.Searcher.search(Searcher.java:65) at Search.main(Search.java:35) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]