Re: clustering results

2004-04-10 Thread Michael A. Schoen
So as Venu pointed out, sorting doesn't seem to help the problem. If we have
to walk the result set, access docs and dedupe using brute force, we're
better off w/ the standard order by relevance.

If you've got an example of this type of clustering done in a more efficient
way, that'd be great.

Any other ideas?


- Original Message - 
From: Erik Hatcher [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Saturday, April 10, 2004 12:35 AM
Subject: Re: clustering results


 On Apr 9, 2004, at 8:16 PM, Michael A. Schoen wrote:
  I have an index of urls, and need to display the top 10 results for a
  given query, but want to display only 1 result per domain. It seems
  that using either Hits or a HitCollector, I'll need to access the doc,
  grab the domain field (I'll have it parse ahead of time) and only
  take/display documents that are unique.
 
  A significant percentage of the time I expect I may have to access
  thousands of results before I find 10 in unique domains. Is there a
  faster approach that won't require accessing thousands of documents?

 I have examples of this that I can post when I have more time, but a
 quick pointer... check out the overloaded IndexSearcher.search()
 methods which accept a Sort.  You can do really really interesting
 slicing and dicing, I think, using it.  Try this one on for size:

  example.displayHits(allBooks,
  new Sort(new SortField[]{
new SortField(category),
SortField.FIELD_SCORE,
new SortField(pubmonth, SortField.INT, true)
  }));

 Be clever indexing the piece you want to group on - I think you may
 find this the solution you're looking for.

 Erik


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



clustering results

2004-04-09 Thread Michael A. Schoen
I have an index of urls, and need to display the top 10 results for a given query, but 
want to display only 1 result per domain. It seems that using either Hits or a 
HitCollector, I'll need to access the doc, grab the domain field (I'll have it parse 
ahead of time) and only take/display documents that are unique.

A significant percentage of the time I expect I may have to access thousands of 
results before I find 10 in unique domains. Is there a faster approach that won't 
require accessing thousands of documents?

Re: problem with SearchFiles demo

2004-02-23 Thread Michael A. Schoen
I am using 1.3-final. Specifically I'm using the jar files from
lucene-1.3-final.zip.

Any other ideas?

- Original Message - 
From: Otis Gospodnetic [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, February 23, 2004 3:11 AM
Subject: Re: problem with SearchFiles demo


 I remember somebody reporting a similar problem a few months ago.  The
 problem has been fixed since then.  You need 1.3-final version Lucene.
 You didn't say if that is what you are using.

 Otis

 --- Michael A. Schoen [EMAIL PROTECTED] wrote:
  I'm sure there's some obvious explanation for this that I'm missing
  -- I
  can't get the SearchFiles demo class to work. I can successfully use
  the
  IndexFiles class to index a directory, but searching doesn't work; I
  just
  get a NullPointerException.
 
  So I wrote my own Search class, which is basically just a slightly
  tweaked
  version of SearchFiles. And I get a NullPointerException there as
  well. I
  added a stack trace, which shows the exception coming from
  IndexSearcher.explain().
 
  Any ideas?
 
  I've attached the source for Search.java, and below is the stack
  trace.
 
  thanks,
  Michael
 
 
   $ java Search
   Query: casino
   Searching for: casino
caught a class java.lang.NullPointerException
with message: null
   java.lang.NullPointerException
   at
 
 org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196)
   at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93)
   at org.apache.lucene.search.Hits.init(Hits.java:80)
   at
  org.apache.lucene.search.Searcher.search(Searcher.java:71)
   at
  org.apache.lucene.search.Searcher.search(Searcher.java:65)
   at Search.main(Search.java:35)
  
  
 
  
 -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: problem with SearchFiles demo

2004-02-23 Thread Michael A. Schoen

$ java -version
java version 1.3.0_02

$ uname -a
SunOS qadyn1i.looksmart.com 5.8 Generic_108528-20 sun4u sparc SUNW,Ultra-250

I entered a bug into BugZilla, but curiously it doesn't allow me to enter a
bug against 1.3-final -- so I chose unspecified. Bug 27174. This is my
first bug post, so please let me know if you need more detail.

I've attached the source to this message as well.

thanks,
Michael

- Original Message - 
From: Doug Cutting [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, February 23, 2004 9:31 AM
Subject: Re: problem with SearchFiles demo


 Michael,

 What JVM and OS are you using?

 Your attachment did not make it through.  If you continue to have
 problems please submit a bug report and attach test code there.

 Thanks,

 Doug

 Michael A. Schoen wrote:
  I am using 1.3-final. Specifically I'm using the jar files from
  lucene-1.3-final.zip.
 
  Any other ideas?
 
  - Original Message - 
  From: Otis Gospodnetic [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, February 23, 2004 3:11 AM
  Subject: Re: problem with SearchFiles demo
 
 
 
 I remember somebody reporting a similar problem a few months ago.  The
 problem has been fixed since then.  You need 1.3-final version Lucene.
 You didn't say if that is what you are using.
 
 Otis
 
 --- Michael A. Schoen [EMAIL PROTECTED] wrote:
 
 I'm sure there's some obvious explanation for this that I'm missing
 -- I
 can't get the SearchFiles demo class to work. I can successfully use
 the
 IndexFiles class to index a directory, but searching doesn't work; I
 just
 get a NullPointerException.
 
 So I wrote my own Search class, which is basically just a slightly
 tweaked
 version of SearchFiles. And I get a NullPointerException there as
 well. I
 added a stack trace, which shows the exception coming from
 IndexSearcher.explain().
 
 Any ideas?
 
 I've attached the source for Search.java, and below is the stack
 trace.
 
 thanks,
 Michael
 
 
 
 $ java Search
 Query: casino
 Searching for: casino
  caught a class java.lang.NullPointerException
  with message: null
 java.lang.NullPointerException
 at
 
 org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196)
 
 at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93)
 at org.apache.lucene.search.Hits.init(Hits.java:80)
 at
 
 org.apache.lucene.search.Searcher.search(Searcher.java:71)
 
 at
 
 org.apache.lucene.search.Searcher.search(Searcher.java:65)
 
 at Search.main(Search.java:35)
 
 
 
 -
 
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: problem with SearchFiles demo

2004-02-23 Thread Michael A. Schoen
That was it -- using 1.3.1.6 solved the problem. Thanks Hui.

Doug: should I close the bug, or was this supposed to work properly on
1.3.0_02?

- Original Message - 
From: hui [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Monday, February 23, 2004 11:44 AM
Subject: RE: problem with SearchFiles demo


 I got same problem when upgrading from 1.2 to 1.3RC1. Upgrading to
JDK1.3.1
 up solved the problem. It may work for you too.

 -Original Message-
 From: Michael A. Schoen [mailto:[EMAIL PROTECTED]
 Sent: Monday, February 23, 2004 1:47 PM
 To: Lucene Users List
 Subject: Re: problem with SearchFiles demo


 $ java -version
 java version 1.3.0_02

 $ uname -a
 SunOS qadyn1i.looksmart.com 5.8 Generic_108528-20 sun4u sparc
SUNW,Ultra-250

 I entered a bug into BugZilla, but curiously it doesn't allow me to enter
a
 bug against 1.3-final -- so I chose unspecified. Bug 27174. This is my
 first bug post, so please let me know if you need more detail.

 I've attached the source to this message as well.

 thanks,
 Michael

 - Original Message - 
 From: Doug Cutting [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, February 23, 2004 9:31 AM
 Subject: Re: problem with SearchFiles demo


  Michael,
 
  What JVM and OS are you using?
 
  Your attachment did not make it through.  If you continue to have
  problems please submit a bug report and attach test code there.
 
  Thanks,
 
  Doug
 
  Michael A. Schoen wrote:
   I am using 1.3-final. Specifically I'm using the jar files from
   lucene-1.3-final.zip.
  
   Any other ideas?
  
   - Original Message - 
   From: Otis Gospodnetic [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, February 23, 2004 3:11 AM
   Subject: Re: problem with SearchFiles demo
  
  
  
  I remember somebody reporting a similar problem a few months ago.  The
  problem has been fixed since then.  You need 1.3-final version Lucene.
  You didn't say if that is what you are using.
  
  Otis
  
  --- Michael A. Schoen [EMAIL PROTECTED] wrote:
  
  I'm sure there's some obvious explanation for this that I'm missing
  -- I
  can't get the SearchFiles demo class to work. I can successfully use
  the
  IndexFiles class to index a directory, but searching doesn't work; I
  just
  get a NullPointerException.
  
  So I wrote my own Search class, which is basically just a slightly
  tweaked
  version of SearchFiles. And I get a NullPointerException there as
  well. I
  added a stack trace, which shows the exception coming from
  IndexSearcher.explain().
  
  Any ideas?
  
  I've attached the source for Search.java, and below is the stack
  trace.
  
  thanks,
  Michael
  
  
  
  $ java Search
  Query: casino
  Searching for: casino
   caught a class java.lang.NullPointerException
   with message: null
  java.lang.NullPointerException
  at
  
  org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196)
  
  at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93)
  at org.apache.lucene.search.Hits.init(Hits.java:80)
  at
  
  org.apache.lucene.search.Searcher.search(Searcher.java:71)
  
  at
  
  org.apache.lucene.search.Searcher.search(Searcher.java:65)
  
  at Search.main(Search.java:35)
  
  
  
  -
  
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



problem with SearchFiles demo

2004-02-22 Thread Michael A. Schoen
I'm sure there's some obvious explanation for this that I'm missing -- I
can't get the SearchFiles demo class to work. I can successfully use the
IndexFiles class to index a directory, but searching doesn't work; I just
get a NullPointerException.

So I wrote my own Search class, which is basically just a slightly tweaked
version of SearchFiles. And I get a NullPointerException there as well. I
added a stack trace, which shows the exception coming from
IndexSearcher.explain().

Any ideas?

I've attached the source for Search.java, and below is the stack trace.

thanks,
Michael


 $ java Search
 Query: casino
 Searching for: casino
  caught a class java.lang.NullPointerException
  with message: null
 java.lang.NullPointerException
 at
org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:196)
 at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:93)
 at org.apache.lucene.search.Hits.init(Hits.java:80)
 at org.apache.lucene.search.Searcher.search(Searcher.java:71)
 at org.apache.lucene.search.Searcher.search(Searcher.java:65)
 at Search.main(Search.java:35)



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]