Ken, Grant,

For the moment I would not want to take this off list.

I did some more digging and found that the (or a) Reuters
collection is available via the contrib/benchmark in Lucene,
and that would be a good start to use as test data.

Also, as reported on Swap-1, only boolean queries were used,
no ranking was done. When adding lexical affinity and ranking,
the query search space gets even bigger, so that will need
attention.

I like Swap-1 because its results can be expressed in queries,
which are normally easily understandable.

Understandability and ranking are useful in interactive classification,
and that is my still distant target for now.

Regards,
Paul Elschot


Op Sunday 10 February 2008 21:56:23 schreef Grant Ingersoll:
> If it makes at all sense that this lives in Mahout, please coordinate  
> online (which I assume it does since it was brought up here), that way  
> others who might be interested can chip in and we will all benefit  
> from the discussion.
> 
> Cheers,
> Grant
> 
> On Feb 10, 2008, at 12:54 AM, Ken Montanez wrote:
> 
> > I would be interested in looking at this. Our schedules sound  
> > similar and
> > the project is regarding subject matter that I am very curious  
> > about. Please
> > let me know if you want to coordinate offline.
> >
> > Ken
> >
> > On Feb 5, 2008 12:01 AM, Paul Elschot <[EMAIL PROTECTED]> wrote:
> >
> >> Dear readers,
> >>
> >> Is there anyone else interested in reimplementing Swap-1 on top of  
> >> Lucene?
> >> Is there perhaps an existing implementation available somewhere  
> >> that I'm
> >> not aware of?
> >>
> >>
> >> http://scholar.google.nl/scholar?q=swap+1+text+reuters&hl=nl&lr=&ie=ISO-8859-1&btnG=Zoeken
> >>
> >> I'd like to extend it with lexical affinity queries and just see what
> >> grows out of it.
> >> The pace will be slow, I have only little time available.
> >>
> >> Regards,
> >> Paul Elschot
> >>
> >
> >
> >
> > -- 
> > Ken Montanez | 510.681.5576
> 
> 
> 
> 


Reply via email to