On Thu, Nov 19, 2009 at 14:15, Grant Ingersoll <[email protected]> wrote:
> Probably a discussion better suited for the Open Relevance Project ( > http://lucene.apache.org/openrelevance). I'll check that > That being said, the primary problem we have is one of redistribution. If > you can give us a pointer to it and we can know that it isn't going to > change, that is probably the best thing. Personally, I'd love to see/use > it. > > -Grant > Well it was there : http://index.isc.org/ , but it's not any more... I got some copy on disk but it's quite large (smallest cral is a 4.5Go tar.bz2 and the smallest index is 990Mo tar.bz2, not easy to send by mail. > On Nov 19, 2009, at 7:40 AM, Gérard Dupont wrote: > > > Hi, > > > > I'm a bit out of the discussion and don't know what is the exact scope of > > the test needed, however, I still have the IOI crawl pages and Lucene > > indexes which have been offered after the end of the search wikia > project. > > It totally not classified data but quite large (I have something like 30M > > pages in mind). Do you have any use of such data ? Again it's raw crawl, > no > > classification has been applied. > > > > cheers > > >
