RE: map/reduce and Lucene integration question

2007-12-14 Thread Butler, Mark (Labs)
lient API as well? And perhaps how it could be used with MapReduce? kind regards, Mark -Original Message- From: Enis Soztutar [mailto:[EMAIL PROTECTED] Sent: 13 December 2007 09:37 To: hadoop-user@lucene.apache.org Subject: Re: map/reduce and Lucene integration question Hi, nutc

Re: map/reduce and Lucene integration question

2007-12-13 Thread Ted Dunning
Yes. On 12/13/07 12:22 PM, "Eugeny N Dzhurinsky" <[EMAIL PROTECTED]> wrote: > On Thu, Dec 13, 2007 at 11:31:49AM -0800, Ted Dunning wrote: >> After indexing, indexes are moved to multiple query servers. ... (how nutch >> works) With this architecture, you get good scaling in both queries per

Re: map/reduce and Lucene integration question

2007-12-13 Thread Eugeny N Dzhurinsky
On Thu, Dec 13, 2007 at 11:31:49AM -0800, Ted Dunning wrote: > After indexing, indexes are moved to multiple query servers. The indexes on > the local query servers are all on local disk. > > There are two dimensions to scaling search. The first dimension is query > rate. To get that scaling, y

Re: map/reduce and Lucene integration question

2007-12-13 Thread Ted Dunning
After indexing, indexes are moved to multiple query servers. The indexes on the local query servers are all on local disk. There are two dimensions to scaling search. The first dimension is query rate. To get that scaling, you simply replicate your basic search operator and balance using a si

Re: map/reduce and Lucene integration question

2007-12-13 Thread Andrzej Bialecki
Ted Dunning wrote: I don't think so (but I don't run nutch) To actually run searches, the search engines copy the index to local storage. Having them in HDFS is very nice, however, as a way to move them to the right place. Nutch can search in Lucene indexes on HDFS (see org.apache.nutch.inde

Re: map/reduce and Lucene integration question

2007-12-13 Thread Eugeny N Dzhurinsky
On Thu, Dec 13, 2007 at 11:03:50AM -0800, Ted Dunning wrote: > > I don't think so (but I don't run nutch) > > To actually run searches, the search engines copy the index to local > storage. Having them in HDFS is very nice, however, as a way to move them > to the right place. Even in case if th

Re: map/reduce and Lucene integration question

2007-12-13 Thread Ted Dunning
I don't think so (but I don't run nutch) To actually run searches, the search engines copy the index to local storage. Having them in HDFS is very nice, however, as a way to move them to the right place. On 12/13/07 10:59 AM, "Eugeny N Dzhurinsky" <[EMAIL PROTECTED]> wrote: > On Thu, Dec 13,

Re: map/reduce and Lucene integration question

2007-12-13 Thread Eugeny N Dzhurinsky
On Thu, Dec 13, 2007 at 11:36:31AM +0200, Enis Soztutar wrote: > Hi, > > nutch indexes the documents in the org.apache.nutch.indexer.Indexer class. > In the reduce phase, the documents are output wrapped in ObjectWritable. > The OutputFormat opens a local indexwriter(FileSystem.startLocalOutput(

Re: map/reduce and Lucene integration question

2007-12-13 Thread Enis Soztutar
Hi, nutch indexes the documents in the org.apache.nutch.indexer.Indexer class. In the reduce phase, the documents are output wrapped in ObjectWritable. The OutputFormat opens a local indexwriter(FileSystem.startLocalOutput()), and adds all the documents that are collected. Then puts the index