Re: Distributed Lucene Questions

2009-06-02 Thread Tarandeep Singh
ng and consulting > http://www.scaleunlimited.com > http://www.101tec.com > > > > > On Jun 1, 2009, at 9:54 AM, Tarandeep Singh wrote: > > Hi All, >> >> I am trying to build a distributed system to build and serve lucene >> indexes. >> I came across the Distributed

RE: Distributed Lucene Questions

2009-06-01 Thread Angel, Eric
er_li...@transpac.com] Sent: Monday, June 01, 2009 11:05 AM To: java-user@lucene.apache.org Subject: Re: Distributed Lucene Questions >Hi All, > >I am trying to build a distributed system to build and serve lucene indexes. >I came across the Distributed Lucene project- >http://wi

Re: Distributed Lucene Questions

2009-06-01 Thread Ken Krugler
Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene https://issues.apache.org/jira/browse/HADOOP-3394 and have a couple of questions. It will be really helpful if

Distributed Lucene Questions

2009-06-01 Thread Tarandeep Singh
Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene https://issues.apache.org/jira/browse/HADOOP-3394 and have a couple of questions. It will be really helpful if

Re: distributed lucene progress

2008-06-02 Thread Lukas Vlcek
FYI: The Ning's code seems to be part of Hadoop contrib package now. On Sat, May 31, 2008 at 5:35 AM, Matt Ronge <[EMAIL PROTECTED]> wrote: > > On May 21, 2008, at 3:19 PM, Otis Gospodnetic wrote: > > No, that's a separate project on SF, IIRC. >> > > I

Re: distributed lucene progress

2008-05-30 Thread Matt Ronge
On May 21, 2008, at 3:19 PM, Otis Gospodnetic wrote: No, that's a separate project on SF, IIRC. I am also interested in distributed lucene. I took a look on Hadoop's wiki and found this: http://wiki.apache.org/hadoop/DistributedLucene?highlight=%28distributed%29 which lea

Re: distributed lucene progress

2008-05-21 Thread Otis Gospodnetic
No, that's a separate project on SF, IIRC. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: John Wang <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, May 21, 2008 11:46:02 AM > Subject:

Re: distributed lucene progress

2008-05-21 Thread John Wang
- Lucene - Solr - Nutch > > > - Original Message > > From: Chris Hostetter <[EMAIL PROTECTED]> > > To: java-user@lucene.apache.org > > Sent: Tuesday, May 20, 2008 5:58:08 PM > > Subject: Re: distributed lucene progress > > > > :What is

Re: distributed lucene progress

2008-05-20 Thread Otis Gospodnetic
t: Tuesday, May 20, 2008 5:58:08 PM > Subject: Re: distributed lucene progress > > :What is the current status on the distributed lucene project proposed at: > : > : http://www.mail-archive.com/[EMAIL PROTECTED]/msg00338.html > > I don't think it ever got passed the init

Re: distributed lucene progress

2008-05-20 Thread Chris Hostetter
:What is the current status on the distributed lucene project proposed at: : : http://www.mail-archive.com/[EMAIL PROTECTED]/msg00338.html I don't think it ever got passed the initial idea stage ... or, if it did: I haven't heard about i

distributed lucene progress

2008-05-14 Thread John Wang
Hi: What is the current status on the distributed lucene project proposed at: http://www.mail-archive.com/[EMAIL PROTECTED]/msg00338.html Thanks -John

Re: Distributed Lucene Directory

2008-02-01 Thread Cedric Ho
On Feb 1, 2008 9:47 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > > Cedric Ho wrote: > > > > But managing such a set of indexes is not trivial. Especially when > > need to add redundancies for reliability and update frequently. > > > Agreed. Apparently the Solr guys are working on this now. Certainl

Re: Distributed Lucene Directory

2008-01-31 Thread Mark Miller
Cedric Ho wrote: But managing such a set of indexes is not trivial. Especially when need to add redundancies for reliability and update frequently. Agreed. Apparently the Solr guys are working on this now. Certainly not trivial to do right. You might want to check out that work. I want to

Re: Distributed Lucene Directory

2008-01-31 Thread Cedric Ho
Yes, I am aware of the RemoteSearchable and ParallelSearcher. And I am doing something similiar now. i.e. split the index on multiple machines. But managing such a set of indexes is not trivial. Especially when need to add redundancies for reliability and update frequently. I bumped into this a w

Re: Distributed Lucene Directory

2008-01-31 Thread Karl Wettin
31 jan 2008 kl. 09.42 skrev Cedric Ho: I am wondering if there exist any implemenation of org.apache.lucene.store.Directory which can be distributed across multiple machines with comparable performance to a local FSDirectory index, or is such an idea feasible in the first place. By comparable p

Distributed Lucene Directory

2008-01-31 Thread Cedric Ho
Hi all, I am wondering if there exist any implemenation of org.apache.lucene.store.Directory which can be distributed across multiple machines with comparable performance to a local FSDirectory index, or is such an idea feasible in the first place. By comparable performance I mean a 100G index di

RE: Distributed Lucene.. - clustering as a requirement

2006-04-11 Thread Dmitry Goldenberg
I guess Compass is probably the way to go - http://www.opensymphony.com/compass/ From: Prasenjit Mukherjee [mailto:[EMAIL PROTECTED] Sent: Tue 4/11/2006 2:45 AM To: java-user@lucene.apache.org Subject: Re: Distributed Lucene.. - clustering as a requirement

Re: Distributed Lucene.. - clustering as a requirement

2006-04-10 Thread Prasenjit Mukherjee
Agreed, an inverted index cannot be efficiently maintained in a B-tree(hence RDBMS). But I think we can(or should) have the option of a B-tree based storage for unindexed fields, whereas for indexed fields we can use the existing lucene's architecture. prasen [EMAIL PROTECTED] wrote: Dmi

Re: Distributed Lucene.. - clustering as a requirement

2006-04-10 Thread Doug Cutting
Dmitry Goldenberg wrote: For an enterprise-level application, Lucene appears too file-system and too byte-sequence-centric a technology. Just my opinion. The Directory API is just too low-level. There are good reasons why Lucene is not built on top of a RDBMS. An inverted index is not effi

RE: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Dmitry Goldenberg
ht [mailto:[EMAIL PROTECTED] Sent: Thu 4/6/2006 3:55 PM To: java-user@lucene.apache.org Subject: Re: Distributed Lucene.. - clustering as a requirement What about using lucene just for searching (i.e., no stored fields except maybe one "ID" primary key field), and using an RDBMS fo

Re: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Chris Lamprecht
much trouble to > deal with for application integrator like myself. > > - Dmitry > > > > From: Samuru Jackson [mailto:[EMAIL PROTECTED] > Sent: Mon 3/6/2006 10:05 AM > To: java-user@lucene.apache.org > Subject: Re: Distributed Lucene.. >

RE: Distributed Lucene.. - clustering as a requirement

2006-04-06 Thread Dmitry Goldenberg
xing structures at a single byte level are just way too much trouble to deal with for application integrator like myself. - Dmitry From: Samuru Jackson [mailto:[EMAIL PROTECTED] Sent: Mon 3/6/2006 10:05 AM To: java-user@lucene.apache.org Subject: Re: Distributed

RE: Distributed Lucene..

2006-03-07 Thread Andrew Schetinin
h Engine" SIIA Codie Award "Trend Setting Product" KMWorld Magazine -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 8:55 PM To: java-user@lucene.apache.org Subject: Re: Distributed Lucene.. Hi, Just curious about this: >

Re: Distributed Lucene..

2006-03-07 Thread Otis Gospodnetic
Hi, Just curious about this: > We hacked :-) IndexWriter of Lucene to start all segment names with a > prefix unique for each small index part. > Then, when adding it to the actual index, we simply copy the new segment > to the folder with the other segments, and add it in such a way so that > the

Re: Distributed Lucene..

2006-03-07 Thread Andrzej Bialecki
Prasenjit Mukherjee wrote: I think nutch has a distributed lucene implementation. I could have used nutch straightaway, but I have a different crawler, and also dont want to use NDFS(which is used by nutch) . What I have proposed earlier is basically based on mapReduce paradigm, which is used

Re: Distributed Lucene..

2006-03-06 Thread Prasenjit Mukherjee
I think nutch has a distributed lucene implementation. I could have used nutch straightaway, but I have a different crawler, and also dont want to use NDFS(which is used by nutch) . What I have proposed earlier is basically based on mapReduce paradigm, which is used by nutch as well. It would

RE: Distributed Lucene..

2006-03-06 Thread Andrew Schetinin
: java-user@lucene.apache.org Subject: Re: Distributed Lucene.. Do you plan to release some kind of a commerical product including an API? I ask because I'm evaluating different technologies for a prototype which is part of my diploma thesis. The problem is that I have to deal with real huge

Re: Distributed Lucene..

2006-03-06 Thread Samuru Jackson
Do you plan to release some kind of a commerical product including an API? I ask because I'm evaluating different technologies for a prototype which is part of my diploma thesis. The problem is that I have to deal with real huge data amounts and one machine is simply not enough to handle those am

RE: Distributed Lucene..

2006-03-06 Thread Andrew Schetinin
Hello, We are implementing a distributed searcher and indexer based on Lucene. I cannot share its code but I may provide hints basing on our experience. What we did basically is having several machines indexing documents and creating small Lucene indexes. We hacked :-) IndexWriter of Lucene to s

Re: Distributed Lucene..

2006-03-06 Thread Samuru Jackson
> Does it make any sense ? Also would like to know if there are other ways > to distribute lucene's indexing/searching ? I'm interested in such a distributed architecture too. What I have got in mind is some kind of lucene index cluster where you have got several machines having subindexes in me

Distributed Lucene..

2006-03-05 Thread Prasenjit Mukherjee
I already have an implementation of a distributed crawler farm, where crawler instances are runnign on different boxes. I want to come up with a distributed indexing scheme using lucene and take advantage of the distributed nature of my crawlers' distributed nature. Here is what I am thinking.