Re: Solr with Auto-suggest

2008-04-29 Thread Rantjil Bould
Thanks a lot for your advice/suggestion. I have made good progress and could able to extract all facets based on facet.prefix query. The auto-suggest works fine for single word suggestion. I was wondering to extract all nearest token for any token selected by user in auto-suggest mode. Example:

Re: unique values from a field in a result

2008-04-29 Thread Thijs
It must be my english. When I read your comment, I think you could compare it to the category example... Maybe with an example I can explain my situation better: The documents in the index contain variations of different products. Say for example I have 10 different products. Every product is

Index splitting

2008-04-29 Thread Nico Heid
Hi, Let me first roughly describe the scenario :-) We're trying to index online stored data for some thousand users. The schema.xml has a custom identifier for the user, so FQ can be applied and further filtering is only done for the user (more important, the user doesn't get to see results from

Solr replication by solr (for windows)

2008-04-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi , The current replication strategy in solr involves shell scripts . The following are the drawbacks * It does not work with windows * Replication works as a separate piece not integrated with solr. * Cannot control replication from solr admin/JMX * Each operation requires manual telnet to the

Re: Solr replication by solr (for windows)

2008-04-29 Thread Ian Holsman
The current scripts use rsync to minimize the amount of data actually being copied. I've had a brief look and found only 1 implementation which is GPL and abandoned http://sourceforge.net/projects/jarsync. Personally I still think the size of the transfer is important (as for most use cases

Re: unique values from a field in a result

2008-04-29 Thread Ian Holsman
Hi Thijs. If you are not concerned with a *EXACT* number there is a paper that was published in 1990 that discusses this problem. http://dblab.kaist.ac.kr/Publication/pdf/ACM90_TODS_v15n2.pdf from the paper (If I understand it correctly) For 120,000,000 records you can sample 10,112,529

Re: Solr replication by solr (for windows)

2008-04-29 Thread Shalin Shekhar Mangar
Hi Ian, I assume that a sizeable amount of people do replication after an optimize which causes almost the whole index to be transferred by rsync. We can do a checksum based modification check on individual segment files and pull only those from the master. Although that's not a true diff copy,

Re: Solr replication by solr (for windows)

2008-04-29 Thread Ryan McKinley
In the future, don't post the same idea in solr-user and solr-dev... most people on solr-dev read solr-user and the cross posting splits where discussion ends up. On Apr 29, 2008, at 5:01 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: hi , The current replication strategy in solr involves shell

Re: Solr replication by solr (for windows)

2008-04-29 Thread Ryan McKinley
We are not doing away with the current replication strategy. It's just that we're proposing an alternative. I'm all for adding a replication strategy that works on windows and is controlled/managed from the webapp. The existing hardlink rsync methods may have better performance...

Re: GSA - Solr

2008-04-29 Thread Jon Baer
I don't think the KeywordMatch and the elevate.xml are the same thing. I tried this out today but there is no way for an element to @ least mark that it was elevated to the top. An example of what Im trying to do is if say Solr is entered into a search, return a block of text and/or

Uprade lucene to 2.3

2008-04-29 Thread Yongjun Rong
Hi, It seems the latest lucene 2.3 has some improvement on performance. I'm just wondering if it is ok for us to easily upgrade the solr's lucene from 2.1 to 2.3. Is any special thing we need to know except just replace the lucene jars in the lib directory. Thank you very much. Yongjun Rong

Re: Uprade lucene to 2.3

2008-04-29 Thread Otis Gospodnetic
I think you may be okay with the Lucene 2.3 (I tried it with Solr from a few months ago). The development version of Solr already uses Lucene 2.3.*. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Yongjun Rong [EMAIL PROTECTED] To:

Re: Index splitting

2008-04-29 Thread Otis Gospodnetic
Hi Nico, I don't think there is a tool to split an existing Lucene index, though I imagine one could write such a tool using http://lucene.apache.org/java/2_3_1/fileformats.html as a guide. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From:

Re: Index splitting

2008-04-29 Thread Grant Ingersoll
I seem to recall Doug C. commenting on this: http://lucene.markmail.org/search/?q=FilterIndexReader#query :FilterIndexReader%20from%3A%22Doug%20Cutting%22+page:1+mid:y673avueo43ufwhm+state:results Not sure if that is exactly what you are looking for, but sounds similar. -Grant On Apr 29,

Sorting results

2008-04-29 Thread Esteve Camps Chust
Hi, I want to know if is posible the next kind of Sorting: I perform the search like Matahari. The returned results may include A big life: Matahari, War and Matahari, Matahari (in that order). How can I return results by sorting at first the results that matches the begiging of string? I want

Master / slave setup with multicore

2008-04-29 Thread James Brady
Hi all, I'm aiming to use the new multicore features in development versions of Solr. My ideal setup would be to have master / slave servers on the same machine, snapshotting across from the 'write' to the 'read' server at intervals. This was all fine with Solr 1.2, but the rsync

Re: Uprade lucene to 2.3

2008-04-29 Thread Funtick
Special things: - 2.3.1 fixes bugs with 'autocommit' of version 2.3.0 - I am having OutOfMemoryError constantly, I can't understand where the problem is yet... I didn't have it with default SOLR 1.2 installation. It's not memory-cache related, most probably it is a bug somewhere... Yongjun

Re: Queuing adds and commits

2008-04-29 Thread James Brady
Depending on your application, it might be useful to take control of the queueing yourself: it was for me! I needed quick turnarounds for submitting a document to be indexed, which Solr can't guarantee right now. To address it, I wrote a persistent queueing server, accessed by XML-RPC,

Re: Master / slave setup with multicore

2008-04-29 Thread Ryan McKinley
On Apr 29, 2008, at 3:09 PM, James Brady wrote: Hi all, I'm aiming to use the new multicore features in development versions of Solr. My ideal setup would be to have master / slave servers on the same machine, snapshotting across from the 'write' to the 'read' server at intervals. This

Re: unique values from a field in a result

2008-04-29 Thread Chris Hostetter
: My example is just simple, in real life the numbers are a lot bigger. However, : the amount of unique products vs variations is such that it seems a lot of : work to iterate over al variations in a DocSet just to get the few unique : products. : But, what I understand from you anwser is that

Re: Index splitting

2008-04-29 Thread Norberto Meijome
On Tue, 29 Apr 2008 10:10:09 +0200 Nico Heid [EMAIL PROTECTED] wrote: So now the Question: Is there a way to split a too big index into smaller ones? Do I have to create more instances at the beginning, so that I will not run out of power and space? (which will ad quite a bit of redundance of