Re: Changing dataDir without restatrting server

2008-09-30 Thread Walter Underwood
Solr index distribution already does this with a slightly different mechanism. It moves the files instead of the directory. I recommend understanding and using the standard scripts for index distribution. http://wiki.apache.org/solr/CollectionDistribution wunder On 9/29/08 9:55 PM, Otis

Re: How to select one entity at a time?

2008-09-30 Thread con
Thanks Everybody. I have went through the wiki and some other docs. Actually I have a tight schedule and I have to look into various other things along with this. Currently I am looking into rebuilding solr by writing a wrapper class. I will update you with more meaningful questions soon..

Re: Question about facet.prefix usage

2008-09-30 Thread Erik Hatcher
If I'm not mistaken, doesn't facet.query accomplish what you want? Erik On Sep 29, 2008, at 5:43 PM, Simon Hu wrote: I also need the exact same feature. I was not able to find an easy solution and ended up modifying class SimpleFacets to make it accept an array of facet

Re: Running Solr1.3 with multicore support

2008-09-30 Thread RaghavPrabhu
Hi Saurabh Bhutyani, Is it show the two core links in ur solr home page like Admin core0 Admin core1 if not,the problem is you are upgrading the solr from 1.2 to 1.3. Better stop the server delete all the floders in %Tomcat_Home%\work\Catalina\localhost location and restart it.

Howto concatenate tokens at index time (without spaces)

2008-09-30 Thread Batzenmann
Hi, I'm looking for a way to create a fieldtype which will apart from the whitespacedtokenized tokens also store concatenated versions of the tokens. The ShingleFilter does s.th. very similar but keeps spaces in between words. In german a shoe(Schuh) you wear in your 'spare time'(Freizeit) is

Re: spellcheck: buildOnOptimize?

2008-09-30 Thread Jason Rennie
On Fri, Sep 26, 2008 at 9:33 AM, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: Jason, can you please open a jira issue to add this feature? Done. https://issues.apache.org/jira/browse/SOLR-795 Jason

Re: Indexing Multiple Fields with the Same Name

2008-09-30 Thread KyleMorrison
That was indeed the error, I apologize for wasting your time. Thank you very much for the help. Kyle Shalin Shekhar Mangar wrote: Is that a mis-spelling? mulitValued=true On Thu, Sep 25, 2008 at 12:12 AM, KyleMorrison [EMAIL PROTECTED] wrote: I'm trying to index fields as such:

Re: Howto concatenate tokens at index time (without spaces)

2008-09-30 Thread Otis Gospodnetic
I haven't used the German analyzer (either Snowball or the one we have in Lucene's contrib), but have you checked if that does the trick of keeping words together? Or maybe the compound tokenizer has this option? (check Lucene JIRA, not sure now where the compound tokenizer went) Otis --

French synonyms Online synonyms

2008-09-30 Thread Pierre Auslaender
Hello, I'm sure these questions have been raised a million times, I'll try one more: 1/ Is there any general-purpose, free, French synonyms file out there? 2/ Is there a Solr or Lucene analyser class that could tap an on-line resource for synoynms at index-time? And by the same token,

Re: French synonyms Online synonyms

2008-09-30 Thread Otis Gospodnetic
Pierre, 1) I don't know, but a good place to check and see what previous answers to this questions were is markmail.org 2) I don't think there is such a thing, but I also don't think there are sites that make this data freely available (answer to 1?) Otis -- Sematext -- http://sematext.com/

Re: French synonyms Online synonyms

2008-09-30 Thread Walter Underwood
Synonyms are domain-specific, so general-purpose lists are not very useful. Ultraseek shipped a British-American synonym list as an example, but even that wasn't very general. One of our customers was a chemical company and was very surprised when the search rocket fuel suggested arugula, even

Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread KyleMorrison
I apologize for spamming this mailing list with my problems, but I'm at my wits end. I'll get right to the point. I have an xml file which is ~1GB which I wish to index. If that is successful, I will move to a larger file of closer to 20GB. However, when I run my data-config(let's call it

Re: commit not fired

2008-09-30 Thread Chris Hostetter
: When I check my commit.log nothings is runned commit.log is only updated by the bin/commit script ... not by Solr itself. you'll see Solr log commits in whatever logs are kept by your servlet container. : My snapshooter too: but no log in snapshooter.log : !-- A postCommit event is

Re: French synonyms Online synonyms

2008-09-30 Thread Pierre Auslaender
True, synonyms can be grouped in cliques based on the strength of their resemblence given a specific context. But what I'm indexing is the text content of TV programs produced by a public television, so the context is very large and non-specific. What I want is to find automobile for car,

Calculated Unique Key Field

2008-09-30 Thread Jim Murphy
My unique key field is an MD5 hash of several other fields that represent identity of documents in my index. We've been calculating this externally and setting the key value in documents but have found recurring bugs as the number and variety of inserting consumers has grown... So I wanted to

Discarding undefined fields in query

2008-09-30 Thread Jérôme Etévé
Hi All, I wrote a customized query parser which discards non-schema fields from the query (I'm using the schema field names from req.getSchema().getFields().keySet() ) . This parser works fine in unit tests. But still I have an error from the webapp when I try to query my schema with non

Re: Calculated Unique Key Field

2008-09-30 Thread Jim Murphy
It may not be all that relevant but our Update handler extends from DirectUpdateHandler2. -- View this message in context: http://www.nabble.com/Calculated-Unique-Key-Field-tp19747955p19748032.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Dismax , query phrases

2008-09-30 Thread Chris Hostetter
: That's why I was wondering how Dismax breaks it all apart. It makes sense...I : suppose what I'd like to have is a way to tell dismax which fields NOT to : tokenize the input for. For these fields, it would pass the full q instead of : each part of it. Does this make sense? would it be useful

Re: Applying Stop words for Field Type String

2008-09-30 Thread Chris Hostetter
: Question : Is it possible to do the same for String type or not, since the StrField doesn't support an analyzer like TextField does, but if you define string to be a TextField using KeywordTokenizer it will preserve the whole value as a single token and you can then use the

Re: Monitoring solr stats with munin?

2008-09-30 Thread Chris Hostetter
: has anyone had the need and maybe already written a munin plugin to graph : some informations from e.g. admin/stats.jsp ? : Something like that, though I havn't seen anything available publicly yet. Its Anything exposed via stats.jsp should also be available via JMX (if you enable JMX) ...

Re: Searching Question

2008-09-30 Thread Jake Conk
How would I write a custom Similarity factor that overrides the TF function? Is there some documentation on that somewhere? On Sat, Sep 27, 2008 at 5:14 AM, Grant Ingersoll [EMAIL PROTECTED] wrote: On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote: It might be easiest to store the thread ID

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Shalin Shekhar Mangar
Hmm, strange. This is Solr 1.3.0, right? Do you have any transformers applied to these multi-valued fields? Do you have stream=true in the entity? On Tue, Sep 30, 2008 at 11:01 PM, KyleMorrison [EMAIL PROTECTED] wrote: I apologize for spamming this mailing list with my problems, but I'm at my

Re: Integrating external stemmer in Solr and pre-processing text

2008-09-30 Thread Jaco
Hi, The suggested approach with a TokenFilter extending the BufferedTokenStream class works fine, performance is OK - the external stemmer is now invoked only once for the complete search text. Also, from a functional point of view, the approach is useful, because it allows for other filtering

Re: Searching Question

2008-09-30 Thread Otis Gospodnetic
The easiest thing is to look at Lucene javadoc and look for Similarity and DefaultSimilarity classes. Then have a peek at Lucene contrib to get some other examples of custom Similarity. You'll just need to override one method, for example: -- Sematext -- http://sematext.com/ -- Lucene -

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread KyleMorrison
As a follow up: I continued tweaking the data-config.xml, and have been able to make the commit fail with as little as 3 fields in the sdc.xml, with only one multivalued field. Even more strange, some fields work and some do not. For instance, in my dc.xml: field column=Taxon

Re: Searching Question

2008-09-30 Thread Otis Gospodnetic
I hit ctrl-S by mistake. This is the method you are after: http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/DefaultSimilarity.html#tf(float) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Otis Gospodnetic [EMAIL

Re: Discarding undefined fields in query

2008-09-30 Thread Yonik Seeley
On Tue, Sep 30, 2008 at 2:42 PM, Jérôme Etévé [EMAIL PROTECTED] wrote: But still I have an error from the webapp when I try to query my schema with non existing fields in my query ( like foo:bar ). I'm wondering if the query q is parsed in a very simple way somewhere else (and independently

Re: Are facet searches slower on large indexes?

2008-09-30 Thread Chris Hostetter
the time factor has more to do with teh number of distinct values in the field being faceted on then it does the number of documents. with 1 million documents there are probably a lot more indexed terms in the contents field then there are with only 1000 documents. As an inverted index,

Re: Question about facet.prefix usage

2008-09-30 Thread Simon Hu
not really. facet.query filters the result set. Here we need to filter the facet counts by multiple facet prefixes. facet.query would work only if the faceted field is not a multi-value field. Erik Hatcher wrote: If I'm not mistaken, doesn't facet.query accomplish what you want?

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess it is a threading problem. I can give you a patch. you can raise a bug --Noble On Wed, Oct 1, 2008 at 2:11 AM, KyleMorrison [EMAIL PROTECTED] wrote: As a follow up: I continued tweaking the data-config.xml, and have been able to make the commit fail with as little as 3 fields in the

Re: Indexing Large Files with Large DataImport: Problems

2008-09-30 Thread Noble Paul നോബിള്‍ नोब्ळ्
this patch is created from 1.3 (may apply on trunk also) --Noble On Wed, Oct 1, 2008 at 9:56 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: I guess it is a threading problem. I can give you a patch. you can raise a bug --Noble On Wed, Oct 1, 2008 at 2:11 AM, KyleMorrison [EMAIL

Re: How to select one entity at a time?

2008-09-30 Thread con
Hi guys, In the URL, http://localhost:8983/solr/select/?q= :bobversion=2.2start=0rows=10indent=onwt=json q=: applies to a field and not to an entity. So If I have 3 entities like: dataConfig dataSource **/ document entity