Re: solr tuple/tag store

2007-10-10 Thread Pieter Berkel
On 10/10/2007, Ryan McKinley [EMAIL PROTECTED] wrote: Without seeing the actual queries that are slow, it's difficult to determine what the problem is. Have you tried using EXPLAIN ( http://dev.mysql.com/doc/refman/5.0/en/explain.html) to check if your query is using the table indexes

RE: problems with arabic search

2007-10-10 Thread Heba Farouk
I'm developing a java application using solr, this application is working with English search Yes, I have tried querying solr directly for Arabic and it's working Any suggestions ?? -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007

Re: extending StandardRequestHandler gives ClassCastException

2007-10-10 Thread Britske
Thanks that was the problem! I mistakingly thought the lib-folder containing the jetty.jar etc. was the folder to put the plugins into. After adding a lib-folder to solr-home everything is resolved. Geert-Jan hossman wrote: : SEVERE: java.lang.ClassCastException: :

Problems with mySolr Wiki

2007-10-10 Thread Christian Klinger
Hi Solr-Users, i try to follow the instructions [1] from the solr-wiki to build my custom solr server. First i have created the directory-structure. mySolr --solr --conf --schema.xml --solrconfig.xml --solr.xml -- Where i can find this file? --build.xml -- copy paste from the wiki

unlockOnStartup does not work in embedded solr?

2007-10-10 Thread Alexey Shakov
Hi *, I use solr as embedded solution. I have set unlockOnStartup to true in my solrconfig.xml But it seems, that this option is ignored by embedded solr. Any ideas? Thanks in advance, Alexey

Manage multiple indexes with Solr

2007-10-10 Thread ycrux
Hi guys ! Is it possible to configure Solr to manage different indexes depending on the added documents ? For example: * document 1, with uniq ID ui1 will be indexed in the indexA * document 2, with uniq ID ui2 will be indexed in the indexB * document 3, with uniq ID ui1 will be indexed in the

Re: Manage multiple indexes with Solr

2007-10-10 Thread Venkatraman S
i would be interested to know in both the cases : Case 1 : * document 1, with uniq ID ui1 will be indexed in the indexA * document 2, with uniq ID ui2 will be indexed in the indexB * document 3, with uniq ID ui3 will be indexed in the indexA Case 2 : * document 1, with uniq ID ui1 will be

Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Martin Grotzke
Hello, with our application we have the issue, that we get different results for singular and plural searches (german language). E.g. for hose we get 1.000 documents back, but for hosen we get 10.000 docs. The same applies to t-shirt or t-shirts, of e.g. hut and hüte - lots of cases :) This is

Re: Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Thomas Traeger
in short: use stemming Try the SnowballPorterFilterFactory with German2 as language attribute first and use synonyms for combined words i.e. Herrenhose = Herren, Hose. By using stemming you will maybe have some interesting results, but it is much better living with them than having no or

RE: Solr and KStem

2007-10-10 Thread Wagner,Harry
Hi Piete, Good idea. Thanks. One other change that should probably be made is to change the package statement from org.oclc.solr.analysis to org.apache.solr.analysis. Thanks again. Cheers! harry -Original Message- From: Pieter Berkel [mailto:[EMAIL PROTECTED] Sent: Tuesday, October

Re: problems with arabic search

2007-10-10 Thread Grant Ingersoll
Can you give more detail about what you have done? What character encoding do you have your browser set to? In Firefox, do View - Character Encoding to see what it is set to when you are on the input page? Internet Explorer and other browsers have other options. Are you sending the

RE: problems with arabic search

2007-10-10 Thread Heba Farouk
In firefox, character encoding is set to UTF-8 Yes, I'm sending the query directly to solr using apache httpclient and I set the http request header content type to : Content-Type=text/html; charset=UTF-8 Any suggestions Thanks in advance -Original Message- From: Grant Ingersoll

Re: problems with arabic search

2007-10-10 Thread Grant Ingersoll
Hmmm, by the looks of your query, it doesn't seem like it is a Solr query, but I admit I don't have all the parameters memorized. What request handler, etc. are you using? Have you tried debugging? And you say you have tried a query with the Solr Admin query page, right? And that works?

Re: Availability Issues

2007-10-10 Thread Otis Gospodnetic
Hi, - Original Message From: David Whalen [EMAIL PROTECTED] On that note -- I've read that Jetty isn't the best servlet container to use in these situations, is that your experience? OG: In which situations? Jetty is great, actually! (the pretty high traffic site in my sig runs

getting number of stored documents via rest api

2007-10-10 Thread Stefan Rinner
Hi for some tests I need to know how many documents are stored in the index - is there a fast easy way to retrieve this number (instead of searching for *:* and counting the results)? I already took a look at the stats.jsp code - but there the number of documents is retrieved via an api

Re: Problems with mySolr Wiki

2007-10-10 Thread Chris Hostetter
i'm not very familiar with that wiki, but note the line in the example ant script... !-- SOLR_HOME must be set as an environment variable -- ... : --solr.xml -- Where i can find this file? according to the wiki page... First we will setup a basic directory structure (assuming we only

Re: getting number of stored documents via rest api

2007-10-10 Thread Chris Hostetter
: there a fast easy way to retrieve this number (instead of searching for : *:* and counting the results)? NOTE: you don't have to count the results to know the total number of docs matching any query ... just use the numFound attribute of the results/ block. : I already took a look at the

Re: getting number of stored documents via rest api

2007-10-10 Thread Chris Hostetter
: I think search for *:* is the optimal code to do it. I don't think you can : do anything faster. FYI: getting the data from the xml returned by stats.jsp is definitely faster in the case where you really want all docs. if you want the total number from some other query however, don't count

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the

Re: start tag not allowed in epilog

2007-10-10 Thread Chris Hostetter
: Does anyone know how to correct this? Is it not possible to have multiple : different top-level tags in the same update xml file? It seems to me like it : should work, but perhaps there's something inherently bad about this from : the XMLStreamReader's point of view. it's inherently bad from

Re: start tag not allowed in epilog

2007-10-10 Thread BrendanD
We simply process a queue of updates from a database table. Some of the updates are deletes, some are adds. Sometimes you can have many deletes in a row, sometimes many adds in a row, and sometimes a mixture of deletes and adds. We're trying to batch our updates and were hoping to avoid having to

Re: WebException (ServerProtocolViolation) with SolrSharp

2007-10-10 Thread Jeff Rodenburg
Hi Felipe - The issue you're encountering is a problem with the data format being passed to the solr server. If you follow the stack trace that you posted, you'll notice that the solr field is looking for a value that's a float, but the passed value is 1,234. I'm guessing this is caused by one

quick allowDups questions

2007-10-10 Thread Charlie Jackson
Normally this is the type of thing I'd just scour through the online docs or the source code for, but I'm under the gun a bit. Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this

Re: quick allowDups questions

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote: Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping I won't have to write additional code to go through and delete each existing

Re: start tag not allowed in epilog

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 12:49 PM, BrendanD wrote: We simply process a queue of updates from a database table. Some of the updates are deletes, some are adds. Sometimes you can have many deletes in a row, sometimes many adds in a row, and sometimes a mixture of deletes and adds. We're trying to

RE: quick allowDups questions

2007-10-10 Thread Charlie Jackson
Thanks for the response, Mike. A quick test using the example app confirms your statement. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named overwrite that defaults to true. Without knowing for

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that? -Original Message- From: Mike

Re: Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Daniel Naber
On Wednesday 10 October 2007 12:00, Martin Grotzke wrote: Basically I see two options: stemming and the usage of synonyms. Are there others? A large list of German words and their forms is available from a Windows software called Morphy

Re: start tag not allowed in epilog

2007-10-10 Thread BrendanD
I've re-written the code to generate separate files. One for adds and one for deletes. And this is working well for us now. Thanks. Mike Klaas wrote: This would be very complicated from a standpoint of returning errors to the client. Keep in mind the deletes can never be batched,

Internal Server Error and waitSearcher=false for commit/optimize

2007-10-10 Thread Jason Rennie
Hello, We're using solr 1.2 and a nightly build of the solrj client code. We very occasionally see things like this: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:86) at

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 2:40 PM, David Whalen wrote: Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that?

Re: quick allowDups questions

2007-10-10 Thread Ryan McKinley
the default solrj implementation should do what you need. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named overwrite that defaults to true. Without knowing for sure what it does, I'm not

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we actually need a counter-intuitive solution.

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 3:46 PM, David Whalen wrote: I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we

Re: [ADMIN] - Spam problems?

2007-10-10 Thread Chris Hostetter
: Around Sept. 20 I started getting Japanese spam to this account. This is : a special account I only use for the Solr and Lucene user mailing : lists. Did anybody else get these, starting around 9/20? Note that many mailing list archives leave the sender emails in plain text (which results in

Syntax for newSearcher query

2007-10-10 Thread BrendanD
Hi, The examples that I've found in the solrconfig.xml file and on this site are fairly basic for pre-warming specific queries. I have some rather complex looking queries that I'm not quite sure how to specify in my solrconfig.xml file in the newSearcher section. Here's an example of 3 queries

Re: Syntax for newSearcher query

2007-10-10 Thread Chris Hostetter
: looking queries that I'm not quite sure how to specify in my solrconfig.xml : file in the newSearcher section. :

Re: Syntax for newSearcher query

2007-10-10 Thread BrendanD
Awesome! Thanks! hossman wrote: : looking queries that I'm not quite sure how to specify in my solrconfig.xml : file in the newSearcher section. :

Re: Facets and running out of Heap Space

2007-10-10 Thread Yonik Seeley
On 10/10/07, Mike Klaas [EMAIL PROTECTED] wrote: Have you tried setting multivalued=true without reindexing? I'm not sure, but I think it will work. Yes, that will work fine. One thing that will change is the response format for stored fields arr name=foostrval1/str/arr instead of str