RE: Problem with pdf, upgrading Cell

2010-05-03 Thread Sandhya Agarwal
Hello, Please let me know if anybody figured out a way out of this issue. Thanks, Sandhya -Original Message- From: Praveen Agrawal [mailto:pkal...@gmail.com] Sent: Friday, April 30, 2010 11:14 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Grant, You

Re: phrase search - problem

2010-05-03 Thread Ahmet Arslan
I wanted to do phrase search.  What are the analyzers that best suited for phrase search.  I tried with textgen, but it did not yield the expected results. I wanted to index: my dear friend If I search for dear friend, I should get the result and if I search for friend dear I should

Re: synonym filter problem for string or phrase

2010-05-03 Thread Marco Martinez
Hi Ranveer, I don't see any stemming analyzer in your configuration of the field 'text_sync', also you have filter class=solr.TrimFilterFactory / at query time and not at index time, maybe that is your problem. Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de

Re: Skipping duplicates in DataImportHandler based on uniqueKey

2010-05-03 Thread Marc Sturlese
You can use deduplication to do that. Create the signature based on the unique field or any field you want. -- View this message in context: http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p772768.html Sent from the Solr - User mailing list

RE: Problem with pdf, upgrading Cell

2010-05-03 Thread Marc Ghorayeb
Hi, Grant, i confirm what Praveen has said, any PDF i try does not work with the new Tika and SVN versions. :( Marc From: sagar...@opentext.com To: solr-user@lucene.apache.org Date: Mon, 3 May 2010 13:05:24 +0530 Subject: RE: Problem with pdf, upgrading Cell Hello, Please let me know

RE: run on reboot on windows

2010-05-03 Thread Frederico Azeiteiro
Hi Ahmed, I need to achieve that also. Do you manage to install it as service and start solr with Jetty? After installing and start jetty as service how do you start solr? Thanks, Frederico -Original Message- From: S Ahmed [mailto:sahmed1...@gmail.com] Sent: segunda-feira, 3 de Maio de

Re: synonym filter problem for string or phrase

2010-05-03 Thread MitchK
Just for clear terminology: You mean field, not fieldType. FieldType is the definition of tokenizers, filters etc.. You apply a fieldType on a field. And you query against a field, not against a whole fieldType. :-) Kind regards - Mitch Marco Martinez-2 wrote: Hi Ranveer, If you don't

Retrieving indexed field data

2010-05-03 Thread Licinio Fernández Maurelo
Hi folks, i'm wondering if there is a way to retrieve the indexed data. The reason is that i'm working on a solrj-based tool that copies one index data into other (allowing you to perform changes in docs ). I know i can't perform any change in an indexed field, just want to copy the chunk of

Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-03 Thread Markus Fischer
Hi, we recently began having trouble with our Solr 1.4 instance. We've about 850k documents in the index which is about 1.2GB in size; the JVM which runs tomcat/solr (no other apps are deployed) has been given 2GB. We've a forum and run a process every minute which indexes the new messages.

Re: OutOfMemoryError when using query with sort

2010-05-03 Thread Erick Erickson
How many unique terms are in your sort field? On Sun, May 2, 2010 at 11:48 PM, Hamid Vahedi hvb...@yahoo.com wrote: I install 64 bit windows and my problem solved. also i using shard mode (100 M doc per machine with one solr instance) is there better solution? because i insert at least 5M doc

SpellChecking

2010-05-03 Thread Jan Kammer
Hi there, I want to enable spellchecking, but i got many fields. I tried around with copyfield to copy all with * in one field, but that didnt work. Next try was to copy some fields specified each by name in one field named spell, but that worked only for 2 or 3 fields, but not for 10 or

Re: Skipping duplicates in DataImportHandler based on uniqueKey

2010-05-03 Thread Andrew Clegg
Marc Sturlese wrote: You can use deduplication to do that. Create the signature based on the unique field or any field you want. Cool, thanks, I hadn't thought of that. -- View this message in context:

Re: run on reboot on windows

2010-05-03 Thread Dave Searle
I don't think jetty can be installed as a service. You'd need to create a bat file and put that in the win startup registry. Sent from my iPhone On 3 May 2010, at 11:26, Frederico Azeiteiro frederico.azeite...@cision.com wrote: Hi Ahmed, I need to achieve that also. Do you manage to

Re: SpellChecking

2010-05-03 Thread Erick Erickson
It would help a lot to see your actual config file, and if you provided a bit more detail about what failure looks like Best Erick On Mon, May 3, 2010 at 9:43 AM, Jan Kammer jan.kam...@mni.fh-giessen.dewrote: Hi there, I want to enable spellchecking, but i got many fields. I tried

Re: SpellChecking

2010-05-03 Thread Jan Kammer
Hi, if I define one of my normal fields from schema.xml in solrconfig.xml for spellchecking all works fine: lst name=spellchecker ... str name=fieldcontent/str ... /lst But i want to use spellcheck for much more fields. So I tried to define in schema.xml a copyfield like this: copyField

RE: SpellChecking

2010-05-03 Thread Villemos, Gert
We are using copy fields for 40+ fields to do spelling, and it works fine. Are you sure that you actually build the spell index before you try to do spelling? You need to either configure SOLr to build spell index on commit, or manually issue a spell index build request. Regards, Gert.

Re: SpellChecking

2010-05-03 Thread Michael Kuhlmann
Am 03.05.2010 16:43, schrieb Jan Kammer: Hi, It worked fine with a normal field. There must something wrong with copyfield, or why does dataimporthandler add/update no more documents? Did you define your destination field as multivalue? -Michael

Auto-commit does not work

2010-05-03 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Running Solr 1.4 with updateHandler class=solr.DirectUpdateHandler2 ? !-- A prefix of solr. for class names is an alias that causes solr to search appropriate packages, including

Re: Auto-commit does not work

2010-05-03 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ahmet Arslan wrote: I inserted 10k documents through a Python script (w/ solrpy bindings) without explict commit. However I do not see that the numDocs increased meanwhile...is there any way to hunt this down? What does

cores and SWAP

2010-05-03 Thread Tim Heckman
Hi, I'm trying to figure out whether I need to reload a core (or both cores?) after performing a swap. When I perform a swap in my sandbox (non-production) environment, I am seeing that one of the cores needs to be reloaded following a swap and the other does not, but I haven't been able to find

Re: Auto-commit does not work

2010-05-03 Thread Ahmet Arslan
commits : 135 autocommits : 0 optimizes : 0 rollbacks : 0 expungeDeletes : 0 docsPending : 8842 adds : 8842 deletesById : 0 deletesByQuery : 0 errors : 0 cumulative_adds : 8842 cumulative_deletesById : 20390 cumulative_deletesByQuery : 0 cumulative_errors : 0 I just realized that

Re: Auto-commit does not work

2010-05-03 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ahmet Arslan wrote: I just realized that there is a typo in your autoCommit definition. The letter C sould be capital. autoCommit maxDocs1/maxDocs maxTime1000/maxTime /autoCommit right - and Solr should not swallow

Re: Auto-commit does not work

2010-05-03 Thread Chris Hostetter
: right - and Solr should not swallow errors in the configuration :-) If you have an error in a *known* config declaration, solr will complain about it -- but solr can't complain just because you declare extra stuff in your conig files that it doens't know anything about -- some other plugin

Re: Auto-commit does not work

2010-05-03 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Hostetter wrote: : right - and Solr should not swallow errors in the configuration :-) If you have an error in a *known* config declaration, solr will complain about it -- but solr can't complain just because you declare extra stuff in

Re: Overlapping onDeckSearchers=2

2010-05-03 Thread Shalin Shekhar Mangar
On Mon, May 3, 2010 at 11:24 AM, revas revas...@gmail.com wrote: Hello, We have a server with many solr instances running (around 40-50) . We are committing documents ,sometimes one or sometimes around 200 documents at a time .to only one instance at a time When i run 2 -3 commits

Re: Overlapping onDeckSearchers=2

2010-05-03 Thread Chris Hostetter
: When i run 2 -3 commits parallely to diff instances or same instance I get : this error : : PERFORMANCE WARNING: Overlapping onDeckSearchers=2 : : What is the Best approach to solve this

Re: cores and SWAP

2010-05-03 Thread Tim Heckman
I have 2 cores: core1 and core2. Load the same data set into each and commit. Verify that searches return the same for each core. Delete a document (call it docA) from core2 but not from core1. Commit and verify search results (docA disappears from core2's search results. core1 continues to

Re: Score cutoff

2010-05-03 Thread Satish Kumar
Hi, Can someone give clues on how to implement this feature? This is a very important requirement for us, so any help is greatly appreciated. thanks! On Tue, Apr 27, 2010 at 5:54 PM, Satish Kumar satish.kumar.just.d...@gmail.com wrote: Hi, For some of our queries, the top xx (five or so)

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
I'm investigating. On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote: Hi, Grant, i confirm what Praveen has said, any PDF i try does not work with the new Tika and SVN versions. :( Marc From: sagar...@opentext.com To: solr-user@lucene.apache.org Date: Mon, 3 May 2010 13:05:24 +0530

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
I've opened https://issues.apache.org/jira/browse/SOLR-1902 to track this. It is indeed a bug somewhere (still investigating). It seems that Tika is now picking an EmptyParser implementation when trying to determine which parser to use, despite the fact that it properly identifies the MIME

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
Little more info... Seems to be a classloading issue. The tests pass, but they aren't loading the Tika libraries via the Solr ResourceLoader, whereas the example is. Marc, one thing to try is to unjar the Solr WAR file and put the Tika libs in there, as I bet it will then work. Note,

Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-03 Thread Mark Miller
On 5/3/10 9:06 AM, Markus Fischer wrote: Hi, we recently began having trouble with our Solr 1.4 instance. We've about 850k documents in the index which is about 1.2GB in size; the JVM which runs tomcat/solr (no other apps are deployed) has been given 2GB. We've a forum and run a process every

Re: Auto-commit does not work

2010-05-03 Thread Darren Govoni
I think his point was, _what_ determines if its a misconfiguration? It can't be Solr because, like he said, a plugin may require it. If there is no such plugin, then what shall be the handler of it properly? nothingergo its ignored. On Mon, 2010-05-03 at 19:34 +0200, Andreas Jung wrote:

Re: Solr commit issue

2010-05-03 Thread Lance Norskog
This could be caused by HTTP caching. Solr's example solrconfig.xml comes with HTTP caching turned on, and this causes lots of beginners to have problems. The code to turn it off is commented in solrconfig.xml. Notice that the default is to have caching on, so to turn it off you have to have the

Facets vs TermV's

2010-05-03 Thread Darren Govoni
Hi, I spent a lot of time on the Wiki and am working with facets and tv's, but I'm still confused about something. Basically, what is the difference between issuing a facet field query that returns facets with counts, and a query with term vectors that also returns document frequency counts for

Re: run on reboot on windows

2010-05-03 Thread Lance Norskog
There are a few programs that wrap any java app as a service. http://en.wikipedia.org/wiki/Service_wrapper On Mon, May 3, 2010 at 6:58 AM, Dave Searle dave.sea...@magicalia.com wrote: I don't think jetty can be installed as a service. You'd need to create a bat file and put that in the win

Re: Solr Dismax query - prefix matching

2010-05-03 Thread Chris Hostetter
: example: If I have a field called 'booktitle' with the actual values as : 'Code Complete', 'Coding standard 101', then I'd like to search for the : query string 'cod' and have the dismax match against both the book : titles since 'cod' is a prefix match for 'code' and 'coding'. it doesn't