Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Simon Willnauer
On Sun, Sep 12, 2010 at 1:51 AM, Michael McCandless luc...@mikemccandless.com wrote: On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom tburt...@umich.edu wrote:  Is there an example of how to set up the divisor parameter in solrconfig.xml somewhere? Alas I don't know how to configure terms

RE: multivalued fields in result

2010-09-12 Thread Jason Chaffee
But it doesn't seem to be returning mulitvalued fields that are stored. It is returning all of the single value fields though. -Original Message- From: Markus Jelsma [mailto:markus.jel...@buyways.nl] Sent: Sat 9/11/2010 4:19 AM To: solr-user@lucene.apache.org Subject: RE: multivalued

Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Michael McCandless
One thing that the Codec API makes possible (in theory, anyway)... is variable gap terms index. Ie, Lucene today makes an indexed term at regular (every N -- 128 in 3.x, 32 in 4.0) intervals. But this is rather silly. Imagine the terms you are going through are all singletons (happen only in

RE: Delta Import with something other than Date

2010-09-12 Thread Ephraim Ofir
Alternatively, you could use the deltaQuery to retrieve the last indexed id from the DB (you'd have to save it there on your previous import). Your entity would look something like: entity name=my_entity deltaQuery=SELECT MAX(id) AS last_id_value FROM last_id_table

Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Robert Muir
On Sat, Sep 11, 2010 at 7:51 PM, Michael McCandless luc...@mikemccandless.com wrote: On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom tburt...@umich.edu wrote: Is there an example of how to set up the divisor parameter in solrconfig.xml somewhere? Alas I don't know how to configure

Invalid version or the data in not in 'javabin' format

2010-09-12 Thread h00kpub...@gmail.com
hi... currently i am integrating nutch (release 1.2) into solr (trunk). if i indexing to solr index with nutch i got the exception: java.lang.RuntimeException: Invalid version or the data in not in 'javabin' format at

Re: Invalid version or the data in not in 'javabin' format

2010-09-12 Thread Peter Sturge
Could be a solrj .jar version compat issue. Check that the client and server's solrj version jars match up. Peter On Sun, Sep 12, 2010 at 1:16 PM, h00kpub...@gmail.com h00kpub...@googlemail.com wrote:  hi... currently i am integrating nutch (release 1.2) into solr (trunk). if i indexing to

Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Simon Willnauer
On Sun, Sep 12, 2010 at 12:42 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Sep 11, 2010 at 7:51 PM, Michael McCandless luc...@mikemccandless.com wrote: On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom tburt...@umich.edu wrote:  Is there an example of how to set up the divisor

Re: mm=0?

2010-09-12 Thread Erick Erickson
Could you explain the use-case a bit? Because the very first response I would have is why in the world did product management make this a requirement and try to get the requirement changed As a user, I'm having a hard time imagining being well served by getting a document in response to a

Re: Invalid version or the data in not in 'javabin' format

2010-09-12 Thread h00kpub...@gmail.com
thats was the solution!! i package the current lucene and solrj repositories (dev 4.0) and copy the nesseccary jars to nutch-libs (after removing the old), building nutch and run it - it works!! thank you peter :) marcel On 09/12/2010 03:40 PM, Peter Sturge wrote: Could be a solrj .jar

Re: multivalued fields in result

2010-09-12 Thread Erick Erickson
Can we see your schema file? Because it sounds like you didn't really declare your field multivalued=true on the face of things. But if it is multivalued AND you changed it, did you reindex after you changed the schema? Best Erick On Sun, Sep 12, 2010 at 4:21 AM, Jason Chaffee

Re: Solr memory use, jmap and TermInfos/tii

2010-09-12 Thread Robert Muir
On Sun, Sep 12, 2010 at 9:57 AM, Simon Willnauer simon.willna...@googlemail.com wrote: To change the divisor in your solrconfig, for example to 4, it looks like you need to do this. indexReaderFactory name=IndexReaderFactory class=org.apache.solr.core.StandardIndexReaderFactory

Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g. 5min). Environment: Solr 1.4.1 or branch_3x trunk. Note the 4.x trunk has lots of neat new features, so the notes here are likely less relevant to the 4.x

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Erick Erickson
Peter: This kind of information is extremely useful to document, thanks! Do you have the time/energy to put it up on the Wiki? Anyone can edit it by creating a logon. If you don't, would it be OK if someone else did it (with attribution, of course)? I guess that by bringing it up I'm volunteering

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Dennis Gearon
Wow! Thanks for that. This email is DEFINITELY being filed. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Sun, 9/12/10, Peter Sturge peter.stu...@gmail.com wrote:

Re: Solr and jvm Garbage Collection tuning

2010-09-12 Thread Grant Ingersoll
On Sep 10, 2010, at 7:01 PM, Burton-West, Tom wrote: We have noticed that when the first query hits Solr after starting it up, memory use increases significantly, from about 1GB to about 16GB, and then as queries are received it goes up to about 19GB at which point there is a Full Garbage

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Karich
Peter, thanks a lot for your in-depth explanations! Your findings will be definitely helpful for my next performance improvement tests :-) Two questions: 1. How would I do that: or a local read-only instance that reads the same core as the indexing instance (for the latter, you'll need

Saravanan Chinnadurai/Actionimages is out of the office.

2010-09-12 Thread Saravanan . Chinnadurai
I will be out of the office starting 12/09/2010 and will not return until 14/09/2010. Please email to itsta...@actionimages.com for any urgent issues. (Embedded image moved to file: pic19187.jpg)

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Jason Rutherglen
Peter, Are you using per-segment faceting, eg, SOLR-1617? That could help your situation. On Sun, Sep 12, 2010 at 12:26 PM, Peter Sturge peter.stu...@gmail.com wrote: Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
Hi Jason, I've tried some limited testing with the 4.x trunk using fcs, and I must say, I really like the idea of per-segment faceting. I was hoping to see it in 3.x, but I don't see this option in the branch_3x trunk. Is your SOLR-1606 patch referred to in SOLR-1617 the one to use with 3.1?

Re: No more trunk support for 2.9 indexes

2010-09-12 Thread Ryan McKinley
I suppose an index 'remaker' might be something like a DIH reader for a Solr index - streams everything out of the existing index, writing it into the new one? This works fine if all fields are stored (and copy field does not go to a stored field), otherwise you would need/want to start with

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Lance Norskog
Bravo! Other tricks: here is a policy for deciding when to merge segments that attempts to balance merging with performance. It was contributed by LinkedIn- they also run indexsearch in the same instance (not Solr, a different Lucene app).

Re: multivalued fields in result

2010-09-12 Thread Lance Norskog
Also, the 'v' is capitalized: multiValued. (This is one reason why posting your schema helps.) Erick Erickson wrote: Can we see your schema file? Because it sounds like you didn't really declare your field multivalued=true on the face of things. But if it is multivalued AND you changed

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Chris Haggstrom
Thanks, Peter. This is really great info. One setting I've found to be very useful for the problem of overlapping onDeskSearchers is to reduce the value of maxWarmingSearchers in solrconfig.xml. I've reduced this to 1, so if a slave is already busy doing pre-warming, it won't try to also

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Jason Rutherglen
Yeah there's no patch... I think Yonik can write it. :-) Yah... The Lucene version shouldn't matter. The distributed faceting theoretically can easily be applied to multiple segments, however the way it's written for me is a challenge to untangle and apply successfully to a working patch. Also

RE: multivalued fields in result

2010-09-12 Thread Jason Chaffee
My schema.xml was fine. The problem was that my test queries weren't returning top 10 documents that had data in the fields. Once I increased the rows, I saw the results. Definitely user error. :) Thanks for help though. Jason -Original Message- From: Lance Norskog