Why does Solr (1.4.1) keep so many Tokenizer objects?

2012-09-08 Thread T. Kuro Kurosaka
While investigating a bug, I found that Solr keeps many Tokenizer objects. This experimental 80-core Solr 1.4.1 system runs on Tomcat. It was continuously sent indexing requests in parallel, and it eventually died due to OutOfMemory. The heap dump that was taken by the JVM shows there were

SolrCloud vs SolrReplication

2012-09-08 Thread thaihai
Hi All, im little bit confussed about the new cloud functinalities. some questions: 1) its possible to use the old style solrreplication in solr4 (it means not using solrcloud. not starting with zk params) ? 2) in our production-environment we use solr 3.6 with solrreplication. we have 1 index

ConcurrentModificationException - SolrCmdDistributor

2012-09-08 Thread Balaji Gandhi
Hi, I am trying to implement a multi-threaded version of DIH (SOLR 4) and am able to successfully run this with a single SOLR node. Getting ConcurrentModificationException at SolrCmdDistributor.java:223 when doing it with more than one node. Has anyone faced this issue? Please let me know.

Re: Solr 4: Private master, public slave?

2012-09-08 Thread Erick Erickson
These are really unrelated. Presumably you have some program that accesses your system of record, that you want to keep private. No problem, that program (SolrJ?) is accessing your private data and sending the SolrInputDocuments to the cloud-based Solr program for searching. Or I don't understand

Re: Re: Schema model to store additional field metadata

2012-09-08 Thread Erick Erickson
You might be confusing indexing and storing. When you specify index=true in your field definition, the input is tokenized, transformed, etc and the results of this (see the admin/analysis) page is what is searched. But when you specify stored=true, a literal, verbatim copy of the text is put in a

Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-08 Thread Erick Erickson
Solr docs a complete delete and re-add, there's no way to do a partial update. When you add a doc with the same unique key as an old doc, the data associated with the first version of the doc is entirely thrown away and its as though you'd never indexed it at all, the second version completely

Re: SolrCloud vs SolrReplication

2012-09-08 Thread Erick Erickson
See inline On Sat, Sep 8, 2012 at 1:09 AM, thaihai thai...@live.de wrote: Hi All, im little bit confussed about the new cloud functinalities. some questions: 1) its possible to use the old style solrreplication in solr4 (it means not using solrcloud. not starting with zk params) ? Yes.

Re: JRockit with SOLR3.4/3.5

2012-09-08 Thread Snehal Chennuru
I am running into a similar issue with Lucene 3.6 which I believe is used in Solr 3.4. Following is the exception stack trace: 2012-09-08 18:08:56,341 WARN [STDERR] (Load thread:) Exception in thread Load thread java.lang.OutOfMemoryError: classblock allocation, 2814576 loaded, 2816K footprint,

Re: Fail to huge collection extraction

2012-09-08 Thread neosky
I am sorry that I can't get your point. Would you explain a little more? I am still struggling with this problem. It seems crash by no meaning sometimes. Even I reduce to 5000 records each time, but sometimes it works well with 1 per page. -- View this message in context:

Re: Fail to huge collection extraction

2012-09-08 Thread Alexandre Rafalovitch
I think the point here is the question about your use of the data. If you want to show it to the client, then you are unlikely to need details of more than 1 screenful of records (e.g. 10). When user goes to another screen, you rerun the query and specify values 11-20, etc. SOLR does not have a

zkcli command line util

2012-09-08 Thread JesseBuesking
I was trying to use the command line util to update my zookeeper instance with config files from a machine running solr, but I'm getting the following error: /Could not find or load main class org.apache.solr.cloud.ZkCLI/ The command I'm trying to execute is /java -classpath

Cloud terminology clarification

2012-09-08 Thread JesseBuesking
It's been a while since the terminology at http://wiki.apache.org/solr/SolrTerminology has been updated, so I'm wondering how these terms apply to solr cloud setups. My take on what the terms mean: Collection: Basically the highest level container that bundles together the other pieces for