Solr replicating at 5 mb/sec

2014-02-22 Thread Cool Techi
Hi, I am running solr replication between two machine which are connected by a 1 gb network speed. The best speed I am getting for replication is 5mb/sec, how can this be increased. The replication keeps failing and this is the first time replication of an index over 300Gb in size. We are using

Re: Fwd: help on edismax_dynamic fields

2014-02-22 Thread Jack Krupansky
Dynamic fields in queries and qf - yes. Wildcard field names in queries and qf - no. But you can use a wildcard in a copyField in your schema to copy any number of fields to some common field to search. That said, I think wildcard in qf is a reasonable request. I don't recall if there was

Fwd: No suggestions when I set spellcheck.q

2014-02-22 Thread Hakim Benoudjit
Hi guys, Suppose that a user is browsing a webpage where he has already filtered its articles. I want to get suggestions only in the filtered content (i.e. current category). To achieve this I have set `spellcheck.q` to the current query or category, but by doing this the query no longer returns

please add me to the solr lucene contributors group

2014-02-22 Thread Jayaram Iyer
my username is Jay. please add me to the solr lucene contributors group. would like to contribute an article on sharding via the implicit router

Re: Solr replicating at 5 mb/sec

2014-02-22 Thread Brendan Grainger
First thing I'd check is just transferring a large file (created with dd or something) over the network to make sure it's solr that is the issue. On Sat, Feb 22, 2014 at 8:45 AM, Cool Techi cooltec...@outlook.com wrote: Hi, I am running solr replication between two machine which are

Re: please add me to the solr lucene contributors group

2014-02-22 Thread Steve Rowe
Hi Jayaram, I've added your Jay username to the Solr wiki contributors group, so you should now be able to create/edit pages. Steve my username is Jay. please add me to the solr lucene contributors group. would like to contribute an article on sharding via the implicit router

Re: Tweaking Solr Query Result Cache

2014-02-22 Thread KNitin
Thanks, Erick. Turned off the query cache and sharded more aggressively helped bring down the latencies On Thu, Feb 20, 2014 at 5:07 PM, Erick Erickson erickerick...@gmail.comwrote: What you _do_ want to do is add replicas so you distribute the CPU load across a bunch of machines. The

Solr Segments, Segment Merges,Optimize

2014-02-22 Thread KNitin
Hi I have the following questions 1. I have a job that runs for 3-4 hours continuously committing data to a collection with auto commit of 30 seconds. Does it mean that every 30 seconds I would get a new solr segment ? 2. My current segment merge policy is set to 10. Will merger

solrdispatchfilter error

2014-02-22 Thread Pradeep Pujari
Hi, I imported solr 4.6 source code int eclipse under windows 8.1 Enterprise. Did ant eclipse and improted as a java project and then converted java project to Dynamic web project. Copied webcontent folder from solr war 4.6 into my project. I am geeting this below exception. I tried some the

Wikipedia Data Cleaning at Solr

2014-02-22 Thread Furkan KAMACI
Hi; I want to run an NLP algorithm for Wikipedia data. I used dataimport handler for dump data and everything is OK. However there are some texts as like: == Altyapı bilgileri == Köyde, [[ilköğretim]] okulu yoktur fakat taşımalı eğitimden yararlanılmaktadır. I think that it should be like that:

Re: Solr Segments, Segment Merges,Optimize

2014-02-22 Thread Erick Erickson
1 It Depends. Soft commits will not add a new segment. Hard commits with openSearcher=true or false _will_ create a new segment. 2 There are, but you'll have to dig. 3 Well, I'd ask a counter-question. Are you seeing unacceptable performance? If not, why worry? :) A better answer is that 24-28

Re: solrdispatchfilter error

2014-02-22 Thread Erick Erickson
First, please don't cross-post to multiple lists. Take the time to figure out which list is most relevant. This post is much more appropriate to the user's list. Second, back up and tell us what you are trying to accomplish. Why are you copying things around? Do you wish to debug in Eclipse? If

Re: Wikipedia Data Cleaning at Solr

2014-02-22 Thread Ahmet Arslan
Hi Furkan, There is org.apache.lucene.analysis.wikipedia.WikipediaTokenizer Ahmet On Sunday, February 23, 2014 2:22 AM, Furkan KAMACI furkankam...@gmail.com wrote: Hi; I want to run an NLP algorithm for Wikipedia data. I used dataimport handler for dump data and everything is OK. However

Re: Solr Segments, Segment Merges,Optimize

2014-02-22 Thread KNitin
Thanks, Erick. *2 There are, but you'll have to dig. * Any pointers on where to get started? *3 Well, I'd ask a counter-question. Are you seeing unacceptableperformance? If not, why worry? :)* When you mean % do you refer to deleted_docs/NumDocs or deleted_docs/Max_docs ? To answer

Re: ZK connection problems

2014-02-22 Thread Mark Miller
On Feb 21, 2014, at 12:23 PM, Jeff Wartes jwar...@whitepages.com wrote: I’ve been experimenting with SolrCloud configurations in AWS. One issue I’ve been plagued with is that during indexing, occasionally a node decides it can’t talk to ZK, and this disables updates in the pool. The node

Re: SolrCloud can't correctly create collection after zookeeper ensemble recovery

2014-02-22 Thread Mark Miller
I think this is a regression. There was code that removed the state from zk for a core that could not be created. There was a bug in that, in that you only want to do that for new cores and not existing cores (think cores that existed on startup). Someone commented out that code while working

Re: Solr Segments, Segment Merges,Optimize

2014-02-22 Thread Erick Erickson
Well, it's always possible. I wouldn't expect the search time/CPU utilization to increase with # segments, within reasonable limits. At some point, the important parts of the index get read into memory and the number of segments is pretty irrelevant. You do mention that you have a heavy ingestion