Solr cloud clusterstate.json update query ?

2015-05-05 Thread Sai Sreenivas K
Could you clarify on the following questions, 1. Is there a way to avoid all the nodes simultaneously getting into recovery state when a bulk indexing happens ? Is there an api to disable replication on one node for a while ? 2. We recently changed the host name on nodes in solr.xml. But the old

Alternate ways to facet spatial data

2015-05-05 Thread James Sewell
Hello all, I've just started using SOLR for spatial queries and it looks great so far. I've mostly been investigating importing a large amount of point data, indexing and searching it. I've discovered the facet.heatmap functionality, which is great - but I would like to ask if it is possible to

Re: Solr 5.1.0 Cloud and Zookeeper

2015-05-05 Thread shacky
Thank you very much for your answer. I installed ZooKeeper 3.4.6 on my Debian (Wheezy) system, and it's working well. The only problem I have is that I'm looking for some init script but I cannot find anything. I'm also trying to adapt the script in Debian's zookeeperd package, but I have some

SolrCloud indexing

2015-05-05 Thread Vincenzo D'Amore
Hi all, I have 3 nodes and there are 3 shards but looking at solrcloud admin I see that all the leaders are on the same node. If I understood well looking at solr documentation https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud : When a document is sent to

Solr Wordpress - one server or two?

2015-05-05 Thread Robg50
I'm thinking of taking SOLR for a test drive and will probably keep it if it works as I'm hoping so I'd like to get it as right as possible the first time out. I'm running Wordpress on Ubuntu with php and Mariadb 10. The server is a 7 core, 4gb, Azure VM. The database is 4gb. The data itself is

Solr 5.1.0 Cloud and Zookeeper

2015-05-05 Thread shacky
Hi. I read on https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble that Solr needs to use the same ZooKeeper version it owns (at the moment 3.4.6). Debian Jessie has ZooKeeper 3.4.5 (https://packages.debian.org/jessie/zookeeper). Are you sure that this

Re: Solr 5.1.0 Cloud and Zookeeper

2015-05-05 Thread Mark Miller
A bug fix version difference probably won't matter. It's best to use the same version everyone else uses and the one our tests use, but it's very likely 3.4.5 will work without a hitch. - Mark On Tue, May 5, 2015 at 9:09 AM shacky shack...@gmail.com wrote: Hi. I read on

Re: Multiple index.timestamp directories using up disk space

2015-05-05 Thread Rishi Easwaran
Worried about data loss makes sense. If I get the way solr behaves, the new directory should only have missing/changed segments. I guess since our application is extremely write heavy, with lot of inserts and deletes, almost every segment is touched even during a short window, so it appears

Re: Finding out optimal hash ranges for shard split

2015-05-05 Thread anand.mahajan
Looks like its not possible to find out the optimal hash ranges for a split before you actually split it. So the only way out is to keep splitting out the large subshards? -- View this message in context:

Re: SolrCloud indexing

2015-05-05 Thread Erick Erickson
bq: Does it mean that all the indexing is done by the leaders in one node? no. The raw document is forwarded from the leader to the replica and it's indexed on all the nodes. The leader has a little bit of extra work to do routing the docs, but that's it. Shouldn't be a problem with 3 shards.

Re: Solr 5.0 - uniqueKey case insensitive ?

2015-05-05 Thread Erick Erickson
Well, working fine may be a bit of an overstatement. That has never been officially supported, so it just happened to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so

Re: Solr cloud clusterstate.json update query ?

2015-05-05 Thread Erick Erickson
about 1. This shouldn't be happening, so I wouldn't concentrate there first. The most common reason is that you have a short Zookeeper timeout and the replicas go into a stop-the-world garbage collection that exceeds the timeout. So the first thing to do is to see if that's happening. Here are a

Re: SolrCloud collection properties

2015-05-05 Thread Erick Erickson
_What_ properties? Details matter And how do you do this now? Assuming you do this with separate conf directories, these are then just configsets in Zookeeper and you can have as many of them as you want. Problem here is that each one of them is a complete set of schema and config files,

Re: Solr cloud clusterstate.json update query ?

2015-05-05 Thread Gopal Jee
about 2 , live_nodes under zookeeper is ephemeral node (please see zookeeper ephemeral node). So, once connection from solr zkClient to zookeeper is lost, these nodes will disappear automatically. AFAIK, clusterstate.json is updated by overseer based on messages published to a queue in zookeeper

Re: Solr Wordpress - one server or two?

2015-05-05 Thread Shawn Heisey
On 5/5/2015 6:11 AM, Robg50 wrote: I'm thinking of taking SOLR for a test drive and will probably keep it if it works as I'm hoping so I'd like to get it as right as possible the first time out. I'm running Wordpress on Ubuntu with php and Mariadb 10. The server is a 7 core, 4gb, Azure VM.

SolrCloud collection properties

2015-05-05 Thread Markus Heiden
Hi, we are trying to migrate from Solr 4.10 to SolrCloud 4.10. I understood that SolrCloud uses collections as abstraction from the cores. What I am missing is a possibility to store collection-specific properties in Zookeeper. Using property.foo=bar in CREATE-URLs just sets core-specific

Re: Solr Exception The remote server returned an error: (400) Bad Request.

2015-05-05 Thread marotosg
Thanks for the answer but i don't think that's going to solve my problem.For instance if I copy this query in the chrome browserhttp://localhost:8080/solr48/person/select?q=CoreD:25I get this error.4001CoreD:25undefined field CoreD400If I use wget from linux wget

Re: Multiple index.timestamp directories using up disk space

2015-05-05 Thread Shawn Heisey
On 5/5/2015 7:29 AM, Rishi Easwaran wrote: Worried about data loss makes sense. If I get the way solr behaves, the new directory should only have missing/changed segments. I guess since our application is extremely write heavy, with lot of inserts and deletes, almost every segment is

Solr/ Solr Cloud meetup at Aol

2015-05-05 Thread Rishi Easwaran
Hi All, Aol is hosting a meetup in Dulles VA. The topic this time is Solr/ Solr Cloud. http://www.meetup.com/Code-Brew/events/53217/ Thanks, Rishi.

Re: Multiple index.timestamp directories using up disk space

2015-05-05 Thread Shawn Heisey
On 5/5/2015 1:15 PM, Rishi Easwaran wrote: Thanks for clarifying lucene segment behaviour. We don't trigger optimize externally, could it be internal solr optimize? Is there a setting/ knob to control when optimize occurs. Optimize never happens automatically, but *merging* does. An

Re: Limit Results By Score?

2015-05-05 Thread Chris Hostetter
: We have implemented a custom scoring function and also need to limit the : results by score. How could we go about that? Alternatively, can we : suppress the results early using some kind of custom filter? in general, limiting by score is a bad idea for all of the reasons outlined here...

Re: Schema API: add-field-type

2015-05-05 Thread Steve Rowe
Hi Steve, responses inline below: On Apr 29, 2015, at 6:50 PM, Steven White swhite4...@gmail.com wrote: Hi Everyone, When I pass the following: http://localhost:8983/solr/db/schema/fieldtypes?wt=xml I see this (as one example): lst str name=namedate/str str

Re: Editing the Solr Wiki

2015-05-05 Thread Chris Hostetter
you should be good to go, thanks (in advance) for helping out with your edits. : http://www.manning.com/turnbull/. I have already set up an account with : the username NicoleButterfield.   Many thanks in advance for your help -Hoss http://www.lucidworks.com/

Re: Multiple index.timestamp directories using up disk space

2015-05-05 Thread Rishi Easwaran
Hi Shawn, Thanks for clarifying lucene segment behaviour. We don't trigger optimize externally, could it be internal solr optimize? Is there a setting/ knob to control when optimize occurs. Thanks for pointing it out, will monitor memory closely. Though doubt memory is an issue, these are

Re: Multiple index.timestamp directories using up disk space

2015-05-05 Thread Ramkumar R. Aiyengar
Yes, data loss is the concern. If the recovering replica is not able to retrieve the files from the leader, it at least has an older copy. Also, the entire index is not fetched from the leader, only the segments which have changed. The replica initially gets the file list from the replica, checks

Re: Slow highlighting on Solr 5.0.0

2015-05-05 Thread Ere Maijala
I'm seeing the same with Solr 5.1.0 after upgrading from 4.10.2. Here are my timings: 4.10.2: process: 1432.0 highlight: 723.0 5.1.0: process: 9570.0 highlight: 8790.0 schema.xml and solrconfig.xml are available at https://github.com/NatLibFi/NDL-VuFind-Solr/tree/master/vufind/biblio/conf.

Proximity searching in percentage

2015-05-05 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, how do we implement character proximity searching that's in terms of percentage with regards to the length of the word, instead of a fixed number of edit distance (characters)? For example, if we have a proximity of 20%, a word with 5 characters will have an edit

Solr Exception The remote server returned an error: (400) Bad Request.

2015-05-05 Thread marotosg
Hi, I am having some difficulties knowing which one is the exception I am having on my client for some queries. Queries malformed are always coming back to my solrNet client as The remote server returned an error: (400) Bad Request.. Internally Solr is actually printing the log issues like

Limit Results By Score?

2015-05-05 Thread Johannes Ruscheinski
Hi, We have implemented a custom scoring function and also need to limit the results by score. How could we go about that? Alternatively, can we suppress the results early using some kind of custom filter? --Johannes -- Dr. Johannes Ruscheinski Universitätsbibliothek Tübingen - IT-Abteilung

Re: Solr Exception The remote server returned an error: (400) Bad Request.

2015-05-05 Thread Tomasz Borek
Take a look at query parameters and use debug and/or explain. https://wiki.apache.org/solr/CommonQueryParameters Also, perhaps change parser from default one to less stringent dismax. Hard to say what fits your case as I don't know it, but those two are best starting points I know of.