Re: Solr vs Lucene

2015-10-02 Thread Mark Fenbers
Thanks for the suggestion, but I've looked at aspell and hunspell and neither provide a native Java API. Further, I already use Solr for a search engine, too, so why not stick with this infrastructure for spelling, too? I think it will work well for me once I figure out the right

Re: Zk and Solr Cloud

2015-10-02 Thread Rallavagu
Thanks Shawn. Right. That is a great insight into the issue. We ended up clearing the overseer queue and then cloud became normal. We were running Solr indexing process and wondering if that caused the queue to grow. Will Solr (leader) add a work entry to zookeeper for every update if not

Re: Reverse query?

2015-10-02 Thread Andrea Roggerone
Hi Remy, The question is not really clear, could you explain a little bit better what you need? Reading your email I understand that you want to get documents containing all the search terms typed. For instance if you search for "Mad Max", you wanna get documents containing both Mad and Max. If

Reverse query?

2015-10-02 Thread remi tassing
Hi, I have medium-low experience on Solr and I have a question I couldn't quite solve yet. Typically we have quite short query strings (a couple of words) and the search is done through a set of bigger documents. What if the logic is turned a little bit around. I have a document and I need to

RE: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-02 Thread Adrian Liew
Hi Edwin, I have followed the standards recommended by the Zookeeper article. It seems to be working. Incidentally, I am facing intermittent issues whereby I am unable to connect to Zookeeper service via Solr's zkCli.bat command, even after having setting automatic startup of my ZooKeeper

Re: Facet queries blow out the filterCache

2015-10-02 Thread Charlie Hull
On 01/10/2015 23:31, Jeff Wartes wrote: It still inserts if I address the core directly and use distrib=false. I’ve got a few collections sharing the same config, so it’s surprisingly annoying to change solrconfig.xml right now, but it seemed pretty clear the query is the thing being cached,

Re: Facet queries blow out the filterCache

2015-10-02 Thread Toke Eskildsen
On Thu, 2015-10-01 at 22:31 +, Jeff Wartes wrote: > It still inserts if I address the core directly and use distrib=false. It is quite strange that is is triggered with the direct access. If that can be reproduced in test, it looks like a performance optimization to be done. Anyway,

Re: Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-02 Thread Ravi Solr
Mr. Uchida, Thank you for responding. It was my fault, I had a update processor which takes specific text and string fields and concatenates them into a single field, and I search on that single field. Recently I used Atomic update to fix a specific field's value and forgot to disable the

Re: Zk and Solr Cloud

2015-10-02 Thread Ravi Solr
Awesome nugget Shawn, I also faced similar issue a while ago while i was doing a full re-index. It would be great if such tips are added into FAQ type documentation on cwiki. I love the SOLR forum everyday I learn something new :-) Thanks Ravi Kiran Bhaskar On Fri, Oct 2, 2015 at 1:58 AM, Shawn

Re: Reverse query?

2015-10-02 Thread Ravi Solr
Hello Remi, Iam assuming the field where you store the data is analyzed. The field definition might help us answer your question better. If you are using edismax handler for your search requests, I believe you can achieve you goal by setting set your "mm" to 100%, phrase slop "ps" and

NullPointerException

2015-10-02 Thread Mark Fenbers
Greetings! Attached is a snippet from solrconfig.xml pertaining to my spellcheck efforts. When I use the Admin UI (v5.3.0), and check the spellcheck.build box, I get a NullPointerException stacktrace. The actual stacktrace is at the bottom of the attachment. The

Re: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-02 Thread Erick Erickson
Hmmm, there are usually a couple of ports that each ZK instance needs, is it possible that you've got more than one process using one of those ports? By default (I think), zookeeper uses "peer port + 1000" for its leader election process, see:

Re: Zk and Solr Cloud

2015-10-02 Thread Erick Erickson
Rallavagu: Absent nodes going up and down or otherwise changing state, Zookeeper isn't involved in the normal operations of Solr (adding docs, querying, all that). That said, things that change the state of the Solr nodes _do_ involve Zookeeper and the Overseer. The Overseer is used to serialize

Re: Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-02 Thread Erick Erickson
do we have to "reload" the collections on all the nodes to see the updated config ?? YES Is there a single call which can update all nodes connected to the ensemble ?? NO. I'll be a little pedantic here. When you say "ensemble", I'm not quite sure what that means and am interpreting it as "all

Re: Reverse query?

2015-10-02 Thread Erick Erickson
The admin/analysis page is your friend here, find it and use it ;) Note you have to select a core on the admin UI screen before you can see the choice. Because apart from the other comments, KeywordTokenizer is a red flag. It does NOT break anything up into tokens, so if your doc contains: Mad

Re: Reverse query?

2015-10-02 Thread Roman Chyla
I'd like to offer another option: you say you want to match long query into a document - but maybe you won't know whether to pick "Mad Max" or "Max is" (not mentioning the performance hit of "*mad max*" search - or is it not the case anymore?). Take a look at the NGram tokenizer (say size of 2;

Empty string in field used for grouping causes NPE in 4.x

2015-10-02 Thread Shawn Heisey
Let's say I'm using group.field=ip in a query. If the index contains documents where the ip field is the empty string, grouping fails with NullPointerException in the response writer. From our perspective, this is a bad document that we have to fix, but I don't think it should have failed the

Drill down facet for multi valued groups of fields

2015-10-02 Thread Douglas McGilvray
Hi everyone, my first post to the list! I tried and failed to explain this on IRC, I hope I can do a better job here. My document has a group of text fields: company, location, year. The group can have multiple values and I would like to facet (drill down) beginning with any of the three

Re: Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-02 Thread Tomoko Uchida
Hi Ravi, And for minor additional information, you may want to look through Collections API reference guide to handle collections properly in SolrCloud environment. (I bookmark this page.) https://cwiki.apache.org/confluence/display/solr/Collections+API

Re: Zk and Solr Cloud

2015-10-02 Thread Rallavagu
Thanks for the insight into this Erick. Thanks. On 10/2/15 8:58 AM, Erick Erickson wrote: Rallavagu: Absent nodes going up and down or otherwise changing state, Zookeeper isn't involved in the normal operations of Solr (adding docs, querying, all that). That said, things that change the state

Re: Solr vs Lucene

2015-10-02 Thread Jack Krupansky
Did you have a specific reason why you didn't want to send an HTTP request to Solr to perform the spellcheck operation? I mean, that is probably easier than diving into raw Lucene code. Also, Solr lets you do a spellcheck from a remote client whereas the Lucene spellcheck needs to be on the same

Re: Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-02 Thread Ravi Solr
Thank you very much Erick and Uchida. I will take a look at the URL u gave Erick. Thanks Ravi Kiran Bhaskar On Fri, Oct 2, 2015 at 12:41 PM, Tomoko Uchida wrote: > Hi Ravi, > > And for minor additional information, > you may want to look through Collections API

are there any SolrCloud supervisors?

2015-10-02 Thread r b
I've been working on something that just monitors ZooKeeper to add and remove nodes from collections. the use case being I put SolrCloud in an autoscaling group on EC2 and as instances go up and down, I need them added to the collection. It's something I've built for work and could clean up to

Re: Reverse query?

2015-10-02 Thread Andrea Roggerone
Hi, the phrase query format would be: "Mad Max"~2 The * has been added by the mail aggregator around the chars in Bold for some reason. That wasn't a wildcard. On Friday, October 2, 2015, Roman Chyla wrote: > I'd like to offer another option: > > you say you want to match

Re: Facet queries blow out the filterCache

2015-10-02 Thread Jeff Wartes
I backed up a bit. I took the stock solr download and did this: solr-5.3.1>$ bin/solr -e techproducts So, no SolrCloud, default example config, about as basic as you get. I didn’t even bother indexing any docs. Then I issued this query:

Recovery Thread Blocked

2015-10-02 Thread Rallavagu
Solr 4.6.1 on Tomcat 7, single shard 4 node cloud with 3 node zookeeper During updates, some nodes are going very high cpu and becomes unavailable. The thread dump shows the following thread is blocked 870 threads which explains high CPU. Any clues on where to look? "Thread-56848" id=79207

Re: Recovery Thread Blocked

2015-10-02 Thread Rallavagu
Here is the stack trace of the thread that is holding the lock. "Thread-55266" id=77142 idx=0xc18 tid=992 prio=5 alive, waiting, native_blocked, daemon -- Waiting for notification on: org/apache/solr/cloud/RecoveryStrategy@0x3f34e8480[fat lock] at

Re: Zk and Solr Cloud

2015-10-02 Thread Upayavira
Very interesting, Shawn. What I'd say is paste more of the stacktraces, so we can see the context in which the exception happened. It could be that you are flooding the overseer, or it could be that you have a synonyms file (or such) that is too large. I'd like to think the rest of the stacktrace

Re: highlighting

2015-10-02 Thread Upayavira
In the end, in most open source projects, people implement that which they need themselves, and offer it back to the community in the hope that it will help others too. If you need this, then I'd encourage you to look at the source highlighting component and see if you can see how it might be