how to query this?

2014-10-27 Thread rulinma
I have a query on course: 1. if course will begin, then sort those by beginTime asc. 2. if couse ended, then sort those by begin desc. how to query use solr? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-query-this-tp4165998.html Sent from the Solr -

AW: AW: AW: (auto)suggestions, but ony from a filtered set of documents

2014-10-27 Thread Clemens Wyss DEV
Thx Mike, at the moment I do not see/understand what the advantage of feed a single suggester from multiple fields compared to using copyfields to feed the suggester field is? Also, coming back to my main issue, how does your approach allow me to filter the documents to be taken into

Re: how to query this?

2014-10-27 Thread Mikhail Khludnev
check if() and map() https://cwiki.apache.org/confluence/display/solr/Function+Queries On Mon, Oct 27, 2014 at 9:05 AM, rulinma ruli...@gmail.com wrote: I have a query on course: 1. if course will begin, then sort those by beginTime asc. 2. if couse ended, then sort those by begin desc. how

RE: suggestion for new custom atomic update

2014-10-27 Thread Elran Dvir
I will explain with an example. Let's say field_a is sent in the update with the value of 5. field_a is already stored in the document with the value 8. After the update field_a should have the value 13 (sum). The value of field_b will be based on the value of 13 and not 5. Is there a way in URP

Re: how to query this?

2014-10-27 Thread rulinma
not solve my question. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-query-this-tp4165998p4166007.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: suggestion for new custom atomic update

2014-10-27 Thread Matthew Nigl
You can get the summed value, 13, if you add a processor after DistributedUpdateProcessorFactory in the URP chain. Then one possibility would be to clone this value to another field, such as field_b, and run other processors on that field. Or for something more customized, you can use the

Map-Reduce Solr Kerberos Authentication

2014-10-27 Thread Adam Higginson
Hi all, As a bit of background, we're trying to run a map-reduce job on a Hadoop cluster (CDH version 4.5.0) which involved reading/writing from Solr during both the Map and Reduce phase. To accomplish this, we are using the Solrj library with version 4.4.0-search-1.3.0. In a separate

Re: AW: AW: AW: (auto)suggestions, but ony from a filtered set of documents

2014-10-27 Thread Michael Sokolov
On 10/27/14 2:23 AM, Clemens Wyss DEV wrote: Thx Mike, at the moment I do not see/understand what the advantage of feed a single suggester from multiple fields compared to using copyfields to feed the suggester field is? The advantage is mainly in that you can apply different analysis and

Re: solr highlighting query

2014-10-27 Thread john eipe
I have this line highlighted emJobs/em was emborn/em in San Francisco, California on February 24 1955. for query Jobs born~15 but not for born Jobs~15. I want the same result irrespective of the order of search keywords. Regards, John Eipe “The Roots of Violence: Wealth without work, Pleasure

phrase query in solr 4

2014-10-27 Thread Robust Links
Hi We are trying to upgrade our index from 3.6.1 to 4.9.1 and I wanted to make sure our existing indexing strategy is still valid or not. The statistics of the raw corpus are: - 4.8 Billon total number of tokens in the entire corpus. - 13MM documents We have 3 requirements 1) we want to

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Erick Erickson
OK, clarify a bit more what you're doing with Hadoop. Are you using the MapReduceIndexerTool? Or are your Hadoop jobs writing directly to SolrCloud? How are you measuring out of sync? Are you sure that you've committed? Does out of synch mean reporting different result counts? Different order?

Re: solr highlighting query

2014-10-27 Thread Erick Erickson
Well, maybe you can work with the ComplexPhraseQueryParser, that's been around for a while, see: http://lucene.apache.org/core/4_10_1/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html Or you can just live with the inherent slop in the ~ operator. You haven't

Stopwords in shingles suggester

2014-10-27 Thread O. Klein
Is there a way in Solr to filter out stopwords in shingles like ES does? http://www.elasticsearch.org/blog/searching-with-shingles/ -- View this message in context: http://lucene.472066.n3.nabble.com/Stopwords-in-shingles-suggester-tp4166057.html Sent from the Solr - User mailing list archive

Re: phrase query in solr 4

2014-10-27 Thread Shawn Heisey
On 10/27/2014 6:20 AM, Robust Links wrote: 1) we want to index and search all tokens in a document (i.e. we do not rely on external stores) 2) we need search time to be fast and willing to pay larger indexing time and index size, 3) be able to search as fast as possible ngrams of 3

RE: Stopwords in shingles suggester

2014-10-27 Thread Markus Jelsma
You do not want stopwords in your shingles? Then put the stopword filter on top of the shingle filter. Markus -Original message- From:O. Klein kl...@octoweb.nl Sent: Monday 27th October 2014 13:56 To: solr-user@lucene.apache.org Subject: Stopwords in shingles suggester Is there a

Re: Stopwords in shingles suggester

2014-10-27 Thread Dikshant Shahi
Configure a fieldType in schema.xml as below: fieldType name=text_shingle class=solr.TextField positionIncrementGap=0 analyzer tokenizer class=solr.StandardTokenizerFactory/ .. .. *filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt /*

Re: solr highlighting query

2014-10-27 Thread david.w.smi...@gmail.com
John, I’m not seeing this problem. Presumably we’re talking about the default highlighter (the most accurate one) but I figure the others would match it too. To test, I added the following to HighlightTest.java in Solr and this test passed: @Test public void testSpan() { final String

Re: Stopwords in shingles suggester

2014-10-27 Thread Shawn Heisey
On 10/27/2014 6:56 AM, O. Klein wrote: Is there a way in Solr to filter out stopwords in shingles like ES does? http://www.elasticsearch.org/blog/searching-with-shingles/ If I read that correctly, ES isn't doing anything differently than Solr does. They use the same filters that Solr does.

Re: solr highlighting query

2014-10-27 Thread john eipe
Yes. It seems to work for Default Highlighting. I'm using Fast Vector Highlighter. Let me also explain why I went for Fast Vector Highlighter. I wanted the highlighted content to be complete and not broken words and for that I need to use breakIterator which works only for Fast vector

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Otis Gospodnetic
Hi, You may simply be overwhelming your cluster-nodes. Have you checked various metrics to see if that is the case? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Oct 26, 2014, at 9:59 PM, S.L

Re: solr highlighting query

2014-10-27 Thread david.w.smi...@gmail.com
Ah. Currently the most accurate highlighter is the default one, so if accuracy is more important than BreakIterator then you’ll have to switch back. Keep an eye out for some efficiency enhancements to this highlighter “real soon now”. Separately from that efficiency, later this year I may have

set solr to return only doc ids and highlighting

2014-10-27 Thread john eipe
Hi My solr searches with highlighting returns documents (with all fields) that contain the search words and highlighting. Is there a way to restrict so that I get only id field + highlighting. result name=response numFound=1 start=0 doc str name=id1253/str /doc /result lst name=highlighting

Re: solr highlighting query

2014-10-27 Thread john eipe
Thanks David. So I guess I will have to go with default highlighter (with a higher fragsize) and then take care of boundryScanning myself.

Re: Stopwords in shingles suggester

2014-10-27 Thread Vikas Agarwal
Is this https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StopFilterFactory what you are looking for? Basically, you can use analyzers for this purpose. You can even write your own analyzer. On Mon, Oct 27, 2014 at 6:26 PM, O. Klein kl...@octoweb.nl wrote: Is there a way in Solr

Re: Solr + HDFS settings

2014-10-27 Thread Michael Della Bitta
This doesn't answer your question, but unless something is changed, you're going to want to set this to false. It causes index corruption at the moment. On 10/25/14 03:42, Norgorn wrote: bool name=solr.hdfs.blockcache.write.enabledtrue/bool

New Meetup in London - Lucene/Solr User Group

2014-10-27 Thread Charlie Hull
Hi all, We noticed that there isn't a Lucene/Solr user group in London (although there is an Elasticsearch user group) - so we decided to start one! http://www.meetup.com/Apache-Lucene-Solr-London-User-Group Please join if you're interested and do pass the word. Our first meeting will be

Re: set solr to return only doc ids and highlighting

2014-10-27 Thread Alexandre Rafalovitch
Have you looked at 'fl' parameter? You can experiment with that in the Admin UI. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

Re: Solr + HDFS settings

2014-10-27 Thread Norgorn
Already tried with same result (the message changed properly ) -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-HDFS-settings-tp4165873p4166089.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: New Meetup in London - Lucene/Solr User Group

2014-10-27 Thread Alexandre Rafalovitch
Awesome. And whatever lessons you learn, please share them on the popularizers LinkedIn group. That's what it's there for. Also, feel free to announce it there and ask for feedback. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter:

SolrCloud config question and zookeeper

2014-10-27 Thread Bernd Fehling
While starting now with SolrCloud I tried to understand the sense of external zookeeper. Let's assume I want to split 1 huge collection accross 4 server. My straight forward idea is to setup a cloud with 4 shards (one on each server) and also have a replication of the shard on another server.

Re: SolrCloud config question and zookeeper

2014-10-27 Thread Michael Della Bitta
You want external zookeepers. Partially because you don't want your Solr garbage collections holding up zookeeper availability, but also because you don't want your zookeepers going offline if you have to restart Solr for some reason. Also, you want 3 or 5 zookeeepers, not 4 or 8. On

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread S.L
Thank Otis, I have checked the logs , in my case the default catalina.out and I dont see any OOMs or , any other exceptions. What others metrics do you suggest ? On Mon, Oct 27, 2014 at 9:26 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, You may simply be overwhelming your

RE: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Markus Jelsma
It is an ancient issue. One of the major contributors to the issue was resolved some versions ago but we are still seeing it sometimes too, there is nothing to see in the logs. We ignore it and just reindex. -Original message- From:S.L simpleliving...@gmail.com Sent: Monday 27th

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Michael Della Bitta
I'm curious, could you elaborate on the issue and the partial fix? Thanks! On 10/27/14 11:31, Markus Jelsma wrote: It is an ancient issue. One of the major contributors to the issue was resolved some versions ago but we are still seeing it sometimes too, there is nothing to see in the logs.

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread S.L
Markus, I would like to ignore it too, but whats happening is that the there is a lot of discrepancy between the replicas , queries like q=*:*fq=(id:220a8dce-3b31-4d46-8386-da8405595c47) fail depending on which replica the request goes to, because of huge amount of discrepancy between the

RE: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Markus Jelsma
https://issues.apache.org/jira/browse/SOLR-4260 resolved https://issues.apache.org/jira/browse/SOLR-4924 open -Original message- From:Michael Della Bitta michael.della.bi...@appinions.com Sent: Monday 27th October 2014 16:40 To: solr-user@lucene.apache.org Subject: Re: Heavy

Re: set solr to return only doc ids and highlighting

2014-10-27 Thread john eipe
Perfect. Thanks. Regards, John Eipe “The Roots of Violence: Wealth without work, Pleasure without conscience, Knowledge without character, Commerce without morality, Science without humanity, Worship without sacrifice, Politics without principles” - Mahatma Gandhi

RE: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Markus Jelsma
Hi - if there is a very large discrepancy, you could consider to purge the smallest replica, it will then resync from the leader. -Original message- From:S.L simpleliving...@gmail.com Sent: Monday 27th October 2014 16:41 To: solr-user@lucene.apache.org Subject: Re: Heavy

RE: suggestion for new custom atomic update

2014-10-27 Thread Elran Dvir
Thank you very much for your suggestion. I created an update processor factory with my logic. I changed the update processor chain to be: processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory / processor

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread S.L
One is not smaller than the other, because the numDocs is same for both replicas and essentially they seem to be disjoint sets. Also manually purging the replicas is not option , because this is frequently indexed index and we need everything to be automated. What other options do I have now.

Re: suggestion for new custom atomic update

2014-10-27 Thread Shalin Shekhar Mangar
Hi Elran, You need to explicitly specify the DistributedUpdateProcessorFactory in the chain and then add your custom processor after it. On Mon, Oct 27, 2014 at 9:26 PM, Elran Dvir elr...@checkpoint.com wrote: Thank you very much for your suggestion. I created an update processor factory

[ANN] Heliosearch 0.08 released

2014-10-27 Thread Yonik Seeley
http://heliosearch.org/download Heliosearch v0.08 Features: o Heliosearch v0.08 is based on (and contains all features of) Lucene/Solr 4.10.2 o Streaming Aggregations over search results API: http://heliosearch.org/streaming-aggregation-for-solrcloud/ o Optimized request logging, and

Re: suggestion for new custom atomic update

2014-10-27 Thread Matthew Nigl
No problem Elran. As Shalin mentioned, you will need to do it like this: processor class=solr.DistributedUpdateProcessorFactory/ processor class=mycode.solr_plugins.FieldManipulationProcessorFactory / processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread S.L
Please find the clusterstate.json attached. Also in this case *atleast *the Shard1 replicas are out of sync , as can be seen below. *Shard 1 replica 1 *does not* return a result with distrib=false.* *Query :*

Re: Stopwords in shingles suggester

2014-10-27 Thread O. Klein
Thank you all for your input. The stopword is being replaced by the fillerToken as shown in the article. Changing positionIncrementGap makes no difference and as of Solr 4.4, the enablePositionIncrements argument is no longer supported in the StopFilterFactory. So how do I get this working in

Re: Stopwords in shingles suggester

2014-10-27 Thread Ahmet Arslan
Hi, I think you can set fillerToken value? Ahmet On Monday, October 27, 2014 8:03 PM, O. Klein kl...@octoweb.nl wrote: Thank you all for your input. The stopword is being replaced by the fillerToken as shown in the article. Changing positionIncrementGap makes no difference and as of Solr

Re: Stopwords in shingles suggester

2014-10-27 Thread O. Klein
I changed luceneMatchVersion to 4.3 and got the behavior i was looking for. -- View this message in context: http://lucene.472066.n3.nabble.com/Stopwords-in-shingles-suggester-tp4166057p4166192.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Will Martin
2 naïve comments, of course. - Queuing theory - Zookeeper logs. From: S.L [mailto:simpleliving...@gmail.com] Sent: Monday, October 27, 2014 1:42 PM To: solr-user@lucene.apache.org Subject: Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread S.L
Good point about ZK logs , I do see the following exceptions intermittently in the ZK log. 2014-10-27 06:54:14,621 [myid:1] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /xxx.xxx.xxx.xxx:56877 which had sessionid 0x34949dbad580029

RE: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Will Martin
Erick Erickson has a comment on a thread out there that says there's a lot of pinging between SolrCloud and ZK. AND if a timeout occurs (which could be fallback behavior on that exception) ZK will mark the node down AND SolrCloud won't use it until ZK gets back inline/online. Fwiw.