Re: Advice on Stemming in Solr

2017-11-02 Thread Zheng Lin Edwin Yeo
Hi Emir, We are looking to change to HunspellStemFilterFactory. This has a dictionary file containing words and applicable flags, and an affix file that specifies how these flags will control spell checking. Probably we can control it from those files in HunspellStemFilterFactory? Regards, Edwin

Trouble using Jython script as ScriptTransformer

2017-11-02 Thread Kevin Grimes
Hey all, I’m running v6.3.0. I’ve been trying to configure a Jython ScriptTransformer in my data-config.xml (pulls from JdbcDataSource). But when I run the full import, it tries to interpret the script as JavaScript, even though I added the language=Jython attribute to the

Configuring HDFS Keyprovider for Solr

2017-11-02 Thread q4
I'm trying to create a Solr collection and store it in an HDFS encryption zone. Getting errors below: org.apache.solr.common.SolrException: Error CREATEing SolrCore 'person4_shard1_replica_n1': Unable to create core [person4_shard1_replica_n1] Caused by: No KeyProvider is configured, cannot

Re: Anyone have any comments on current solr monitoring favorites?

2017-11-02 Thread Emir Arnautović
Hi Robi, Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and more. We use it for monitoring our Solr instances and for consulting. Disclaimer - see signature :) Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training

RE: adding documents to a secured solr server.

2017-11-02 Thread Phil Scadden
Yes, that worked. -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Thursday, 2 November 2017 6:14 p.m. To: solr-user@lucene.apache.org Subject: Re: adding documents to a secured solr server. On 11/1/2017 10:04 PM, Phil Scadden wrote: > For testing, I changed to

Re: Solr streaming innerJoin doesn't return rows

2017-11-02 Thread Webster Homer
Thank you, that helps a lot. I suspect that we won't use joins, but getting them to work at all is a plus. However, that does work once I add the /export to both searches. It doesn't perform all that badly considering that I am running it on a small solrcloud on an under powered developer's VM.

Re: Solr streaming innerJoin doesn't return rows

2017-11-02 Thread Joel Bernstein
The joins are MapReduce joins which require shuffling of entire result sets. This means you need to use the /export handler to make them work. The joins in general are designed to be done in parallel on large clusters. You won't be able to get good performance with large joins on a single node or

From Zero to Learning to Rank in Apache Solr

2017-11-02 Thread Michael Alcorn
Here's a tutorial I wrote that some of you all might find useful: https://github.com/airalcorn2/Solr-LTR. Feedback is welcome. Thanks, Michael A. Alcorn

Re: Anyone have any comments on current solr monitoring favorites?

2017-11-02 Thread Walter Underwood
We use New Relic for JVM, CPU, and disk monitoring. I tried the built-in metrics support in 6.4, but it just didn’t do what we want. We want rates and percentiles for each request handler. That gives us 95th percentile for textbooks suggest or for homework search results page, etc. The Solr

Solr streaming innerJoin doesn't return rows

2017-11-02 Thread Webster Homer
I'm using Solr 6.2.0. I am trying to understand how the streaming api works. in 6.2 simple expressions seem to behave well. I am having a problem making the joins work. I don't see errors, but I don't see data either. Using the Solr Admin Console for testing, this query works:

Re: Solr streaming questions

2017-11-02 Thread Webster Homer
This is a new project, and it's requirements are not yet completely defined. The system we are looking at building is an automated B2B system where a customer's system calls in with queries and we return products, skus, pricing and availability to the caller. As it turns out relevancy will not be

Anyone have any comments on current solr monitoring favorites?

2017-11-02 Thread Petersen, Robert (Contr)
OK I'm probably going to open a can of worms here... lol In the old old days I used PSI probe to monitor solr running on tomcat which worked ok on a machine by machine basis. Later I had a grafana dashboard on top of graphite monitoring which was really nice looking but kind of complicated

Re: Upgrade path from 5.4.1

2017-11-02 Thread Petersen, Robert (Contr)
Thanks guys! I kind of suspected this would be the best route and I'll move forward with a fresh start on 7.x as soon as I can get ops to give me the needed machines!  Best Robi From: Erick Erickson Sent: Thursday, November 2, 2017

Re: Upgrade path from 5.4.1

2017-11-02 Thread Erick Erickson
Yonik: Yeah, I was justparroting what had been reported I have no data to back it up personally. I just saw the JIRA that Simon indicated and it looks like the statement "which are faster on all fronts and use less memory" is just flat wrong when it comes to looking up individual values. Ya

Re: how to ensure that one shard does not get overloaded when we use routing

2017-11-02 Thread Erick Erickson
Well, you have to monitor. That's the down-side to using this type of routing, you're effectively saying "I know enough about my usage to predict". What do you think you're gaining by using this? Putting all docs from a single org on a subset of your servers reduces some part of the parallelism

ANNOUNCE: Solr Reference Guide for Solr 7.1 released

2017-11-02 Thread Cassandra Targett
The Lucene PMC is pleased to announce that the Solr Reference Guide for 7.1 is now available. This 1,077-page PDF is the definitive guide to using Apache Solr, the search server built on Lucene. The PDF Guide can be downloaded from:

Re: SynonymGraphFilterFactory with edismax

2017-11-02 Thread Amar Raja
Thanks Steve, We have a smoking gun! I am on 6.5.1, and have tested in 7.1 and I don't see the same issue. I can't upgrade just yet, however I have found setting mm=1 sorts this out in my case, giving me the following: (+(+DisjunctionMaxQueryweb_name:metal (+web_name:rose

how to ensure that one shard does not get overloaded when we use routing

2017-11-02 Thread Ketan Thanki
Hi, I have 4 shard and 4 replica and I do Composite document routing for my unique field 'Id' as mentions below. e.g : tenants bits use as projectId/2! prefix with Id how to ensure that one shard does not get overloaded when we use routing Regards, Ketan. Please cast a vote for Asite in the

Re: SynonymGraphFilterFactory with edismax

2017-11-02 Thread Steve Rowe
Hi Amar, What version of Solr are you using? This looks like a bug that was fixed in Solr 6.6.1: . -- Steve www.lucidworks.com > On Nov 2, 2017, at 8:31 AM, Amar Raja > wrote: > > Hello, > > I have

SynonymGraphFilterFactory with edismax

2017-11-02 Thread Amar Raja
Hello, I have the following field definition: And the following two synonym definitions: kids => boys,girls metallic => rose gold,metallic The intent being a user searching for "kids" should get girls or boys results, but searching for "boys" will

Re: Upgrade path from 5.4.1

2017-11-02 Thread simon
though see SOLR-11078 , which is reporting significant query slowdowns after converting *Trie to *Point fields in 7.1, compared with 6.4.2 On Wed, Nov 1, 2017 at 9:06 PM, Yonik Seeley wrote: > On Wed, Nov 1, 2017 at 2:36 PM, Erick Erickson > wrote:

SynonymGraphFilterFactory with edismax

2017-11-02 Thread Amar Raja
Hello, I have the following field definition: And the following two synonym definitions: kids => boys,girls metallic => rose gold,metallic The intent being a user searching for "kids" should get girls or boys results, but searching for "boys" will

Re: SOLR-11504: Provide a config to restrict number of indexing threads

2017-11-02 Thread Michael McCandless
Actually, it's one lucene segment per *concurrent* indexing thread. So if you have 10 indexing threads in Lucene at once, then 10 in-memory segments will be created and will have to be written on refresh/commit. Elasticsearch uses a bounded thread pool to service all indexing requests, which I

Re: Advice on Stemming in Solr

2017-11-02 Thread Emir Arnautović
Hi Edwin, It seems that it would be best if you do not apply *ing stemming rule at all. The first idea is to trick stemmer and replace any word that ends with ing to some nonexisting char combination e.g. ‘wqx’. You can use solr.PatternReplaceFilterFactory to do that. You can switch it back

Re: SOLR-11504: Provide a config to restrict number of indexing threads

2017-11-02 Thread Emir Arnautović
Hi Nawab, > One indexing thread in lucene corresponds to one segment being written. I > need a fine control on the number of segments. I didn’t check the code, but I would be surprised that it is how things work. It can appear that it is working like that if each client thread is doing

Re: Automatic creation of indexes

2017-11-02 Thread Jokin C
Nice presentation, the concepts that in it are the reason that I was searching for this feature. Thanks! On Wed, Nov 1, 2017 at 5:12 PM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > >Emir, your message did not actually include anything related to the > presentation you mentioned. >

Re: Automatic creation of indexes

2017-11-02 Thread Jokin C
Oh, nice, this was just what I was looking for, I will follow the issue. Thanks! On Wed, Nov 1, 2017 at 3:03 PM, Shawn Heisey wrote: > On 10/31/2017 5:32 AM, Jokin Cuadrado wrote: > >> Hi, I'm using solr to store time series data, log events etc. Right now I >> use a solr

AW: LatLonPointSpatialField, sorting : sort param could not be parsed as a query, and is not a field that exists in the index

2017-11-02 Thread Clemens Wyss DEV
Sorry for "re-asking". Anybody else facing this issue (bug?), or can anybody provide an advice "where to look"? Thx Clemens -Ursprüngliche Nachricht- Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch] Gesendet: Mittwoch, 1. November 2017 11:06 An: 'solr-user@lucene.apache.org'