Dynamic Stopwords

2020-05-14 Thread A Adel
Hi - Is there a way to configure stop words to be dynamic for each document based on the language detected of a multilingual text field? Combining all languages stop words in one set is a possibility however it introduces false positives for some language combinations, such as German and English.

Re: nested entities and DIH indexing time

2020-05-14 Thread Shawn Heisey
On 5/14/2020 3:14 PM, matthew sporleder wrote:> Can a non-nested entity write into existing docs, or do they always> have to produce document-per-entity? This is the only thing I found on this topic, and it is on a third-party website, so I can't say much about how accurate it is:

Re: Terraform and EC2

2020-05-14 Thread Ganesh Sethuraman
We use terraform on EC2 for creating infrastructure as code for solr cloud and Zookeeper quorum ( using 3 node auto scale target group terra form module) and solr as well with n node auto scale group module. Auto scale target group is just to make it easy to create cluster infrastructure. We need

Terraform and EC2

2020-05-14 Thread Walter Underwood
Anybody building sharded clusters with Terraform on EC2? I’d love some hints. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)

Re: using aliases in topic stream

2020-05-14 Thread Joel Bernstein
This is where the alias work was done: https://issues.apache.org/jira/browse/SOLR-9077 It could be though that there is a bug here. I'll see if I can reproduce it locally. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, May 14, 2020 at 6:24 PM Nightingale, Jonathan A (US) <

RE: using aliases in topic stream

2020-05-14 Thread Nightingale, Jonathan A (US)
I'm looking on master on git hub, the solrj tests assume never use aliases Just as an example. that’s all over the place in the tests https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/test/org/apache/solr/client/solrj/io/stream/StreamDecoratorTest.java @Test public void

RE: using aliases in topic stream

2020-05-14 Thread Nightingale, Jonathan A (US)
Currently playing with 8.1 but 7.4 is what's in our production environment. -Original Message- From: Joel Bernstein Sent: Wednesday, May 13, 2020 1:11 PM To: solr-user@lucene.apache.org Subject: Re: using aliases in topic stream *** WARNING *** EXTERNAL EMAIL -- This message originates

Re: nested entities and DIH indexing time

2020-05-14 Thread matthew sporleder
On Thu, May 14, 2020 at 4:46 PM Shawn Heisey wrote: > > On 5/14/2020 9:36 AM, matthew sporleder wrote: > > It appears that adding entities to my entities in my data import > > config is slowing down my import process by a lot. Is there a good > > way to speed this up? I see the ID's are

Re: 404 response from Schema API

2020-05-14 Thread Shawn Heisey
On 5/14/2020 1:13 PM, Mark H. Wood wrote: On Fri, Apr 17, 2020 at 10:11:40AM -0600, Shawn Heisey wrote: On 4/16/2020 10:07 AM, Mark H. Wood wrote: I need to ask Solr 4.10 for the name of the unique key field of a schema. So far, no matter what I've done, Solr is returning a 404. The Luke

Re: nested entities and DIH indexing time

2020-05-14 Thread Shawn Heisey
On 5/14/2020 9:36 AM, matthew sporleder wrote: It appears that adding entities to my entities in my data import config is slowing down my import process by a lot. Is there a good way to speed this up? I see the ID's are individually queried instead of using IN() or similar normal techniques to

Re: 404 response from Schema API

2020-05-14 Thread Mark H. Wood
On Thu, May 14, 2020 at 03:13:07PM -0400, Mark H. Wood wrote: > Anyway, I'll be reading up on how to upgrade to 5. (Hopefully not > farther, just yet -- changes between, I think, 5 and 6 mean I'd have > to spend a week reloading 10 years worth of data. For now I don't > want to go any farther

Re: 404 response from Schema API

2020-05-14 Thread Mark H. Wood
On Fri, Apr 17, 2020 at 10:11:40AM -0600, Shawn Heisey wrote: > On 4/16/2020 10:07 AM, Mark H. Wood wrote: > > I need to ask Solr 4.10 for the name of the unique key field of a > > schema. So far, no matter what I've done, Solr is returning a 404. > > > > This works: > > > >curl

nested entities and DIH indexing time

2020-05-14 Thread matthew sporleder
It appears that adding entities to my entities in my data import config is slowing down my import process by a lot. Is there a good way to speed this up? I see the ID's are individually queried instead of using IN() or similar normal techniques to make things faster. Just looking for some tips.

Re: DIH nested entity repeating query in verbose output

2020-05-14 Thread matthew sporleder
I think this is just an issue in the verbose/debug output. tcpdump does not show the same issue. On Wed, May 13, 2020 at 7:39 PM matthew sporleder wrote: > > I am attempting to use nested entities to populate documents from > different tables and verbose/debug output is showing repeated queries

Performance issue in Query execution in Solr 8.3.0 and 8.5.1

2020-05-14 Thread vishal patel
I am upgrading Solr 6.1.0 to Solr 8.3.0 or Solr 8.5.1. I get performance issue for query execution in Solr 8.3.0 or Solr 8.5.1 when values of one field is large in query and group field is apply. My Solr URL : https://drive.google.com/file/d/1UqFE8I6M451Z1wWAu5_C1dzqYEOGjuH2/view My Solr

Re: How to determine why solr stops running?

2020-05-14 Thread James Greene
Check the log for for an OOM crash. Fatal exceptions will be in the main solr log and out of memory errors will be in their own -oom log. I've encountered quite a few solr crashes and usually it's when there's a threshold of concurrent users and/or indexing happening. On Thu, May 14, 2020,

How to determine why solr stops running?

2020-05-14 Thread Ryan W
Hi all, I manage a site where solr has stopped running a couple times in the past week. The server hasn't been rebooted, so that's not the reason. What else causes solr to stop running? How can I investigate why this is happening? Thank you, Ryan

Re: Filtering large amount of values

2020-05-14 Thread Mikhail Khludnev
Hi, Artur. Please, don't tell me that you obtain docValues per every doc? It's deadly slow see https://issues.apache.org/jira/browse/LUCENE-9328 for related problem. Make sure you obtain them once per segment, when leaf reader is injected. Recently there are some new method(s) for {!terms} I'm

Filtering large amount of values

2020-05-14 Thread Rudenko, Artur
Hi, We have a requirement of implementing a boolean filter with up to 500k values. We took the approach of post filter. Our environment has 7 servers of 128gb ram and 64cpus each server. We have 20-40m very large documents. Each solr instance has 64 shards with 2 replicas and JVM memory xms

Re: Secure communication between Solr and Zookeeper

2020-05-14 Thread Jan Høydahl
I’m sorry, I don’t have the possibility of completing that now. As I said you have some pointers in https://issues.apache.org/jira/browse/SOLR-7893 but is it not completed, so this is currently an undocumented (and unsupported) feature. That

Re: Secure communication between Solr and Zookeeper

2020-05-14 Thread ChienHuaWang
Hi Jan, Could you provide more detail what are the steps to setup between zookeeper & Solr? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Solr TLS for CDCR

2020-05-14 Thread ChienHuaWang
Does anyone have experience to setup TLS for Solr CDCR? I read the documentation: https://lucene.apache.org/solr/guide/7_6/enabling-ssl.html Would this apply to CDCR once enable? or we need additional configuration for CDCR? Appreciate any feedback -- Sent from:

Solr 7.4 - LTR reranker not adhering by Elevate Plugin

2020-05-14 Thread Ashwin Ramesh
Hi everybody, We are running a query with both elevateIds=1,2,3 & a reranker phase using LTR plugin. We noticed that the results do not return in the expected order - per the elevateIds param. Example LTR rq param {!ltr.model=foo reRankDocs=250 efi.query=$q} When I used the standard reranker

Re: Slow Query in Solr 8.3.0

2020-05-14 Thread vishal patel
Thanks for reply. Yes query is large but our functionality is like this. And query length is not matter because same thing is working fine in Solr 6.1.0. Return fields multi-valued are not a issue in my case. If I pass single return field(fl=id) then it also takes time.(34 seconds). But if I