Re: NRT Real time Get with documentCache

2020-02-03 Thread Karl Stoney
Great stuff thank you Erick On 04/02/2020, 00:17, "Erick Erickson" wrote: The documentCache shouldn’t matter at all. RTG should return the latest doc by maintaining a pointer into the tlogs and returning that version. > On Feb 3, 2020, at 6:43 PM, Karl Stoney wrote: > > Hi,

RE: Solr 8.4.1 error

2020-02-03 Thread Srinivas Kashyap
Sorry for the interruption, This error was due to wrong context path mentioned in solr-jetty-context.xml And in jetty.xml it was referring /solr. So index was locked. Thanks, Srinivas -Original Message- From: Srinivas Kashyap Sent: 04 February 2020 11:04 To:

Re: how splitting more shards impact performance

2020-02-03 Thread Shawn Heisey
On 2/3/2020 5:17 PM, ChienHua wrote: What should we expect the query performance impacted by splitting one collection into more shards? We expect the query performance would degrade by splitting more shards since the overhead of merging results from several shards. However, the test result

RE: Solr 8.4.1 error

2020-02-03 Thread Srinivas Kashyap
Hi Shawn, I did delete the data folder of the core and also did in windows command: solr stop -all. I see only one solr server is running in this machine which gets started and stopped when I do so. To confirm, I even copied my folders to another system and tried there but facing same issue.

how splitting more shards impact performance

2020-02-03 Thread ChienHua
What should we expect the query performance impacted by splitting one collection into more shards? We expect the query performance would degrade by splitting more shards since the overhead of merging results from several shards. However, the test result seems not as we expect. Any idea or

Query Elevation Component

2020-02-03 Thread Sidharth Negi
Hi, I want to use the Solr query elevation component. Let's say I want to elevate "doc_id" when a user inputs the query "qwerty". I am able to get a prototype to work by filling these values in elevate.xml and hitting the Solr API with q="qwerty". However, in our service, where I want to plug

Re: NRT Real time Get with documentCache

2020-02-03 Thread Erick Erickson
The documentCache shouldn’t matter at all. RTG should return the latest doc by maintaining a pointer into the tlogs and returning that version. > On Feb 3, 2020, at 6:43 PM, Karl Stoney > wrote: > > Hi, > Could anyone let me know if a real time get would return a cached, up to date > version

NRT Real time Get with documentCache

2020-02-03 Thread Karl Stoney
Hi, Could anyone let me know if a real time get would return a cached, up to date version of a document if we enabled documentCache? Thanks Karl This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England

Blocking certain queries

2020-02-03 Thread John Davis
Hello, Is there a way to block certain queries in solr? For eg a delete for *:* or if there is a known query that causes problems, can these be blocked at the solr server layer.

Graph Query Bug ?

2020-02-03 Thread sambasivarao giddaluri
Hi All , Solr 8.2 Database structure . Parent -> Children Each child has parent referenceId Query: Get Parent doc based on child query Method 1: {!graph from=parentId to=parentId traversalFilter='docType:parent' returnRoot=false}child.name:foo AND child.type:name Result : 1 Debug:

Connection spike when slight solr latency spike

2020-02-03 Thread Karl Stoney
Hey all, When our searcher refreshes on a soft-commit, we get a slight latency spike (p99th response times can jump up to about 200ms from 100ms), however what we see in the upstream clients using org.apache.solr.client.solrj SolrClient is a big spike in connections outbound (70-80 per client,

Re: Performance comparison for wildcard searches

2020-02-03 Thread Shawn Heisey
On 2/3/2020 12:06 PM, Rahul Goswami wrote: I am working with Solr 7.2.1 and had a question regarding the performance of wildcard searches. q=*:* vs q=id:* vs q=id:[* TO *] Can someone please rank them in the order of performance with the underlying reason? The only one of those that is an

Re: Solr 8.4.1 error

2020-02-03 Thread Shawn Heisey
On 2/3/2020 5:16 AM, Srinivas Kashyap wrote: I'm trying to upgrade to solr 8.4.1 and facing below error while start up and my cores are not being listed in solr admin screen. I need your help. Caused by: java.nio.channels.OverlappingFileLockException at

Performance comparison for wildcard searches

2020-02-03 Thread Rahul Goswami
Hello, I am working with Solr 7.2.1 and had a question regarding the performance of wildcard searches. q=*:* vs q=id:* vs q=id:[* TO *] Can someone please rank them in the order of performance with the underlying reason? Thanks, Rahul

Re: KeeperErrorCode= BadVersion

2020-02-03 Thread Rajeswari Natarajan
Any thoughts on this?. We are continuously publishing and have disabled schemaless mode. Thanks, Rajeswari On Wed, Jan 29, 2020 at 9:18 AM Rajeswari Natarajan wrote: > Hi, > > Getting below exception. We have solrcloud 7.6 installed and have > commented off the below in solrconfig.xml > > > >

Reading authenticated user value inside custom DocTransformer

2020-02-03 Thread mosheB
We are using Solr's kerberos authentication plugin and we are trying to implement field-level filtering based on the authenticated user and DocTransformer class: public class FieldAclTransformerFactory extends TransformerFactory { @Override public DocTransformer create(String

Getting authenticated user inside DocTransformer plugin

2020-02-03 Thread mosheB
We are using Solr's kerberos authentication plugin and we are trying to implement field-level filtering based on the authenticated user and DocTransformer class: public class FieldAclTransformerFactory extends TransformerFactory { @Override public DocTransformer create(String

Re: Replica type affinity

2020-02-03 Thread Jason Gerlowski
This is a bit of a guess - I haven't used this functionality before. But to a novice the "tag" Rule Condition for "Rule Based Replica Placement" sounds similar to the requirements you mentioned above. https://lucene.apache.org/solr/guide/8_3/rule-based-replica-placement.html#rule-conditions Good

Re: How to compute index size

2020-02-03 Thread David Hastings
Yup, I find the right calculation to be as much ram as the server can take, and as much SSD space as it will hold, when you run out, buy another server and repeat. machines/ram/SSD's are cheap. just get as much as you can. On Mon, Feb 3, 2020 at 11:59 AM Walter Underwood wrote: > What he

Re: How to compute index size

2020-02-03 Thread Walter Underwood
What he said. But if you must have a number, assume that the index will be as big as your (text) data. It might be 2X bigger or 2X smaller. Or 3X or 4X, but that is a starting point. Once you start updating, the index might get as much as 2X bigger before merges. Do NOT try to get by with the

Alternative of ChildDocTransformerFactory

2020-02-03 Thread kumar gaurav
HI Mikhail/ All Do we have any alternative of ChildDocTransformerFactory i.e. fl=id,[child parentFilter=doc_type:book childFilter=doc_type:chapter limit=100] I am facing high performance impact because of this . Any suggestions? Thanks Regards Kumar Gaurav

Auto-Suggest within Tier Architecture

2020-02-03 Thread Moyer, Brett
Hello, Looking to see how others accomplished this goal. We have a 3 Tier architecture, Solr is down deep in T3 far from the end user. How do you make Auto-Suggest calls from the Internet Browser through the Tiers down to Solr in T3? We essentially created steps down each tier, but I'm

Re: Importing Large CSV File into Solr Cloud Fails with 400 Bad Request

2020-02-03 Thread Erick Erickson
I don’t quite know how TolerantUpdateProcessor works with importing CSV files, see: https://issues.apache.org/jira/browse/SOLR-445. That is about sending batches of docs to Solr and frankly I don’t know what path your process will take. It’s worth a try though. Otherwise, I typically go with

Re: Importing Large CSV File into Solr Cloud Fails with 400 Bad Request

2020-02-03 Thread Joseph Lorenzini
Hi Shawn/Erick, This information has been very helpful. Thank you. So I did some more investigation into our ETL process and I verified that with the exception of the text I sent above they are all obviously invalid dates. For example, one field value had 00 for a day so would guess that field

Re: How to compute index size

2020-02-03 Thread Erick Erickson
I’ve always had trouble with that advice, that RAM size should be JVM + index size. I’ve seen 300G indexes (as measured by the size of the data/index directory) run in 128G of memory. Here’s the long form:

Solr 8.4.1 error

2020-02-03 Thread Srinivas Kashyap
Hello, I'm trying to upgrade to solr 8.4.1 and facing below error while start up and my cores are not being listed in solr admin screen. I need your help. 2020-02-03 12:12:35.622 ERROR (coreContainerWorkExecutor-2-thread-1) [ ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on

RE: Solr fact response strange behaviour

2020-02-03 Thread Kaminski, Adi
Hi Mikhail, Here is the code, where basically we are trying to retrieve the value of facet counts, that sometimes returned as Integer and sometime as Long, where we've got the ClassCast exception, until the W/A of Numeric casting was applied. if (resList != null) { List terms =

How to compute index size

2020-02-03 Thread Mohammed Farhan Ejaz
Hello All, I want to size the RAM for my Solr cloud instance. The thumb rule is your total RAM size should be = (JVM size + index size) Now I have a simple question, How do I know my index size? A simple method, perhaps from the Solr cloud admin UI or an API? My assumption so far is the total

SOLR Data Import Handler : A command is still running...

2020-02-03 Thread Doss
We are doing hourly data import to our index, per day one or two requests are getting failed with the message "A command is still running...". 1. Does it mean, the data import not happened for the last hour? 2. If you look at the "Full Dump Started" time has an older data, in the below log all