Re: Realtime get not always returning existing data

2019-06-05 Thread damienk
I'm using Solr 7.7.1, 12 shards, router:{"field":"route", "name":"compositeId"}, and find the realtime get only returns results if I specify the leader core-url. Most of the time I see no results. On Thu, 11 Oct 2018 at 23:41, Chris Ulicny wrote: > We are relatively far behind with this one. The

Re: Realtime get not always returning existing data

2018-10-11 Thread Chris Ulicny
We are relatively far behind with this one. The collections that we experience the problem on are currently running on 6.3.0. If it's easy enough for you to upgrade, it might be worth a try, but I didn't see any changes to the RealTimeGet in either of the 7.4/5 change logs after a cursory glance.

Re: Realtime get not always returning existing data

2018-10-11 Thread sgaron cse
Hey Chris, Which version of SOLR are you running? I was thinking of maybe trying another version to see if it fixes the issue. On Thu, Oct 11, 2018 at 8:11 AM Chris Ulicny wrote: > We've also run into that issue of not being able to reproduce it outside of > running production loads. > > Howeve

Re: Realtime get not always returning existing data

2018-10-11 Thread Chris Ulicny
We've also run into that issue of not being able to reproduce it outside of running production loads. However, we haven't been encountering the problem in live production quite as much as we used to, and I think that might be from the /get requests being spread out a little more evenly over the ru

Re: Realtime get not always returning existing data

2018-10-10 Thread sgaron cse
I haven't found a way to reproduce the problem other that running our entire set of code. I've also been trying different things to make sure to problem is not from my end and so far I haven't managed to fix it by changing my code. It has to be a race condition somewhere but I just can't put my fin

Re: Realtime get not always returning existing data

2018-10-10 Thread Erick Erickson
Well assigning a bogus version that generates a 409 error then immediately doing an RTG on the doc doesn't fail for me either 18 million tries later. So I'm afraid I haven't a clue where to go from here. Unless we can somehow find a way to generate this failure I'm going to drop it for the foreseea

Re: Realtime get not always returning existing data

2018-10-09 Thread Erick Erickson
H. I wonder if a version conflict or perhaps other failure can somehow cause this. It shouldn't be very hard to add that to my test setup, just randomly add n _version_ field value. Erick On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson wrote: > > Thanks. I'll be away for the rest of the week, s

Re: Realtime get not always returning existing data

2018-10-01 Thread Erick Erickson
Thanks. I'll be away for the rest of the week, so won't be able to try anything more On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny wrote: > > In our case, we are heavily indexing in the collection while the /get > requests are happening which is what we assumed was causing this very rare > behav

Re: Realtime get not always returning existing data

2018-10-01 Thread Chris Ulicny
In our case, we are heavily indexing in the collection while the /get requests are happening which is what we assumed was causing this very rare behavior. However, we have experienced the problem for a collection where the following happens in sequence with minutes in between them. 1. Document id=

Re: Realtime get not always returning existing data

2018-09-30 Thread Erick Erickson
57 million queries later, with constant indexing going on and 9 dummy collections in the mix and the main collection I'm querying having 2 shards, 2 replicas each, I have no errors. So unless the code doesn't look like it exercises any similar path, I'm not sure what more I can test. "It works on

Re: Realtime get not always returning existing data

2018-09-29 Thread Erick Erickson
Steve: bq. Basically, one core had data in it that should belong to another core. Here's my question about this: Is it possible that two request to the /get API coming in at the same time would get confused and either both get the same result or result get inverted? Well, that shouldn't be happe

Re: Realtime get not always returning existing data

2018-09-29 Thread Shawn Heisey
On 9/28/2018 8:11 PM, sgaron cse wrote: @Shawn We're running two instance on one machine for two reason: 1. The box has plenty of resources (48 cores / 256GB ram) and since I was reading that it's not recommended to use more than 31GB of heap in SOLR we figured 96 GB for keeping index data in OS

Re: Realtime get not always returning existing data

2018-09-28 Thread sgaron cse
@Shawn We're running two instance on one machine for two reason: 1. The box has plenty of resources (48 cores / 256GB ram) and since I was reading that it's not recommended to use more than 31GB of heap in SOLR we figured 96 GB for keeping index data in OS cache + 31 GB of heap per instance was a g

Re: Realtime get not always returning existing data

2018-09-28 Thread Erick Erickson
Well, I flipped indexing on and after another 7 million queries, no fails. No reason to stop just yet, but not encouraging so far... On Fri, Sep 28, 2018, 10:58 Erick Erickson wrote: > I've set up a test program on a local machine, we'll see if I can reproduce > here's the setup: > > 1> created

Re: Realtime get not always returning existing data

2018-09-28 Thread Erick Erickson
I've set up a test program on a local machine, we'll see if I can reproduce here's the setup: 1> created a 2-shard, leader(primary) only collection 2> added 1M simple docs to it (ids 0-999,999) and some text 3> re-added 100_000 docs with a random id between 0 - 999,999 (inclusive) to insure t

Re: Realtime get not always returning existing data

2018-09-28 Thread Shawn Heisey
On 9/28/2018 6:09 AM, sgaron cse wrote: because this is a test deployment replica is set to 1 so as far as I understand, data will not be replicated for this core. Basically we have two SOLR instances running on the same box. One on port 8983, the other on port 8984. We have 9 cores on this SOLR

Re: Realtime get not always returning existing data

2018-09-28 Thread sgaron cse
Hey Shawn, because this is a test deployment replica is set to 1 so as far as I understand, data will not be replicated for this core. Basically we have two SOLR instances running on the same box. One on port 8983, the other on port 8984. We have 9 cores on this SOLR cloud deployment, 5 of which o

Re: Realtime get not always returning existing data

2018-09-27 Thread Chris Ulicny
I don't think I've much to add that Steve hasn't already covered, but we've also seen this "null doc" problem in one of our setups. In one of our Solr Cloud instances in production where the /get handler is hit very hard in bursts, the /get request will occasionally return "null" for a document th

Re: Realtime get not always returning existing data

2018-09-27 Thread Shawn Heisey
On 9/27/2018 11:48 AM, sgaron cse wrote: So this is a SOLR core where we keep configuration data so it is almost never written to. The statistics for the core say its been last modified 4 hours ago, yet I got doc:null from the API an hour ago. And also you don't have to have a lot of data into th

Re: Realtime get not always returning existing data

2018-09-27 Thread sgaron cse
So this is a SOLR core where we keep configuration data so it is almost never written to. The statistics for the core say its been last modified 4 hours ago, yet I got doc:null from the API an hour ago. And also you don't have to have a lot of data into the core. For example, this core has only 11

Re: Realtime get not always returning existing data

2018-09-27 Thread Erick Erickson
Steve: Thanks. So theoretically I should be able to set up a cluster, index a bunch of docs to it and then just hammer RTG calls against those IDs and sometime see a failure? Hmmm, I guess a follow-up question is whether there's any indexing gong on at all when this happens. Or, more specifically

Re: Realtime get not always returning existing data

2018-09-27 Thread sgaron cse
Hey Erick, We're using SOLR 7.3.1, which is not the latest but still not too far back. No the document has not been recently indexed, in fact, I can use the /search API endpoint to find the document. But I need a fast way to find document that have not necessarily been indexed yet so /search is o

Re: Realtime get not always returning existing data

2018-09-27 Thread Erick Erickson
What version of Solr are you running? Mostly that's for curiosity. Is the doc that's not returned something you've recently indexed? Here's a possible scenario: You send the doc out to be indexed. The primary forwards the doc to the followers. Before the follower has a chance to process (but not c

Realtime get not always returning existing data

2018-09-26 Thread sgaron cse
Hey all, We're trying to use SOLR for our document store and are facing some issues with the Realtime Get api. Basically, we're doing an api call from multiple endpoint to retrieve configuration data. The document that we are retrieving does not change at all but sometimes the API returns a null d