Re: 0.90.2 _update or _bulk update causing NullPointerException in logs and I start losing shards

Boaz Leskes Sun, 22 Jun 2014 13:12:07 -0700

If you restart the node it's on, it doesn't come back?


On Sun, Jun 22, 2014 at 10:01 PM, Rohit Jaiswal <[email protected]>
wrote:

> Hi Boaz,
>                Thanks for replying. After we get this error, the cluster
> health changes to Yellow with a replica shard in Unassigned state. Is there
> a specific way to recover that shard? We dont want to lose other data on
> that shard.
>
> Thanks,
> Rohit
>
>
> On Sun, Jun 22, 2014 at 12:50 PM, Boaz Leskes <[email protected]> wrote:
>
>> Hi Rohit,
>>
>> This issue means update fails anyway, but it breaks the entire request.
>> You should indeed set the retry_on_conflict option to make the update
>> request succeed. PS - you should really upgrade - a lot has happened and
>> was fixed since 0.90.2  ...
>>
>> Cheers,
>> Boaz
>>
>>
>> On Monday, June 16, 2014 10:26:06 PM UTC+2, Rohit Jaiswal wrote:
>>>
>>> Hi Boaz,
>>>                We are using 0.90.2 and run into this issue. As i
>>> understand, one option is to upgrade to 0.90.3. If we continue using 0.90.2
>>> and use (increase) retry_on_conflict, we will not see the problem? Please
>>> clarify.
>>>
>>> Thanks,
>>> Rohit
>>> On Wednesday, August 7, 2013 9:39:56 AM UTC-7, Boaz Leskes wrote:
>>>
>>>> HI Eric,
>>>>
>>>> OK. Based on the gist you sent, i tracked down a problem at fixed it:
>>>> https://github.com/elasticsearch/elasticsearch/issues/3448 . Thanks!!
>>>> The fix is part of 0.90.3, so I'd recommend upgrading. This is a secondary
>>>> problem which occurs when two requests try to update the same document at
>>>> exactly the same time. One of them succeeds and the other fails with a
>>>> version conflict (that error was masked by the error you were seeing). You
>>>> can use (or increase) the retry_on_conflict parameter to make the failing
>>>> request try again.
>>>>
>>>> I'm still curious about your reporting of loosing replicas. Can you
>>>> elaborate more about what happens? Do you see anything in the logs?
>>>>
>>>> Cheers,
>>>> Boaz
>>>>
>>>> On Tuesday, August 6, 2013 5:09:26 AM UTC+2, Eric Sites wrote:
>>>>>
>>>>> Boaz,
>>>>>
>>>>> Sorry but I no longer have those logs, I upgraded to 0.90.2 from
>>>>> 0.90.0 and wiped the logs when I did.
>>>>> I did the upgrade to use the _bulk api for my update.
>>>>>
>>>>> Basically the "lang", "js" was not the issue.
>>>>>
>>>>> I was using different scripts with the same set of params and an
>>>>> upcert. The fix was to use a different param name for different scripts,
>>>>> about 10 unique scripts in total.
>>>>>
>>>>> I was losing replicated shards about every 10,000 to 30,000 updates,
>>>>> never the primary shard.
>>>>>
>>>>> I have 185 million + large json documents, with 100 shards in 1 index
>>>>> with 1 replication, so 200 shards total over 6 servers. Each shard is 
>>>>> about
>>>>> 10.4 GB in size.
>>>>> About 2 TB of data, 1 TB primary, 1 TB replicated.
>>>>>
>>>>> Cheers,
>>>>> Eric Sites
>>>>>
>>>>> From: Boaz Leskes <[email protected]>
>>>>> Reply-To: <[email protected]>
>>>>> Date: Monday, August 5, 2013 5:38 PM
>>>>> To: <[email protected]>
>>>>> Subject: Re: 0.90.2 _update or _bulk update causing
>>>>> NullPointerException in logs and I start losing shards
>>>>>
>>>>> Hi Eric,
>>>>>
>>>>> Glad to hear you solved it. It would be great if you can share the
>>>>> failed logs from the _update (non bulk call). A failed script shouldn't
>>>>> cause shards to drop so I would like to research it some more.
>>>>>
>>>>> Cheers,
>>>>> Boaz
>>>>>
>>>>>
>>>>> On Mon, Aug 5, 2013 at 6:40 PM, Eric Sites <[email protected]> wrote:
>>>>>
>>>>>> Boaz,
>>>>>>
>>>>>> I found and fixed the problem.
>>>>>>
>>>>>> I added the "lang", "js" to the update json, that was not needed
>>>>>> before in es 0.90.0.
>>>>>> I also changed the name of new_tracking to match the name of the
>>>>>> action in the params section.
>>>>>> So for example the script now looks like this:
>>>>>>
>>>>>> if (ctx._source['tracking'] != null) {
>>>>>>     if (ctx._source.tracking['some_action'] != null) {
>>>>>>         ctx._source.tracking.some_action += param1;
>>>>>>     } else {
>>>>>>         ctx._source.tracking['some_action'] = 1;
>>>>>>     }
>>>>>> } else {
>>>>>>     ctx._source.tracking = new_some_action;
>>>>>> }
>>>>>>
>>>>>> "params" : { "param1" : 1, "new_some_action" : { "some_action" : 1 } }
>>>>>>
>>>>>> Cheers,
>>>>>> Eric Sites
>>>>>>
>>>>>> From: Boaz Leskes <[email protected]>
>>>>>> Reply-To: <[email protected]>
>>>>>> Date: Monday, August 5, 2013 10:35 AM
>>>>>> To: <[email protected]>
>>>>>> Subject: Re: 0.90.2 _update or _bulk update causing
>>>>>> NullPointerException in logs and I start losing shards
>>>>>>
>>>>>> Hi Eric,
>>>>>>
>>>>>> This is interesting. The log stack trace from the gist comes from the
>>>>>> bulk calls. Can you also post one from a failed _update? Cross checking
>>>>>> them might help pin pointing the issue.
>>>>>>
>>>>>> Cheers,
>>>>>> Boaz
>>>>>>
>>>>>> On Monday, August 5, 2013 1:34:16 AM UTC+2, [email protected] wrote:
>>>>>>>
>>>>>>> I am getting java.lang.NullPointerException exception in my
>>>>>>> ElasticSearch cluster logs when I am doing a _bulk update or just an
>>>>>>> _update.
>>>>>>> I am sending a lot of data to my clusters. After I get this error I
>>>>>>> lose a shard and it has to be recreated.
>>>>>>>
>>>>>>> version 0.90.2
>>>>>>>
>>>>>>> gist: https://gist.github.com/EricSites/6152468
>>>>>>>
>>>>>>> I get this using the _bulk api or just normal _update api.
>>>>>>>
>>>>>>> My update script is a little complicated.
>>>>>>> I am adding a tracking object to my document if it does not exists.
>>>>>>> There should only be one of these and it should not be an array of 
>>>>>>> these.
>>>>>>> If the object does exists, I am trying to add a new field to the
>>>>>>> tracking object to keep track on counts.
>>>>>>> So if the field does not exists I create it, else just += to it.
>>>>>>>
>>>>>>> if (ctx._source['tracking'] != null) {
>>>>>>>     if (ctx._source.tracking['some_action'] != null) {
>>>>>>>         ctx._source.tracking.some_action += param1;
>>>>>>>     } else {
>>>>>>>         ctx._source.tracking['some_action'] = 1;
>>>>>>>     }
>>>>>>> } else {
>>>>>>>     ctx._source.tracking = new_tracking;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Here is my mapping for this:
>>>>>>> {
>>>>>>>    "sample" : {
>>>>>>>       "index_options" : "docs",
>>>>>>>       "properties" : {
>>>>>>>          "tracking" : {
>>>>>>>              "type" : "object",
>>>>>>>              "dynamic" : true
>>>>>>>          }
>>>>>>>       }
>>>>>>>    }
>>>>>>> }
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to a topic in
>>>>>> the Google Groups "elasticsearch" group.
>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>>>>> topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
>>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>>> [email protected].
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>>
>>>>>
>>>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/a22ffbaa-af7e-4d15-ac5a-e1dcd5b76976%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/a22ffbaa-af7e-4d15-ac5a-e1dcd5b76976%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GKheAXK%3Dq%2BG2vdyfgRBURuk4_udO8XFLNCTmDV3EnWiA%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GKheAXK%3Dq%2BG2vdyfgRBURuk4_udO8XFLNCTmDV3EnWiA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKzwz0oDUnYonpURtCVis-9UxS0FRiRMvLW1wZZybo2gOZboTA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: 0.90.2 _update or _bulk update causing NullPointerException in logs and I start losing shards

Reply via email to