Hi Boaz,
                  How can we fix this issue? (
https://github.com/elasticsearch/elasticsearch/issues/4502)

                   Will this work -
                    1. Take a backup of the data and local gateway
directory of each ES node prior to node restart.
                    2. Disable routing allocation on each node.
                    3. Restart the node
                    4. Copy data and gateway from backup to node's data and
gateway directory.
                    5. Enable routing allocation
                    6. Based on recovery settings, after
gateway.recover_after_time seconds, index recovery will start from gateway.

Thanks,
Rohit


On Sun, Jun 22, 2014 at 1:30 PM, Boaz Leskes <[email protected]> wrote:

> Not that I know of. But there is a known but very rare bug (fixed in
> 0.90.8) which can cause data loss upon a node restart:
> https://github.com/elasticsearch/elasticsearch/issues/4502
>
> Maybe you run into that?
>
>
> On Sun, Jun 22, 2014 at 10:18 PM, Rohit Jaiswal <[email protected]>
> wrote:
>
>> Yes, it did when we restarted the node while trying to reproduce this
>> problem. We also were able to access the data using the Scan search api
>> after restarting the node.
>>
>>  However we have seen quite a few of the bulk update errors in our
>> 20-node production cluster and have suffered data loss on other aliases
>> (The alias filter being the user-id) as well. We think the data loss is
>> because of this bulk update error.
>>
>> Is there a chance of losing data on shards when enough of these bulk
>> updates happen concurrently on multiple aliases (users)?
>>
>> Thanks
>>
>>
>> On Sun, Jun 22, 2014 at 1:10 PM, Boaz Leskes <[email protected]> wrote:
>>
>>> If you restart the node it's on, it doesn't come back?
>>>
>>>
>>> On Sun, Jun 22, 2014 at 10:01 PM, Rohit Jaiswal <[email protected]
>>> > wrote:
>>>
>>>> Hi Boaz,
>>>>                Thanks for replying. After we get this error, the
>>>> cluster health changes to Yellow with a replica shard in Unassigned state.
>>>> Is there a specific way to recover that shard? We dont want to lose other
>>>> data on that shard.
>>>>
>>>> Thanks,
>>>> Rohit
>>>>
>>>>
>>>> On Sun, Jun 22, 2014 at 12:50 PM, Boaz Leskes <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Rohit,
>>>>>
>>>>> This issue means update fails anyway, but it breaks the entire
>>>>> request. You should indeed set the retry_on_conflict option to make the
>>>>> update request succeed. PS - you should really upgrade - a lot has 
>>>>> happened
>>>>> and was fixed since 0.90.2  ...
>>>>>
>>>>> Cheers,
>>>>> Boaz
>>>>>
>>>>>
>>>>> On Monday, June 16, 2014 10:26:06 PM UTC+2, Rohit Jaiswal wrote:
>>>>>>
>>>>>> Hi Boaz,
>>>>>>                We are using 0.90.2 and run into this issue. As i
>>>>>> understand, one option is to upgrade to 0.90.3. If we continue using 
>>>>>> 0.90.2
>>>>>> and use (increase) retry_on_conflict, we will not see the problem? Please
>>>>>> clarify.
>>>>>>
>>>>>> Thanks,
>>>>>> Rohit
>>>>>> On Wednesday, August 7, 2013 9:39:56 AM UTC-7, Boaz Leskes wrote:
>>>>>>
>>>>>>> HI Eric,
>>>>>>>
>>>>>>> OK. Based on the gist you sent, i tracked down a problem at fixed
>>>>>>> it: https://github.com/elasticsearch/elasticsearch/issues/3448 .
>>>>>>> Thanks!! The fix is part of 0.90.3, so I'd recommend upgrading. This is 
>>>>>>> a
>>>>>>> secondary problem which occurs when two requests try to update the same
>>>>>>> document at exactly the same time. One of them succeeds and the other 
>>>>>>> fails
>>>>>>> with a version conflict (that error was masked by the error you were
>>>>>>> seeing). You can use (or increase) the retry_on_conflict parameter to 
>>>>>>> make
>>>>>>> the failing request try again.
>>>>>>>
>>>>>>> I'm still curious about your reporting of loosing replicas. Can you
>>>>>>> elaborate more about what happens? Do you see anything in the logs?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Boaz
>>>>>>>
>>>>>>> On Tuesday, August 6, 2013 5:09:26 AM UTC+2, Eric Sites wrote:
>>>>>>>>
>>>>>>>> Boaz,
>>>>>>>>
>>>>>>>> Sorry but I no longer have those logs, I upgraded to 0.90.2 from
>>>>>>>> 0.90.0 and wiped the logs when I did.
>>>>>>>> I did the upgrade to use the _bulk api for my update.
>>>>>>>>
>>>>>>>> Basically the "lang", "js" was not the issue.
>>>>>>>>
>>>>>>>> I was using different scripts with the same set of params and an
>>>>>>>> upcert. The fix was to use a different param name for different 
>>>>>>>> scripts,
>>>>>>>> about 10 unique scripts in total.
>>>>>>>>
>>>>>>>> I was losing replicated shards about every 10,000 to 30,000
>>>>>>>> updates, never the primary shard.
>>>>>>>>
>>>>>>>> I have 185 million + large json documents, with 100 shards in 1
>>>>>>>> index with 1 replication, so 200 shards total over 6 servers. Each 
>>>>>>>> shard is
>>>>>>>> about 10.4 GB in size.
>>>>>>>> About 2 TB of data, 1 TB primary, 1 TB replicated.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Eric Sites
>>>>>>>>
>>>>>>>> From: Boaz Leskes <[email protected]>
>>>>>>>> Reply-To: <[email protected]>
>>>>>>>> Date: Monday, August 5, 2013 5:38 PM
>>>>>>>> To: <[email protected]>
>>>>>>>> Subject: Re: 0.90.2 _update or _bulk update causing
>>>>>>>> NullPointerException in logs and I start losing shards
>>>>>>>>
>>>>>>>> Hi Eric,
>>>>>>>>
>>>>>>>> Glad to hear you solved it. It would be great if you can share the
>>>>>>>> failed logs from the _update (non bulk call). A failed script shouldn't
>>>>>>>> cause shards to drop so I would like to research it some more.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Boaz
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Aug 5, 2013 at 6:40 PM, Eric Sites <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Boaz,
>>>>>>>>>
>>>>>>>>> I found and fixed the problem.
>>>>>>>>>
>>>>>>>>> I added the "lang", "js" to the update json, that was not needed
>>>>>>>>> before in es 0.90.0.
>>>>>>>>> I also changed the name of new_tracking to match the name of the
>>>>>>>>> action in the params section.
>>>>>>>>> So for example the script now looks like this:
>>>>>>>>>
>>>>>>>>> if (ctx._source['tracking'] != null) {
>>>>>>>>>     if (ctx._source.tracking['some_action'] != null) {
>>>>>>>>>         ctx._source.tracking.some_action += param1;
>>>>>>>>>     } else {
>>>>>>>>>         ctx._source.tracking['some_action'] = 1;
>>>>>>>>>     }
>>>>>>>>> } else {
>>>>>>>>>     ctx._source.tracking = new_some_action;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> "params" : { "param1" : 1, "new_some_action" : { "some_action" : 1
>>>>>>>>> } }
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Eric Sites
>>>>>>>>>
>>>>>>>>> From: Boaz Leskes <[email protected]>
>>>>>>>>> Reply-To: <[email protected]>
>>>>>>>>> Date: Monday, August 5, 2013 10:35 AM
>>>>>>>>> To: <[email protected]>
>>>>>>>>> Subject: Re: 0.90.2 _update or _bulk update causing
>>>>>>>>> NullPointerException in logs and I start losing shards
>>>>>>>>>
>>>>>>>>> Hi Eric,
>>>>>>>>>
>>>>>>>>> This is interesting. The log stack trace from the gist comes from
>>>>>>>>> the bulk calls. Can you also post one from a failed _update? Cross 
>>>>>>>>> checking
>>>>>>>>> them might help pin pointing the issue.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Boaz
>>>>>>>>>
>>>>>>>>> On Monday, August 5, 2013 1:34:16 AM UTC+2, [email protected]
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I am getting java.lang.NullPointerException exception in my
>>>>>>>>>> ElasticSearch cluster logs when I am doing a _bulk update or just an
>>>>>>>>>> _update.
>>>>>>>>>> I am sending a lot of data to my clusters. After I get this error
>>>>>>>>>> I lose a shard and it has to be recreated.
>>>>>>>>>>
>>>>>>>>>> version 0.90.2
>>>>>>>>>>
>>>>>>>>>> gist: https://gist.github.com/EricSites/6152468
>>>>>>>>>>
>>>>>>>>>> I get this using the _bulk api or just normal _update api.
>>>>>>>>>>
>>>>>>>>>> My update script is a little complicated.
>>>>>>>>>> I am adding a tracking object to my document if it does not
>>>>>>>>>> exists. There should only be one of these and it should not be an 
>>>>>>>>>> array of
>>>>>>>>>> these.
>>>>>>>>>> If the object does exists, I am trying to add a new field to the
>>>>>>>>>> tracking object to keep track on counts.
>>>>>>>>>> So if the field does not exists I create it, else just += to it.
>>>>>>>>>>
>>>>>>>>>> if (ctx._source['tracking'] != null) {
>>>>>>>>>>     if (ctx._source.tracking['some_action'] != null) {
>>>>>>>>>>         ctx._source.tracking.some_action += param1;
>>>>>>>>>>     } else {
>>>>>>>>>>         ctx._source.tracking['some_action'] = 1;
>>>>>>>>>>     }
>>>>>>>>>> } else {
>>>>>>>>>>     ctx._source.tracking = new_tracking;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here is my mapping for this:
>>>>>>>>>> {
>>>>>>>>>>    "sample" : {
>>>>>>>>>>       "index_options" : "docs",
>>>>>>>>>>       "properties" : {
>>>>>>>>>>          "tracking" : {
>>>>>>>>>>              "type" : "object",
>>>>>>>>>>              "dynamic" : true
>>>>>>>>>>          }
>>>>>>>>>>       }
>>>>>>>>>>    }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "elasticsearch" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to a topic in
>>>>>>>>> the Google Groups "elasticsearch" group.
>>>>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>>>>>>>> topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
>>>>>>>>> To unsubscribe from this group and all its topics, send an email
>>>>>>>>> to [email protected].
>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "elasticsearch" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>  --
>>>>> You received this message because you are subscribed to a topic in the
>>>>> Google Groups "elasticsearch" group.
>>>>> To unsubscribe from this topic, visit
>>>>> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe
>>>>> .
>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>> [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/elasticsearch/a22ffbaa-af7e-4d15-ac5a-e1dcd5b76976%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/a22ffbaa-af7e-4d15-ac5a-e1dcd5b76976%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "elasticsearch" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe
>>>> .
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GKheAXK%3Dq%2BG2vdyfgRBURuk4_udO8XFLNCTmDV3EnWiA%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GKheAXK%3Dq%2BG2vdyfgRBURuk4_udO8XFLNCTmDV3EnWiA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAKzwz0oDUnYonpURtCVis-9UxS0FRiRMvLW1wZZybo2gOZboTA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAKzwz0oDUnYonpURtCVis-9UxS0FRiRMvLW1wZZybo2gOZboTA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GexzoN8Nrf3GBaCrrXrVdKjUzzrkqw%3DYLwTW9YwEst5A%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAP_rV8GexzoN8Nrf3GBaCrrXrVdKjUzzrkqw%3DYLwTW9YwEst5A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/yk7HvjqCgOg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKzwz0qcZw2SR0Bt6GU06-FEp%2BL%2BRyAin3oCnWhpefGGVH99Zg%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAKzwz0qcZw2SR0Bt6GU06-FEp%2BL%2BRyAin3oCnWhpefGGVH99Zg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP_rV8G%2BLhU4adb5nJ8V-7PWv%2BiwEK5PZKSV-rZYNsXyFR3yfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to