Just to be sure I understand -- are you suggesting the following syntax:
curl -XPOST
'http://localhost:9200/test/type1/_bulk?replication=async&timeout=5m' -d '
{ "index" : { "_id" : "i1", "version": 3, "version_type": "external" } }
{ "fields": "values etc." }
'
For my use case, version_type is always "external" for all documents in the
request. But I get the motivation for specifying it per-doc.
You said that timeout per-doc is ignored in bulk mode. So the
Elasticsearch default timeout, i.e. [1m] should have been applied to my
original requests.
Do you know why a [0s] timeout was applied instead? This was from a
response:
{"index":"test","_type":"type1","_id":"123","status":503,"error":"UnavailableShardsException[[test][98]
[3] shardIt, [3] active : Timeout waiting for [0s], request:
org.elasticsearch.action.bulk.BulkShardRequest@36d185a1]"}
On Tuesday, July 29, 2014 12:21:21 AM UTC-7, Jörg Prante wrote:
>
> You can use "version" and "version_type" per doc, of course.
>
> The parameters "replication" and "timeout" per doc are ignored when using
> bulk mode. They must be set at bulk request level.
>
> Each bulk request is split and forwarded to relevant shards. This
> splitting is very fast by searching delimiters in the request chunk,
> sorting the actions that belong to one shard, and forward them as new
> packets. For these packets, the bulk request level parameters "replication"
> and "timeout" should work.
>
> Although the request format looks heavy, it is most appropriate for
> distributed processing.
>
> Jörg
>
>
> On Tue, Jul 29, 2014 at 1:02 AM, Ashish Mishra <[email protected]
> <javascript:>> wrote:
>
>> I'm uploading documents using syntax like the following.
>>
>> curl -XPOST 'http://localhost:9200/test/type1/_bulk' -d '
>> { "index" : { "_id" : "i1", "version": 3, "version_type": "external",
>> "replication": "async", "timeout": "5m" } }
>> { "fields": "values etc." }
>> { "index" : { "_id" : "i2", "version": 1, "version_type": "external",
>> "replication": "async", "timeout": "5m" } }
>> { "fields": "values etc." }
>> '
>>
>> A couple of questions: First, there's a fair bit of redundancy in the
>> action line. It feels wasteful when sending 10s of Mb / thousands of
>> requests per API call.
>> Can I roll default version_type / replication / timeout parameters into
>> the top-level _bulk url? I've seen a few resolved issues suggesting this.
>> But it's not mentioned in the documentation at
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html
>>
>>
>> Second, in the response I occasionally see errors like
>> {"index":"test","_type":"type1","_id":"123","status":503,"error":"UnavailableShardsException[[test][98]
>>
>> [3] shardIt, [3] active : Timeout waiting for [0s], request:
>> org.elasticsearch.action.bulk.BulkShardRequest@36d185a1]"}
>>
>> The "[0s]" part is surprising. The available-shard-timeout is 1m by
>> default, and I explicitly requested 5m. Does this get overridden somewhere?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/ba4bde17-2668-42c4-9d14-0923571044d5%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/ba4bde17-2668-42c4-9d14-0923571044d5%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bc6294c4-cca4-4eec-961a-491e6c6c007b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.