Just to be sure I understand -- are you suggesting the following syntax:

curl -XPOST 
'http://localhost:9200/test/type1/_bulk?replication=async&timeout=5m' -d '
{ "index" : { "_id" : "i1", "version": 3, "version_type": "external" } }
{ "fields": "values etc." }
'

For my use case, version_type is always "external" for all documents in the 
request.  But I get the motivation for specifying it per-doc.

You said that timeout per-doc is ignored in bulk mode.  So the 
Elasticsearch default timeout, i.e. [1m] should have been applied to my 
original requests.
Do you know why a [0s] timeout was applied instead?  This was from a 
response:

{"index":"test","_type":"type1","_id":"123","status":503,"error":"UnavailableShardsException[[test][98]
 
[3] shardIt, [3] active : Timeout waiting for [0s], request: 
org.elasticsearch.action.bulk.BulkShardRequest@36d185a1]"}


On Tuesday, July 29, 2014 12:21:21 AM UTC-7, Jörg Prante wrote:
>
> You can use "version" and "version_type" per doc, of course.
>
> The parameters "replication" and "timeout" per doc are ignored when using 
> bulk mode. They must be set at bulk request level. 
>
> Each bulk request is split and forwarded to relevant shards. This 
> splitting is very fast by searching delimiters in the request chunk, 
> sorting the actions that belong to one shard, and forward them as new 
> packets. For these packets, the bulk request level parameters "replication" 
> and "timeout" should work. 
>
> Although the request format looks heavy, it is most appropriate for 
> distributed processing.
>
> Jörg
>
>
> On Tue, Jul 29, 2014 at 1:02 AM, Ashish Mishra <[email protected] 
> <javascript:>> wrote:
>
>> I'm uploading documents using syntax like the following.
>>
>> curl -XPOST 'http://localhost:9200/test/type1/_bulk' -d '
>> { "index" : { "_id" : "i1", "version": 3, "version_type": "external", 
>> "replication": "async", "timeout": "5m" } }
>> { "fields": "values etc." }
>> { "index" : { "_id" : "i2", "version": 1, "version_type": "external", 
>> "replication": "async", "timeout": "5m" } }
>> { "fields": "values etc." }
>> '
>>
>> A couple of questions:  First, there's a fair bit of redundancy in the 
>> action line.  It feels wasteful when sending 10s of Mb /  thousands of 
>> requests per API call.
>> Can I roll default version_type / replication / timeout parameters into 
>> the top-level _bulk url?  I've seen a few resolved issues suggesting this. 
>>  But it's not mentioned in the documentation at 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html
>>
>>
>> Second, in the response I occasionally see errors like
>> {"index":"test","_type":"type1","_id":"123","status":503,"error":"UnavailableShardsException[[test][98]
>>  
>> [3] shardIt, [3] active : Timeout waiting for [0s], request: 
>> org.elasticsearch.action.bulk.BulkShardRequest@36d185a1]"}
>>
>> The "[0s]" part is surprising.  The available-shard-timeout is 1m by 
>> default, and I explicitly requested 5m.  Does this get overridden somewhere?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/ba4bde17-2668-42c4-9d14-0923571044d5%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/ba4bde17-2668-42c4-9d14-0923571044d5%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc6294c4-cca4-4eec-961a-491e6c6c007b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to