Re: ElasticsearchIO Write Batching Problems

Tim Thu, 06 Dec 2018 22:37:45 -0800

Hi Evan

That is definitely not the expected behaviour and I believe is covered in tests 
which use DirectRunner. Are you able to share your pipeline code, or describe 
how you source your records please? It could be that something else is causing 
EsIO to see bundles sized at only one record.


I’ll verify ES IO behaviour when I get to a computer too.

Tim (on phone)

> On 6 Dec 2018, at 22:00, [email protected] <[email protected]> wrote:
> 
> Hi all,
> 
> I’m having a bit of trouble with ElasticsearchIO Write transform. I’m able to 
> successfully index documents into my elasticsearch cluster, but batching does 
> not seem to work. There ends up being a 1:1 ratio between HTTP requests sent 
> to `/my-index/_doc/_bulk` and the number of documents in my PCollection to 
> which I apply the ElasticsearchIO PTransform. I’ve noticed this specifically 
> under the DirectRunner by utilizing a debugger.
> 
> Am I missing something? Is this possibly a difference between execution 
> environments (Ex. DirectRunner Vs. DataflowRunner)? How can I make sure my 
> program is taking advantage of batching/bulk indexing?
> 
> Thanks,
> Evan

Re: ElasticsearchIO Write Batching Problems

Reply via email to