Hi Evan That is definitely not the expected behaviour and I believe is covered in tests which use DirectRunner. Are you able to share your pipeline code, or describe how you source your records please? It could be that something else is causing EsIO to see bundles sized at only one record.
I’ll verify ES IO behaviour when I get to a computer too. Tim (on phone) > On 6 Dec 2018, at 22:00, [email protected] <[email protected]> wrote: > > Hi all, > > I’m having a bit of trouble with ElasticsearchIO Write transform. I’m able to > successfully index documents into my elasticsearch cluster, but batching does > not seem to work. There ends up being a 1:1 ratio between HTTP requests sent > to `/my-index/_doc/_bulk` and the number of documents in my PCollection to > which I apply the ElasticsearchIO PTransform. I’ve noticed this specifically > under the DirectRunner by utilizing a debugger. > > Am I missing something? Is this possibly a difference between execution > environments (Ex. DirectRunner Vs. DataflowRunner)? How can I make sure my > program is taking advantage of batching/bulk indexing? > > Thanks, > Evan
