How can I throttle Spark as it writes to Elasticsearch? I have already
repartitioned down to one partition in an effort to slow the writes. ES
indicates it is being overloaded, and I don't know how to slow things down.
This is all on one r4.xlarge EC2 node that runs Spark with 25GB of RAM and
ES as well.

The script:
https://github.com/rjurney/Agile_Data_Code_2/blob/master/ch04/pyspark_to_elasticsearch.py

The error: https://gist.github.com/rjurney/ec0d6b1ef050e3fbead2314255f4b6fa

I asked the question on the Elasticsearch forums and I thought someone here
might know:
https://discuss.elastic.co/t/spark-elasticsearch-exception-maybe-es-was-overloaded/71932

Thanks!
-- 
Russell Jurney twitter.com/rjurney russell.jur...@gmail.com relato.io

Reply via email to