How can I throttle Spark as it writes to Elasticsearch? I have already repartitioned down to one partition in an effort to slow the writes. ES indicates it is being overloaded, and I don't know how to slow things down. This is all on one r4.xlarge EC2 node that runs Spark with 25GB of RAM and ES as well.
The script: https://github.com/rjurney/Agile_Data_Code_2/blob/master/ch04/pyspark_to_elasticsearch.py The error: https://gist.github.com/rjurney/ec0d6b1ef050e3fbead2314255f4b6fa I asked the question on the Elasticsearch forums and I thought someone here might know: https://discuss.elastic.co/t/spark-elasticsearch-exception-maybe-es-was-overloaded/71932 Thanks! -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com relato.io