I'm trying to write a JavaPairRDD to a downstream database using saveAsNewAPIHadoopFile with a custom OutputFormat and the process is pretty slow.
Is there a way to boost the concurrency of the save process? For example, something like splitting the RDD into multiple smaller RDDs and using Java threads to write the data out? That seems foreign to the way Spark works so not sure if there's a better way.