I agree, we looked at using EMR and found that we liked some custom Terraform +
Docker much better. The existing EMR defined by AWS requires refactoring PIO or
using it in yarn’s cluster mode. EMR is not meant to host any application code
except what is sent into Spark in serialized form.
Hi Malik,
This is a topic I've been investigating as well.
Given how EMR manages its clusters & their runtime, I don't think hacking
configs to make the PredictionIO host act like a cluster member will be a
simple or sustainable approach.
PredictionIO already operates Spark by building