Re: pio train on Amazon EMR

2018-02-05 Thread Pat Ferrel
I agree, we looked at using EMR and found that we liked some custom Terraform + Docker much better. The existing EMR defined by AWS requires refactoring PIO or using it in yarn’s cluster mode. EMR is not meant to host any application code except what is sent into Spark in serialized form.

Re: pio train on Amazon EMR

2018-02-05 Thread Mars Hall
Hi Malik, This is a topic I've been investigating as well. Given how EMR manages its clusters & their runtime, I don't think hacking configs to make the PredictionIO host act like a cluster member will be a simple or sustainable approach. PredictionIO already operates Spark by building