Thank you for sharing, Kevin! Great tips, especially how to setup S3 storage on EMR.
-- Alex On Wed, May 18, 2016 at 6:04 PM, Kevin (Sangwoo) Kim <[email protected]> wrote: > Hi Ahyoung, > > I just added #6 while writing this mail, after realized I kept the cluster > turn on after the presentation.. (Haha) > > I'm attaching the slide. > (Sorry for non-Korean readers, but most of the slide is screen-shots, I > hope it helps!) > > - Kevin > > > > 2016년 5월 18일 (수) 오후 4:55, Hyung Sung Shim <[email protected]>님이 작성: > >> Thank you for sharing great information! >> >> >> 2016-05-18 16:49 GMT+09:00 Ahyoung Ryu <[email protected]>: >> >>> Hi Kevin, >>> >>> Thanks for the sharing. It's really helpful indeed not only me but also >>> to many others. >>> I think *6.**Don't forget to terminate cluster when you're done your >>> job* is the most important thing :) >>> Is there any way I can see your slide? If so, it will be really >>> appreciate. >>> >>> Best regards, >>> Ahyoung >>> >>> 2016년 5월 18일 (수) 오후 3:19, Kevin (Sangwoo) Kim <[email protected]>님이 >>> 작성: >>> >>>> Hi Zeppelin users, >>>> >>>> I'v been presenting some demo on "Spark+Zeppelin on AWS EMR" at AWS >>>> Summit Seoul yesterday. I'm so sad that the slides are written in Korean so >>>> it's hard to share, but I'd like to share some essentials. >>>> >>>> 1. Running Z on EMR is super easy. (EMR team did really good job. You >>>> can do that with only few clicks, took 8min to launch) >>>> >>>> 2. You can launch EMR with spot instances, it will save your money. >>>> >>>> 3. You can provide some configs when you launch EMR cluster, so you may >>>> want to save your notebook on S3, proper config is as follow. >>>> >>>> [ >>>> { >>>> "Classification": "zeppelin-env", >>>> "Properties": {}, >>>> "Configurations": [ >>>> { >>>> "Classification": "export", >>>> "Properties": { >>>> “ZEPPELIN_NOTEBOOK_STORAGE" >>>> :"org.apache.zeppelin.notebook.repo.S3NotebookRepo", >>>> "ZEPPELIN_NOTEBOOK_S3_BUCKET": "BUCKET_NAME", >>>> "ZEPPELIN_NOTEBOOK_S3_USER": "SOME_USER_NAME" >>>> }, >>>> "Configurations": [] >>>> } >>>> ] >>>> } >>>> ] >>>> >>>> 4. You need to set proper spark.executor.memory in Zeppelin interpreter >>>> setting. >>>> >>>> 5. You can increase or decrease cluster size in cluster detail page. >>>> >>>> 6. Don't forget to terminate cluster when you're done your job :) >>>> >>>> That's all! >>>> >>>> >>>> If you have more tips, plz add it on this mail thread. Thanks! >>>> >>>> - Kevin >>>> >>>> >>>> >>>> >>
