Hi Jonathan,

Thanks! This will be super helpful going forward — one less thing to
remember :-)

Best,
Asif

PS: Would love to see some of the best practices (#of executors, memory
etc) from aws slides on the official aws+spark help page
http://www.slideshare.net/AmazonWebServices/bdt309-data-science-best-practices-for-apache-spark-on-amazon-emr

On Sun, Nov 22, 2015 at 10:20 AM, Jonathan Kelly <jonathaka...@gmail.com>
wrote:

> Asif,
>
> This was also brought up by a somebody else at Amazon on Friday, and I
> came up with the following EMR Step that can be used to work around the
> issue:
>
> aws emr create-cluster ... --applications Zeppelin-Sandbox --steps
> Name=CreateZeppelinLocalRepo,Jar=command-runner.jar,Args=[bash,-c,"sudo -u
> zeppelin mkdir /var/lib/zeppelin/local-repo; sudo ln -s
> /var/lib/zeppelin/local-repo /usr/lib/zeppelin"]
>
> Moon's suggestion will also work, but it is a little better to have the
> local-dir underneath /var (which, as of emr-4.2.0, is symlinked to
> /mnt/var) so that it is on the first ephemeral disk rather than on the root
> partition to prevent your master instance's root partition from filling up.
>
> Thanks for bringing up this issue, and we will be sure to fix it in the
> next release of EMR so that you won't need this workaround anymore.
>
> ~ Jonathan
>
> On Fri, Nov 20, 2015 at 9:51 PM, Asif Imran <covariantmon...@gmail.com>
> wrote:
>
>> Hi Moon,
>>
>> Thanks so much. This worked perfectly.
>>
>> Asif
>>
>>
>> On Fri, Nov 20, 2015 at 6:06 PM, moon soo Lee <m...@apache.org> wrote:
>>
>>> Hi Asif,
>>>
>>> Thanks for sharing the problem.
>>> I've found that run
>>>
>>> sudo mkdir /usr/lib/zeppelin/local-repo
>>> sudo chown zeppelin /usr/lib/zeppelin/local-repo
>>>
>>> on your EMR master node, makes %dep working correctly.
>>>
>>> Hope this helps.
>>>
>>> Best,
>>> moon
>>>
>>> On Sat, Nov 21, 2015 at 7:34 AM Asif Imran <covariantmon...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am running spark on aws emr with default options. On the notebook, I
>>>> am having trouble getting this off the ground
>>>>
>>>> %dep
>>>> z.reset()
>>>> z.load("com.databricks:spark-csv_2.10:1.2.0")
>>>>
>>>> Digging through the user-list, ppl in the past had similar issues
>>>> ranging from path permissions, proxy or yarn incompatibility.
>>>>
>>>> Is there a standard way to debug this? More general question is: does
>>>> the dep loader even work for this particular set up (namely, on aws emr)
>>>>
>>>>
>>>> Thanks
>>>> Asif
>>>>
>>>> Love Zeppelin btw :-)
>>>>
>>>>
>>>>
>>>>
>>>> Log
>>>> ————————-
>>>>
>>>> java.lang.NullPointerException at
>>>> org.sonatype.aether.impl.internal.DefaultRepositorySystem.resolveDependencies(DefaultRepositorySystem.java:352)
>>>> at
>>>> org.apache.zeppelin.spark.dep.DependencyContext.fetchArtifactWithDep(DependencyContext.java:141)
>>>> at
>>>> org.apache.zeppelin.spark.dep.DependencyContext.fetch(DependencyContext.java:98)
>>>> at
>>>> org.apache.zeppelin.spark.DepInterpreter.interpret(DepInterpreter.java:189)
>>>> at
>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
>>>> at
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>>>> at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>

Reply via email to