Asif, This was also brought up by a somebody else at Amazon on Friday, and I came up with the following EMR Step that can be used to work around the issue:
aws emr create-cluster ... --applications Zeppelin-Sandbox --steps Name=CreateZeppelinLocalRepo,Jar=command-runner.jar,Args=[bash,-c,"sudo -u zeppelin mkdir /var/lib/zeppelin/local-repo; sudo ln -s /var/lib/zeppelin/local-repo /usr/lib/zeppelin"] Moon's suggestion will also work, but it is a little better to have the local-dir underneath /var (which, as of emr-4.2.0, is symlinked to /mnt/var) so that it is on the first ephemeral disk rather than on the root partition to prevent your master instance's root partition from filling up. Thanks for bringing up this issue, and we will be sure to fix it in the next release of EMR so that you won't need this workaround anymore. ~ Jonathan On Fri, Nov 20, 2015 at 9:51 PM, Asif Imran <covariantmon...@gmail.com> wrote: > Hi Moon, > > Thanks so much. This worked perfectly. > > Asif > > > On Fri, Nov 20, 2015 at 6:06 PM, moon soo Lee <m...@apache.org> wrote: > >> Hi Asif, >> >> Thanks for sharing the problem. >> I've found that run >> >> sudo mkdir /usr/lib/zeppelin/local-repo >> sudo chown zeppelin /usr/lib/zeppelin/local-repo >> >> on your EMR master node, makes %dep working correctly. >> >> Hope this helps. >> >> Best, >> moon >> >> On Sat, Nov 21, 2015 at 7:34 AM Asif Imran <covariantmon...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I am running spark on aws emr with default options. On the notebook, I >>> am having trouble getting this off the ground >>> >>> %dep >>> z.reset() >>> z.load("com.databricks:spark-csv_2.10:1.2.0") >>> >>> Digging through the user-list, ppl in the past had similar issues >>> ranging from path permissions, proxy or yarn incompatibility. >>> >>> Is there a standard way to debug this? More general question is: does >>> the dep loader even work for this particular set up (namely, on aws emr) >>> >>> >>> Thanks >>> Asif >>> >>> Love Zeppelin btw :-) >>> >>> >>> >>> >>> Log >>> ————————- >>> >>> java.lang.NullPointerException at >>> org.sonatype.aether.impl.internal.DefaultRepositorySystem.resolveDependencies(DefaultRepositorySystem.java:352) >>> at >>> org.apache.zeppelin.spark.dep.DependencyContext.fetchArtifactWithDep(DependencyContext.java:141) >>> at >>> org.apache.zeppelin.spark.dep.DependencyContext.fetch(DependencyContext.java:98) >>> at >>> org.apache.zeppelin.spark.DepInterpreter.interpret(DepInterpreter.java:189) >>> at >>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) >>> at >>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) >>> at >>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) >>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at >>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) >>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> >>> >>> >>> >>> >>> >>> >