Re: Equivalent of Jupyter %run
There's no offical examples in zeppelin project, but you can try it as following %python.ipython %run your_python_script_file And what do you mean error prone ? What kind of error you will hit ? Eric Pugh 于2020年1月13日周一 下午8:23写道: > I ended up having a notebook called “shared_code”, and would just always > run it before I did any work. That approach worked, but was error prone > and made refactoring any complex logic much harder. I did play around a > bit with some of the %sh to edit .py files, and then load them.. > > Jeff, are there any examples of what you meant by using the python > interpreter? > > On Jan 10, 2020, at 6:31 PM, Jeff Zhang wrote: > > You can do it via ipython interpreter which support all the of jupyter > magics > > > http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support > > > Partridge, Lucas (GE Aviation) 于2020年1月10日周五 > 下午5:13写道: > >> I've hardly used Jupyter so can't comment on an equivalent for %run. >> >> But for Zeppelin you can put your python files on the local file system >> of your Spark driver node, or more commonly in HDFS, and then use >> sc.addPyFile() [1] to make each file available in the SparkContext. Then >> you can import your python packages as normal. The slightly annoying thing >> is that if you change your code you'll need to restart your Spark >> application to pick up the changes as there's no reliable way to reimport >> the updated modules in a running application. But you could put your >> importing of common files in a shared notebook so everyone can run it >> easily. >> >> Once you're happy with your code and it's fairly stable then you can >> package it with a setup.py and install the packages on all the nodes of >> your cluster like any other python package. Then you can skip the >> sc.addPyFile() step. >> >> DataBricks have a great facility for allowing users to upload their own >> Python packages/libraries. It would be great if Zeppelin provided this >> feature as well (although maybe they do now as I'm on an older version...). >> >> Lucas. >> >> [1] >> https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile >> >> -Original Message- >> From: Dave Boyd >> Sent: 09 January 2020 17:44 >> To: users@zeppelin.apache.org >> Subject: EXT: Equivalent of Jupyter %run >> >> I have googled this but don't see a solution. >> >> We are working on a project where we want to have some common python >> functions shared between notes. >> >> In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? >> Is there a way to store files as .py files that zeppelin can find for >> import to work. >> >> Looking to see how folks may have solved this need. >> >> -- >> = mailto:db...@incadencecorp.com David W. Boyd VP, >> Data Solutions >> 10432 Balls Ford, Suite 240 >> Manassas, VA 20109 >> office: +1-703-552-2862 >> cell: +1-703-402-7908 >> == http://www.incadencecorp.com/ ISO/IEC JTC1 >> SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1 Chair ANSI/INCITS TG Big >> Data Co-chair NIST Big Data Public Working Group Reference Architecture >> First Robotic Mentor - FRC, FTC - www.iliterobotics.org Board Member- >> USSTEM Foundation - www.usstem.org >> >> The information contained in this message may be privileged and/or >> confidential and protected from disclosure. >> If the reader of this message is not the intended recipient or an >> employee or agent responsible for delivering this message to the intended >> recipient, you are hereby notified that any dissemination, distribution or >> copying of this communication is strictly prohibited. If you have received >> this communication in error, please notify the sender immediately by >> replying to this message and deleting the material from any computer. >> >> > > -- > Best Regards > > Jeff Zhang > > > ___ > *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 > | http://www.opensourceconnections.com | My Free/Busy > <http://tinyurl.com/eric-cal> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > > -- Best Regards Jeff Zhang
Re: Equivalent of Jupyter %run
These errors didn't give many clues, I suspect maybe there're other errors in the log. Could you check again ? Dave Boyd 于2020年1月12日周日 上午12:35写道: > Jeff: >Thanks. > > I tried the following: > > %ipython > %run /Dave_Folder/ElasticUtils > > I get the following error: > > java.util.concurrent.RejectedExecutionException: Task > io.grpc.internal.SerializingExecutor@7068c569 rejected from > java.util.concurrent.ThreadPoolExecutor@789e11b[Terminated, pool size = > 0, active threads = 0, queued tasks = 0, completed tasks = 45] at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at > io.grpc.internal.SerializingExecutor.schedule(SerializingExecutor.java:93) > at > io.grpc.internal.SerializingExecutor.execute(SerializingExecutor.java:86) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.closed(ClientCallImpl.java:588) > at io.grpc.internal.FailingClientStream.start(FailingClientStream.java:54) > at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:273) at > io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1.start(CensusTracingModule.java:398) > at > io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1.start(CensusStatsModule.java:673) > at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:308) at > io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:280) at > io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:265) at > io.grpc.stub.ClientCalls.asyncServerStreamingCall(ClientCalls.java:73) at > org.apache.zeppelin.python.proto.IPythonGrpc$IPythonStub.execute(IPythonGrpc.java:240) > at > org.apache.zeppelin.python.IPythonClient.stream_execute(IPythonClient.java:89) > at > org.apache.zeppelin.python.IPythonInterpreter.interpret(IPythonInterpreter.java:350) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) > at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at > org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > I am running 0.8.2 in a docker container. > > > On 1/10/2020 6:31 PM, Jeff Zhang wrote: > > You can do it via ipython interpreter which support all the of jupyter > magics > > > http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support > > > Partridge, Lucas (GE Aviation) 于2020年1月10日周五 > 下午5:13写道: > >> I've hardly used Jupyter so can't comment on an equivalent for %run. >> >> But for Zeppelin you can put your python files on the local file system >> of your Spark driver node, or more commonly in HDFS, and then use >> sc.addPyFile() [1] to make each file available in the SparkContext. Then >> you can import your python packages as normal. The slightly annoying thing >> is that if you change your code you'll need to restart your Spark >> application to pick up the changes as there's no reliable way to reimport >> the updated modules in a running application. But you could put your >> importing of common files in a shared notebook so everyone can run it >> easily. >> >> Once you're happy with your code and it's fairly stable then you can >> package it with a setup.py and install the packages on all the nodes of >> your cluster like any other python package. Then you can skip the >> sc.addPyFile() step. >> >> DataBricks have a great facility for allowing users to upload their own >> Python packages/libraries. It would be great if Zeppelin provided this >> feature as well (although maybe they do now as I'm on an older version...). >> >> Lucas. >> >> [1] >> https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile >> >> -Original Message-
Re: Equivalent of Jupyter %run
I ended up having a notebook called “shared_code”, and would just always run it before I did any work. That approach worked, but was error prone and made refactoring any complex logic much harder. I did play around a bit with some of the %sh to edit .py files, and then load them.. Jeff, are there any examples of what you meant by using the python interpreter? > On Jan 10, 2020, at 6:31 PM, Jeff Zhang <mailto:zjf...@gmail.com>> wrote: > > You can do it via ipython interpreter which support all the of jupyter magics > > http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support > <http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support> > > > Partridge, Lucas (GE Aviation) <mailto:lucas.partri...@ge.com>> 于2020年1月10日周五 下午5:13写道: > I've hardly used Jupyter so can't comment on an equivalent for %run. > > But for Zeppelin you can put your python files on the local file system of > your Spark driver node, or more commonly in HDFS, and then use sc.addPyFile() > [1] to make each file available in the SparkContext. Then you can import > your python packages as normal. The slightly annoying thing is that if you > change your code you'll need to restart your Spark application to pick up the > changes as there's no reliable way to reimport the updated modules in a > running application. But you could put your importing of common files in a > shared notebook so everyone can run it easily. > > Once you're happy with your code and it's fairly stable then you can package > it with a setup.py and install the packages on all the nodes of your cluster > like any other python package. Then you can skip the sc.addPyFile() step. > > DataBricks have a great facility for allowing users to upload their own > Python packages/libraries. It would be great if Zeppelin provided this > feature as well (although maybe they do now as I'm on an older version...). > > Lucas. > > [1] > https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile > > <https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile> > > -Original Message- > From: Dave Boyd mailto:db...@incadencecorp.com>> > Sent: 09 January 2020 17:44 > To: users@zeppelin.apache.org <mailto:users@zeppelin.apache.org> > Subject: EXT: Equivalent of Jupyter %run > > I have googled this but don't see a solution. > > We are working on a project where we want to have some common python > functions shared between notes. > > In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? > Is there a way to store files as .py files that zeppelin can find for import > to work. > > Looking to see how folks may have solved this need. > > -- > = mailto:db...@incadencecorp.com <mailto:db...@incadencecorp.com> > David W. Boyd VP, Data Solutions > 10432 Balls Ford, Suite 240 > Manassas, VA 20109 > office: +1-703-552-2862 > cell: +1-703-402-7908 > == http://www.incadencecorp.com/ <http://www.incadencecorp.com/> > ISO/IEC JTC1 SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1 > Chair ANSI/INCITS TG Big Data Co-chair NIST Big Data Public Working Group > Reference Architecture First Robotic Mentor - FRC, FTC - > www.iliterobotics.org <http://www.iliterobotics.org/> Board Member- USSTEM > Foundation - www.usstem.org <http://www.usstem.org/> > > The information contained in this message may be privileged and/or > confidential and protected from disclosure. > If the reader of this message is not the intended recipient or an employee or > agent responsible for delivering this message to the intended recipient, you > are hereby notified that any dissemination, distribution or copying of this > communication is strictly prohibited. If you have received this > communication in error, please notify the sender immediately by replying to > this message and deleting the material from any computer. > > > > -- > Best Regards > > Jeff Zhang ___ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
Re: Equivalent of Jupyter %run
Jeff: Thanks. I tried the following: %ipython %run /Dave_Folder/ElasticUtils I get the following error: java.util.concurrent.RejectedExecutionException: Task io.grpc.internal.SerializingExecutor@7068c569 rejected from java.util.concurrent.ThreadPoolExecutor@789e11b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 45] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) at io.grpc.internal.SerializingExecutor.schedule(SerializingExecutor.java:93) at io.grpc.internal.SerializingExecutor.execute(SerializingExecutor.java:86) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.closed(ClientCallImpl.java:588) at io.grpc.internal.FailingClientStream.start(FailingClientStream.java:54) at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:273) at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1.start(CensusTracingModule.java:398) at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1.start(CensusStatsModule.java:673) at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:308) at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:280) at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:265) at io.grpc.stub.ClientCalls.asyncServerStreamingCall(ClientCalls.java:73) at org.apache.zeppelin.python.proto.IPythonGrpc$IPythonStub.execute(IPythonGrpc.java:240) at org.apache.zeppelin.python.IPythonClient.stream_execute(IPythonClient.java:89) at org.apache.zeppelin.python.IPythonInterpreter.interpret(IPythonInterpreter.java:350) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) I am running 0.8.2 in a docker container. On 1/10/2020 6:31 PM, Jeff Zhang wrote: You can do it via ipython interpreter which support all the of jupyter magics http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support Partridge, Lucas (GE Aviation) mailto:lucas.partri...@ge.com>> 于2020年1月10日周五 下午5:13写道: I've hardly used Jupyter so can't comment on an equivalent for %run. But for Zeppelin you can put your python files on the local file system of your Spark driver node, or more commonly in HDFS, and then use sc.addPyFile() [1] to make each file available in the SparkContext. Then you can import your python packages as normal. The slightly annoying thing is that if you change your code you'll need to restart your Spark application to pick up the changes as there's no reliable way to reimport the updated modules in a running application. But you could put your importing of common files in a shared notebook so everyone can run it easily. Once you're happy with your code and it's fairly stable then you can package it with a setup.py and install the packages on all the nodes of your cluster like any other python package. Then you can skip the sc.addPyFile() step. DataBricks have a great facility for allowing users to upload their own Python packages/libraries. It would be great if Zeppelin provided this feature as well (although maybe they do now as I'm on an older version...). Lucas. [1] https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile -Original Message- From: Dave Boyd mailto:db...@incadencecorp.com>> Sent: 09 January 2020 17:44 To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: EXT: Equivalent of Jupyter %run I have googled this but don't see a solution. We are working on a project where we want to have some common python functions shared between notes. In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? Is there a way to store files as .py files that zeppelin can find for import to work. Looking to see how folks may have solved this need. -- = mailto:db...@incadencecorp.com<mailto:db...@incadencecorp.com>
Re: Equivalent of Jupyter %run
You can do it via ipython interpreter which support all the of jupyter magics http://zeppelin.apache.org/docs/0.8.2/interpreter/python.html#ipython-support Partridge, Lucas (GE Aviation) 于2020年1月10日周五 下午5:13写道: > I've hardly used Jupyter so can't comment on an equivalent for %run. > > But for Zeppelin you can put your python files on the local file system of > your Spark driver node, or more commonly in HDFS, and then use > sc.addPyFile() [1] to make each file available in the SparkContext. Then > you can import your python packages as normal. The slightly annoying thing > is that if you change your code you'll need to restart your Spark > application to pick up the changes as there's no reliable way to reimport > the updated modules in a running application. But you could put your > importing of common files in a shared notebook so everyone can run it > easily. > > Once you're happy with your code and it's fairly stable then you can > package it with a setup.py and install the packages on all the nodes of > your cluster like any other python package. Then you can skip the > sc.addPyFile() step. > > DataBricks have a great facility for allowing users to upload their own > Python packages/libraries. It would be great if Zeppelin provided this > feature as well (although maybe they do now as I'm on an older version...). > > Lucas. > > [1] > https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile > > -Original Message- > From: Dave Boyd > Sent: 09 January 2020 17:44 > To: users@zeppelin.apache.org > Subject: EXT: Equivalent of Jupyter %run > > I have googled this but don't see a solution. > > We are working on a project where we want to have some common python > functions shared between notes. > > In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? > Is there a way to store files as .py files that zeppelin can find for > import to work. > > Looking to see how folks may have solved this need. > > -- > = mailto:db...@incadencecorp.com David W. Boyd VP, > Data Solutions > 10432 Balls Ford, Suite 240 > Manassas, VA 20109 > office: +1-703-552-2862 > cell: +1-703-402-7908 > == http://www.incadencecorp.com/ ISO/IEC JTC1 > SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1 Chair ANSI/INCITS TG Big > Data Co-chair NIST Big Data Public Working Group Reference Architecture > First Robotic Mentor - FRC, FTC - www.iliterobotics.org Board Member- > USSTEM Foundation - www.usstem.org > > The information contained in this message may be privileged and/or > confidential and protected from disclosure. > If the reader of this message is not the intended recipient or an employee > or agent responsible for delivering this message to the intended recipient, > you are hereby notified that any dissemination, distribution or copying of > this communication is strictly prohibited. If you have received this > communication in error, please notify the sender immediately by replying to > this message and deleting the material from any computer. > > -- Best Regards Jeff Zhang
RE: Equivalent of Jupyter %run
I've hardly used Jupyter so can't comment on an equivalent for %run. But for Zeppelin you can put your python files on the local file system of your Spark driver node, or more commonly in HDFS, and then use sc.addPyFile() [1] to make each file available in the SparkContext. Then you can import your python packages as normal. The slightly annoying thing is that if you change your code you'll need to restart your Spark application to pick up the changes as there's no reliable way to reimport the updated modules in a running application. But you could put your importing of common files in a shared notebook so everyone can run it easily. Once you're happy with your code and it's fairly stable then you can package it with a setup.py and install the packages on all the nodes of your cluster like any other python package. Then you can skip the sc.addPyFile() step. DataBricks have a great facility for allowing users to upload their own Python packages/libraries. It would be great if Zeppelin provided this feature as well (although maybe they do now as I'm on an older version...). Lucas. [1] https://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=addpyfile#pyspark.SparkContext.addPyFile -Original Message- From: Dave Boyd Sent: 09 January 2020 17:44 To: users@zeppelin.apache.org Subject: EXT: Equivalent of Jupyter %run I have googled this but don't see a solution. We are working on a project where we want to have some common python functions shared between notes. In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? Is there a way to store files as .py files that zeppelin can find for import to work. Looking to see how folks may have solved this need. -- = mailto:db...@incadencecorp.com David W. Boyd VP, Data Solutions 10432 Balls Ford, Suite 240 Manassas, VA 20109 office: +1-703-552-2862 cell: +1-703-402-7908 == http://www.incadencecorp.com/ ISO/IEC JTC1 SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1 Chair ANSI/INCITS TG Big Data Co-chair NIST Big Data Public Working Group Reference Architecture First Robotic Mentor - FRC, FTC - www.iliterobotics.org Board Member- USSTEM Foundation - www.usstem.org The information contained in this message may be privileged and/or confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting the material from any computer.
Equivalent of Jupyter %run
I have googled this but don't see a solution. We are working on a project where we want to have some common python functions shared between notes. In Jupyter we would just do a %run. Is there an equivelent in Zeppelin? Is there a way to store files as .py files that zeppelin can find for import to work. Looking to see how folks may have solved this need. -- = mailto:db...@incadencecorp.com David W. Boyd VP, Data Solutions 10432 Balls Ford, Suite 240 Manassas, VA 20109 office: +1-703-552-2862 cell: +1-703-402-7908 == http://www.incadencecorp.com/ ISO/IEC JTC1 SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1 Chair ANSI/INCITS TG Big Data Co-chair NIST Big Data Public Working Group Reference Architecture First Robotic Mentor - FRC, FTC - www.iliterobotics.org Board Member- USSTEM Foundation - www.usstem.org The information contained in this message may be privileged and/or confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting the material from any computer.