Basically, if SPARK_HOME/bin/spark-shell works, then export SPARK_HOME in
conf/zeppelin-env.sh and setting 'master' property in Interpreter menu on
Zeppelin GUI should be enough to make successful connection to Spark
standalone cluster.

Do you see any new exception in your log file when you set 'master'
property in Interpreter menu on Zeppelin GUI and see 'Scheduler already
Terminated' error? If you can share, that would be helpful.

Zeppelin does not use HiveThriftServer2 and does not need any other
dependency except for JVM to run, once it's been built.


Thanks,
moon

On Tue, Nov 24, 2015 at 11:37 PM Timur Shenkao <t...@timshenkao.su> wrote:

> One more question. What should be installed on server? What the
> dependencies of Zeppelin?
> Node.js, npm, bower? Scala?
>
> On Tue, Nov 24, 2015 at 5:34 PM, Timur Shenkao <t...@timshenkao.su> wrote:
>
> > I also checked Spark workers. There are no traces, folders, logs about
> > Zeppelin on them.
> > There are logs about Zeppelin on Spark Master server only where Zeppelin
> > is launched.
> >
> > For example, H2O creates logs on every worker in folders
> > /usr/spark/work/app-.....-... Is it correct?
> >
> > I also launched Thrift server via /usr/spark/sbin/start-thriftserver.sh
> on
> > Spark Master. Does Zeppelin use
> > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 ?
> >
> > For terminated scheduler, I got
> > INFO [2015-11-24 16:26:16,610] ({pool-1-thread-2}
> > SchedulerFactory.java[jobFinished]:138) - Job paragraph_1448346$
> > ERROR [2015-11-24 16:26:17,658] ({Thread-34}
> > JobProgressPoller.java[run]:57) - Can not get or update progress
> > org.apache.zeppelin.interpreter.InterpreterException:
> > org.apache.thrift.transport.TTransportException
> >         at
> >
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:302)
> >         at
> >
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110)
> >         at
> > org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:174)
> >         at
> >
> org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:54)
> > Caused by: org.apache.thrift.transport.TTransportException
> >         at
> >
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> >         at
> > org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
> >         at
> >
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> >         at
> >
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> >         at
> >
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> >         at
> > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> >         at
> >
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpret$
> >         at
> >
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterSer$
> > INFO [2015-11-24 16:26:52,617] ({qtp982007015-52}
> > InterpreterRestApi.java[updateSetting]:104) - Update interprete$
> >  INFO [2015-11-24 16:27:56,319] ({qtp982007015-48}
> > InterpreterRestApi.java[restartSetting]:143) - Restart interpre$
> > ERROR [2015-11-24 16:28:09,603] ({qtp982007015-48}
> > NotebookServer.java[runParagraph]:661) - Exception from run
> > java.lang.RuntimeException: Scheduler already terminated
> >         at
> >
> org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
> >         at org.apache.zeppelin.notebook.Note.run(Note.java:326)
> >         at
> >
> org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:659)
> >         at
> >
> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:126)
> >         at
> >
> org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
> >         at
> >
> org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC645$
> >         at
> >
> org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
> >         at
> >
> org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
> >         at
> >
> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
> >         at
> >
> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> >         at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> >         at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> >         at java.lang.Thread.run(Thread.java:745)
> > ERROR [2015-11-24 16:28:36,906] ({qtp982007015-50}
> > NotebookServer.java[runParagraph]:661) - Exception from run
> > java.lang.RuntimeException: Scheduler already terminated
> >         at
> >
> org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
> >         at org.apache.zeppelin.notebook.Note.run(Note.java:326)
> >         at
> >
> org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:659)
> >         at
> >
> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:126)
> >         at
> >
> org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
> >
> >
> >
> >
> > On Tue, Nov 24, 2015 at 4:50 PM, Timur Shenkao <t...@timshenkao.su>
> wrote:
> >
> >> Hello!
> >>
> >> There is no Kerberos, no security in my cluster. It's in an internal
> >> network.
> >>
> >> Interpreters %hive and %sh work, I can create tables, drop, pwd, etc.
> So,
> >> the problem is in integration with Spark.
> >>
> >> In /usr/spark/conf/spark-env.sh I set / unset in turn MASTER =
> >> spark://localhost:7077,  MASTER = spark://192.168.58.10:7077, MASTER =
> >> spark://127.0.0.1:7077 on master node. On slaves I set / unset in turn
> >> MASTER = spark://192.168.58.10:7077 in different combinations.
> >>
> >> Zeppelin is installed on the same machine as Spark Master. So, in
> >> zeppelin-env.sh I set / unset MASTER = spark://localhost:7077,  MASTER =
> >> spark://192.168.58.10:7077, MASTER = spark://127.0.0.1:7077
> >> Yes, I can connect to 192.168.58 and see URL spark://192.168.58:7077
> >> REST URL spark://192.168.58:6066 (cluster mode)
> >>
> >> Does TCP type influence? On my laptop, in pseudodistributed mode, all
> >> connections are IPv4 (tcp). There are IPv4 lines in /etc/hosts only.
> >> In cluster, Spark automatically, for unknown reasons, uses IPv6 (tcp6).
> >> There are IPv6 lines in /etc/hosts.
> >> Right now, I try to make Spark use IPv4
> >>
> >> I switched Spark to IPv4 via -Djava.net.preferIPv4Stack=true
> >>
> >> It seems that Zeppelin uses / answers the following ports on Master
> >> server (ps axu | grep zeppelin;  then for each PID netstat -natp | grep
> >> ...):
> >> 41303
> >> 46971
> >> 59007
> >> 35781
> >> 53637
> >> 34860
> >> 59793
> >> 46971
> >> 50676
> >> 50677
> >>
> >> 44341
> >> 50805
> >> 50803
> >> 50802
> >>
> >> 60886
> >> 43345
> >> 48415
> >> 48417
> >> 10000
> >> 48416
> >>
> >> Best regards
> >>
> >> P.S. I inserted into zeppelin-env.sh and spark interpreter configuration
> >> in web UI precise address from Spark page: MASTER=spark://
> >> 192.168.58.10:7077.
> >> Earlier, I got Java error stacktrace in Web UI.  I BEGAN to receive
> >> "Scheduler already terminated"
> >>
> >> On Tue, Nov 24, 2015 at 12:56 PM, moon soo Lee <m...@apache.org> wrote:
> >>
> >>> Thanks for sharing the problem.
> >>>
> >>> Based on your log file, it looks like somehow your spark master address
> >>> is not well configured.
> >>>
> >>> Can you confirm that you have also set 'master' property in Interpreter
> >>> menu on GUI, at spark section?
> >>>
> >>> If it is not, you can connect Spark Master UI with your web browser and
> >>> see the first line, "Spark Master at spark://....". That value should
> be in
> >>> 'master' property in Interpreter menu on GUI, at spark section.
> >>>
> >>> Hope this helps
> >>>
> >>> Best,
> >>> moon
> >>>
> >>> On Tue, Nov 24, 2015 at 3:07 AM Timur Shenkao <t...@timshenkao.su>
> wrote:
> >>>
> >>>> Hi!
> >>>>
> >>>> New mistake comes: TTransportException.
> >>>> I use CentOS 6.7 + Spark 1.5.2 Standalone + Cloudera Hadoop 5.4.8 on
> >>>> the same cluster. I can't use Mesos or Spark on YARN.
> >>>> I built Zeppelin 0.6.0 so:
> >>>> mvn clean package  –DskipTests  -Pspark-1.5 -Phadoop-2.6 -Pyarn
> >>>> -Ppyspark -Pbuild-distr
> >>>>
> >>>> I constantly get errors like
> >>>> ERROR [2015-11-23 18:14:33,404] ({pool-1-thread-4} Job.java[run]:183)
> -
> >>>> Job failed
> >>>> org.apache.zeppelin.interpreter.InterpreterException:
> >>>> org.apache.thrift.transport.TTransportException
> >>>>     at
> >>>>
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:237)
> >>>>
> >>>>
> >>>> or
> >>>>
> >>>> ERROR [2015-11-23 18:07:26,535] ({Thread-11}
> >>>> RemoteInterpreterEventPoller.java[run]:72) - Can't get
> >>>> RemoteInterpreterEvent
> >>>> org.apache.thrift.transport.TTransportException
> >>>>
> >>>> I changed several parameters in zeppelin-env.sh and in Spark configs.
> >>>> Whatever I do - these mistakes come. At the same time, when I use
> local
> >>>> Zeppelin with Hadoop in pseudodistributed mode + Spark Standalone
> (Master +
> >>>> workers on the same machine), everything works.
> >>>>
> >>>> What configurations (memory, network, CPU cores) should be in order to
> >>>> Zeppelin to work?
> >>>>
> >>>> I launch H2O on this cluster. And it works.
> >>>> Spark Master config:
> >>>> SPARK_MASTER_WEBUI_PORT=18080
> >>>> HADOOP_CONF_DIR=/etc/hadoop/conf
> >>>> SPARK_HOME=/usr/spark
> >>>>
> >>>> Spark Worker config:
> >>>>    export HADOOP_CONF_DIR=/etc/hadoop/conf
> >>>>    export MASTER=spark://192.168.58.10:7077
> >>>>    export SPARK_HOME=/usr/spark
> >>>>
> >>>>    SPARK_WORKER_INSTANCES=1
> >>>>    SPARK_WORKER_CORES=4
> >>>>    SPARK_WORKER_MEMORY=32G
> >>>>
> >>>>
> >>>> I apply Spark configs + zeppelin configs & logs for local mode   +
> >>>> zeppelin configs & logs when I defined IP address of Spark Master
> >>>> explicitly.
> >>>> Thank you.
> >>>>
> >>>
> >>
> >
>

Reply via email to