One more question. What should be installed on server? What the dependencies of Zeppelin? Node.js, npm, bower? Scala?
On Tue, Nov 24, 2015 at 5:34 PM, Timur Shenkao <t...@timshenkao.su> wrote: > I also checked Spark workers. There are no traces, folders, logs about > Zeppelin on them. > There are logs about Zeppelin on Spark Master server only where Zeppelin > is launched. > > For example, H2O creates logs on every worker in folders > /usr/spark/work/app-.....-... Is it correct? > > I also launched Thrift server via /usr/spark/sbin/start-thriftserver.sh on > Spark Master. Does Zeppelin use > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 ? > > For terminated scheduler, I got > INFO [2015-11-24 16:26:16,610] ({pool-1-thread-2} > SchedulerFactory.java[jobFinished]:138) - Job paragraph_1448346$ > ERROR [2015-11-24 16:26:17,658] ({Thread-34} > JobProgressPoller.java[run]:57) - Can not get or update progress > org.apache.zeppelin.interpreter.InterpreterException: > org.apache.thrift.transport.TTransportException > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:302) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110) > at > org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:174) > at > org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:54) > Caused by: org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at > org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpret$ > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterSer$ > INFO [2015-11-24 16:26:52,617] ({qtp982007015-52} > InterpreterRestApi.java[updateSetting]:104) - Update interprete$ > INFO [2015-11-24 16:27:56,319] ({qtp982007015-48} > InterpreterRestApi.java[restartSetting]:143) - Restart interpre$ > ERROR [2015-11-24 16:28:09,603] ({qtp982007015-48} > NotebookServer.java[runParagraph]:661) - Exception from run > java.lang.RuntimeException: Scheduler already terminated > at > org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124) > at org.apache.zeppelin.notebook.Note.run(Note.java:326) > at > org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:659) > at > org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:126) > at > org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56) > at > org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC645$ > at > org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349) > at > org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) > at java.lang.Thread.run(Thread.java:745) > ERROR [2015-11-24 16:28:36,906] ({qtp982007015-50} > NotebookServer.java[runParagraph]:661) - Exception from run > java.lang.RuntimeException: Scheduler already terminated > at > org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124) > at org.apache.zeppelin.notebook.Note.run(Note.java:326) > at > org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:659) > at > org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:126) > at > org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56) > > > > > On Tue, Nov 24, 2015 at 4:50 PM, Timur Shenkao <t...@timshenkao.su> wrote: > >> Hello! >> >> There is no Kerberos, no security in my cluster. It's in an internal >> network. >> >> Interpreters %hive and %sh work, I can create tables, drop, pwd, etc. So, >> the problem is in integration with Spark. >> >> In /usr/spark/conf/spark-env.sh I set / unset in turn MASTER = >> spark://localhost:7077, MASTER = spark://192.168.58.10:7077, MASTER = >> spark://127.0.0.1:7077 on master node. On slaves I set / unset in turn >> MASTER = spark://192.168.58.10:7077 in different combinations. >> >> Zeppelin is installed on the same machine as Spark Master. So, in >> zeppelin-env.sh I set / unset MASTER = spark://localhost:7077, MASTER = >> spark://192.168.58.10:7077, MASTER = spark://127.0.0.1:7077 >> Yes, I can connect to 192.168.58 and see URL spark://192.168.58:7077 >> REST URL spark://192.168.58:6066 (cluster mode) >> >> Does TCP type influence? On my laptop, in pseudodistributed mode, all >> connections are IPv4 (tcp). There are IPv4 lines in /etc/hosts only. >> In cluster, Spark automatically, for unknown reasons, uses IPv6 (tcp6). >> There are IPv6 lines in /etc/hosts. >> Right now, I try to make Spark use IPv4 >> >> I switched Spark to IPv4 via -Djava.net.preferIPv4Stack=true >> >> It seems that Zeppelin uses / answers the following ports on Master >> server (ps axu | grep zeppelin; then for each PID netstat -natp | grep >> ...): >> 41303 >> 46971 >> 59007 >> 35781 >> 53637 >> 34860 >> 59793 >> 46971 >> 50676 >> 50677 >> >> 44341 >> 50805 >> 50803 >> 50802 >> >> 60886 >> 43345 >> 48415 >> 48417 >> 10000 >> 48416 >> >> Best regards >> >> P.S. I inserted into zeppelin-env.sh and spark interpreter configuration >> in web UI precise address from Spark page: MASTER=spark:// >> 192.168.58.10:7077. >> Earlier, I got Java error stacktrace in Web UI. I BEGAN to receive >> "Scheduler already terminated" >> >> On Tue, Nov 24, 2015 at 12:56 PM, moon soo Lee <m...@apache.org> wrote: >> >>> Thanks for sharing the problem. >>> >>> Based on your log file, it looks like somehow your spark master address >>> is not well configured. >>> >>> Can you confirm that you have also set 'master' property in Interpreter >>> menu on GUI, at spark section? >>> >>> If it is not, you can connect Spark Master UI with your web browser and >>> see the first line, "Spark Master at spark://....". That value should be in >>> 'master' property in Interpreter menu on GUI, at spark section. >>> >>> Hope this helps >>> >>> Best, >>> moon >>> >>> On Tue, Nov 24, 2015 at 3:07 AM Timur Shenkao <t...@timshenkao.su> wrote: >>> >>>> Hi! >>>> >>>> New mistake comes: TTransportException. >>>> I use CentOS 6.7 + Spark 1.5.2 Standalone + Cloudera Hadoop 5.4.8 on >>>> the same cluster. I can't use Mesos or Spark on YARN. >>>> I built Zeppelin 0.6.0 so: >>>> mvn clean package –DskipTests -Pspark-1.5 -Phadoop-2.6 -Pyarn >>>> -Ppyspark -Pbuild-distr >>>> >>>> I constantly get errors like >>>> ERROR [2015-11-23 18:14:33,404] ({pool-1-thread-4} Job.java[run]:183) - >>>> Job failed >>>> org.apache.zeppelin.interpreter.InterpreterException: >>>> org.apache.thrift.transport.TTransportException >>>> at >>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:237) >>>> >>>> >>>> or >>>> >>>> ERROR [2015-11-23 18:07:26,535] ({Thread-11} >>>> RemoteInterpreterEventPoller.java[run]:72) - Can't get >>>> RemoteInterpreterEvent >>>> org.apache.thrift.transport.TTransportException >>>> >>>> I changed several parameters in zeppelin-env.sh and in Spark configs. >>>> Whatever I do - these mistakes come. At the same time, when I use local >>>> Zeppelin with Hadoop in pseudodistributed mode + Spark Standalone (Master + >>>> workers on the same machine), everything works. >>>> >>>> What configurations (memory, network, CPU cores) should be in order to >>>> Zeppelin to work? >>>> >>>> I launch H2O on this cluster. And it works. >>>> Spark Master config: >>>> SPARK_MASTER_WEBUI_PORT=18080 >>>> HADOOP_CONF_DIR=/etc/hadoop/conf >>>> SPARK_HOME=/usr/spark >>>> >>>> Spark Worker config: >>>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>>> export MASTER=spark://192.168.58.10:7077 >>>> export SPARK_HOME=/usr/spark >>>> >>>> SPARK_WORKER_INSTANCES=1 >>>> SPARK_WORKER_CORES=4 >>>> SPARK_WORKER_MEMORY=32G >>>> >>>> >>>> I apply Spark configs + zeppelin configs & logs for local mode + >>>> zeppelin configs & logs when I defined IP address of Spark Master >>>> explicitly. >>>> Thank you. >>>> >>> >> >