Hi Zhan, I applied the patch you recommended, https://github.com/apache/spark/pull/3409, it it now works. It was failing with this:
Exception message: /hadoop/yarn/local/usercache/root/appcache/application_1425078697953_0020/container_1425078697953_0020_01_000002/launch_container.sh: line 14: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/ *${hdp.version}*/hadoop/lib/hadoop-lzo-0.6.0.*${hdp.version}*.jar:/etc/hadoop/conf/secure:$PWD/__app__.jar:$PWD/*: *bad substitution* While the spark-default.conf has these defined: spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 without the patch *${hdp.version} * was not being substituted. Thanks for pointing me to that patch, appreciate it. -Todd On Fri, Mar 6, 2015 at 1:12 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > Hi Todd, > > Looks like the thrift server can connect to metastore, but something > wrong in the executors. You can try to get the log with "yarn logs > -applicationID xxx” to check why it failed. If there is no log (master or > executor is not started at all), you can go to the RM webpage, click the > link to see why the shell failed in the first place. > > Thanks. > > Zhan Zhang > > On Mar 6, 2015, at 9:59 AM, Todd Nist <tsind...@gmail.com> wrote: > > First, thanks to everyone for their assistance and recommendations. > > @Marcelo > > I applied the patch that you recommended and am now able to get into the > shell, thank you worked great after I realized that the pom was pointing to > the 1.3.0-SNAPSHOT for parent, need to bump that down to 1.2.1. > > @Zhan > > Need to apply this patch next. I tried to start the spark-thriftserver > but and it starts, then fails with like this: I have the entries in my > spark-default.conf, but not the patch applied. > > ./sbin/start-thriftserver.sh --master yarn --executor-memory 1024m > --hiveconf hive.server2.thrift.port=10001 > > 5/03/06 12:34:17 INFO ui.SparkUI: Started SparkUI at > http://hadoopdev01.opsdatastore.com:404015/03/06 12:34:18 INFO > impl.TimelineClientImpl: Timeline service address: > http://hadoopdev02.opsdatastore.com:8188/ws/v1/timeline/15/03/06 12:34:18 > INFO client.RMProxy: Connecting to ResourceManager at > hadoopdev02.opsdatastore.com/192.168.15.154:805015/03/06 12:34:18 INFO > yarn.Client: Requesting a new application from cluster with 4 > NodeManagers15/03/06 12:34:18 INFO yarn.Client: Verifying our application has > not requested more than the maximum memory capability of the cluster (8192 MB > per container)15/03/06 12:34:18 INFO yarn.Client: Will allocate AM container, > with 896 MB memory including 384 MB overhead15/03/06 12:34:18 INFO > yarn.Client: Setting up container launch context for our AM15/03/06 12:34:18 > INFO yarn.Client: Preparing resources for our AM container15/03/06 12:34:19 > WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature > cannot be used because libhadoop cannot be loaded.15/03/06 12:34:19 INFO > yarn.Client: Uploading resource > file:/root/spark-1.2.1-bin-hadoop2.6/lib/spark-assembly-1.2.1-hadoop2.6.0.jar > -> > hdfs://hadoopdev01.opsdatastore.com:8020/user/root/.sparkStaging/application_1425078697953_0018/spark-assembly-1.2.1-hadoop2.6.0.jar15/03/06 > 12:34:21 INFO yarn.Client: Setting up the launch environment for our AM > container15/03/06 12:34:21 INFO spark.SecurityManager: Changing view acls to: > root15/03/06 12:34:21 INFO spark.SecurityManager: Changing modify acls to: > root15/03/06 12:34:21 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(root); users with modify permissions: Set(root)15/03/06 12:34:21 INFO > yarn.Client: Submitting application 18 to ResourceManager15/03/06 12:34:21 > INFO impl.YarnClientImpl: Submitted application > application_1425078697953_001815/03/06 12:34:22 INFO yarn.Client: Application > report for application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:22 > INFO yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: N/A > ApplicationMaster RPC port: -1 > queue: default > start time: 1425663261755 > final status: UNDEFINED > tracking URL: > http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018/ > user: root15/03/06 12:34:23 INFO yarn.Client: Application report for > application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:24 INFO > yarn.Client: Application report for application_1425078697953_0018 (state: > ACCEPTED)15/03/06 12:34:25 INFO yarn.Client: Application report for > application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:26 INFO > yarn.Client: Application report for application_1425078697953_0018 (state: > ACCEPTED)15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend: > ApplicationMaster registered as > Actor[akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:40201/user/YarnAM#-557112763]15/03/06 > 12:34:27 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS > -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES -> > http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018), > /proxy/application_1425078697953_001815/03/06 12:34:27 INFO ui.JettyUtils: > Adding filter: > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter15/03/06 12:34:27 > INFO yarn.Client: Application report for application_1425078697953_0018 > (state: RUNNING)15/03/06 12:34:27 INFO yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: hadoopdev08.opsdatastore.com > ApplicationMaster RPC port: 0 > queue: default > start time: 1425663261755 > final status: UNDEFINED > tracking URL: > http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018/ > user: root15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend: > Application application_1425078697953_0018 has started running.15/03/06 > 12:34:28 INFO netty.NettyBlockTransferService: Server created on > 4612415/03/06 12:34:28 INFO storage.BlockManagerMaster: Trying to register > BlockManager15/03/06 12:34:28 INFO storage.BlockManagerMasterActor: > Registering block manager hadoopdev01.opsdatastore.com:46124 with 265.4 MB > RAM, BlockManagerId(<driver>, hadoopdev01.opsdatastore.com, 46124)15/03/06 > 12:34:28 INFO storage.BlockManagerMaster: Registered BlockManager15/03/06 > 12:34:47 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready > for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: > 30000(ms)15/03/06 12:34:48 INFO hive.metastore: Trying to connect to > metastore with URI thrift://hadoopdev03.opsdatastore.com:908315/03/06 > 12:34:48 INFO hive.metastore: Connected to metastore.15/03/06 12:34:49 INFO > session.SessionState: No Tez session required at this point. > hive.execution.engine=mr.15/03/06 12:34:49 INFO service.AbstractService: > HiveServer2: Async execution pool size 10015/03/06 12:34:49 INFO > service.AbstractService: Service:OperationManager is inited.15/03/06 12:34:49 > INFO service.AbstractService: Service: SessionManager is inited.15/03/06 > 12:34:49 INFO service.AbstractService: Service: CLIService is inited.15/03/06 > 12:34:49 INFO service.AbstractService: Service:ThriftBinaryCLIService is > inited.15/03/06 12:34:49 INFO service.AbstractService: Service: HiveServer2 > is inited.15/03/06 12:34:49 INFO service.AbstractService: > Service:OperationManager is started.15/03/06 12:34:49 INFO > service.AbstractService: Service:SessionManager is started.15/03/06 12:34:49 > INFO service.AbstractService: Service:CLIService is started.15/03/06 12:34:49 > INFO hive.metastore: Trying to connect to metastore with URI > thrift://hadoopdev03.opsdatastore.com:908315/03/06 12:34:49 INFO > hive.metastore: Connected to metastore.15/03/06 12:34:49 INFO > service.AbstractService: Service:ThriftBinaryCLIService is started.15/03/06 > 12:34:49 INFO service.AbstractService: Service:HiveServer2 is > started.15/03/06 12:34:49 INFO thriftserver.HiveThriftServer2: > HiveThriftServer2 started15/03/06 12:34:49 INFO thrift.ThriftCLIService: > ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:1000115/03/06 12:34:58 > WARN remote.ReliableDeliverySupervisor: Association with remote system > [akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:40201] has failed, > address is now gated for [5000] ms. Reason is: [Disassociated].15/03/06 > 12:35:02 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster > registered as > Actor[akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:53176/user/YarnAM#-1793579186]15/03/06 > 12:35:02 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS > -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES -> > http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018), > /proxy/application_1425078697953_001815/03/06 12:35:02 INFO ui.JettyUtils: > Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter > 15/03/06 12:35:38 WARN remote.ReliableDeliverySupervisor: Association with > remote system [akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:53176] has > failed, address is now gated for [5000] ms. Reason is: > [Disassociated].15/03/06 12:35:39 ERROR cluster.YarnClientSchedulerBackend: > Yarn application has already exited with state FINISHED!15/03/06 12:35:39 > INFO handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/metrics/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/stage/kill,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}15/03/06 > 12:35:39 INFO handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/static,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}15/03/06 > 12:35:39 INFO handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/threadDump,null}15/03/06 12:35:39 > INFO handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/executors,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/environment/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/environment,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/rdd/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/rdd,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/storage,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/pool/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/pool,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/stage/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/stage,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/stages,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/jobs/job/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/jobs/job,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/jobs/json,null}15/03/06 12:35:39 INFO > handler.ContextHandler: stopped > o.e.j.s.ServletContextHandler{/jobs,null}15/03/06 12:35:39 INFO ui.SparkUI: > Stopped Spark web UI at http://hadoopdev01.opsdatastore.com:404015/03/06 > 12:35:39 INFO scheduler.DAGScheduler: Stopping DAGScheduler15/03/06 12:35:39 > INFO cluster.YarnClientSchedulerBackend: Shutting down all executors15/03/06 > 12:35:39 INFO cluster.YarnClientSchedulerBackend: Asking each executor to > shut down15/03/06 12:35:39 INFO cluster.YarnClientSchedulerBackend: > Stopped15/03/06 12:35:40 INFO spark.MapOutputTrackerMasterActor: > MapOutputTrackerActor stopped!15/03/06 12:35:40 INFO storage.MemoryStore: > MemoryStore cleared15/03/06 12:35:40 INFO storage.BlockManager: BlockManager > stopped15/03/06 12:35:40 INFO storage.BlockManagerMaster: BlockManagerMaster > stopped15/03/06 12:35:40 INFO spark.SparkContext: Successfully stopped > SparkContext15/03/06 12:35:40 INFO > remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote > daemon.15/03/06 12:35:40 INFO > remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; > proceeding with flushing remote transports.15/03/06 12:35:40 INFO > remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. > > > > Thanks again for the help. > > -Todd > > On Thu, Mar 5, 2015 at 7:06 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > >> In addition, you may need following patch if it is not in 1.2.1 to solve >> some system property issue if you use HDP 2.2. >> >> https://github.com/apache/spark/pull/3409 >> >> You can follow the following link to set hdp.version for java options. >> >> http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/ >> >> Thanks. >> >> Zhan Zhang >> >> On Mar 5, 2015, at 11:09 AM, Marcelo Vanzin <van...@cloudera.com> >> wrote: >> >> It seems from the excerpt below that your cluster is set up to use the >> Yarn ATS, and the code is failing in that path. I think you'll need to >> apply the following patch to your Spark sources if you want this to >> work: >> >> https://github.com/apache/spark/pull/3938 >> >> On Thu, Mar 5, 2015 at 10:04 AM, Todd Nist <tsind...@gmail.com> wrote: >> >> >> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166) >> at >> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) >> at >> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:65) >> at >> >> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) >> at >> >> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) >> at org.apache.spark.SparkContext.<init>(SparkContext.scala:348) >> >> >> -- >> Marcelo >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > >