Hi Todd, Looks like the thrift server can connect to metastore, but something wrong in the executors. You can try to get the log with "yarn logs -applicationID xxx” to check why it failed. If there is no log (master or executor is not started at all), you can go to the RM webpage, click the link to see why the shell failed in the first place.
Thanks. Zhan Zhang On Mar 6, 2015, at 9:59 AM, Todd Nist <tsind...@gmail.com<mailto:tsind...@gmail.com>> wrote: First, thanks to everyone for their assistance and recommendations. @Marcelo I applied the patch that you recommended and am now able to get into the shell, thank you worked great after I realized that the pom was pointing to the 1.3.0-SNAPSHOT for parent, need to bump that down to 1.2.1. @Zhan Need to apply this patch next. I tried to start the spark-thriftserver but and it starts, then fails with like this: I have the entries in my spark-default.conf, but not the patch applied. ./sbin/start-thriftserver.sh --master yarn --executor-memory 1024m --hiveconf hive.server2.thrift.port=10001 5/03/06 12:34:17 INFO ui.SparkUI: Started SparkUI at http://hadoopdev01<http://hadoopdev01/>.opsdatastore.com:4040 15/03/06 12:34:18 INFO impl.TimelineClientImpl: Timeline service address: http://hadoopdev02<http://hadoopdev02/>.opsdatastore.com:8188/ws/v1/timeline/ 15/03/06 12:34:18 INFO client.RMProxy: Connecting to ResourceManager at hadoopdev02.opsdatastore.com/192.168.15.154:8050 15/03/06 12:34:18 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers 15/03/06 12:34:18 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/06 12:34:18 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/06 12:34:18 INFO yarn.Client: Setting up container launch context for our AM 15/03/06 12:34:18 INFO yarn.Client: Preparing resources for our AM container 15/03/06 12:34:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 15/03/06 12:34:19 INFO yarn.Client: Uploading resource file:/root/spark-1.2.1-bin-hadoop2.6/lib/spark-assembly-1.2.1-hadoop2.6.0.jar -> hdfs://hadoopdev01.opsdatastore.com:8020/user/root/.sparkStaging/application_1425078697953_0018/spark-assembly-1.2.1-hadoop2.6.0.jar 15/03/06 12:34:21 INFO yarn.Client: Setting up the launch environment for our AM container 15/03/06 12:34:21 INFO spark.SecurityManager: Changing view acls to: root 15/03/06 12:34:21 INFO spark.SecurityManager: Changing modify acls to: root 15/03/06 12:34:21 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 15/03/06 12:34:21 INFO yarn.Client: Submitting application 18 to ResourceManager 15/03/06 12:34:21 INFO impl.YarnClientImpl: Submitted application application_1425078697953_0018 15/03/06 12:34:22 INFO yarn.Client: Application report for application_1425078697953_0018 (state: ACCEPTED) 15/03/06 12:34:22 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1425663261755 final status: UNDEFINED tracking URL: http://hadoopdev02<http://hadoopdev02/>.opsdatastore.com:8088/proxy/application_1425078697953_0018/ user: root 15/03/06 12:34:23 INFO yarn.Client: Application report for application_1425078697953_0018 (state: ACCEPTED) 15/03/06 12:34:24 INFO yarn.Client: Application report for application_1425078697953_0018 (state: ACCEPTED) 15/03/06 12:34:25 INFO yarn.Client: Application report for application_1425078697953_0018 (state: ACCEPTED) 15/03/06 12:34:26 INFO yarn.Client: Application report for application_1425078697953_0018 (state: ACCEPTED) 15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:40201/user/YarnAM#-557112763] 15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES -> http://hadoopdev02<http://hadoopdev02/>.opsdatastore.com:8088/proxy/application_1425078697953_0018), /proxy/application_1425078697953_0018 15/03/06 12:34:27 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/03/06 12:34:27 INFO yarn.Client: Application report for application_1425078697953_0018 (state: RUNNING) 15/03/06 12:34:27 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: hadoopdev08.opsdatastore.com ApplicationMaster RPC port: 0 queue: default start time: 1425663261755 final status: UNDEFINED tracking URL: http://hadoopdev02<http://hadoopdev02/>.opsdatastore.com:8088/proxy/application_1425078697953_0018/ user: root 15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend: Application application_1425078697953_0018 has started running. 15/03/06 12:34:28 INFO netty.NettyBlockTransferService: Server created on 46124 15/03/06 12:34:28 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/03/06 12:34:28 INFO storage.BlockManagerMasterActor: Registering block manager hadoopdev01.opsdatastore.com:46124 with 265.4 MB RAM, BlockManagerId(<driver>, hadoopdev01.opsdatastore.com, 46124) 15/03/06 12:34:28 INFO storage.BlockManagerMaster: Registered BlockManager 15/03/06 12:34:47 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 15/03/06 12:34:48 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoopdev03.opsdatastore.com:9083 15/03/06 12:34:48 INFO hive.metastore: Connected to metastore. 15/03/06 12:34:49 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr. 15/03/06 12:34:49 INFO service.AbstractService: HiveServer2: Async execution pool size 100 15/03/06 12:34:49 INFO service.AbstractService: Service:OperationManager is inited. 15/03/06 12:34:49 INFO service.AbstractService: Service: SessionManager is inited. 15/03/06 12:34:49 INFO service.AbstractService: Service: CLIService is inited. 15/03/06 12:34:49 INFO service.AbstractService: Service:ThriftBinaryCLIService is inited. 15/03/06 12:34:49 INFO service.AbstractService: Service: HiveServer2 is inited. 15/03/06 12:34:49 INFO service.AbstractService: Service:OperationManager is started. 15/03/06 12:34:49 INFO service.AbstractService: Service:SessionManager is started. 15/03/06 12:34:49 INFO service.AbstractService: Service:CLIService is started. 15/03/06 12:34:49 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoopdev03.opsdatastore.com:9083 15/03/06 12:34:49 INFO hive.metastore: Connected to metastore. 15/03/06 12:34:49 INFO service.AbstractService: Service:ThriftBinaryCLIService is started. 15/03/06 12:34:49 INFO service.AbstractService: Service:HiveServer2 is started. 15/03/06 12:34:49 INFO thriftserver.HiveThriftServer2: HiveThriftServer2 started 15/03/06 12:34:49 INFO thrift.ThriftCLIService: ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:10001 15/03/06 12:34:58 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:40201] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 15/03/06 12:35:02 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:53176/user/YarnAM#-1793579186] 15/03/06 12:35:02 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES -> http://hadoopdev02<http://hadoopdev02/>.opsdatastore.com:8088/proxy/application_1425078697953_0018), /proxy/application_1425078697953_0018 15/03/06 12:35:02 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/03/06 12:35:38 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkyar...@hadoopdev08.opsdatastore.com:53176] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 15/03/06 12:35:39 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null} 15/03/06 12:35:39 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null} 15/03/06 12:35:39 INFO ui.SparkUI: Stopped Spark web UI at http://hadoopdev01<http://hadoopdev01/>.opsdatastore.com:4040 15/03/06 12:35:39 INFO scheduler.DAGScheduler: Stopping DAGScheduler 15/03/06 12:35:39 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 15/03/06 12:35:39 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down 15/03/06 12:35:39 INFO cluster.YarnClientSchedulerBackend: Stopped 15/03/06 12:35:40 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 15/03/06 12:35:40 INFO storage.MemoryStore: MemoryStore cleared 15/03/06 12:35:40 INFO storage.BlockManager: BlockManager stopped 15/03/06 12:35:40 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 15/03/06 12:35:40 INFO spark.SparkContext: Successfully stopped SparkContext 15/03/06 12:35:40 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 15/03/06 12:35:40 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 15/03/06 12:35:40 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. Thanks again for the help. -Todd On Thu, Mar 5, 2015 at 7:06 PM, Zhan Zhang <zzh...@hortonworks.com<mailto:zzh...@hortonworks.com>> wrote: In addition, you may need following patch if it is not in 1.2.1 to solve some system property issue if you use HDP 2.2. https://github.com/apache/spark/pull/3409 You can follow the following link to set hdp.version for java options. http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/ Thanks. Zhan Zhang On Mar 5, 2015, at 11:09 AM, Marcelo Vanzin <van...@cloudera.com<mailto:van...@cloudera.com>> wrote: It seems from the excerpt below that your cluster is set up to use the Yarn ATS, and the code is failing in that path. I think you'll need to apply the following patch to your Spark sources if you want this to work: https://github.com/apache/spark/pull/3938 On Thu, Mar 5, 2015 at 10:04 AM, Todd Nist <tsind...@gmail.com<mailto:tsind...@gmail.com>> wrote: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:65) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) at org.apache.spark.SparkContext.<init>(SparkContext.scala:348) -- Marcelo --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>