16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
Since baseZNode didn't match what you set in hbase-site.xml, the cause was likely that hbase-site.xml being inaccessible to your Spark job. Please add it in your classpath. On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yuzhih...@gmail.com> wrote: > 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server > localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using > SASL (unknown error) > > Is your cluster secure cluster ? > > bq. Trace : > > Was there any output after 'Trace :' ? > > Was hbase-site.xml accessible to your Spark job ? > > Thanks > > On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot <divya.htco...@gmail.com> > wrote: > >> Hi, >> I am getting error when I am trying to connect hive table (which is being >> created through HbaseIntegration) in spark >> >> Steps I followed : >> *Hive Table creation code *: >> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT) >> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' >> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE") >> TBLPROPERTIES ("hbase.table.name" = "TEST", >> "hbase.mapred.output.outputtable" = "TEST"); >> >> >> *DESCRIBE TEST ;* >> col_name data_type comment >> name string from deserializer >> age int from deserializer >> >> >> *Spark Code :* >> import org.apache.spark._ >> import org.apache.spark.sql._ >> >> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >> hiveContext.sql("from TEST SELECT NAME").collect.foreach(println) >> >> >> *Starting Spark shell* >> spark-shell --jars >> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> --driver-class-path >> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> --packages com.databricks:spark-csv_2.10:1.3.0 --master yarn-client -i >> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala >> >> *Stack Trace* : >> >> Stack SQL context available as sqlContext. >>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala... >>> import org.apache.spark._ >>> import org.apache.spark.sql._ >>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version >>> 1.2.1 >>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is >>> /user/hive/warehouse >>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection >>> version 1.2.1 using Spark classes. >>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop >>> library for your platform... using builtin-java classes where applicable >>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with >>> URI thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083 >>> 16/02/29 23:09:30 INFO metastore: Connected to metastore. >>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local >>> reads feature cannot be used because libhadoop cannot be loaded. >>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources >>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db >>> hiveContext: org.apache.spark.sql.hive.HiveContext = >>> org.apache.spark.sql.hive.HiveContext@10b14f32 >>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT >>> NAME >>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed >>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated. >>> Instead, use mapreduce.job.maps >>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with >>> curMem=0, maxMem=556038881 >>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values >>> in memory (estimated size 457.4 KB, free 529.8 MB) >>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with >>> curMem=468352, maxMem=556038881 >>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as >>> bytes in memory (estimated size 48.3 KB, free 529.8 MB) >>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in >>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB) >>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at >>> <console>:30 >>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job >>> properties >>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper >>> ensemble=localhost:2181 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name >>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.version=1.7.0_67 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle >>> Corporation >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/ >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA> >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:os.version=3.10.0-229.el7.x86_64 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >>> connectString=localhost:2181 sessionTimeout=90000 >>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL >>> (unknown error) >>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to >>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on >>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034, >>> negotiated timeout = 40000 >>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable >>> instance that relies on an HBase-managed Connection. This is usually due to >>> directly creating an HTable, which is deprecated. Instead, you should >>> create a Connection object and then request a Table instance from it. If >>> you don't need the Table instance for your own use, you should instead use >>> the TableInputFormatBase.initalizeTable method directly. >>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional >>> unmanaged connection because user provided one can't be used for >>> administrative actions. We'll close it when we close out the table. >>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper >>> ensemble=localhost:2181 >>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >>> connectString=localhost:2181 sessionTimeout=90000 >>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL >>> (unknown error) >>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to >>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on >>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035, >>> negotiated timeout = 40000 >>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes >>> for table "TEST". >>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10, >>> retries=35, started=48318 ms ago, cancelled=false, msg= >>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11, >>> retries=35, started=68524 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12, >>> retries=35, started=88617 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13, >>> retries=35, started=108676 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14, >>> retries=35, started=128747 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15, >>> retries=35, started=148938 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16, >>> retries=35, started=168942 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17, >>> retries=35, started=188975 ms ago, cancelled=false, msg= >>> Trace : >> >> >> >> Could somebody help me in resolving the error. >> Would really appreciate the help . >> >> >> Thanks, >> Divya >> >> >> >> >> >> >