forward you this mails, hope these can help you, you can take a look
at this post 
http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html

2016-03-04 3:30 GMT+01:00 Divya Gehlot <[email protected]>:
> Hi Teng,
>
> Thanks for the link you shared , helped me figure out the missing
> dependency.
> Was missing hbase-hadoop-compat.jar
>
>
>
>
>
> Thanks a lot,
>
> Divya
>
> On 2 March 2016 at 17:05, Teng Qiu <[email protected]> wrote:
>>
>> Hi, maybe the dependencies described in
>> http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html
>> can help, add hive-hbase handler jar as well for HiveIntegration in
>> spark
>>
>> 2016-03-02 2:19 GMT+01:00 Divya Gehlot <[email protected]>:
>> > Hello Teng,
>> > As you could see in chain email.
>> > I am facing lots of  issues while trying to connect to hbase  registered
>> > hive table.
>> > Could your pls help me with the list of jars which needs to be place in
>> > spark classpath?
>> > Would be very grateful you could send me the steps to follow .
>> > Would really appreciate the help.
>> > Thanks,
>> > Divya
>> >
>> > On Mar 2, 2016 4:50 AM, "Teng Qiu" <[email protected]> wrote:
>> >>
>> >> and also make sure that hbase-site.xml is set in your classpath on all
>> >> nodes, both master and workers, and also client.
>> >>
>> >> normally i put it into $SPARK_HOME/conf/ then the spark cluster will
>> >> be started with this conf file.
>> >>
>> >> btw. @Ted, did you tried insert into hbase table with spark's
>> >> HiveContext? i got this issue:
>> >> https://issues.apache.org/jira/browse/SPARK-6628
>> >>
>> >> and there is a patch available:
>> >> https://issues.apache.org/jira/browse/HIVE-11166
>> >>
>> >>
>> >> 2016-03-01 15:16 GMT+01:00 Ted Yu <[email protected]>:
>> >> > 16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0
>> >> > (TID
>> >> > 0,
>> >> > ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal):
>> >> > java.lang.RuntimeException: hbase-default.xml file seems to be for an
>> >> > older
>> >> > version of HBase (null), this version is 1.1.2.2.3.4.0-3485
>> >> >
>> >> > The above was likely caused by some component being built with
>> >> > different
>> >> > release of hbase.
>> >> >
>> >> > Try setting "hbase.defaults.for.version.skip" to true.
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> > On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <[email protected]> wrote:
>> >> >>
>> >> >> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >> connectString=localhost:2181 sessionTimeout=90000
>> >> >> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >> baseZNode=/hbase
>> >> >>
>> >> >> Since baseZNode didn't match what you set in hbase-site.xml, the
>> >> >> cause
>> >> >> was
>> >> >> likely that hbase-site.xml being inaccessible to your Spark job.
>> >> >>
>> >> >> Please add it in your classpath.
>> >> >>
>> >> >> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <[email protected]> wrote:
>> >> >>>
>> >> >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>> server
>> >> >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>> using
>> >> >>> SASL
>> >> >>> (unknown error)
>> >> >>>
>> >> >>> Is your cluster secure cluster ?
>> >> >>>
>> >> >>> bq. Trace :
>> >> >>>
>> >> >>> Was there any output after 'Trace :' ?
>> >> >>>
>> >> >>> Was hbase-site.xml accessible to your Spark job ?
>> >> >>>
>> >> >>> Thanks
>> >> >>>
>> >> >>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot
>> >> >>> <[email protected]>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hi,
>> >> >>>> I am getting error when I am trying to connect hive table (which
>> >> >>>> is
>> >> >>>> being created through HbaseIntegration) in spark
>> >> >>>>
>> >> >>>> Steps I followed :
>> >> >>>> Hive Table creation code  :
>> >> >>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> >> >>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> >> >>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> >> >>>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> >> >>>> "hbase.mapred.output.outputtable" = "TEST");
>> >> >>>>
>> >> >>>>
>> >> >>>> DESCRIBE TEST ;
>> >> >>>> col_name    data_type    comment
>> >> >>>> name            string         from deserializer
>> >> >>>> age               int             from deserializer
>> >> >>>>
>> >> >>>>
>> >> >>>> Spark Code :
>> >> >>>> import org.apache.spark._
>> >> >>>> import org.apache.spark.sql._
>> >> >>>>
>> >> >>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> >> >>>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>> >> >>>>
>> >> >>>>
>> >> >>>> Starting Spark shell
>> >> >>>> spark-shell --jars
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --driver-class-path
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --packages com.databricks:spark-csv_2.10:1.3.0  --master
>> >> >>>> yarn-client
>> >> >>>> -i
>> >> >>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>> >> >>>>
>> >> >>>> Stack Trace :
>> >> >>>>
>> >> >>>>> Stack SQL context available as sqlContext.
>> >> >>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> >> >>>>> import org.apache.spark._
>> >> >>>>> import org.apache.spark.sql._
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive,
>> >> >>>>> version 1.2.1
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> >> >>>>> /user/hive/warehouse
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing
>> >> >>>>> HiveMetastoreConnection version 1.2.1 using Spark classes.
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load
>> >> >>>>> native-hadoop
>> >> >>>>> library for your platform... using builtin-java classes where
>> >> >>>>> applicable
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore
>> >> >>>>> with
>> >> >>>>> URI
>> >> >>>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> >> >>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit
>> >> >>>>> local
>> >> >>>>> reads feature cannot be used because libhadoop cannot be loaded.
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> >> >>>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> >> >>>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST
>> >> >>>>> SELECT
>> >> >>>>> NAME
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> >> >>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is
>> >> >>>>> deprecated.
>> >> >>>>> Instead, use mapreduce.job.maps
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352)
>> >> >>>>> called
>> >> >>>>> with
>> >> >>>>> curMem=0, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as
>> >> >>>>> values
>> >> >>>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called
>> >> >>>>> with
>> >> >>>>> curMem=468352, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0
>> >> >>>>> stored
>> >> >>>>> as
>> >> >>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0
>> >> >>>>> in
>> >> >>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from
>> >> >>>>> collect
>> >> >>>>> at <console>:30
>> >> >>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> >> >>>>> properties
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015
>> >> >>>>> 02:35 GMT
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:host.name=ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.version=1.7.0_67
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.vendor=Oracle
>> >> >>>>> Corporation
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.io.tmpdir=/tmp
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.compiler=<NA>
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.name=Linux
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.arch=amd64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.version=3.10.0-229.el7.x86_64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.name=hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.home=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.dir=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20034,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an
>> >> >>>>> HTable
>> >> >>>>> instance that relies on an HBase-managed Connection. This is
>> >> >>>>> usually
>> >> >>>>> due to
>> >> >>>>> directly creating an HTable, which is deprecated. Instead, you
>> >> >>>>> should create
>> >> >>>>> a Connection object and then request a Table instance from it. If
>> >> >>>>> you don't
>> >> >>>>> need the Table instance for your own use, you should instead use
>> >> >>>>> the
>> >> >>>>> TableInputFormatBase.initalizeTable method directly.
>> >> >>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an
>> >> >>>>> additional
>> >> >>>>> unmanaged connection because user provided one can't be used for
>> >> >>>>> administrative actions. We'll close it when we close out the
>> >> >>>>> table.
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20035,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region
>> >> >>>>> sizes
>> >> >>>>> for table "TEST".
>> >> >>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=10,
>> >> >>>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=11,
>> >> >>>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=12,
>> >> >>>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=13,
>> >> >>>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=14,
>> >> >>>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=15,
>> >> >>>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=16,
>> >> >>>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=17,
>> >> >>>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> >> >>>>> Trace :
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Could somebody help me in resolving the error.
>> >> >>>> Would really appreciate the help .
>> >> >>>>
>> >> >>>>
>> >> >>>> Thanks,
>> >> >>>> Divya
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>
>

2016-04-08 8:45 GMT+02:00 Wojciech Indyk <[email protected]>:
> Hello Divya!
> Have you solved the problem?
> I suppose the log comes from driver. You need to look also at logs on
> worker JVMs, there can be an exception or something.
> Do you have Kerberos on your cluster? It could be similar to a problem
> http://issues.apache.org/jira/browse/SPARK-14115
>
> Based on your logs:
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>
> Maybe there is a problem with using RPC call to regions using IPv6
> (but I just guess).
>
> --
> Kind regards/ Pozdrawiam,
> Wojciech Indyk
> http://datacentric.pl
>
>
> 2016-03-01 5:27 GMT+01:00 Divya Gehlot <[email protected]>:
>> Hi,
>> I am getting error when I am trying to connect hive table (which is being
>> created through HbaseIntegration) in spark
>>
>> Steps I followed :
>> *Hive Table creation code  *:
>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> "hbase.mapred.output.outputtable" = "TEST");
>>
>>
>> *DESCRIBE TEST ;*
>> col_name    data_type    comment
>> name            string         from deserializer
>> age               int             from deserializer
>>
>>
>> *Spark Code :*
>> import org.apache.spark._
>> import org.apache.spark.sql._
>>
>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>
>>
>> *Starting Spark shell*
>> spark-shell --jars
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --driver-class-path
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>
>> *Stack Trace* :
>>
>> Stack SQL context available as sqlContext.
>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>> import org.apache.spark._
>>> import org.apache.spark.sql._
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>>> 1.2.1
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>> /user/hive/warehouse
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>>> version 1.2.1 using Spark classes.
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>>> feature cannot be used because libhadoop cannot be loaded.
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>> Instead, use mapreduce.job.maps
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>> curMem=0, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>>> memory (estimated size 457.4 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>> curMem=468352, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>>> <console>:30
>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>> properties
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>> Corporation
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:os.version=3.10.0-229.el7.x86_64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>> instance that relies on an HBase-managed Connection. This is usually due to
>>> directly creating an HTable, which is deprecated. Instead, you should
>>> create a Connection object and then request a Table instance from it. If
>>> you don't need the Table instance for your own use, you should instead use
>>> the TableInputFormatBase.initalizeTable method directly.
>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>> unmanaged connection because user provided one can't be used for
>>> administrative actions. We'll close it when we close out the table.
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>>> table "TEST".
>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>> Trace :
>>
>>
>>
>> Could somebody help me in resolving the error.
>> Would really appreciate the help .
>>
>>
>> Thanks,
>> Divya
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

Reply via email to