forward you this mails, hope these can help you, you can take a look at this post http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html
2016-03-04 3:30 GMT+01:00 Divya Gehlot <[email protected]>: > Hi Teng, > > Thanks for the link you shared , helped me figure out the missing > dependency. > Was missing hbase-hadoop-compat.jar > > > > > > Thanks a lot, > > Divya > > On 2 March 2016 at 17:05, Teng Qiu <[email protected]> wrote: >> >> Hi, maybe the dependencies described in >> http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html >> can help, add hive-hbase handler jar as well for HiveIntegration in >> spark >> >> 2016-03-02 2:19 GMT+01:00 Divya Gehlot <[email protected]>: >> > Hello Teng, >> > As you could see in chain email. >> > I am facing lots of issues while trying to connect to hbase registered >> > hive table. >> > Could your pls help me with the list of jars which needs to be place in >> > spark classpath? >> > Would be very grateful you could send me the steps to follow . >> > Would really appreciate the help. >> > Thanks, >> > Divya >> > >> > On Mar 2, 2016 4:50 AM, "Teng Qiu" <[email protected]> wrote: >> >> >> >> and also make sure that hbase-site.xml is set in your classpath on all >> >> nodes, both master and workers, and also client. >> >> >> >> normally i put it into $SPARK_HOME/conf/ then the spark cluster will >> >> be started with this conf file. >> >> >> >> btw. @Ted, did you tried insert into hbase table with spark's >> >> HiveContext? i got this issue: >> >> https://issues.apache.org/jira/browse/SPARK-6628 >> >> >> >> and there is a patch available: >> >> https://issues.apache.org/jira/browse/HIVE-11166 >> >> >> >> >> >> 2016-03-01 15:16 GMT+01:00 Ted Yu <[email protected]>: >> >> > 16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0 >> >> > (TID >> >> > 0, >> >> > ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal): >> >> > java.lang.RuntimeException: hbase-default.xml file seems to be for an >> >> > older >> >> > version of HBase (null), this version is 1.1.2.2.3.4.0-3485 >> >> > >> >> > The above was likely caused by some component being built with >> >> > different >> >> > release of hbase. >> >> > >> >> > Try setting "hbase.defaults.for.version.skip" to true. >> >> > >> >> > Cheers >> >> > >> >> > >> >> > On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <[email protected]> wrote: >> >> >> >> >> >> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >> >> >> connectString=localhost:2181 sessionTimeout=90000 >> >> >> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, >> >> >> baseZNode=/hbase >> >> >> >> >> >> Since baseZNode didn't match what you set in hbase-site.xml, the >> >> >> cause >> >> >> was >> >> >> likely that hbase-site.xml being inaccessible to your Spark job. >> >> >> >> >> >> Please add it in your classpath. >> >> >> >> >> >> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <[email protected]> wrote: >> >> >>> >> >> >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to >> >> >>> server >> >> >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate >> >> >>> using >> >> >>> SASL >> >> >>> (unknown error) >> >> >>> >> >> >>> Is your cluster secure cluster ? >> >> >>> >> >> >>> bq. Trace : >> >> >>> >> >> >>> Was there any output after 'Trace :' ? >> >> >>> >> >> >>> Was hbase-site.xml accessible to your Spark job ? >> >> >>> >> >> >>> Thanks >> >> >>> >> >> >>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot >> >> >>> <[email protected]> >> >> >>> wrote: >> >> >>>> >> >> >>>> Hi, >> >> >>>> I am getting error when I am trying to connect hive table (which >> >> >>>> is >> >> >>>> being created through HbaseIntegration) in spark >> >> >>>> >> >> >>>> Steps I followed : >> >> >>>> Hive Table creation code : >> >> >>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT) >> >> >>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' >> >> >>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE") >> >> >>>> TBLPROPERTIES ("hbase.table.name" = "TEST", >> >> >>>> "hbase.mapred.output.outputtable" = "TEST"); >> >> >>>> >> >> >>>> >> >> >>>> DESCRIBE TEST ; >> >> >>>> col_name data_type comment >> >> >>>> name string from deserializer >> >> >>>> age int from deserializer >> >> >>>> >> >> >>>> >> >> >>>> Spark Code : >> >> >>>> import org.apache.spark._ >> >> >>>> import org.apache.spark.sql._ >> >> >>>> >> >> >>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >> >> >>>> hiveContext.sql("from TEST SELECT NAME").collect.foreach(println) >> >> >>>> >> >> >>>> >> >> >>>> Starting Spark shell >> >> >>>> spark-shell --jars >> >> >>>> >> >> >>>> >> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> >> >>>> --driver-class-path >> >> >>>> >> >> >>>> >> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> >> >>>> --packages com.databricks:spark-csv_2.10:1.3.0 --master >> >> >>>> yarn-client >> >> >>>> -i >> >> >>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala >> >> >>>> >> >> >>>> Stack Trace : >> >> >>>> >> >> >>>>> Stack SQL context available as sqlContext. >> >> >>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala... >> >> >>>>> import org.apache.spark._ >> >> >>>>> import org.apache.spark.sql._ >> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, >> >> >>>>> version 1.2.1 >> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >> >> >>>>> 2.7.1.2.3.4.0-3485 >> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >> >> >>>>> 2.7.1.2.3.4.0-3485 >> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is >> >> >>>>> /user/hive/warehouse >> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing >> >> >>>>> HiveMetastoreConnection version 1.2.1 using Spark classes. >> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >> >> >>>>> 2.7.1.2.3.4.0-3485 >> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >> >> >>>>> 2.7.1.2.3.4.0-3485 >> >> >>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load >> >> >>>>> native-hadoop >> >> >>>>> library for your platform... using builtin-java classes where >> >> >>>>> applicable >> >> >>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore >> >> >>>>> with >> >> >>>>> URI >> >> >>>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083 >> >> >>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore. >> >> >>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit >> >> >>>>> local >> >> >>>>> reads feature cannot be used because libhadoop cannot be loaded. >> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >> >> >>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources >> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >> >> >>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db >> >> >>>>> hiveContext: org.apache.spark.sql.hive.HiveContext = >> >> >>>>> org.apache.spark.sql.hive.HiveContext@10b14f32 >> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST >> >> >>>>> SELECT >> >> >>>>> NAME >> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed >> >> >>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is >> >> >>>>> deprecated. >> >> >>>>> Instead, use mapreduce.job.maps >> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) >> >> >>>>> called >> >> >>>>> with >> >> >>>>> curMem=0, maxMem=556038881 >> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as >> >> >>>>> values >> >> >>>>> in memory (estimated size 457.4 KB, free 529.8 MB) >> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called >> >> >>>>> with >> >> >>>>> curMem=468352, maxMem=556038881 >> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 >> >> >>>>> stored >> >> >>>>> as >> >> >>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB) >> >> >>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 >> >> >>>>> in >> >> >>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB) >> >> >>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from >> >> >>>>> collect >> >> >>>>> at <console>:30 >> >> >>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job >> >> >>>>> properties >> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >> >> >>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper >> >> >>>>> ensemble=localhost:2181 >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 >> >> >>>>> 02:35 GMT >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> >> >> >>>>> >> >> >>>>> environment:host.name=ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:java.version=1.7.0_67 >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:java.vendor=Oracle >> >> >>>>> Corporation >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> >> >> >>>>> >> >> >>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/ >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> >> >> >>>>> >> >> >>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:java.io.tmpdir=/tmp >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:java.compiler=<NA> >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:os.name=Linux >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:os.arch=amd64 >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:os.version=3.10.0-229.el7.x86_64 >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:user.name=hdfs >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:user.home=/home/hdfs >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client >> >> >>>>> environment:user.dir=/home/hdfs >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >> >> >>>>> connectString=localhost:2181 sessionTimeout=90000 >> >> >>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, >> >> >>>>> baseZNode=/hbase >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to >> >> >>>>> server >> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate >> >> >>>>> using SASL >> >> >>>>> (unknown error) >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established >> >> >>>>> to >> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete >> >> >>>>> on >> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = >> >> >>>>> 0x3532fb70ba20034, >> >> >>>>> negotiated timeout = 40000 >> >> >>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an >> >> >>>>> HTable >> >> >>>>> instance that relies on an HBase-managed Connection. This is >> >> >>>>> usually >> >> >>>>> due to >> >> >>>>> directly creating an HTable, which is deprecated. Instead, you >> >> >>>>> should create >> >> >>>>> a Connection object and then request a Table instance from it. If >> >> >>>>> you don't >> >> >>>>> need the Table instance for your own use, you should instead use >> >> >>>>> the >> >> >>>>> TableInputFormatBase.initalizeTable method directly. >> >> >>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an >> >> >>>>> additional >> >> >>>>> unmanaged connection because user provided one can't be used for >> >> >>>>> administrative actions. We'll close it when we close out the >> >> >>>>> table. >> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >> >> >>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper >> >> >>>>> ensemble=localhost:2181 >> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >> >> >>>>> connectString=localhost:2181 sessionTimeout=90000 >> >> >>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, >> >> >>>>> baseZNode=/hbase >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to >> >> >>>>> server >> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate >> >> >>>>> using SASL >> >> >>>>> (unknown error) >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established >> >> >>>>> to >> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete >> >> >>>>> on >> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = >> >> >>>>> 0x3532fb70ba20035, >> >> >>>>> negotiated timeout = 40000 >> >> >>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region >> >> >>>>> sizes >> >> >>>>> for table "TEST". >> >> >>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=10, >> >> >>>>> retries=35, started=48318 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=11, >> >> >>>>> retries=35, started=68524 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=12, >> >> >>>>> retries=35, started=88617 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=13, >> >> >>>>> retries=35, started=108676 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=14, >> >> >>>>> retries=35, started=128747 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=15, >> >> >>>>> retries=35, started=148938 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=16, >> >> >>>>> retries=35, started=168942 ms ago, cancelled=false, msg= >> >> >>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, >> >> >>>>> tries=17, >> >> >>>>> retries=35, started=188975 ms ago, cancelled=false, msg= >> >> >>>>> Trace : >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> Could somebody help me in resolving the error. >> >> >>>> Would really appreciate the help . >> >> >>>> >> >> >>>> >> >> >>>> Thanks, >> >> >>>> Divya >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>> >> >> >> >> >> > > > 2016-04-08 8:45 GMT+02:00 Wojciech Indyk <[email protected]>: > Hello Divya! > Have you solved the problem? > I suppose the log comes from driver. You need to look also at logs on > worker JVMs, there can be an exception or something. > Do you have Kerberos on your cluster? It could be similar to a problem > http://issues.apache.org/jira/browse/SPARK-14115 > > Based on your logs: >> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server >> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL >> (unknown error) >> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to >> localhost/0:0:0:0:0:0:0:1:2181, initiating session >> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on >> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035, > > Maybe there is a problem with using RPC call to regions using IPv6 > (but I just guess). > > -- > Kind regards/ Pozdrawiam, > Wojciech Indyk > http://datacentric.pl > > > 2016-03-01 5:27 GMT+01:00 Divya Gehlot <[email protected]>: >> Hi, >> I am getting error when I am trying to connect hive table (which is being >> created through HbaseIntegration) in spark >> >> Steps I followed : >> *Hive Table creation code *: >> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT) >> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' >> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE") >> TBLPROPERTIES ("hbase.table.name" = "TEST", >> "hbase.mapred.output.outputtable" = "TEST"); >> >> >> *DESCRIBE TEST ;* >> col_name data_type comment >> name string from deserializer >> age int from deserializer >> >> >> *Spark Code :* >> import org.apache.spark._ >> import org.apache.spark.sql._ >> >> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >> hiveContext.sql("from TEST SELECT NAME").collect.foreach(println) >> >> >> *Starting Spark shell* >> spark-shell --jars >> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> --driver-class-path >> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar >> --packages com.databricks:spark-csv_2.10:1.3.0 --master yarn-client -i >> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala >> >> *Stack Trace* : >> >> Stack SQL context available as sqlContext. >>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala... >>> import org.apache.spark._ >>> import org.apache.spark.sql._ >>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version >>> 1.2.1 >>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is >>> /user/hive/warehouse >>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection >>> version 1.2.1 using Spark classes. >>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version: >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded >>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version >>> 2.7.1.2.3.4.0-3485 >>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop >>> library for your platform... using builtin-java classes where applicable >>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI >>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083 >>> 16/02/29 23:09:30 INFO metastore: Connected to metastore. >>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads >>> feature cannot be used because libhadoop cannot be loaded. >>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources >>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >>> 16/02/29 23:09:31 INFO SessionState: Created local directory: >>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16 >>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory: >>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db >>> hiveContext: org.apache.spark.sql.hive.HiveContext = >>> org.apache.spark.sql.hive.HiveContext@10b14f32 >>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT NAME >>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed >>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated. >>> Instead, use mapreduce.job.maps >>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with >>> curMem=0, maxMem=556038881 >>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in >>> memory (estimated size 457.4 KB, free 529.8 MB) >>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with >>> curMem=468352, maxMem=556038881 >>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as >>> bytes in memory (estimated size 48.3 KB, free 529.8 MB) >>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in >>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB) >>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at >>> <console>:30 >>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job >>> properties >>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper >>> ensemble=localhost:2181 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name >>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle >>> Corporation >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/ >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA> >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client >>> environment:os.version=3.10.0-229.el7.x86_64 >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs >>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >>> connectString=localhost:2181 sessionTimeout=90000 >>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL >>> (unknown error) >>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to >>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on >>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034, >>> negotiated timeout = 40000 >>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable >>> instance that relies on an HBase-managed Connection. This is usually due to >>> directly creating an HTable, which is deprecated. Instead, you should >>> create a Connection object and then request a Table instance from it. If >>> you don't need the Table instance for your own use, you should instead use >>> the TableInputFormatBase.initalizeTable method directly. >>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional >>> unmanaged connection because user provided one can't be used for >>> administrative actions. We'll close it when we close out the table. >>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process >>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper >>> ensemble=localhost:2181 >>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, >>> connectString=localhost:2181 sessionTimeout=90000 >>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL >>> (unknown error) >>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to >>> localhost/0:0:0:0:0:0:0:1:2181, initiating session >>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on >>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035, >>> negotiated timeout = 40000 >>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for >>> table "TEST". >>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10, >>> retries=35, started=48318 ms ago, cancelled=false, msg= >>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11, >>> retries=35, started=68524 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12, >>> retries=35, started=88617 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13, >>> retries=35, started=108676 ms ago, cancelled=false, msg= >>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14, >>> retries=35, started=128747 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15, >>> retries=35, started=148938 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16, >>> retries=35, started=168942 ms ago, cancelled=false, msg= >>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17, >>> retries=35, started=188975 ms ago, cancelled=false, msg= >>> Trace : >> >> >> >> Could somebody help me in resolving the error. >> Would really appreciate the help . >> >> >> Thanks, >> Divya > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] >
