Great find. I agree with upgrading storm-hive to newer hive. Maybe even overdue. Would be great if you can provide the PRs. Thanks Roshan
Sent from Yahoo Mail for iPhone On Tuesday, June 12, 2018, 3:47 AM, Abhishek Raj <[email protected]> wrote: Circling back to this, I was able to figure out what the problem was. storm-hive dependencies are compiled with hive version 0.14.0 which is very old. We are using hive 2.3.2 so obviously there were a lot of differences between the two versions. The fix was to override storm-hive's hive dependencies in our pom with a newer version. To be more verbose, we copied the following into our pom. | <dependency> | | | <groupId>org.apache.hive.hcatalog</groupId> | | | <artifactId>hive-hcatalog-streaming</artifactId> | | | <version>2.3.2</version> | | | <exclusions> | | | <exclusion> | | | <groupId>org.slf4j</groupId> | | | <artifactId>slf4j-log4j12</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-core</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-avatica</artifactId> | | | </exclusion> | | | </exclusions> | | | </dependency> | | | | | | <dependency> | | | <groupId>org.apache.hive.hcatalog</groupId> | | | <artifactId>hive-hcatalog-core</artifactId> | | | <version>2.3.2</version> | | | <exclusions> | | | <exclusion> | | | <groupId>org.slf4j</groupId> | | | <artifactId>slf4j-log4j12</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-avatica</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-core</artifactId> | | | </exclusion> | | | </exclusions> | | | </dependency> | | | <dependency> | | | <groupId>org.apache.hive</groupId> | | | <artifactId>hive-cli</artifactId> | | | <version>2.3.2</version> | | | <exclusions> | | | <exclusion> | | | <groupId>org.slf4j</groupId> | | | <artifactId>slf4j-log4j12</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-core</artifactId> | | | </exclusion> | | | <exclusion> | | | <groupId>org.apache.calcite</groupId> | | | <artifactId>calcite-avatica</artifactId> | | | </exclusion> | | | </exclusions> | | | </dependency> | To go a little more deeper about the problem, this is what changed between the two hive versions giving us the "Unexpected DataOperationType: UNSET". createLockRequest in 2.3.2 explicitly passes a operation type "INSERT" while acquiring a lock, but in 0.14.0 no operation type is being passed. So the operation type ends up defaulting to UNSET and throwing an error. This is the commit where the change occurred. In light of the fact that there are several different threads with users facing the same issue, imho, storm-hive's hive dependencies should be updated with newer hive releases and there should be a way for users to explicitly specify which hive release they want to use storm-hive with. The documentation for storm-hive should also be updated to reflect this requirement. Happy to provide prs if that sounds like a good idea. Thanks. On Fri, Jun 8, 2018 at 3:21 PM, Abhishek Raj <[email protected]> wrote: Hi. We faced a similar problem earlier when trying HiveBolt in storm with hive on emr. We were seeing java.lang.IllegalStateExceptio n: Unexpected DataOperationType: UNSET agentInfo=Unknown txnid:130551 in hive logs. Any help here would be appreciated. On Fri, Jun 8, 2018 at 10:26 AM, Milind Vaidya <[email protected]> wrote: Here are some details from the meta store logs: 018-06-08T03:34:20,634 ERROR [pool-13-thread-197([])]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invok eInternal(204)) - java.lang.IllegalStateExceptio n: Unexpected DataOperationType: UNSET agentInfo=Unknown txnid:130551 at org.apache.hadoop.hive.metasto re.txn.TxnHandler.enqueueLockW ithRetry(TxnHandler.java:1000) at org.apache.hadoop.hive.metasto re.txn.TxnHandler.lock(TxnHand ler.java:872) at org.apache.hadoop.hive.metasto re.HiveMetaStore$HMSHandler. lock(HiveMetaStore.java:6366) at sun.reflect.GeneratedMethodAcc essor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAc cessorImpl.invoke(DelegatingMe thodAccessorImpl.java:43) at java.lang.reflect.Method.invok e(Method.java:498) at org.apache.hadoop.hive.metasto re.RetryingHMSHandler.invokeIn ternal(RetryingHMSHandler. java:148) at org.apache.hadoop.hive.metasto re.RetryingHMSHandler.invoke( RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy32.lock(Un known Source) at org.apache.hadoop.hive.metasto re.api.ThriftHiveMetastore$ Processor$lock.getResult(Thrif tHiveMetastore.java:14155) at org.apache.hadoop.hive.metasto re.api.ThriftHiveMetastore$ Processor$lock.getResult(Thrif tHiveMetastore.java:14139) at org.apache.thrift.ProcessFunct ion.process(ProcessFunction. java:39) at org.apache.hadoop.hive.metasto re.TUGIBasedProcessor$1.run( TUGIBasedProcessor.java:110) at org.apache.hadoop.hive.metasto re.TUGIBasedProcessor$1.run( TUGIBasedProcessor.java:106) at java.security.AccessController .doPrivileged(Native Method) at javax.security.auth.Subject.do As(Subject.java:422) at org.apache.hadoop.security.Use rGroupInformation.doAs(UserGro upInformation.java:1836) at org.apache.hadoop.hive.metasto re.TUGIBasedProcessor.process( TUGIBasedProcessor.java:118) at org.apache.thrift.server.TThre adPoolServer$WorkerProcess. run(TThreadPoolServer.java: 286) at java.util.concurrent.ThreadPoo lExecutor.runWorker(ThreadPool Executor.java:1149) at java.util.concurrent.ThreadPoo lExecutor$Worker.run(ThreadPoo lExecutor.java:624) at java.lang.Thread.run(Thread.ja va:748) Here are some details about the environment Source : Storm Topology version : 1.1.1storm-hive : version 1.1.1 mvn dependency plugin show following depedencies Hive : [INFO] org.apache.hive.shims:hive-shi ms-0.23:jar:0.14.0:runtime [INFO] org.apache.hive:hive-ant:jar:0 .14.0:compile [INFO] org.apache.hive:hive-metastore :jar:0.14.0:compile [INFO] org.apache.hive:hive-shims:jar :0.14.0:compile [INFO] org.apache.hive:hive-cli:jar:0 .14.0:compile [INFO] org.apache.hive:hive-exec:jar: 0.14.0:compile [INFO] org.apache.hive.shims:hive-shi ms-common-secure:jar:0.14.0: compile [INFO] org.apache.hive.shims:hive-shi ms-common:jar:0.14.0:compile [INFO] org.apache.hive:hive-common:ja r:0.14.0:compile [INFO] org.apache.hive.shims:hive-shi ms-0.20S:jar:0.14.0:runtime [INFO] org.apache.hive.shims:hive-shi ms-0.20:jar:0.14.0:runtime [INFO] org.apache.hive.hcatalog:hive- hcatalog-streaming:jar:0.14.0: compile [INFO] org.apache.hive:hive-serde:jar :0.14.0:compile [INFO] org.apache.storm:storm-hive:ja r:1.1.1:compile [INFO] org.apache.hive:hive-service:j ar:0.14.0:compile [INFO] org.apache.hive.hcatalog:hive- hcatalog-core:jar:0.14.0:compi le Hadoop : org.apache.hadoop:hadoop-mapre duce-client-core:jar:2.6.1: compile [INFO] org.apache.hadoop:hadoop-yarn- common:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-commo n:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-mapre duce-client-jobclient:jar:2.6. 1:compile [INFO] org.apache.hadoop:hadoop-yarn- api:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-clien t:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-auth: jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-yarn- client:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-mapre duce-client-app:jar:2.6.1: compile [INFO] org.apache.hadoop:hadoop-mapre duce-client-common:jar:2.6.1: compile [INFO] org.apache.hadoop:hadoop-annot ations:jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-hdfs: jar:2.6.1:compile [INFO] org.apache.hadoop:hadoop-mapre duce-client-shuffle:jar:2.6.1: compile [INFO] org.apache.hadoop:hadoop-yarn- server-common:jar:2.6.1:compil e and HDFS [INFO] org.apache.hadoop:hadoop-hdfs: jar:2.6.1:compile Sink:Hive EMR Hadoop version : hadoop@ip-10-0-6-16 ~]$ hadoop version Hadoop 2.8.3-amzn-0 Hive version : [hadoop@ip-10-0-6-16 ~]$ hive --version Hive 2.3.2-amzn-2 Any inconsistency leading to such an error ? On Thu, Jun 7, 2018 at 7:35 PM, Roshan Naik <[email protected]> wrote: The lock issue seems to be happening on the Metastore end and surfacing via the API. Partition creation is working but the API is unable to acquire a TxnBatch from the metastore due to the lock issue. Check the hive metastore logs and see why the locks are failing. Roshan Sent from Yahoo Mail for iPhone On Thursday, June 7, 2018, 11:08 AM, Milind Vaidya <[email protected]> wrote: Hi I am using storm and strom-hive version 1.1.1 to store data directly to hive cluster. After using mvn shade plugin and overcoming few other errors I am now stuck at this point. The strange thing observed was few partitions were created but the data was not inserted. dt=17688/platform=site/country =SG/entity_id=abcd dt=17688/platform=site/country =SG/entity_id=asdlfa dt=17688/platform=site/country =SG/entity_id=asdq13 dt=17688/platform=site/country =SG/entity_id=123124 What are my debugging options here ? ( some data from log is removed intentionally) 2018-06-07 16:35:22.459 h.metastore Thread-12-users-by-song-hive-b olt-executor[5 5] [INFO] Connected to metastore. 2018-06-07 16:35:22.545 o.a.s.h.b.HiveBolt Thread-12-users-by-song-hive-b olt-executor[5 5] [ERROR] Failed to create HiveWriter for endpoint: { } org.apache.storm.hive.common.H iveWriter$ConnectFailure: Failed connecting to EndPoint {metaStoreUri='', database='', table='', partitionVals=[] } at org.apache.storm.hive.common.H iveWriter.<init>(HiveWriter.ja va:80) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveUtils.makeHiveWriter(HiveUt ils.java:50) ~[stormjar.jar:?] at org.apache.storm.hive.bolt.Hiv eBolt.getOrCreateWriter(HiveBo lt.java:262) [stormjar.jar:?] at org.apache.storm.hive.bolt.Hiv eBolt.execute(HiveBolt.java:11 2) [stormjar.jar:?] at org.apache.storm.daemon.execut or$fn__5030$tuple_action_fn__5 032.invoke(executor.clj:729) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.daemon.execut or$mk_task_receiver$fn__4951.i nvoke(executor.clj:461) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.disruptor$clo jure_handler$reify__4465.onEve nt(disruptor.clj:40) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.utils.Disrupt orQueue.consumeBatchToCursor(D isruptorQueue.java:482) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.utils.Disrupt orQueue.consumeBatchWhenAvaila ble(DisruptorQueue.java:460) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.disruptor$con sume_batch_when_available.invo ke(disruptor.clj:73) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.daemon.execut or$fn__5030$fn__5043$fn__5096. invoke(executor.clj:848) [storm-core-1.1.1.jar:1.1.1] at org.apache.storm.util$async_lo op$fn__557.invoke(util.clj:484 ) [storm-core-1.1.1.jar:1.1.1] at clojure.lang.AFn.run(AFn.java: 22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.ja va:745) [?:1.7.0_131] Caused by: org.apache.storm.hive.common.H iveWriter$TxnBatchFailure: Failed acquiring Transaction Batch from EndPoint: {metaStoreUri='', database='', table='', partitionVals=[, , , ] } at org.apache.storm.hive.common.H iveWriter.nextTxnBatch(HiveWri ter.java:264) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveWriter.<init>(HiveWriter.ja va:72) ~[stormjar.jar:?] ... 13 more Caused by: org.apache.hive.hcatalog.strea ming.TransactionError: Unable to acquire lock on { } at org.apache.hive.hcatalog.strea ming.HiveEndPoint$TransactionB atchImpl.beginNextTransactionI mpl(HiveEndPoint.java:575) ~[stormjar.jar:?] at org.apache.hive.hcatalog.strea ming.HiveEndPoint$TransactionB atchImpl.beginNextTransaction( HiveEndPoint.java:544) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveWriter.nextTxnBatch(HiveWri ter.java:259) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveWriter.<init>(HiveWriter.ja va:72) ~[stormjar.jar:?] ... 13 more Caused by: org.apache.thrift.transport.TT ransportException at org.apache.thrift.transport.TI OStreamTransport.read(TIOStrea mTransport.java:132) ~[stormjar.jar:?] at org.apache.thrift.transport.TT ransport.readAll(TTransport.ja va:84) ~[stormjar.jar:?] at org.apache.thrift.protocol.TBi naryProtocol.readAll(TBinaryPr otocol.java:378) ~[stormjar.jar:?] at org.apache.thrift.protocol.TBi naryProtocol.readI32(TBinaryPr otocol.java:297) ~[stormjar.jar:?] at org.apache.thrift.protocol.TBi naryProtocol.readMessageBegin( TBinaryProtocol.java:204) ~[stormjar.jar:?] at org.apache.thrift.TServiceClie nt.receiveBase(TServiceClient. java:69) ~[stormjar.jar:?] at org.apache.hadoop.hive.metasto re.api.ThriftHiveMetastore$Cli ent.recv_lock(ThriftHiveMetast ore.java:3781) ~[stormjar.jar:?] at org.apache.hadoop.hive.metasto re.api.ThriftHiveMetastore$Cli ent.lock(ThriftHiveMetastore. java:3768) ~[stormjar.jar:?] at org.apache.hadoop.hive.metasto re.HiveMetaStoreClient.lock(Hi veMetaStoreClient.java:1736) ~[stormjar.jar:?] at org.apache.hive.hcatalog.strea ming.HiveEndPoint$TransactionB atchImpl.beginNextTransactionI mpl(HiveEndPoint.java:570) ~[stormjar.jar:?] at org.apache.hive.hcatalog.strea ming.HiveEndPoint$TransactionB atchImpl.beginNextTransaction( HiveEndPoint.java:544) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveWriter.nextTxnBatch(HiveWri ter.java:259) ~[stormjar.jar:?] at org.apache.storm.hive.common.H iveWriter.<init>(HiveWriter.ja va:72) ~[stormjar.jar:?] ... 13 more
