Re: Storm hive bolt
You can dowload my project from link http://itzone.pl/tmp234der/StormSample.zip This is simple topology with Kafka spout (it works ), hbase bolt (it works), hive bolt (doesnt work) I've created hive table: CREATE TABLE stock_prices( day DATE, open FLOAT, high FLOAT, low FLOAT, close FLOAT, volume INT, adj_close FLOAT ) PARTITIONED BY (name STRING) CLUSTERED BY (day) into 5 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); I've created hbase table create 'stock_prices', 'cf' I've created kafka tpoic: /usr/hdf/current/kafka-broker/bin/kafka-topics.sh --create --zookeeper hdf1.local:2181,hdf2.local:2181,hdf3.local:2181 --replication-factor 3 --partition 3 --topic my-topic I've deployed app to storm. storm jar /root/StormSample-0.0.1-SNAPSHOT.jar mk.stormkafka.KafkaSpoutTestTopology MKjobarg1XXX When i Deploy and publish to kafka topic sample message I can not save data to hive table. I'm 100% sure I have hive configured OK, because I can add something to this table manually outside storm. 2017-03-30,11,12,13,14,15,16,Marcin2345 Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://hdp1.local:9083', database='default', table='stock_prices', partitionVals=[Marcin] } at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:575) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:544) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter.nextTxnBatch(HiveWriter.java:259) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter.(HiveWriter.java:72) ~[stormjar.jar:?] ... 13 more I think it could be pom dependencies problem. I have no idea how I can fix if. Can you help me ? pozdrawiam Marcin Kasiński http://itzone.pl On 31 March 2017 at 12:35, Igor Kuzmenkowrote: > Check this example: > https://github.com/hortonworks/storm-release/blob/HDP-2.5.0.0-tag/external/storm-hive/src/test/java/org/apache/storm/hive/bolt/HiveTopology.java > > If you can, please post your topology code. It's strange that you are using > org.apache.hadoop.hive package directly. > > > On Fri, Mar 31, 2017 at 1:08 PM, Marcin Kasiński > wrote: >> >> After changin I have lots of errors in eclipse >> "DescriptionResourcePathLocationType >> The import org.apache.hadoop.hive cannot be resolved >> TestHiveBolt.java/StormSample/src/mk/storm/hiveline 26Java >> Problem >> " >> >> Do you have hello world storm hive project (HDP 1.5 and HDF 2.1) ? >> >> Can you send it to me an I will try it ? >> >> >> pozdrawiam >> Marcin Kasiński >> http://itzone.pl >> >> >> On 31 March 2017 at 11:30, Igor Kuzmenko wrote: >> > I'm using hive streaming bolt with HDP 2.5.0.0. >> > Try this: >> > >> > >> > >> > hortonworks >> > >> > >> > http://nexus-private.hortonworks.com/nexus/content/groups/public/ >> > >> > >> > >> > >> > org.apache.storm >> > storm-hive >> > 1.0.1.2.5.0.0-1245 >> > >> > >> > >> > On Fri, Mar 31, 2017 at 11:32 AM, Marcin Kasiński >> > wrote: >> >> >> >> Hi Eugene. >> >> >> >> Below yo have my pom file. >> >> >> >> Can you check it and fix it to use repositories in proper way, please ? >> >> >> >> I'm working with my problem over 2 weeks and I'm loosing hope. >> >> >> >> >> >> http://maven.apache.org/POM/4.0.0; >> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; >> >> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 >> >> http://maven.apache.org/xsd/maven-4.0.0.xsd;> >> >> 4.0.0 >> >> StormSample >> >> StormSample >> >> 0.0.1-SNAPSHOT >> >> >> >> >> >> >> >> UTF-8 >> >> 1.7 >> >> 1.7 >> >> 1.0.1 >> >> 0.3.0 >> >> 0.8.2.2.3.0.0-2557 >> >> 1.7.7 >> >> 4.11 >> >> >> >> >> >> src >> >> >> >> >> >> >> >> >> >> maven-compiler-plugin >> >> 3.3 >> >> >> >> 1.8 >> >> 1.8 >> >> >> >> >> >> >> >> >> >> org.apache.maven.plugins >> >> maven-jar-plugin >> >> >> >> >> >> >> >> true >> >> lib/ >> >> mk.StormSample >> >> >> >> >> >> >> >> >> >> >> >> >> >> org.apache.maven.plugins >> >> maven-shade-plugin >> >> 1.4 >> >> >> >> true >> >> >> >> >> >> >> >> package >> >> >> >> shade >> >> >> >> >> >>
Re: ORC slipt
Please find answers inline. Thanks Prasanth _ From: Alberto Ramón> Sent: Friday, March 31, 2017 9:32 AM Subject: ORC slipt To: > Some doubts about ORC: 1- hive.exec.orc.default.buffer.size is used for read or write? Configurable only during write. Writer writes this buffer size into the footer which readers use during decompression. 2- orc.stripe.size is compressed or uncompresed? Both. Stripe size is essentially sum of all buffers of all columns (also dictionary size) held in memory. 3- orc.stripe.size must be multiple of HDFS block size? It is optimal to have it as multiple of hdfs block size else writer will adjust the last stripe size within a block so as to not straddle hdfs block boundary or pad the remaining space if it is less than 5% of block size. Note that hdfs block size can be configurable via orc.block.size and is independent of cluster wide block size. Default stripe size is 64 mb and block size is 256mb. 4- For read ORC file , the numbers of mappers depends onr HDFS blocks or Stripe number? Depends. If predicate pushdown is enabled each split could have one or more stripes. If predicate pushdown is disabled adjacent stripes are grouped together until hdfs block size to form a single split. Let's say, we have 3 stripes and if 2nd stripe does not satisfy the predicate then 1st and 3rd stripe will become 2 separate splits and 2nd stripe will be ignored. If predicate pushdown is disabled, all 3 stripes will together form a single split as it is less than block boundary. Number of splits will vary based on input format and execution engine. 5- hive.exec.orc.split.strategy is used for read? Yes.
ORC slipt
Some doubts about ORC: *1- hive.exec.orc.default.buffer.size* is used for read or write? *2- orc.stripe.size* is compressed or uncompresed? *3- orc.stripe.size* must be multiple of HDFS block size? 4- For read ORC file , the numbers of mappers depends onr HDFS blocks or Stripe number? *5- hive.exec.orc.split.strategy *is used for read?
Re: Storm hive bolt
Hi Eugene. Below yo have my pom file. Can you check it and fix it to use repositories in proper way, please ? I'm working with my problem over 2 weeks and I'm loosing hope. http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> 4.0.0 StormSample StormSample 0.0.1-SNAPSHOT UTF-8 1.7 1.7 1.0.1 0.3.0 0.8.2.2.3.0.0-2557 1.7.7 4.11 src maven-compiler-plugin 3.3 1.8 1.8 org.apache.maven.plugins maven-jar-plugin true lib/ mk.StormSample org.apache.maven.plugins maven-shade-plugin 1.4 true package shade *:* META-INF/*.SF META-INF/*.DSA META-INF/*.RSA defaults.yaml org.apache.storm storm-hive 1.0.2 jline jline org.apache.storm storm-hbase 1.0.1 org.apache.storm storm-core 1.0.3 log4j-over-slf4j org.slf4j org.apache.kafka kafka_2.10 0.10.0.0 org.apache.zookeeper zookeeper org.slf4j slf4j-log4j12 log4j log4j org.slf4j log4j-over-slf4j 1.7.21 org.apache.storm storm-kafka 1.0.1 org.apache.storm storm-jdbc 1.0.3 org.apache.hadoop hadoop-hdfs 2.6.0 ch.qos.logback logback-classic javax.servlet servlet-api com.googlecode.json-simple json-simple 1.1 log4j log4j 1.2.17 hortonworks http://repo.hortonworks.com/content/groups/public/org/apache/storm/storm-hive/1.0.1.2.0.1.0-12/ pozdrawiam Marcin Kasiński http://itzone.pl On 30 March 2017 at 17:51, Eugene Koifmanwrote: > It maybe because you are mixing artifacts from HDP/F and Apache when > compiling the topology. > Can you try using > http://repo.hortonworks.com/content/groups/public/org/apache/storm/storm-hive/1.0.1.2.0.1.0-12/ > Rather than > > org.apache.storm > storm-hive > 1.0.3 > > > Eugene > > On 3/29/17, 9:47 AM, "Marcin Kasiński" wrote: > > I've upgraded my environment. > > I have HIve on HDP 2.5 (environment 1) and storm on HDF 2.1 > > (environment 2) > > I have the same eroor: > > On storm (HDF 2.1): > > Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable > to acquire lock on {metaStoreUri='thrift://hdp1.local:9083', > database='default', table='stock_prices', partitionVals=[Marcin] } at > > org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:575) > ~[stormjar.jar:?] > > On hive metastore (HDP 2.5): > > 2017-03-29 11:56:29,926 ERROR [pool-5-thread-17]: > server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error > occurred during processing of message. > java.lang.IllegalStateException: Unexpected DataOperationType: UNSET > agentInfo=Unknown txnid:54 at > > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:938) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:814) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaSt > pozdrawiam > Marcin Kasiński > http://itzone.pl > > > On 27 March 2017 at 22:01, Marcin Kasiński > wrote: > > Hello. > > > > Thank you for reply. > > > > I do really want to solve it. > > > > I'm sure i compiled sources again with new jars. > > > > I've changed source from