[ https://issues.apache.org/jira/browse/FLUME-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ling Jinjiang updated FLUME-3059: --------------------------------- Description: As I want to use hive sink to load data into hive 2.1.0, then I configure it with the example in flume user guide. But something wrong happened, the error message is here: 2017-02-23 18:49:09,079 ERROR org.apache.flume.SinkRunner: Unable to deliver event. Exception follows. org.apache.flume.EventDeliveryException: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:268) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99) at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344) at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296) at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254) ... 3 more Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380) at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86) ... 6 more Caused by: org.apache.hive.hcatalog.streaming.InvalidTable: Invalid table db:logsdb, table:weblogs: 'transactional' property is not set on Table at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.checkEndPoint(HiveEndPoint.java:340) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:312) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:278) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:215) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:192) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:105) at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:376) at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:373) at org.apache.flume.sink.hive.HiveWriter$11.call(HiveWriter.java:425) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) It's becasue that https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-TableProperties and https://issues.apache.org/jira/browse/HIVE-11716. So since HIVE-11716 the "transactional" must be "true" when creating table like this: create table weblogs ( id int , msg string ) partitioned by (continent string, country string, time string) clustered by (id) into 5 buckets stored as orc TBLPROPERTIES ("transactional"="true"); And some configuration must be setted in hive to support transaction. But the example in flume user guide of hive sink does not mention these. I think it's nessecory to specify the hive transaction configuration in the guide; was: Hive sink failed in hive 2.1.0 As I want to use hive sink to load data into hive 2.1.0, then I configure it with the example in flume user guide. But something wrong happened, the error message is here: 2017-02-23 18:49:09,079 ERROR org.apache.flume.SinkRunner: Unable to deliver event. Exception follows. org.apache.flume.EventDeliveryException: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:268) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99) at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344) at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296) at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254) ... 3 more Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380) at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86) ... 6 more Caused by: org.apache.hive.hcatalog.streaming.InvalidTable: Invalid table db:logsdb, table:weblogs: 'transactional' property is not set on Table at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.checkEndPoint(HiveEndPoint.java:340) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:312) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:278) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:215) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:192) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:105) at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:376) at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:373) at org.apache.flume.sink.hive.HiveWriter$11.call(HiveWriter.java:425) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) It's becasue that https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-TableProperties and https://issues.apache.org/jira/browse/HIVE-11716. So since HIVE-11716 the "transactional" must be "true" when creating table like this: create table weblogs ( id int , msg string ) partitioned by (continent string, country string, time string) clustered by (id) into 5 buckets stored as orc TBLPROPERTIES ("transactional"="true"); And some configuration must be setted in hive to support transaction. But the example in flume user guide of hive sink does not mention these. I think it's nessecory to specify the hive transaction configuration in the guide; > Hive sink failed in hive 2.1.0 > ------------------------------ > > Key: FLUME-3059 > URL: https://issues.apache.org/jira/browse/FLUME-3059 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources > Affects Versions: v1.7.0 > Reporter: Ling Jinjiang > Priority: Minor > > As I want to use hive sink to load data into hive 2.1.0, then I configure it > with the example in flume user guide. But something wrong happened, the error > message is here: > 2017-02-23 18:49:09,079 ERROR org.apache.flume.SinkRunner: Unable to deliver > event. Exception follows. > org.apache.flume.EventDeliveryException: > org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to > EndPoint {metaStoreUri='thrift://127.0.0.1:9083', database='logsdb', > table='weblogs', partitionVals=[asia, test, 17-02-23-18-40] } > at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:268) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed > connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', > database='logsdb', table='weblogs', partitionVals=[asia, test, > 17-02-23-18-40] } > at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99) > at > org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344) > at > org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296) > at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254) > ... 3 more > Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed > connecting to EndPoint {metaStoreUri='thrift://127.0.0.1:9083', > database='logsdb', table='weblogs', partitionVals=[asia, test, > 17-02-23-18-40] } > at > org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380) > at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86) > ... 6 more > Caused by: org.apache.hive.hcatalog.streaming.InvalidTable: Invalid table > db:logsdb, table:weblogs: 'transactional' property is not set on Table > at > org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.checkEndPoint(HiveEndPoint.java:340) > at > org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:312) > at > org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:278) > at > org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:215) > at > org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:192) > at > org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:105) > at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:376) > at org.apache.flume.sink.hive.HiveWriter$8.call(HiveWriter.java:373) > at org.apache.flume.sink.hive.HiveWriter$11.call(HiveWriter.java:425) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > It's becasue that > https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-TableProperties > and https://issues.apache.org/jira/browse/HIVE-11716. So since HIVE-11716 > the "transactional" must be "true" when creating table like this: > create table weblogs ( id int , msg string ) > partitioned by (continent string, country string, time string) > clustered by (id) into 5 buckets > stored as orc TBLPROPERTIES ("transactional"="true"); > And some configuration must be setted in hive to support transaction. But the > example in flume user guide of hive sink does not mention these. I think it's > nessecory to specify the hive transaction configuration in the guide; -- This message was sent by Atlassian JIRA (v6.3.15#6346)