Realize this is a bit late but I ran into you're emails when I encountered the same issue on a my installation of NiFi. I don't know what the too many users message is about but I do know why NiFi gets a null pointer exception. The PutHiveStreaming Processor uses the HiveEndpoint Class to communicate with Hive and when you tell it to auto create partitions it creates a new instance of CliSessionState an extended version of SessionState. This can be seen on line 458 at https://github.com/apache/hive/blob/master/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java. Unfortunately CliSessionState doesn't set the ip address. If you look at the line in AuthorizationMetaStoreFilterHook mentioned you'll see it's calling getUserIpAddress on a SessionState instance. Since the partition is actually created that error must be occurring afterwards. I'm assuming this a bug but none of this code has been changed in a really long time. I'm going to try building the Hive Streaming example outside of NiFi and see if it works at all. Thanks Shawn
On 2018/02/15 17:44:10, Michal Tomaszewski <[email protected]> wrote: > Hi All,> > > PutHiveStreaming throws lots of errors like bellow:> > > ERROR [Timer-Driven Process Thread-11] hive.log Got exception: > java.lang.NullPointerException null> > java.lang.NullPointerException: null> > > Errors appear when PutHiveStreaming is working. It appears from every few > seconds to few times a second.> > > Hive streaming writes data to database but is extremely slow (like 2MB/5mins > per 6core/24GB NiFi node).> > Tested on NiFi 1.4, 1.5 and current 1.6 snapshot.> > > Hadoop/Hive cluster uses Hortonworks HDP 2.6.4 installation in HA mode > without any security.> > NiFi works in a 3 server cluster without security. NiFi compiled with > -Phortonworks and proper libraries definition.> > nifi conf directory contains actual Hadoop configuration files: > core-site.xml, hbase-site.xml, hdfs-site.xml, hive-site.xml, yarn-site.xml> > HiveQL queries in NIFI (both SelectHiveQL and PutHiveQL) are working properly > and fast.> > > The same error exists when using Hortonworks original 1.5 NiFi compilation > installed automatically by Ambari using HDF 3.1 pack.> > > nifi-app.log:> > > 2018-02-15 17:42:29,889 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.hooks.ATSHook Created ATS Hook> > 2018-02-15 17:42:29,889 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG > method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG > method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1518712949889 > end=1518712949890 duration=1 from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.Driver Resetting the caller context to> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.execute > start=1518712949845 end=1518712949890 duration=45 > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.Driver OK> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=releaseLocks > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.lockmgr.DbTxnManager Stopped heartbeat for query: > nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67> > 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.lockmgr.DbLockManager releaseLocks: [lockid:5010336 > queryId=nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67 txnid:0]> > 2018-02-15 17:42:29,894 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=releaseLocks > start=1518712949890 end=1518712949894 duration=4 > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,894 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.run > start=1518712949819 end=1518712949894 duration=75 > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:29,898 WARN [Timer-Driven Process Thread-11] hive.metastore > Unexpected increment of user count beyond one: 2 HCatClient: thread: 132 > users=2 expired=false closed=false> > 2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log Got > exception: java.lang.NullPointerException null> > java.lang.NullPointerException: null> > at > org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77)> > at > org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54)> > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1116)> > at > org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:469)> > at sun.reflect.GeneratedMethodAccessor111.invoke(Unknown Source)> > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)> > at java.lang.reflect.Method.invoke(Method.java:498)> > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:174)> > at com.sun.proxy.$Proxy341.isOpen(Unknown Source)> > at > org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:269)> > at > org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)> > at > org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94)> > at > org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82)> > at > org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60)> > at > org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:85)> > at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:72)> > at > org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:1036)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:947)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.lambda$null$8(PutHiveStreaming.java:743)> > at > org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$12(PutHiveStreaming.java:740)> > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)> > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2145)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:694)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:572)> > at > org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)> > at > org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184)> > at > org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:572)> > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1122)> > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)> > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)> > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128)> > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)> > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)> > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)> > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)> > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)> > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)> > at java.lang.Thread.run(Thread.java:748)> > 2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log > Converting exception to MetaException> > 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore > Evicted client has non-zero user count: 2> > 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore > Non-zero user count preventing client tear down: users=2 expired=true> > 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore > Non-zero user count preventing client tear down: users=1 expired=true> > 2018-02-15 17:42:29,902 INFO [Timer-Driven Process Thread-11] hive.metastore > Trying to connect to metastore with URI > thrift://dc1-hadoop-m2.domain.local:9083> > 2018-02-15 17:42:29,903 INFO [Timer-Driven Process Thread-11] hive.metastore > Connected to metastore.> > 2018-02-15 17:42:30,270 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 > Chosen compression buffer size: 32768> > 2018-02-15 17:42:30,270 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: > hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00000 > with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: > 32768> > 2018-02-15 17:42:30,274 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 > Chosen compression buffer size: 32768> > 2018-02-15 17:42:30,274 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: > hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00001 > with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: > 32768> > 2018-02-15 17:42:30,278 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 > Chosen compression buffer size: 32768> > 2018-02-15 17:42:30,278 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: > hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00002 > with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: > 32768> > 2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11] > org.apache.nifi.util.hive.HiveWriter Closing connection to end point : > {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', > database='monitors', table='hivetable_oneminute', partitionVals=[1497999600] > }> > 2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero > user count preventing client tear down: users=3 expired=false> > 2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero > user count preventing client tear down: users=2 expired=false> > 2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11] > org.apache.nifi.util.hive.HiveUtils cached writers { endPoint = > {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', > database='monitors', table='hivetable_oneminute', partitionVals=[1497913200] > }, TransactionBatch = TxnIds=[216052674...216052773] on endPoint = > {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', > database='monitors', table='hivetable_oneminute', partitionVals=[1497913200] > } }> > 2018-02-15 17:42:30,774 WARN [put-hive-streaming-0] hive.metastore Unexpected > increment of user count beyond one: 3 HCatClient: thread: 215 users=3 > expired=false closed=false> > 2018-02-15 17:42:30,777 WARN [put-hive-streaming-0] hive.metastore Unexpected > increment of user count beyond one: 4 HCatClient: thread: 215 users=4 > expired=false closed=false> > 2018-02-15 17:42:30,820 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.session.SessionState Created local directory: > /tmp/de079f5c-6ff6-4985-86b4-aa335d1d3746_resources> > 2018-02-15 17:42:30,824 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.session.SessionState Created HDFS directory: > /tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746> > 2018-02-15 17:42:30,844 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.session.SessionState Created local directory: > /tmp/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746> > 2018-02-15 17:42:30,848 INFO [put-hive-streaming-0] > o.a.hadoop.hive.ql.session.SessionState Created HDFS directory: > /tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746/_tmp_space.db> > 2018-02-15 17:42:30,865 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=Driver.run > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:30,865 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=compile > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.Driver We are setting the hadoop caller context > from to nifi_20180215174230_892f37ef-f87f-4058-bfaf-a2700ef12f07> > 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] > org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=parse > from=org.apache.hadoop.hive.ql.Driver>> > 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver > Parsing command: use monitors> > 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver > Parse Completed> > > I'm not sure if it matters but when NIFI is ran in debug mode there is plenty > of such DEBUG information:> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@18b55297' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@60201e05' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@b23f4aa' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@41e9a01b' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@21b39331' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@14931248' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@4d01ae2e' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2955b240' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@543a2e0a' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@28be35e1' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@35d14e25' > has been cancelled.> > 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] > org.apache.nifi.engine.FlowEngine A flow controller execution task > 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@488ec6c5' > has been cancelled.> > 2018-02-15 1 [message truncated...]
