Hi All, PutHiveStreaming throws lots of errors like bellow:
ERROR [Timer-Driven Process Thread-11] hive.log Got exception: java.lang.NullPointerException null java.lang.NullPointerException: null Errors appear when PutHiveStreaming is working. It appears from every few seconds to few times a second. Hive streaming writes data to database but is extremely slow (like 2MB/5mins per 6core/24GB NiFi node). Tested on NiFi 1.4, 1.5 and current 1.6 snapshot. Hadoop/Hive cluster uses Hortonworks HDP 2.6.4 installation in HA mode without any security. NiFi works in a 3 server cluster without security. NiFi compiled with -Phortonworks and proper libraries definition. nifi conf directory contains actual Hadoop configuration files: core-site.xml, hbase-site.xml, hdfs-site.xml, hive-site.xml, yarn-site.xml HiveQL queries in NIFI (both SelectHiveQL and PutHiveQL) are working properly and fast. The same error exists when using Hortonworks original 1.5 NiFi compilation installed automatically by Ambari using HDF 3.1 pack. nifi-app.log: 2018-02-15 17:42:29,889 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.hooks.ATSHook Created ATS Hook 2018-02-15 17:42:29,889 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1518712949889 end=1518712949890 duration=1 from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.Driver Resetting the caller context to 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.execute start=1518712949845 end=1518712949890 duration=45 from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.Driver OK 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.lockmgr.DbTxnManager Stopped heartbeat for query: nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67 2018-02-15 17:42:29,890 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.lockmgr.DbLockManager releaseLocks: [lockid:5010336 queryId=nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67 txnid:0] 2018-02-15 17:42:29,894 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=releaseLocks start=1518712949890 end=1518712949894 duration=4 from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,894 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.run start=1518712949819 end=1518712949894 duration=75 from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:29,898 WARN [Timer-Driven Process Thread-11] hive.metastore Unexpected increment of user count beyond one: 2 HCatClient: thread: 132 users=2 expired=false closed=false 2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log Got exception: java.lang.NullPointerException null java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77) at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1116) at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:469) at sun.reflect.GeneratedMethodAccessor111.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:174) at com.sun.proxy.$Proxy341.isOpen(Unknown Source) at org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:269) at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558) at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94) at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82) at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60) at org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:85) at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:72) at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) at org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:1036) at org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:947) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$null$8(PutHiveStreaming.java:743) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$12(PutHiveStreaming.java:740) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2145) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:694) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:572) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:572) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1122) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log Converting exception to MetaException 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore Evicted client has non-zero user count: 2 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore Non-zero user count preventing client tear down: users=2 expired=true 2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore Non-zero user count preventing client tear down: users=1 expired=true 2018-02-15 17:42:29,902 INFO [Timer-Driven Process Thread-11] hive.metastore Trying to connect to metastore with URI thrift://dc1-hadoop-m2.domain.local:9083 2018-02-15 17:42:29,903 INFO [Timer-Driven Process Thread-11] hive.metastore Connected to metastore. 2018-02-15 17:42:30,270 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen compression buffer size: 32768 2018-02-15 17:42:30,270 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00000 with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: 32768 2018-02-15 17:42:30,274 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen compression buffer size: 32768 2018-02-15 17:42:30,274 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00001 with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: 32768 2018-02-15 17:42:30,278 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen compression buffer size: 32768 2018-02-15 17:42:30,278 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path: hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00002 with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize: 32768 2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11] org.apache.nifi.util.hive.HiveWriter Closing connection to end point : {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', database='monitors', table='hivetable_oneminute', partitionVals=[1497999600] } 2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero user count preventing client tear down: users=3 expired=false 2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero user count preventing client tear down: users=2 expired=false 2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11] org.apache.nifi.util.hive.HiveUtils cached writers { endPoint = {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', database='monitors', table='hivetable_oneminute', partitionVals=[1497913200] }, TransactionBatch = TxnIds=[216052674...216052773] on endPoint = {metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083', database='monitors', table='hivetable_oneminute', partitionVals=[1497913200] } } 2018-02-15 17:42:30,774 WARN [put-hive-streaming-0] hive.metastore Unexpected increment of user count beyond one: 3 HCatClient: thread: 215 users=3 expired=false closed=false 2018-02-15 17:42:30,777 WARN [put-hive-streaming-0] hive.metastore Unexpected increment of user count beyond one: 4 HCatClient: thread: 215 users=4 expired=false closed=false 2018-02-15 17:42:30,820 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.session.SessionState Created local directory: /tmp/de079f5c-6ff6-4985-86b4-aa335d1d3746_resources 2018-02-15 17:42:30,824 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.session.SessionState Created HDFS directory: /tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746 2018-02-15 17:42:30,844 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.session.SessionState Created local directory: /tmp/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746 2018-02-15 17:42:30,848 INFO [put-hive-streaming-0] o.a.hadoop.hive.ql.session.SessionState Created HDFS directory: /tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746/_tmp_space.db 2018-02-15 17:42:30,865 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:30,865 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.Driver We are setting the hadoop caller context from to nifi_20180215174230_892f37ef-f87f-4058-bfaf-a2700ef12f07 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver> 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver Parsing command: use monitors 2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver Parse Completed I'm not sure if it matters but when NIFI is ran in debug mode there is plenty of such DEBUG information: 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@18b55297' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@60201e05' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@b23f4aa' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@41e9a01b' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@21b39331' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@14931248' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@4d01ae2e' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2955b240' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@543a2e0a' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@28be35e1' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@35d14e25' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@488ec6c5' has been cancelled. 2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11] org.apache.nifi.engine.FlowEngine A flow controller execution task 'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@6165d244' has been cancelled. Similar problem has been reported about half year ago on hortonworks community blog regarding older HDP/HDF version: https://community.hortonworks.com/questions/132752/hdf-hivestreaming-nullpointerexceptiontoo-many-cli.html but there is no solution. I would appreciate any help and suggestion how to get rid of this error and make PutHiveStreaming working properly... Regards, Mike ________________________________________ Uwaga: Tre?? niniejszej wiadomo?ci mo?e by? poufna i obj?ta zakazem jej ujawniania. Je?li czytelnik tej wiadomo?ci nie jest jej zamierzonym adresatem, pracownikiem lub po?rednikiem upowa?nionym do jej przekazania adresatowi, informujemy ?e wszelkie rozprowadzanie, rozpowszechnianie lub powielanie niniejszej wiadomo?ci jest zabronione. Je?li otrzyma?e? t? wiadomo?? omy?kowo, prosz? bezzw?ocznie odes?a? j? nadawcy, a sam? wiadomo?? usun?? z komputera. Dzi?kujemy. ________________________________ Note: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.If you have received this communication in error, please notify the sender immediately by replying to the message and deleting it from your computer. Thank you. ________________________________