Hi All,
PutHiveStreaming throws lots of errors like bellow:
ERROR [Timer-Driven Process Thread-11] hive.log Got exception:
java.lang.NullPointerException null
java.lang.NullPointerException: null
Errors appear when PutHiveStreaming is working. It appears from every few
seconds to few times a second.
Hive streaming writes data to database but is extremely slow (like 2MB/5mins
per 6core/24GB NiFi node).
Tested on NiFi 1.4, 1.5 and current 1.6 snapshot.
Hadoop/Hive cluster uses Hortonworks HDP 2.6.4 installation in HA mode without
any security.
NiFi works in a 3 server cluster without security. NiFi compiled with
-Phortonworks and proper libraries definition.
nifi conf directory contains actual Hadoop configuration files: core-site.xml,
hbase-site.xml, hdfs-site.xml, hive-site.xml, yarn-site.xml
HiveQL queries in NIFI (both SelectHiveQL and PutHiveQL) are working properly
and fast.
The same error exists when using Hortonworks original 1.5 NiFi compilation
installed automatically by Ambari using HDF 3.1 pack.
nifi-app.log:
2018-02-15 17:42:29,889 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.hooks.ATSHook Created ATS Hook
2018-02-15 17:42:29,889 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG
method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG
method=PostHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1518712949889
end=1518712949890 duration=1 from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.Driver Resetting the caller context to
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.execute
start=1518712949845 end=1518712949890 duration=45
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.Driver OK
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=releaseLocks
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.lockmgr.DbTxnManager Stopped heartbeat for query:
nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67
2018-02-15 17:42:29,890 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.lockmgr.DbLockManager releaseLocks: [lockid:5010336
queryId=nifi_20180215174229_58046dbe-0af7-41ae-94c4-7bb692053d67 txnid:0]
2018-02-15 17:42:29,894 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=releaseLocks
start=1518712949890 end=1518712949894 duration=4
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,894 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger </PERFLOG method=Driver.run
start=1518712949819 end=1518712949894 duration=75
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:29,898 WARN [Timer-Driven Process Thread-11] hive.metastore
Unexpected increment of user count beyond one: 2 HCatClient: thread: 132
users=2 expired=false closed=false
2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log Got
exception: java.lang.NullPointerException null
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77)
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1116)
at
org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:469)
at sun.reflect.GeneratedMethodAccessor111.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:174)
at com.sun.proxy.$Proxy341.isOpen(Unknown Source)
at
org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:269)
at
org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94)
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82)
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60)
at
org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:85)
at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:72)
at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)
at
org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:1036)
at
org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:947)
at
org.apache.nifi.processors.hive.PutHiveStreaming.lambda$null$8(PutHiveStreaming.java:743)
at
org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)
at
org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$12(PutHiveStreaming.java:740)
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2145)
at
org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:694)
at
org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:572)
at
org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
at
org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184)
at
org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:572)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1122)
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-02-15 17:42:29,901 ERROR [Timer-Driven Process Thread-11] hive.log
Converting exception to MetaException
2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore
Evicted client has non-zero user count: 2
2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore
Non-zero user count preventing client tear down: users=2 expired=true
2018-02-15 17:42:29,902 WARN [Timer-Driven Process Thread-11] hive.metastore
Non-zero user count preventing client tear down: users=1 expired=true
2018-02-15 17:42:29,902 INFO [Timer-Driven Process Thread-11] hive.metastore
Trying to connect to metastore with URI thrift://dc1-hadoop-m2.domain.local:9083
2018-02-15 17:42:29,903 INFO [Timer-Driven Process Thread-11] hive.metastore
Connected to metastore.
2018-02-15 17:42:30,270 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen
compression buffer size: 32768
2018-02-15 17:42:30,270 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path:
hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00000
with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize:
32768
2018-02-15 17:42:30,274 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen
compression buffer size: 32768
2018-02-15 17:42:30,274 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path:
hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00001
with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize:
32768
2018-02-15 17:42:30,278 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl WIDE TABLE - Number of columns: 16 Chosen
compression buffer size: 32768
2018-02-15 17:42:30,278 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.io.orc.WriterImpl ORC writer created for path:
hdfs://hdfscluster/apps/hive/warehouse/monitors.db/hivetable_oneminute/dzien=1497913200/delta_216052674_216052773/bucket_00002
with stripeSize: 8388608 blockSize: 268435456 compression: ZLIB bufferSize:
32768
2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11]
org.apache.nifi.util.hive.HiveWriter Closing connection to end point :
{metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083',
database='monitors', table='hivetable_oneminute', partitionVals=[1497999600] }
2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero
user count preventing client tear down: users=3 expired=false
2018-02-15 17:42:30,773 WARN [put-hive-streaming-0] hive.metastore Non-zero
user count preventing client tear down: users=2 expired=false
2018-02-15 17:42:30,773 INFO [Timer-Driven Process Thread-11]
org.apache.nifi.util.hive.HiveUtils cached writers { endPoint =
{metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083',
database='monitors', table='hivetable_oneminute', partitionVals=[1497913200]
}, TransactionBatch = TxnIds=[216052674...216052773] on endPoint =
{metaStoreUri='thrift://dc1-hadoop-m2.domain.local:9083,thrift://dc1-hadoop-m3.domain.local:9083',
database='monitors', table='hivetable_oneminute', partitionVals=[1497913200] }
}
2018-02-15 17:42:30,774 WARN [put-hive-streaming-0] hive.metastore Unexpected
increment of user count beyond one: 3 HCatClient: thread: 215 users=3
expired=false closed=false
2018-02-15 17:42:30,777 WARN [put-hive-streaming-0] hive.metastore Unexpected
increment of user count beyond one: 4 HCatClient: thread: 215 users=4
expired=false closed=false
2018-02-15 17:42:30,820 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.session.SessionState Created local directory:
/tmp/de079f5c-6ff6-4985-86b4-aa335d1d3746_resources
2018-02-15 17:42:30,824 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.session.SessionState Created HDFS directory:
/tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746
2018-02-15 17:42:30,844 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.session.SessionState Created local directory:
/tmp/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746
2018-02-15 17:42:30,848 INFO [put-hive-streaming-0]
o.a.hadoop.hive.ql.session.SessionState Created HDFS directory:
/tmp/hive/nifi/de079f5c-6ff6-4985-86b4-aa335d1d3746/_tmp_space.db
2018-02-15 17:42:30,865 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=Driver.run
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:30,865 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=compile
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:30,866 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.Driver We are setting the hadoop caller context from
to nifi_20180215174230_892f37ef-f87f-4058-bfaf-a2700ef12f07
2018-02-15 17:42:30,866 INFO [put-hive-streaming-0]
org.apache.hadoop.hive.ql.log.PerfLogger <PERFLOG method=parse
from=org.apache.hadoop.hive.ql.Driver>
2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver
Parsing command: use monitors
2018-02-15 17:42:30,866 INFO [put-hive-streaming-0] hive.ql.parse.ParseDriver
Parse Completed
I'm not sure if it matters but when NIFI is ran in debug mode there is plenty
of such DEBUG information:
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@18b55297'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@60201e05'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@b23f4aa'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@41e9a01b'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@21b39331'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@14931248'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@4d01ae2e'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2955b240'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@543a2e0a'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@28be35e1'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@35d14e25'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@488ec6c5'
has been cancelled.
2018-02-15 12:20:43,187 DEBUG [Timer-Driven Process Thread-11]
org.apache.nifi.engine.FlowEngine A flow controller execution task
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@6165d244'
has been cancelled.
Similar problem has been reported about half year ago on hortonworks community
blog regarding older HDP/HDF version:
https://community.hortonworks.com/questions/132752/hdf-hivestreaming-nullpointerexceptiontoo-many-cli.html
but there is no solution.
I would appreciate any help and suggestion how to get rid of this error and
make PutHiveStreaming working properly...
Regards,
Mike
________________________________________ Uwaga: Tre?? niniejszej wiadomo?ci
mo?e by? poufna i obj?ta zakazem jej ujawniania. Je?li czytelnik tej wiadomo?ci
nie jest jej zamierzonym adresatem, pracownikiem lub po?rednikiem upowa?nionym
do jej przekazania adresatowi, informujemy ?e wszelkie rozprowadzanie,
rozpowszechnianie lub powielanie niniejszej wiadomo?ci jest zabronione. Je?li
otrzyma?e? t? wiadomo?? omy?kowo, prosz? bezzw?ocznie odes?a? j? nadawcy, a
sam? wiadomo?? usun?? z komputera. Dzi?kujemy. ________________________________
Note: The information contained in this message may be privileged and
confidential and protected from disclosure. If the reader of this message is
not the intended recipient, or an employee or agent responsible for delivering
this message to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication is strictly
prohibited.If you have received this communication in error, please notify the
sender immediately by replying to the message and deleting it from your
computer. Thank you. ________________________________