[
https://issues.apache.org/jira/browse/NIFI-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942751#comment-15942751
]
Ryan Persaud edited comment on NIFI-3625 at 3/27/17 2:28 PM:
-------------------------------------------------------------
I was experimenting with streaming some JSON data into a partitioned table in a
HDP 2.5 sandbox tonight, and I encountered an Exception (below). I built from
master (552148e9e7d45be4d298ee48afd7471405a5bfad) and tested with the 'old'
PutHiveStreaming processor, and I got the same error. From what I can tell,
the error occurs whenever partition columns are specified in the
PutHiveStreaming processor.
On a hunch I reverted HiveUtils and HiveWriter back to the versions from
8/4/2016 (3943d72e95ff7b18c32d12020d34f134f4e86125), and I hacked them up a bit
to work with the newer versions of PutHiveStreaming and TestPutHiveStreaming.
I was able to successfully stream into a table.
Has any one else encountered these issues since NIFI-3574 and NIFI-3530 have
been resolved? Any thoughts on how to proceed?
Here are the PutHiveStreaming properties:
<properties>
<entry>
<key>hive-stream-metastore-uri</key>
<value>thrift://sandbox.hortonworks.com:9083</value>
</entry>
<entry>
<key>hive-config-resources</key>
<value>/home/rpersaud/shared/hive-site.xml</value>
</entry>
<entry>
<key>hive-stream-database-name</key>
<value>default</value>
</entry>
<entry>
<key>hive-stream-table-name</key>
<value>test_err</value>
</entry>
<entry>
<key>hive-stream-partition-cols</key>
<value>src</value>
</entry>
<entry>
<key>hive-stream-autocreate-partition</key>
<value>true</value>
</entry>
<entry>
<key>hive-stream-max-open-connections</key>
<value>8</value>
</entry>
<entry>
<key>hive-stream-heartbeat-interval</key>
<value>60</value>
</entry>
<entry>
<key>hive-stream-transactions-per-batch</key>
<value>100</value>
</entry>
<entry>
<key>hive-stream-records-per-transaction</key>
<value>10000</value>
</entry>
<entry>
<key>Kerberos Principal</key>
</entry>
<entry>
<key>Kerberos Keytab</key>
</entry>
</properties>
Here's the exception that I got from the master build
(552148e9e7d45be4d298ee48afd7471405a5bfad). I see the same exception when I
build with my branch (322f36dd82507633a7d7e2c23122eb59530c8967), but the line
numbers are different in PutHiveStreaming since the code has changed.:
2017-03-27 08:18:58,730 ERROR [Timer-Driven Process Thread-5] hive.log Got
exception: java.lang.NullPointerException null
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77)
~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54)
~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1046)
~[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:367)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[na:1.8.0_121]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[na:1.8.0_121]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:155)
[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at com.sun.proxy.$Proxy127.isOpen(Unknown Source) [na:na]
at
org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:205)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:84)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:71)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:846)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:757)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:480)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2120)
~[na:na]
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2090)
~[na:na]
at
org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:407)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
~[nifi-api-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099)
~[na:na]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:144)
~[na:na]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
~[na:na]
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
~[na:na]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
~[na:1.8.0_121]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
~[na:1.8.0_121]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
~[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
was (Author: rpersaud):
I was experimenting with streaming some JSON data into a partitioned table in a
HDP 2.5 sandbox tonight, and I encountered an Exception (below). I built from
master (552148e9e7d45be4d298ee48afd7471405a5bfad) and tested with the 'old'
PutHiveStreaming processor, and I got the same error. From what I can tell,
the error occurs whenever partition columns are specified in the
PutHiveStreaming processor.
On a hunch I reverted HiveUtils and HiveWriter back to the versions from
8/4/2016 (3943d72e95ff7b18c32d12020d34f134f4e86125), and I hacked them up a bit
to work with the newer versions of PutHiveStreaming and TestPutHiveStreaming.
I was able to successfully stream into a table.
Has any one else encountered these issues since NIFI-3574 and NIFI-3530 have
been resolved? Any thoughts on how to proceed?
Here are the PutHiveStreaming properties:
<properties>
<entry>
<key>hive-stream-metastore-uri</key>
<value>thrift://sandbox.hortonworks.com:9083</value>
</entry>
<entry>
<key>hive-config-resources</key>
<value>/home/rpersaud/shared/hive-site.xml</value>
</entry>
<entry>
<key>hive-stream-database-name</key>
<value>default</value>
</entry>
<entry>
<key>hive-stream-table-name</key>
<value>test_err</value>
</entry>
<entry>
<key>hive-stream-partition-cols</key>
<value>src</value>
</entry>
<entry>
<key>hive-stream-autocreate-partition</key>
<value>true</value>
</entry>
<entry>
<key>hive-stream-max-open-connections</key>
<value>8</value>
</entry>
<entry>
<key>hive-stream-heartbeat-interval</key>
<value>60</value>
</entry>
<entry>
<key>hive-stream-transactions-per-batch</key>
<value>100</value>
</entry>
<entry>
<key>hive-stream-records-per-transaction</key>
<value>10000</value>
</entry>
<entry>
<key>Kerberos Principal</key>
</entry>
<entry>
<key>Kerberos Keytab</key>
</entry>
</properties>
Here's the exception that I got from the master build
(552148e9e7d45be4d298ee48afd7471405a5bfad). I see the same exception when I
build with 322f36dd82507633a7d7e2c23122eb59530c8967, but the line numbers are
different:
2017-03-27 08:18:58,730 ERROR [Timer-Driven Process Thread-5] hive.log Got
exception: java.lang.NullPointerException null
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77)
~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54)
~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1046)
~[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:367)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[na:1.8.0_121]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[na:1.8.0_121]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:155)
[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at com.sun.proxy.$Proxy127.isOpen(Unknown Source) [na:na]
at
org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:205)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)
[hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60)
[hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at
org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:84)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:71)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:846)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:757)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:480)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2120)
~[na:na]
at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2090)
~[na:na]
at
org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:407)
[nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
~[nifi-api-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099)
~[na:na]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:144)
~[na:na]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
~[na:na]
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
~[na:na]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
~[na:1.8.0_121]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
~[na:1.8.0_121]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
~[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> Add JSON support to PutHiveStreaming
> ------------------------------------
>
> Key: NIFI-3625
> URL: https://issues.apache.org/jira/browse/NIFI-3625
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Affects Versions: 1.2.0
> Reporter: Ryan Persaud
> Fix For: 1.2.0
>
>
> As noted in a Hortonworks Community Connection post
> (https://community.hortonworks.com/questions/88424/nifi-puthivestreaming-requires-avro.html),
> PutHiveStreaming does not currently support JSON Flow File content. I've
> completed the code to allow JSON flow files to be streamed into hive, and I'm
> currently working on test cases and updated documentation. I should have a
> PR to submit this week.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)