I took some time to look into this again and just opened PR #147 [1] with a fix that pulls in the dependencies needed to talk with the metastore and Kite's Hive module.

Maybe we can get that in for the 0.5.0 release?

rb


[1]: https://github.com/apache/nifi/pull/147

On 12/22/2015 03:27 PM, Ryan Blue wrote:
Thanks for pointing this out, Joe.

The Kite dataset processor doesn't currently support writing to Hive.
It's not that it can't, it is just that the set of dependencies required
to write directly to Hive is annoyingly large, so they aren't included
in the NiFi bundle. It works on the command-line because Kite knows
where to find the Hive jars.

If you were to add kite-data-hive and its dependencies to the classpath,
then Kite works fine writing to Hive. The annoying problem is getting
all of those dependencies together without package management help (like
maven). If you want to try, I'll paste the list of jars I used to get it
working in 0.2.1 below. I added those jars to


$NIFI_ROOT/work/nar/extensions/nifi-kite-nar-0.2.1.nar-unpacked/META-INF/bundled-dependencies/


Adding jars into NiFi's working folders probably isn't a good idea. If
there is a recommended way to make optional jars available to NAR
bundles, that would be nice.

The real solution to this is to find out how to allow Kite to talk with
the Hive Metastore with minimal dependencies. I attempted a fix a little
bit ago, but didn't do a great job on it and need to push it to
completion. That's here:

   https://github.com/apache/nifi/pull/128

Hopefully that helps,

rb


Jars for Kite/NiFi/Hive:

commons-codec-1.4.jar
commons-compress-1.4.1.jar
commons-io-2.1.jar
commons-lang-2.5.jar
commons-logging-1.1.1.jar
datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar
hadoop-mapreduce-client-common-2.6.0-cdh5.4.2.jar
hadoop-mapreduce-client-core-2.6.0-cdh5.4.2.jar
hive-exec-1.1.0-cdh5.4.2.jar
hive-metastore-1.1.0-cdh5.4.2.jar
hive-shims-1.1.0-cdh5.4.2.jar
jdo-api-3.0.1.jar
jta-1.1.jar
kite-data-hive-1.1.0.jar
kite-hadoop-compatibility-1.1.0.jar
libfb303-0.9.2.jar
libthrift-0.9.2.jar

On 12/22/2015 02:41 PM, Joe Witt wrote:
Ryan Blue - any chance you are available to weigh in on this thread?

Thanks
Joe

On Tue, Dec 22, 2015 at 8:13 AM, Joe Percivall
<joeperciv...@yahoo.com> wrote:
Are you saying the Hive dataset is not yet supported in Kite?

Joe
- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com



On Tuesday, December 22, 2015 7:41 AM, panfei <cnwe...@gmail.com> wrote:


hive dataset is not supported  yet.


2015年12月22日星期二,Chandu Koripella <ckori...@starbucks.com> 写道:
Hi Joe,



I am still facing this issue. any help is appreciated.

I am not sure why dataset:hive:namespace/dataset_name didn’t get
validated
in NIFI. Same does work when I use kite-dataset CLI.



Thanks,

Chandra





From: Joe Percivall [mailto:joeperciv...@yahoo.com]
Sent: Monday, December 21, 2015 6:38 AM
To: users@nifi.apache.org
Subject: Re: Nifi Kite-dataset URI issue



Hello Chandra,



Are you still running into issues?



If so, can someone with more experience with Kite and Hadoop chime in?



Joe

- - - - - -

Joseph Percivall

linkedin.com/in/Percivall

e: joeperciv...@yahoo.com





On Wednesday, December 16, 2015 2:08 PM, Chandu Koripella
<ckori...@starbucks.com> wrote:



HI,



I am running into another issue. I really appreciate your help. My
apology
if I am asking very basic question. I am first time user for both
Nifi, Kite
dataset and Hadoop



My Nifi is service is running as HDFS user.



·        Create a dataset as HDFS and I could able to import just
fine as
HDFS



</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.1&disp=emb&zw&atsh=1>




·        Permissions:



</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.2&disp=emb&zw&atsh=1>






·        URI Validates just fine in NIfi

</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.3&disp=emb&zw&atsh=1>




</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.4&disp=emb&zw&atsh=1>




·        Hadoop Configuration files:
/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-site.xml




Any help is appreciated.



Thanks,

Chandra



From: Chandu Koripella [mailto:ckori...@starbucks.com]
Sent: Monday, December 14, 2015 11:49 AM
To: users@nifi.apache.org
Subject: RE: Nifi Kite-dataset URI issue



You are right. It works just fine by changing the keyword to hdfs.
Kit-dataset accepting the both hdfs and hive





</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.5&disp=emb&zw&atsh=1>




Thanks,

Chandu





From: Joe Witt [mailto:joe.w...@gmail.com]
Sent: Monday, December 14, 2015 11:37 AM
To: users@nifi.apache.org
Subject: Re: Nifi Kite-dataset URI issue



the property to focus on is 'Target Dataset URI'



I'm no expert on Kite but a quick check of their docs makes me think
the
current URI being used really is not valid.



Try dataset:hdfs:/default/ctest1





On Mon, Dec 14, 2015 at 2:33 PM, Chandu Koripella
<ckori...@starbucks.com>
wrote:

HI Alan,



Thanks for responding. Hive dataset has 777 permissions. I am not
sure why
it still unable to validate it. I can load the data just fine if
execute
kite-dataset command on the server. can you please take a look?



</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.6&disp=emb&zw&atsh=1>





/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-exec-log4j.properties,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-exec-log4j.properties,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-log4j.properties




Here is last few lines from nifi logs.



2015-12-14 11:07:26,671 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 27 milliseconds

2015-12-14 11:09:16,643 INFO [Flow Service Tasks Thread-2]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:09:26,672 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:09:26,700 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 28 milliseconds (Stop-the-world time = 12
milliseconds,
Clear Edit Logs time = 13 millis), max Transaction ID -1

2015-12-14 11:09:26,700 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 28 milliseconds

2015-12-14 11:09:42,649 INFO [Flow Service Tasks Thread-2]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:10:33,658 INFO [Flow Service Tasks Thread-2]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:11:26,700 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:11:26,740 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 39 milliseconds (Stop-the-world time = 12
milliseconds,
Clear Edit Logs time = 24 millis), max Transaction ID -1

2015-12-14 11:11:26,740 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 39 milliseconds

2015-12-14 11:13:26,740 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:13:26,769 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 28 milliseconds (Stop-the-world time = 12
milliseconds,
Clear Edit Logs time = 13 millis), max Transaction ID -1

2015-12-14 11:13:26,769 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 28 milliseconds

2015-12-14 11:15:21,314 INFO [Flow Service Tasks Thread-1]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:15:26,769 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:15:26,796 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 27 milliseconds (Stop-the-world time = 11
milliseconds,
Clear Edit Logs time = 12 millis), max Transaction ID -1

2015-12-14 11:15:26,796 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 27 milliseconds

2015-12-14 11:15:47,820 INFO [Flow Service Tasks Thread-1]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:16:28,328 INFO [Flow Service Tasks Thread-1]
o.a.nifi.controller.StandardFlowService Saved flow controller
org.apache.nifi.controller.FlowController@66f2383d // Another save
pending =
false

2015-12-14 11:17:26,797 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:17:26,824 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 27 milliseconds (Stop-the-world time = 10
milliseconds,
Clear Edit Logs time = 13 millis), max Transaction ID -1

2015-12-14 11:17:26,825 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 27 milliseconds

2015-12-14 11:19:26,825 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:19:26,900 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 74 milliseconds (Stop-the-world time = 59
milliseconds,
Clear Edit Logs time = 13 millis), max Transaction ID -1

2015-12-14 11:19:26,900 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 records in 75 milliseconds

2015-12-14 11:21:26,900 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
FlowFile
Repository

2015-12-14 11:21:26,928 INFO [pool-18-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0
Records
and 0 Swap Files in 27 milliseconds (Stop-the-world time = 12
milliseconds,
Clear Edit Logs time = 13 millis), max Transaction ID -1

2015-12-14 11:21:26,928 INFO [pool-18-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
FlowFile
Repository with 0 record









From: Alan Jackoway [mailto:al...@cloudera.com]
Sent: Monday, December 14, 2015 11:05 AM

To: users@nifi.apache.org
Subject: Re: Nifi Kite-dataset URI issue



Try adding the path to hive-site.xml.



Did it give you any error message beyond just "cannot validate url"? Is
there anything interesting in the nifi-app.log?



On Mon, Dec 14, 2015 at 12:54 PM, Chandu Koripella
<ckori...@starbucks.com> wrote:

It would be great help if someone can take a look at this.



Thanks,

Chandu



From: Chandu Koripella
Sent: Friday, December 11, 2015 11:44 AM
To: 'users@nifi.apache.org'
Subject: RE: Nifi Kite-dataset URI issue



Hi,



This is final step in implementing first NIfi job. I really
appreciate any
quick tips to resolve this this.



Thanks,

Chandu





From: Chandu Koripella
Sent: Friday, December 11, 2015 7:03 AM
To: users@nifi.apache.org
Subject: RE: Nifi Kite-dataset URI issue



Hi Allen,



Thanks for helping me with this issue. Here is the full value Hadoop
config property.




/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml




</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.7&disp=emb&zw&atsh=1>




This works just fine with putHDFS processor



</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.8&disp=emb&zw&atsh=1>




From: Alan Jackoway [mailto:al...@cloudera.com]
Sent: Friday, December 11, 2015 5:42 AM
To: users@nifi.apache.org
Subject: Re: Nifi Kite-dataset URI issue



Can you give the full value of that hadoop configuration files
property?

On Dec 10, 2015 11:23 PM, "Chandu Koripella" <ckori...@starbucks.com>
wrote:

Hi,



I am trying to load data into a hive external table. Nifi  is unable to
validate the URI. It is working just fine when I use kite-dataset.

Can you please advice if am missing any parameters here?



</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.9&disp=emb&zw&atsh=1>




</mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.10&disp=emb&zw&atsh=1>




Thanks,

Chandu







--
不学习,不知道







--
Ryan Blue
Software Engineer
Cloudera, Inc.

Reply via email to