Thans Ryan. I see the fix version is on 050 so yes let's definitely try. Thanks Joe
On Wed, Dec 23, 2015 at 2:09 PM, Ryan Blue <b...@cloudera.com> wrote: > I took some time to look into this again and just opened PR #147 [1] with a > fix that pulls in the dependencies needed to talk with the metastore and > Kite's Hive module. > > Maybe we can get that in for the 0.5.0 release? > > rb > > > [1]: https://github.com/apache/nifi/pull/147 > > > On 12/22/2015 03:27 PM, Ryan Blue wrote: >> >> Thanks for pointing this out, Joe. >> >> The Kite dataset processor doesn't currently support writing to Hive. >> It's not that it can't, it is just that the set of dependencies required >> to write directly to Hive is annoyingly large, so they aren't included >> in the NiFi bundle. It works on the command-line because Kite knows >> where to find the Hive jars. >> >> If you were to add kite-data-hive and its dependencies to the classpath, >> then Kite works fine writing to Hive. The annoying problem is getting >> all of those dependencies together without package management help (like >> maven). If you want to try, I'll paste the list of jars I used to get it >> working in 0.2.1 below. I added those jars to >> >> >> >> $NIFI_ROOT/work/nar/extensions/nifi-kite-nar-0.2.1.nar-unpacked/META-INF/bundled-dependencies/ >> >> >> Adding jars into NiFi's working folders probably isn't a good idea. If >> there is a recommended way to make optional jars available to NAR >> bundles, that would be nice. >> >> The real solution to this is to find out how to allow Kite to talk with >> the Hive Metastore with minimal dependencies. I attempted a fix a little >> bit ago, but didn't do a great job on it and need to push it to >> completion. That's here: >> >> https://github.com/apache/nifi/pull/128 >> >> Hopefully that helps, >> >> rb >> >> >> Jars for Kite/NiFi/Hive: >> >> commons-codec-1.4.jar >> commons-compress-1.4.1.jar >> commons-io-2.1.jar >> commons-lang-2.5.jar >> commons-logging-1.1.1.jar >> datanucleus-api-jdo-3.2.6.jar >> datanucleus-core-3.2.10.jar >> datanucleus-rdbms-3.2.9.jar >> hadoop-mapreduce-client-common-2.6.0-cdh5.4.2.jar >> hadoop-mapreduce-client-core-2.6.0-cdh5.4.2.jar >> hive-exec-1.1.0-cdh5.4.2.jar >> hive-metastore-1.1.0-cdh5.4.2.jar >> hive-shims-1.1.0-cdh5.4.2.jar >> jdo-api-3.0.1.jar >> jta-1.1.jar >> kite-data-hive-1.1.0.jar >> kite-hadoop-compatibility-1.1.0.jar >> libfb303-0.9.2.jar >> libthrift-0.9.2.jar >> >> On 12/22/2015 02:41 PM, Joe Witt wrote: >>> >>> Ryan Blue - any chance you are available to weigh in on this thread? >>> >>> Thanks >>> Joe >>> >>> On Tue, Dec 22, 2015 at 8:13 AM, Joe Percivall >>> <joeperciv...@yahoo.com> wrote: >>>> >>>> Are you saying the Hive dataset is not yet supported in Kite? >>>> >>>> Joe >>>> - - - - - - >>>> Joseph Percivall >>>> linkedin.com/in/Percivall >>>> e: joeperciv...@yahoo.com >>>> >>>> >>>> >>>> On Tuesday, December 22, 2015 7:41 AM, panfei <cnwe...@gmail.com> wrote: >>>> >>>> >>>> hive dataset is not supported yet. >>>> >>>> >>>> 2015年12月22日星期二,Chandu Koripella <ckori...@starbucks.com> 写道: >>>>> >>>>> Hi Joe, >>>>> >>>>> >>>>> >>>>> I am still facing this issue. any help is appreciated. >>>>> >>>>> I am not sure why dataset:hive:namespace/dataset_name didn’t get >>>>> validated >>>>> in NIFI. Same does work when I use kite-dataset CLI. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandra >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From: Joe Percivall [mailto:joeperciv...@yahoo.com] >>>>> Sent: Monday, December 21, 2015 6:38 AM >>>>> To: users@nifi.apache.org >>>>> Subject: Re: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> Hello Chandra, >>>>> >>>>> >>>>> >>>>> Are you still running into issues? >>>>> >>>>> >>>>> >>>>> If so, can someone with more experience with Kite and Hadoop chime in? >>>>> >>>>> >>>>> >>>>> Joe >>>>> >>>>> - - - - - - >>>>> >>>>> Joseph Percivall >>>>> >>>>> linkedin.com/in/Percivall >>>>> >>>>> e: joeperciv...@yahoo.com >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wednesday, December 16, 2015 2:08 PM, Chandu Koripella >>>>> <ckori...@starbucks.com> wrote: >>>>> >>>>> >>>>> >>>>> HI, >>>>> >>>>> >>>>> >>>>> I am running into another issue. I really appreciate your help. My >>>>> apology >>>>> if I am asking very basic question. I am first time user for both >>>>> Nifi, Kite >>>>> dataset and Hadoop >>>>> >>>>> >>>>> >>>>> My Nifi is service is running as HDFS user. >>>>> >>>>> >>>>> >>>>> · Create a dataset as HDFS and I could able to import just >>>>> fine as >>>>> HDFS >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.1&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> · Permissions: >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.2&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> · URI Validates just fine in NIfi >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.3&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.4&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> · Hadoop Configuration files: >>>>> >>>>> /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-site.xml >>>>> >>>>> >>>>> >>>>> >>>>> Any help is appreciated. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandra >>>>> >>>>> >>>>> >>>>> From: Chandu Koripella [mailto:ckori...@starbucks.com] >>>>> Sent: Monday, December 14, 2015 11:49 AM >>>>> To: users@nifi.apache.org >>>>> Subject: RE: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> You are right. It works just fine by changing the keyword to hdfs. >>>>> Kit-dataset accepting the both hdfs and hive >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.5&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandu >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From: Joe Witt [mailto:joe.w...@gmail.com] >>>>> Sent: Monday, December 14, 2015 11:37 AM >>>>> To: users@nifi.apache.org >>>>> Subject: Re: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> the property to focus on is 'Target Dataset URI' >>>>> >>>>> >>>>> >>>>> I'm no expert on Kite but a quick check of their docs makes me think >>>>> the >>>>> current URI being used really is not valid. >>>>> >>>>> >>>>> >>>>> Try dataset:hdfs:/default/ctest1 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Dec 14, 2015 at 2:33 PM, Chandu Koripella >>>>> <ckori...@starbucks.com> >>>>> wrote: >>>>> >>>>> HI Alan, >>>>> >>>>> >>>>> >>>>> Thanks for responding. Hive dataset has 777 permissions. I am not >>>>> sure why >>>>> it still unable to validate it. I can load the data just fine if >>>>> execute >>>>> kite-dataset command on the server. can you please take a look? >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.6&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-exec-log4j.properties,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-exec-log4j.properties,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/etc/hive/conf.dist/hive-log4j.properties >>>>> >>>>> >>>>> >>>>> >>>>> Here is last few lines from nifi logs. >>>>> >>>>> >>>>> >>>>> 2015-12-14 11:07:26,671 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 27 milliseconds >>>>> >>>>> 2015-12-14 11:09:16,643 INFO [Flow Service Tasks Thread-2] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:09:26,672 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:09:26,700 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 28 milliseconds (Stop-the-world time = 12 >>>>> milliseconds, >>>>> Clear Edit Logs time = 13 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:09:26,700 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 28 milliseconds >>>>> >>>>> 2015-12-14 11:09:42,649 INFO [Flow Service Tasks Thread-2] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:10:33,658 INFO [Flow Service Tasks Thread-2] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:11:26,700 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:11:26,740 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 39 milliseconds (Stop-the-world time = 12 >>>>> milliseconds, >>>>> Clear Edit Logs time = 24 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:11:26,740 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 39 milliseconds >>>>> >>>>> 2015-12-14 11:13:26,740 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:13:26,769 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 28 milliseconds (Stop-the-world time = 12 >>>>> milliseconds, >>>>> Clear Edit Logs time = 13 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:13:26,769 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 28 milliseconds >>>>> >>>>> 2015-12-14 11:15:21,314 INFO [Flow Service Tasks Thread-1] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:15:26,769 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:15:26,796 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 27 milliseconds (Stop-the-world time = 11 >>>>> milliseconds, >>>>> Clear Edit Logs time = 12 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:15:26,796 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 27 milliseconds >>>>> >>>>> 2015-12-14 11:15:47,820 INFO [Flow Service Tasks Thread-1] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:16:28,328 INFO [Flow Service Tasks Thread-1] >>>>> o.a.nifi.controller.StandardFlowService Saved flow controller >>>>> org.apache.nifi.controller.FlowController@66f2383d // Another save >>>>> pending = >>>>> false >>>>> >>>>> 2015-12-14 11:17:26,797 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:17:26,824 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 27 milliseconds (Stop-the-world time = 10 >>>>> milliseconds, >>>>> Clear Edit Logs time = 13 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:17:26,825 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 27 milliseconds >>>>> >>>>> 2015-12-14 11:19:26,825 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:19:26,900 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 74 milliseconds (Stop-the-world time = 59 >>>>> milliseconds, >>>>> Clear Edit Logs time = 13 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:19:26,900 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 records in 75 milliseconds >>>>> >>>>> 2015-12-14 11:21:26,900 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of >>>>> FlowFile >>>>> Repository >>>>> >>>>> 2015-12-14 11:21:26,928 INFO [pool-18-thread-1] >>>>> org.wali.MinimalLockingWriteAheadLog >>>>> org.wali.MinimalLockingWriteAheadLog@75a9cc24 checkpointed with 0 >>>>> Records >>>>> and 0 Swap Files in 27 milliseconds (Stop-the-world time = 12 >>>>> milliseconds, >>>>> Clear Edit Logs time = 13 millis), max Transaction ID -1 >>>>> >>>>> 2015-12-14 11:21:26,928 INFO [pool-18-thread-1] >>>>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed >>>>> FlowFile >>>>> Repository with 0 record >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From: Alan Jackoway [mailto:al...@cloudera.com] >>>>> Sent: Monday, December 14, 2015 11:05 AM >>>>> >>>>> To: users@nifi.apache.org >>>>> Subject: Re: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> Try adding the path to hive-site.xml. >>>>> >>>>> >>>>> >>>>> Did it give you any error message beyond just "cannot validate url"? Is >>>>> there anything interesting in the nifi-app.log? >>>>> >>>>> >>>>> >>>>> On Mon, Dec 14, 2015 at 12:54 PM, Chandu Koripella >>>>> <ckori...@starbucks.com> wrote: >>>>> >>>>> It would be great help if someone can take a look at this. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandu >>>>> >>>>> >>>>> >>>>> From: Chandu Koripella >>>>> Sent: Friday, December 11, 2015 11:44 AM >>>>> To: 'users@nifi.apache.org' >>>>> Subject: RE: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> This is final step in implementing first NIfi job. I really >>>>> appreciate any >>>>> quick tips to resolve this this. >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandu >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From: Chandu Koripella >>>>> Sent: Friday, December 11, 2015 7:03 AM >>>>> To: users@nifi.apache.org >>>>> Subject: RE: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> Hi Allen, >>>>> >>>>> >>>>> >>>>> Thanks for helping me with this issue. Here is the full value Hadoop >>>>> config property. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/hdfs-site.xml,/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/etc/hadoop/core-site.xml >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.7&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> This works just fine with putHDFS processor >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.8&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> From: Alan Jackoway [mailto:al...@cloudera.com] >>>>> Sent: Friday, December 11, 2015 5:42 AM >>>>> To: users@nifi.apache.org >>>>> Subject: Re: Nifi Kite-dataset URI issue >>>>> >>>>> >>>>> >>>>> Can you give the full value of that hadoop configuration files >>>>> property? >>>>> >>>>> On Dec 10, 2015 11:23 PM, "Chandu Koripella" <ckori...@starbucks.com> >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> I am trying to load data into a hive external table. Nifi is unable to >>>>> validate the URI. It is working just fine when I use kite-dataset. >>>>> >>>>> Can you please advice if am missing any parameters here? >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.9&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> </mail/u/1/s/?view=att&th=151c5ef250d3def9&attid=0.10&disp=emb&zw&atsh=1> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chandu >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> 不学习,不知道 >>>> >>>> >>>> >> >> > > > -- > Ryan Blue > Software Engineer > Cloudera, Inc.