Re: Determine which Nifi processor eating CPU usage

2015-10-26 Thread Oleg Zhurakousky
Unfortunately I can’t seem to even see how it would be possible for NiFi to 
tell you that since your custom Processors are running within the same JVM as 
NiFi.
Having said that the 800% tells me that you probably have some processor with 
custom thread pool where each thread is spinning in the loop with a lot of 
misses on the functionality it expects to perform.
For example:
while (true) {
if (someCondition){
// do something
}
}
The above will definitely eat 100% of your CPU if ’someCondition’ never happens 
and if you have something like this running in multiple threads on 8 cores 
there is your 800%.
That could be your code or some library you are using.

There is also a slight chance that your the code executed by multiple threads 
is actually doing something very CPU intensive

Hope that helps

Oleg


On Oct 26, 2015, at 10:57 AM, Elli Schwarz 
> wrote:

Hello,

We have a nifi flow with many custom processors (and many processor groups). We 
suspect that one or more processors are eating up CPU usage, so we're wondering 
if there's an easy way to tell which processor has a heavy load on the CPU. 
There are tables to see processors in order of number of flow files or bytes 
in/out, etc, but not based on CPU usage. In fact, I can't find a way to see a 
table of all processors that have active threads. All that we know is that the 
top command has nifi running at 800%, and we're doing trial and error by 
turning off processors until we hit the one that makes CPU utilization go down.

We did see an earlier post about processors that poll can be eating up CPU 
cycles, but that doesn't seem to be the case here. Once in the past we had a 
custom processor with a bug that caused it to eat CPU cycles, but we discovered 
the issue not through Nifi but because we happened to be examining the code.

Thank you!

-Elli




Re: Replicate flow files to multiple processors

2015-11-10 Thread Oleg Zhurakousky
Multiple processors or multiple instance of the same processor?
Could you also elaborate on your use case a bit more, simply because their may 
be several ways of accomplishing your goal and to pick the best understanding 
of the underlying problem would help.

Thanks
Oleg


On Nov 10, 2015, at 14:39, Chakrader Dewaragatla 
> 
wrote:

Hi - Do we have any built in processor that replicate flow files to multiple  
processors in parallel (in memory, not staging on disk)?
I was looking at distributedload processor that distribute load on weighted, 
roudrobin technique. I am looking for something that replicate the flow files.

Thanks,
-Chakri

The information contained in this transmission may contain privileged and 
confidential information. It is intended only for the use of the person(s) 
named above. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. If you are not the intended recipient, 
please contact the sender by reply email and destroy all copies of the original 
message.



Re: Replicate flow files to multiple processors

2015-11-11 Thread Oleg Zhurakousky
I am still a bit confused with the problem that is being solved here.
“replication” implies some type of redundancy allowing processing that failed 
“here” to be resumed “there”.

What I am reading here is more about "content based routing” - (route to their 
respective workflows based on their attribute)

Am I missing something?

Cheers
Oleg
On Nov 10, 2015, at 5:35 PM, Andrew Grande 
> wrote:

As mentioned, simply keep connecting things together (e.g. multiple 'success' 
relationship links). For better organization, consider putting a Funnel in the 
flow and connecting to it instead of a processor.

Andrew

From: Chakrader Dewaragatla 
>
Reply-To: "users@nifi.apache.org" 
>
Date: Tuesday, November 10, 2015 at 3:01 PM
To: "users@nifi.apache.org" 
>
Subject: RE: Replicate flow files to multiple processors

Thanks Mark. This should help.

 Our use case is to route traffic (flowflies) to multiple independent 
processors that inline route to their respective workflows based on their 
attribute.


From: Mark Payne [marka...@hotmail.com]
Sent: Tuesday, November 10, 2015 11:45 AM
To: users@nifi.apache.org
Subject: Re: Replicate flow files to multiple processors

Chakri,

This can be done with any Processor. You can simply drag multiple connections 
that have the same Relationship.

For example, you can create a GetSFTP processor and draw a connection from 
GetSFTP to UpdateAttribute with the 'success' relationship.
and then also draw a connection from GetSFTP to PutHDFS with the 'success' 
relationship.

This will result in each FlowFile that is routed to 'success' going to both 
Processors.

NiFi does this without copying the data or anything, simply by creating a new 
FlowFile that points to the same content on disk, so
it is able to do this extremely efficiently.

Thanks
-Mark




On Nov 10, 2015, at 2:39 PM, Chakrader Dewaragatla 
> 
wrote:

Hi - Do we have any built in processor that replicate flow files to multiple  
processors in parallel (in memory, not staging on disk)?
I was looking at distributedload processor that distribute load on weighted, 
roudrobin technique. I am looking for something that replicate the flow files.

Thanks,
-Chakri

The information contained in this transmission may contain privileged and 
confidential information. It is intended only for the use of the person(s) 
named above. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. If you are not the intended recipient, 
please contact the sender by reply email and destroy all copies of the original 
message.


The information contained in this transmission may contain privileged and 
confidential information. It is intended only for the use of the person(s) 
named above. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. If you are not the intended recipient, 
please contact the sender by reply email and destroy all copies of the original 
message.



Re: Provenance doesn't work with FetchS3Object

2015-10-15 Thread Oleg Zhurakousky
Ben

I don’t think it needs an incoming FlowFile. It is a scheduled component and 
will retrieve contents based on how you configure scheduling.
Have you tried it without incoming FlowFiles?

Cheers
Oleg

On Oct 15, 2015, at 3:38 PM, Ben Meng 
> wrote:

I understand that FetchS3Object processor requires an incoming FlowFile to 
trigger it. The problem is that FetchS3Object emits a RECEIVE provenance event 
for the existing FlowFile. That event causes following error when I try to open 
the lineage chart for a simple flow: GenerateFlowFile -> FetchS3Object.

"Found cycle in graph. This indicates that multiple events were registered 
claiming to have generated the same FlowFile (UUID = 
40f58407-ea10-4843-b8d1-be0e24f685aa)"

Should FetchS3Object create a new FlowFile for each fetched object? If so, does 
it really require an incoming FlowFile?

Regards,
Ben

The information contained in this transmission may contain privileged and 
confidential information. It is intended only for the use of the person(s) 
named above. If you are not the intended recipient, you are hereby notified 
that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. If you are not the intended recipient, 
please contact the sender by reply email and destroy all copies of the original 
message.




Re: StoreInKiteDataset help

2015-10-16 Thread Oleg Zhurakousky
Chris

Could you elaborate on your use case a bit more? Specifically about where is 
the source of data you want to pump into hive (e.g., Streaming, bulk file load 
etc.)

Cheers
Oleg

On Oct 16, 2015, at 8:56 AM, Christopher Wilson 
> wrote:

Joe, it was an HDP issue.  I didn't leap to NiFi if the examples didn't work.  
Thanks again.

Also, if there's a better way to pump data into Hive I'm all ears.

-Chris

On Fri, Oct 16, 2015 at 8:53 AM, Christopher Wilson 
> wrote:
Joe, the first hurdle is to get ojdbc6.jar downloaded and installed in 
/usr/share/java.  There's a link created in /usr/hdp/2.3.0.0-2557/hive/lib/ but 
points to nothing.

Here's the hurdle I can't get past.  If you install and run kite-dataset from 
the web site and run through the example with debug and verbose turned on 
(below) you get the output below.  It thinks mapreduce.tar.gz doesn't exist, 
but it does (way down below).  I've run this as users root and hdfs with no 
joy.  Thanks for looking.

debug=true ./kite-dataset -v csv-import sandwiches.csv sandwiches

WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.3.0.0-2557/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.3.0.0-2557/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
1 job failure(s) occurred:
org.kitesdk.tools.CopyTask: 
Kite(dataset:file:/tmp/0c1454eb-7831-4d6b-85a2-63a6cc8c51... ID=1 (1/1)(1): 
java.io.FileNotFoundException: File 
file:/hdp/apps/2.3.0.0-2557/mapreduce/mapreduce.tar.gz does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
at 
org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:110)
at 
org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:467)
at org.apache.hadoop.fs.FilterFs.resolvePath(FilterFs.java:157)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2193)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2189)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.resolve(FileContext.java:2189)
at org.apache.hadoop.fs.FileContext.resolvePath(FileContext.java:601)
at 
org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:457)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at 
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:329)
at 
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:204)
at 
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:238)
at 
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112)
at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55)
at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83)
at java.lang.Thread.run(Thread.java:745)

[hdfs@sandbox ~]$ hdfs dfs -ls /hdp/apps/2.3.0.0-2557/mapreduce
Found 2 items
-r--r--r--   1 hdfs hadoop 105893 2015-08-20 08:36 
/hdp/apps/2.3.0.0-2557/mapreduce/hadoop-streaming.jar
-r--r--r--   1 hdfs hadoop  207888607 2015-08-20 08:33 
/hdp/apps/2.3.0.0-2557/mapreduce/mapreduce.tar.gz


On Thu, Oct 15, 2015 at 3:22 PM, Joe Witt 
> wrote:
Chris,

Are you seeing errors in NiFi or in HDP?  If you're seeing errors in
NiFi can you please send us the logs?

Thanks
Joe

On Thu, Oct 15, 2015 at 3:02 PM, Christopher Wilson 
> wrote:
> Has anyone gotten Kite to work on HDP?  I'd wanted to do this very thing but
> am running into all kinds of issues with having .jar files not in the
> distributed cache (basically in /apps/hdp).
>
> Any feedback appreciated.
>
> -Chris
>
> On Sat, Sep 19, 2015 at 11:04 AM, Tyler Hawkes 
> 

Re: Custom classpath (using MapR-FS in lieu of HDFS)

2015-10-12 Thread Oleg Zhurakousky
May be I am missing certain details, but are you asking about custom bundle 
(NAR) or using existing NiFi bundles which would somehow have to delegate to 
some custom JARs?
I am thinking that if you are using custom client, you may need a custom bundle.

Oleg

> On Oct 12, 2015, at 9:29 AM, Andre F de Miranda  wrote:
> 
> Hi there,
> 
> First post in here so please... you know the drill.
> 
> I was wondering: What is the recommended approach to add custom
> classes to NiFi on load time?
> 
> Reason I ask is simple: Our HDFS store is in fact a MapR-FS store and
> as such I need to use the proprietary MapR client and Hadoop JARs on
> the client software.
> 
> Usually this is achieved via add-on shims, rebuild (using maven
> artifacts) and in some rare cases, fully supported out of the box! :-)
> 
> I am finding a bit difficult to setup the NiFI <-> MapR linkage
> without tainting the NiFi distribution folder structure and was
> wondering what is the recommended approach?
> 
> I thank you in advance
> 



Re: NoSuchElementException

2015-12-01 Thread Oleg Zhurakousky
Douglass

Just looked at the code and  would like to ask you if you could try it with our 
latest snapshot. This was a bug that was fixed.

Thanks
Oleg

On Dec 1, 2015, at 11:23 AM, Douglas Doughty 
> wrote:

Hi All,
I’ve enjoyed working with nifi, and I am about to deploy it into production, 
but I keep running into one problem:

I get an error on a ConvertAvroToJSON processor that is in a success path after 
a ExecuteSQL processor.  The messages stayed queued between the two processors. 
 When looking at the provenance for the ExecuteSQL processor, I do see hex data.

The error is:

2015-12-01 09:16:19,349 ERROR [Timer-Driven Process Thread-2] 
o.a.n.processors.avro.ConvertAvroToJSON 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] failed to process 
due to java.util.NoSuchElementException; rolling back session: 
java.util.NoSuchElementException
2015-12-01 09:16:19,350 ERROR [Timer-Driven Process Thread-2] 
o.a.n.processors.avro.ConvertAvroToJSON 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] failed to process 
session due to java.util.NoSuchElementException: 
java.util.NoSuchElementException
2015-12-01 09:16:19,350 WARN [Timer-Driven Process Thread-2] 
o.a.n.processors.avro.ConvertAvroToJSON 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] Processor 
Administratively Yielded for 1 sec due to processing failure
2015-12-01 09:16:19,350 WARN [Timer-Driven Process Thread-2] 
o.a.n.c.t.ContinuallyRunProcessorTask Administratively Yielding 
ConvertAvroToJSON[id=dd7826cf-31d0-437a-9d69-54d91d7c0838] due to uncaught 
Exception: java.util.NoSuchElementException
2015-12-01 09:16:19,353 WARN [Timer-Driven Process Thread-2] 
o.a.n.c.t.ContinuallyRunProcessorTask
java.util.NoSuchElementException: null
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:232) 
~[na:na]
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) 
~[na:na]
at 
org.apache.nifi.processors.avro.ConvertAvroToJSON$1.process(ConvertAvroToJSON.java:89)
 ~[na:na]
at 
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2155)
 ~[nifi-framework-core-0.3.0.jar:0.3.0]
at 
org.apache.nifi.processors.avro.ConvertAvroToJSON.onTrigger(ConvertAvroToJSON.java:81)
 ~[na:na]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 ~[nifi-api-0.3.0.jar:0.3.0]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1077)
 ~[nifi-framework-core-0.3.0.jar:0.3.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:127)
 [nifi-framework-core-0.3.0.jar:0.3.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49)
 [nifi-framework-core-0.3.0.jar:0.3.0]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:119)
 [nifi-framework-core-0.3.0.jar:0.3.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_66]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_66]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]




Re: NIFI connecting to Activemq

2016-01-09 Thread Oleg Zhurakousky
Chris

Are you sure you are providing the correct logs? I can’t see a single mention 
of JMS nor any stack traces which would definitely be there given what you see 
in UI.
Also, the fact that you see NPE is definitely a bug that we have to fix (user’s 
should never see NPE), so that can be filed. What I am trying to figure out is 
the condition that triggers it. May be if you shut down NIFI, delete all the 
logs and restart so you can get a fresh data. . .

Cheers
Oleg

On Jan 9, 2016, at 1:42 AM, Christopher Hamm 
> wrote:

Here are the logs

On Wed, Jan 6, 2016 at 11:32 PM, Joe Witt 
> wrote:
Chris,

Ok.  Can you take a look in the /logs/nifi-app.log and see if there is a 
stack trace for the NullPointerException included?

If not please add the following to your logback.xml and after 30 seconds or so 
it should start giving you the stack traces.  A stacktrace for the NPE would be 
really useful in pinpointing the likely issue.  Also please share the config 
details of GetJMSQueue processor that you can.



Thanks
Joe

On Wed, Jan 6, 2016 at 11:16 PM, Christopher Hamm 
> wrote:
Couldnt copy the text but here it is.


​

On Wed, Jan 6, 2016 at 5:28 PM, Joe Witt 
> wrote:
Christopher,

Is there any error/feedback showing up in the UI or in the logs?

Thanks
Joe

On Wed, Jan 6, 2016 at 5:19 PM, Christopher Hamm
> wrote:
> What am I doing wrong with hooking up my activemq jms get template? I put
> stuff into the activeMQ and NIFI wont get it. Using 0.4.1.
>
> --
> Sincerely,
> Chris Hamm
> (E) ceham...@gmail.com
> (Twitter) http://twitter.com/webhamm
> (Linkedin) http://www.linkedin.com/in/chrishamm



--
Sincerely,
Chris Hamm
(E) ceham...@gmail.com
(Twitter) http://twitter.com/webhamm
(Linkedin) http://www.linkedin.com/in/chrishamm




--
Sincerely,
Chris Hamm
(E) ceham...@gmail.com
(Twitter) http://twitter.com/webhamm
(Linkedin) http://www.linkedin.com/in/chrishamm




Re: Testing a nifi flow via junit

2016-01-09 Thread Oleg Zhurakousky
This is definitely possible and been done. What makes it difficult at times is 
to have all required NiFi dependencies in the process space of a given test. 
I've actually proposed a separate module for these types of 'headless' flow 
tests. It actually helped me to discover some of the bugs as well as learn some 
of the NiFi internals. 

Anyway, not near the computer at the moment, but will follow up with more next 
week 

Oleg 

Sent from my iPhone

> On Jan 4, 2016, at 12:38, Vincent Russell  wrote:
> 
> All,
> 
> I see that there is a way to test a single processor with the TestRunner 
> (StandardProcessorTestRunner) class, but is there a way to set up an 
> integration test to test a complete flow or a subset of a flow?
> 
> Thank you,
> Vincent


Re: High CPU usage in FileSystemRepository.java

2015-11-19 Thread Oleg Zhurakousky
Adam, thanks for doing all the research and pointing out where the problem is. 
It is definitely a bug.

Joe I’ve raised the https://issues.apache.org/jira/browse/NIFI-1200

Cheers
Oleg

On Nov 19, 2015, at 1:18 PM, Joe Percivall 
> wrote:

Hello Adam,


Are you still seeing high cpu usage?

Sorry no has gotten back to you sooner, we are all working very hard to get 
0.4.0 out.

Joe

- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com




On Friday, November 13, 2015 4:10 PM, Adam Lamar  wrote:



Mark,

For this development system, I'm running the packaged OpenJDK from
Ubuntu 14.04:

$ java -version
java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)
OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)

Interestingly, I tried another system running the Oracle JDK (same
version) and didn't see the same issue. Though it does seem to benefit
from the additional sleep in that loop, but just barely (maybe 1% CPU
difference - I could be making that up).

I hadn't uncommented those values, but I tried with no noticeable
difference on the OpenJDK system.

Hope that helps,
Adam


On 11/13/15 5:28 AM, Mark Payne wrote:
Adam,

What version of Java are you running?

Do you have the following lines from conf/bootstrap.conf uncommented, or are 
they still commented out?

java.arg.7=-XX:ReservedCodeCacheSize=256m
java.arg.8=-XX:CodeCacheFlushingMinimumFreeSpace=10m
java.arg.9=-XX:+UseCodeCacheFlushing
java.arg.11=-XX:PermSize=128M
java.arg.12=-XX:MaxPermSize=128M

Thanks
-Mark


On Nov 13, 2015, at 12:28 AM, Joe Witt  wrote:

sorry - i see now :-)

Thanks for the analysis.  Will dig in.

Joe

On Fri, Nov 13, 2015 at 12:28 AM, Joe Witt  wrote:
Adam,

Are you on a recent master build?

Thanks
Joe

On Fri, Nov 13, 2015 at 12:27 AM, Adam Lamar  wrote:
Hi everybody!

I'm following up from my previous thread about high CPU usage in GetSQS. I
ran into high CPU usage while developing a patch for that processor, and
while investigating with "top", I noticed one NiFi thread in particular
showed high CPU usage, even after turning off all processors and restarting
NiFi.

A jstack showed this thread was busy at FileSystemRepository.java line 1287
[1]. Since that is a continue statement, it suggests that the thread was
churning in the surrounding for loop.

I didn't debug any further, but I did add a sleep statement just before the
continue, and CPU usage dropped wildly, settling around 2-4%.

I hope this is useful information, and I'm happy to debug further if needed.

Cheers,
Adam

[1]
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/FileSystemRepository.java#L1287




Re: NiFi processor for Redis

2016-01-13 Thread Oleg Zhurakousky
Suddep

Also, when you say push/retrieve do you mean publish/subscribe or put/get since 
Redis is both storage and messaging system. As Joe mentioned we don’t have any 
at the moment but would be nice to start prioritizing.

Cheers
Oleg

> On Jan 13, 2016, at 12:41 AM, Joe Witt  wrote:
> 
> Sudeep,
> 
> Hello.  At this time there are no apache nifi redis processors to
> push/pull data with Redis that I am aware of.  Something you might be
> interested in contributing or contributing to?
> 
> Thanks
> Joe
> 
> On Wed, Jan 13, 2016 at 12:13 AM, sudeep mishra
>  wrote:
>> Hi,
>> 
>> Do we have any processor to push and retrieve data from Redis?
>> 
>> 
>> Thanks & Regards,
>> 
>> Sudeep Shekhar Mishra
>> 
> 



Re: Custom processor is failing for concurrency

2016-06-03 Thread Oleg Zhurakousky
Kumiko

It appears that the current state of the source you linked in is not in sync 
with what is in the stack trace. Perhaps you have made some code modifications 
(e.g., line 218 is an empty line in code while it has a pointer in the star 
trace).
In any event, from what I can see the error is coming from Azure libraries (not 
NiFi). Specifically ‘com.microsoft.rest.AutoRestBaseUrl.url(..)’ seems to be 
doing some iteration where I presume the remove is called. Perhaps it is not a 
thread safe class after all. What does Microsoft documentation says? Have you 
looked at the source to see what’s going on there? If its open please link and 
we can tale a look.

Cheers
Oleg

On Jun 3, 2016, at 4:58 PM, Kumiko Yada 
> wrote:

Here is the code, https://github.com/kyada1/PutFileAzureDLStore.

Thanks
Kumiko

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Friday, June 3, 2016 12:57 PM
To: users@nifi.apache.org
Subject: Re: Custom processor is failing for the custom processor

Hello,

Would you be able to share your code for PutFileAzureDLStore so we can help 
identify if there is a concurrency problem?

-Bryan

On Fri, Jun 3, 2016 at 3:39 PM, Kumiko Yada 
> wrote:
Hello,

I wrote the following custom service control and processor.  When the custom 
processor is running concurrently, it’s failing often with several different 
errors.  Are there any special handlings for concurrently that I need to add in 
the custom processor?  I wrote the sample Java program which does the same 
thing as the custom processor (authenticate every time the file is 
created/create a file, create 2 threads and run concurrently), it’s working 
fine.  The custom processor also fine when this is not running concurrently.

Custom service control – set the properties for the Microsoft Azure Datalake 
Store
Custom processor – authenticate, then create a file in Microsoft Azure Datalake 
Store

Error1:
2016-06-03 12:29:31,942 INFO [pool-2815-thread-1] 
c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 
64c89876-2f7b-4fbb-8e6f-fb395a47aa87] Instance discovery was successful
2016-06-03 12:29:31,946 ERROR [Timer-Driven Process Thread-10] 
n.a.d.processors.PutFileAzureDLStore
java.util.ConcurrentModificationException: null
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429) 
~[na:1.8.0_77]
at java.util.HashMap$EntryIterator.next(HashMap.java:1463) 
~[na:1.8.0_77]
at java.util.HashMap$EntryIterator.next(HashMap.java:1461) 
~[na:1.8.0_77]
at 
com.microsoft.rest.AutoRestBaseUrl.url(AutoRestBaseUrl.java:28) ~[na:na]
at retrofit2.RequestFactory.create(RequestFactory.java:50) 
~[na:na]
at retrofit2.OkHttpCall.createRawCall(OkHttpCall.java:181) 
~[na:na]
at retrofit2.OkHttpCall.execute(OkHttpCall.java:165) ~[na:na]
at 
com.microsoft.azure.management.datalake.store.FileSystemOperationsImpl.create(FileSystemOperationsImpl.java:1432)
 ~[na:na]
at 
nifi.azure.dlstore.processors.PutFileAzureDLStore.CreateFile(PutFileAzureDLStore.java:252)
 ~[na:na]
at 
nifi.azure.dlstore.processors.PutFileAzureDLStore.onTrigger(PutFileAzureDLStore.java:218)
 ~[na:na]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 ~[nifi-api-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1057)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:123)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_77]
at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_77]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_77]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_77]
at 

Re: Excessive logging

2016-06-05 Thread Oleg Zhurakousky
Huagen, it appears you have DEBUG level logging enabled.

Oleg

On Jun 4, 2016, at 21:27, Huagen peng 
> wrote:

Hi,

I got excessive logging from my NiFi instance.  I suddenly see logging like the 
following going into nidi-bootstrap.log very fast, like 10g/hour, filling up my 
disk quickly. What is the cause of this? How to stop it going into the log?  I 
tried to look into the logback.xml file and even commented out the logger for 
org.apache.nifi.StdOut, this did not help at all.

Thanks,

Huagen

2016-06-04 12:56:09,416 INFO [NiFi logging handler] org.apache.nifi.StdOut 
12:56:08.696 [Timer-Driven Process Thread-4] DEBUG 
org.apache.nifi.engine.FlowEngine - A flow controller execution task 
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7753bd5e' 
has been cancelled.

2016-06-04 05:22:58,034 INFO [NiFi logging handler] org.apache.nifi.StdOut 
05:22:58.034 [Timer-Driven Process Thread-3] DEBUG 
org.apache.nifi.engine.FlowEngine - A flow controller execution task 
'java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@546d70b0' 
has been cancelled.






Re: NiFi GetKafka Processor is illegal, contains a character other than ASCII alphanumerics, '.', '_' and '-'

2016-06-08 Thread Oleg Zhurakousky
I am not sure I understand the question 

Sent from my iPhone

> On Jun 8, 2016, at 18:23, Igor Kravzov  wrote:
> 
> Guys, what can be wrong?


Re: Custom processor is failing for concurrency

2016-06-09 Thread Oleg Zhurakousky
op it calls mappings.clear() so if the RequestFactory is 
shared by multiple threads (which I think it is), then one thread could be 
iterating over the set, which another calls mappings.clear().


On Fri, Jun 3, 2016 at 5:09 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Kumiko

It appears that the current state of the source you linked in is not in sync 
with what is in the stack trace. Perhaps you have made some code modifications 
(e.g., line 218 is an empty line in code while it has a pointer in the star 
trace).
In any event, from what I can see the error is coming from Azure libraries (not 
NiFi). Specifically ‘com.microsoft.rest.AutoRestBaseUrl.url(..)’ seems to be 
doing some iteration where I presume the remove is called. Perhaps it is not a 
thread safe class after all. What does Microsoft documentation says? Have you 
looked at the source to see what’s going on there? If its open please link and 
we can tale a look.

Cheers
Oleg

On Jun 3, 2016, at 4:58 PM, Kumiko Yada 
<kumiko.y...@ds-iq.com<mailto:kumiko.y...@ds-iq.com>> wrote:

Here is the code, https://github.com/kyada1/PutFileAzureDLStore.

Thanks
Kumiko

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Friday, June 3, 2016 12:57 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: Custom processor is failing for the custom processor

Hello,

Would you be able to share your code for PutFileAzureDLStore so we can help 
identify if there is a concurrency problem?

-Bryan

On Fri, Jun 3, 2016 at 3:39 PM, Kumiko Yada 
<kumiko.y...@ds-iq.com<mailto:kumiko.y...@ds-iq.com>> wrote:
Hello,

I wrote the following custom service control and processor.  When the custom 
processor is running concurrently, it’s failing often with several different 
errors.  Are there any special handlings for concurrently that I need to add in 
the custom processor?  I wrote the sample Java program which does the same 
thing as the custom processor (authenticate every time the file is 
created/create a file, create 2 threads and run concurrently), it’s working 
fine.  The custom processor also fine when this is not running concurrently.

Custom service control – set the properties for the Microsoft Azure Datalake 
Store
Custom processor – authenticate, then create a file in Microsoft Azure Datalake 
Store

Error1:
2016-06-03 12:29:31,942 INFO [pool-2815-thread-1] 
c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 
64c89876-2f7b-4fbb-8e6f-fb395a47aa87] Instance discovery was successful
2016-06-03 12:29:31,946 ERROR [Timer-Driven Process Thread-10] 
n.a.d.processors.PutFileAzureDLStore
java.util.ConcurrentModificationException: null
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429) 
~[na:1.8.0_77]
at java.util.HashMap$EntryIterator.next(HashMap.java:1463) 
~[na:1.8.0_77]
at java.util.HashMap$EntryIterator.next(HashMap.java:1461) 
~[na:1.8.0_77]
at 
com.microsoft.rest.AutoRestBaseUrl.url(AutoRestBaseUrl.java:28) ~[na:na]
at retrofit2.RequestFactory.create(RequestFactory.java:50) 
~[na:na]
at retrofit2.OkHttpCall.createRawCall(OkHttpCall.java:181) 
~[na:na]
at retrofit2.OkHttpCall.execute(OkHttpCall.java:165) ~[na:na]
at 
com.microsoft.azure.management.datalake.store.FileSystemOperationsImpl.create(FileSystemOperationsImpl.java:1432)
 ~[na:na]
at 
nifi.azure.dlstore.processors.PutFileAzureDLStore.CreateFile(PutFileAzureDLStore.java:252)
 ~[na:na]
at 
nifi.azure.dlstore.processors.PutFileAzureDLStore.onTrigger(PutFileAzureDLStore.java:218)
 ~[na:na]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 ~[nifi-api-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1057)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:123)
 [nifi-framework-core-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_77]
at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_77]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_77]
at 
java.util.concurrent.Sch

Re: Merge multiple flowfiles

2016-06-03 Thread Oleg Zhurakousky
Huge

Just to close the loop on this one, I also wanted to point out this JIRA 
https://issues.apache.org/jira/browse/NIFI-1926 for general purpose aggregation 
processor which indeed would support multiple connections, configurable 
aggregation, release and correlation strategies.
It would be nice if you can describe your use case in that JIRA, so we can 
start gathering these use cases.

Cheers
Oleg

On Jun 3, 2016, at 2:33 AM, Huagen peng 
> wrote:

Thanks for the reply, Andy.

I ended up abandoning my previous approach and using ExecuteStreamCommand to 
output (with zcat command on GZ files) all the files I want to concatenate.  
Then performing some data manipulation and saving the file.

Huagen

在 2016年6月3日,上午12:29,Andy LoPresto 
> 写道:

Huagen,

Sorry, I am a little confused. My understanding is that you want to combine n 
individual logs (each with a respective flowfile) from a specific hour into a 
single file. What is confusing is when you say “Even with that [a 5* 
confirmation loop], I occasionally still get more than one merged flowfile.” Do 
you mean that what you expected to be combined into a single flowfile is output 
as two distinct and incomplete flowfiles?

Without seeing a template of your work flow, I can make a couple of suggestions.

First, as mentioned last night by James Wing, I would encourage you to look at 
the MergeContent [1] processor properties to provide a high threshold for 
merging flowfiles. If you know the number of log files per hour a priori, you 
can set that as the “Minimum Number of Entries” and ensure that output will 
wait until that many flowfiles have been accumulated.

Also, given that you have described a “loop”, I would imagine you may have 
multiple connections feeding into MergeContent. MergeContent can have 
unexpected behavior with multiple incoming connections, and so I would 
recommend adding a Funnel to aggregate all incoming connections and provide a 
single incoming connection to MergeContent.

Please let us know if this helps, and if not, please share a template and some 
sample input if possible. Thanks.

[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 1, 2016, at 11:52 AM, Huagen peng 
> wrote:

Hi,

In the data flow I am dealing with now, there are multiple (up to 200) logs 
associated with a given hour.  I need to process these fragment hourly logs and 
then concatenate them into a single file.  The approach I am using now has an 
UpdateAttribute processor to set an arbitrary segment.original.filename 
attribute on all the flowfiles I want to merge.  Then I use a MergeContent 
processor, with an UpdateAttribute and RouteOnAttribute processor to form a 
loop to confirm five times that the merge is complete.  Even with that, I 
occasionally still get more than one merged flowfile.

Is there a better way to do this?  Or should I increase the loop count, say 10?

Thanks.

Huagen





Re: Merge multiple flowfiles

2016-06-03 Thread Oleg Zhurakousky
Huagen,
I also want to apologize for my spell-checker butchering your name ;)

Cheers
Oleg

On Jun 3, 2016, at 8:03 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Huge

Just to close the loop on this one, I also wanted to point out this JIRA 
https://issues.apache.org/jira/browse/NIFI-1926 for general purpose aggregation 
processor which indeed would support multiple connections, configurable 
aggregation, release and correlation strategies.
It would be nice if you can describe your use case in that JIRA, so we can 
start gathering these use cases.

Cheers
Oleg

On Jun 3, 2016, at 2:33 AM, Huagen peng 
<huagen.p...@gmail.com<mailto:huagen.p...@gmail.com>> wrote:

Thanks for the reply, Andy.

I ended up abandoning my previous approach and using ExecuteStreamCommand to 
output (with zcat command on GZ files) all the files I want to concatenate.  
Then performing some data manipulation and saving the file.

Huagen

在 2016年6月3日,上午12:29,Andy LoPresto 
<alopre...@apache.org<mailto:alopre...@apache.org>> 写道:

Huagen,

Sorry, I am a little confused. My understanding is that you want to combine n 
individual logs (each with a respective flowfile) from a specific hour into a 
single file. What is confusing is when you say “Even with that [a 5* 
confirmation loop], I occasionally still get more than one merged flowfile.” Do 
you mean that what you expected to be combined into a single flowfile is output 
as two distinct and incomplete flowfiles?

Without seeing a template of your work flow, I can make a couple of suggestions.

First, as mentioned last night by James Wing, I would encourage you to look at 
the MergeContent [1] processor properties to provide a high threshold for 
merging flowfiles. If you know the number of log files per hour a priori, you 
can set that as the “Minimum Number of Entries” and ensure that output will 
wait until that many flowfiles have been accumulated.

Also, given that you have described a “loop”, I would imagine you may have 
multiple connections feeding into MergeContent. MergeContent can have 
unexpected behavior with multiple incoming connections, and so I would 
recommend adding a Funnel to aggregate all incoming connections and provide a 
single incoming connection to MergeContent.

Please let us know if this helps, and if not, please share a template and some 
sample input if possible. Thanks.

[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html


Andy LoPresto
alopre...@apache.org<mailto:alopre...@apache.org>
alopresto.apa...@gmail.com<mailto:alopresto.apa...@gmail.com>
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 1, 2016, at 11:52 AM, Huagen peng 
<huagen.p...@gmail.com<mailto:huagen.p...@gmail.com>> wrote:

Hi,

In the data flow I am dealing with now, there are multiple (up to 200) logs 
associated with a given hour.  I need to process these fragment hourly logs and 
then concatenate them into a single file.  The approach I am using now has an 
UpdateAttribute processor to set an arbitrary segment.original.filename 
attribute on all the flowfiles I want to merge.  Then I use a MergeContent 
processor, with an UpdateAttribute and RouteOnAttribute processor to form a 
loop to confirm five times that the merge is complete.  Even with that, I 
occasionally still get more than one merged flowfile.

Is there a better way to do this?  Or should I increase the loop count, say 10?

Thanks.

Huagen






Re: PutKafka won't Put! "Failed while waiting for acks from Kafka"

2016-06-11 Thread Oleg Zhurakousky
Also, just looked ta the logs and what I see is that you have some type of 
connection problem to Kafka.

2016-06-11 15:23:06,193 WARN [kafka-producer-network-thread | SecondPutKafka2k] 
org.apache.kafka.common.network.Selector Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.2.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.2.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.2.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.2.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.2.jar:na]

The above is coming from Kafka (not NIFi). Are you sure that you have the 
correct IP:PORT in your producer config? One quick diagnostic you can do is to 
try to send message to Kafka from the command line on the same machine as NiFi. 
Basically this:

$> bin/kafka-console-producer.sh --broker-list : --topic 

Let me know
Cheers
Oleg

On Jun 11, 2016, at 5:03 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Also, the NPE was a bug that was fixed a while back and will be available on 
O.7.
Any chance you can build from 0.x branch?

Cheers
Oleg

Sent from my iPhone

On Jun 11, 2016, at 16:58, Juan Sequeiros 
<helloj...@gmail.com<mailto:helloj...@gmail.com>> wrote:

Hi Pat,

Looks to me that on your PutKafka you are pointing it to your zookeeper 
instance port 2181 you should point it to your kafka broker. Port 9092 ( I 
think )
Not home to verify ...

On Sat, Jun 11, 2016 at 3:31 PM, Pat Trainor 
<pat.trai...@gmail.com<mailto:pat.trai...@gmail.com>> wrote:
OK... Prep'ing kafka:

pat@wopr:/opt/kafka$ bin/kafka-server-stop.sh
pat@wopr:/opt/kafka$ bin/zookeeper-server-stop.sh
pat@wopr:/opt/kafka/logs$ rm log-cleaner.log controller.log server.log 
kafkaServer-gc.log zookeeper-gc.log

Prep'ing nifi:

pat@wopr:/opt/nifi$ bin/nifi.sh stop
pat@wopr:/opt/nifi$ rm logs/nifi-bootstrap.log logs/nifi-app.log 
logs/nifi-user.log

Starting kafka:

pat@wopr:/opt/kafka$ bin/zookeeper-server-start.sh config/zookeeper.properties &
pat@wopr:/opt/kafka$ bin/kafka-server-start.sh config/server.properties &

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic test
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic fast-messages
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic inNLP
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic outNLP

The above topics were preserved.

kafka.logs.tar.gz attached

Starting Nifi:

pat@wopr:/opt/nifi$ bin/nifi.sh start

nifi.logs.tar.gz attached.

Now to log onto Nifi GUI and start the 2 processes (GenerateFlowfile & PutKafka 
...to fast-messages topic):

I really hope this helps... I can't believe I'm going to spend my entire 
weekend onthis... :-(

Thanks!

pat<http://about.me/PatTrainor>
( ͡° ͜ʖ ͡°)


"A wise man can learn more from a foolish question than a fool can learn from a 
wise answer". ~ Bruce Lee.

On Jun 11 2016, at 1:38 pm, Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:

Pat

Could you please make sure we are seeing the full stacktrace from the app log?  
If we could get a full log that would he even better.

Thanks
Joe

On Jun 11, 2016 1:17 PM, "Pat Trainor" 
<pat.trai...@gmail.com<mailto:pat.trai...@gmail.com>> wrote:
Andrew,

Thanks for writing.

Nifi:
nifi-0.6.1
Kafka:
kafka_2.11-0.10.0.0

startup with no changes to configs:

#
Kafka:
bin/zookeeper-server-start.sh config/zookeeper.properties &
bin/kafka-server-start.sh config/server.properties &
#
# create a topic
#
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic test
[...]

Nifi's zookeeper.properties:


clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=./state/zookeeper
autopurge.snapRetainCount=30

#
# server.1=nifi-node1-hostname:2888:3888
# server.2=nifi-node2-hostname:2888:3888
# server.3=nifi-node3-hostname:2888:3888
#
server.1=

^^
Even though I have:

nifi.state.management.embedded.zookeeper.start=false

I filled in the above setting... changing it to:

server.1=localhost:2888:3888

...which made no difference, of course...

Any thoughts?

Thanks!

pat<http://about.me/PatTrainor>
( ͡° ͜ʖ ͡°)

On Jun 11 2016, at 12:17 pm, Andrew Grande 
<apere...@gmail.com<mailto:apere...@gmail.com>> wrote:
Pat, which NiFi version is that? Which Kafka broker version on the other end?

Andrew

On Sat, Jun 11, 2016 at 12:15 PM Pat T

Re: PutKafka won't Put! "Failed while waiting for acks from Kafka"

2016-06-11 Thread Oleg Zhurakousky
Also, the NPE was a bug that was fixed a while back and will be available on 
O.7.
Any chance you can build from 0.x branch?

Cheers
Oleg

Sent from my iPhone

On Jun 11, 2016, at 16:58, Juan Sequeiros 
> wrote:

Hi Pat,

Looks to me that on your PutKafka you are pointing it to your zookeeper 
instance port 2181 you should point it to your kafka broker. Port 9092 ( I 
think )
Not home to verify ...

On Sat, Jun 11, 2016 at 3:31 PM, Pat Trainor 
> wrote:
OK... Prep'ing kafka:

pat@wopr:/opt/kafka$ bin/kafka-server-stop.sh
pat@wopr:/opt/kafka$ bin/zookeeper-server-stop.sh
pat@wopr:/opt/kafka/logs$ rm log-cleaner.log controller.log server.log 
kafkaServer-gc.log zookeeper-gc.log

Prep'ing nifi:

pat@wopr:/opt/nifi$ bin/nifi.sh stop
pat@wopr:/opt/nifi$ rm logs/nifi-bootstrap.log logs/nifi-app.log 
logs/nifi-user.log

Starting kafka:

pat@wopr:/opt/kafka$ bin/zookeeper-server-start.sh config/zookeeper.properties &
pat@wopr:/opt/kafka$ bin/kafka-server-start.sh config/server.properties &

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic test
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic fast-messages
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic inNLP
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic outNLP

The above topics were preserved.

kafka.logs.tar.gz attached

Starting Nifi:

pat@wopr:/opt/nifi$ bin/nifi.sh start

nifi.logs.tar.gz attached.

Now to log onto Nifi GUI and start the 2 processes (GenerateFlowfile & PutKafka 
...to fast-messages topic):

I really hope this helps... I can't believe I'm going to spend my entire 
weekend onthis... :-(

Thanks!

pat
( ?? ?? ??)


"A wise man can learn more from a foolish question than a fool can learn from a 
wise answer". ~ Bruce Lee.

On Jun 11 2016, at 1:38 pm, Joe Witt 
> wrote:

Pat

Could you please make sure we are seeing the full stacktrace from the app log?  
If we could get a full log that would he even better.

Thanks
Joe

On Jun 11, 2016 1:17 PM, "Pat Trainor" 
> wrote:
Andrew,

Thanks for writing.

Nifi:
nifi-0.6.1
Kafka:
kafka_2.11-0.10.0.0

startup with no changes to configs:

#
Kafka:
bin/zookeeper-server-start.sh config/zookeeper.properties &
bin/kafka-server-start.sh config/server.properties &
#
# create a topic
#
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 
--partitions 1 --topic test
[...]

Nifi's zookeeper.properties:


clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=./state/zookeeper
autopurge.snapRetainCount=30

#
# server.1=nifi-node1-hostname:2888:3888
# server.2=nifi-node2-hostname:2888:3888
# server.3=nifi-node3-hostname:2888:3888
#
server.1=

^^
Even though I have:

nifi.state.management.embedded.zookeeper.start=false

I filled in the above setting... changing it to:

server.1=localhost:2888:3888

...which made no difference, of course...

Any thoughts?

Thanks!

pat
( ?? ?? ??)

On Jun 11 2016, at 12:17 pm, Andrew Grande 
> wrote:
Pat, which NiFi version is that? Which Kafka broker version on the other end?

Andrew

On Sat, Jun 11, 2016 at 12:15 PM Pat Trainor 
> wrote:
Users,

Has anyone come across PutKafka simply not wanting to send to Kafka?

In this troubleshooting, I have replicated symptoms in the simplest way I 
can-with a GenerateFlowFile and a PutKafka:

[https://community.hortonworks.com/storage/temp/4972-screenshot-from-2016-06-11-11-34-25.png]

With a known good Kafka running (other /producers using the same Topic)...

...yields:

Auto refresh started
11:17:19 EDT WARNING 
2b76634c-6354-4c3c-b977-86f73d64efbcPutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc]
 Processor Administratively Yielded for 1 sec due to processing failure
11:18:00 EDT ERROR 
2b76634c-6354-4c3c-b977-86f73d64efbcPutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc]
 Failed while waiting for acks from Kafka
11:18:00 EDT ERROR 
2b76634c-6354-4c3c-b977-86f73d64efbcPutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc]
 PutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc] failed to process due to 
java.lang.NullPointerException; rolling back session: 
java.lang.NullPointerException
11:18:00 EDT ERROR 2b76634c-6354-4c3c-b977-86f73d64efbc

PutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc] 
PutKafka[id=2b76634c-6354-4c3c-b977-86f73d64efbc] failed to process session due 
to java.lang.NullPointerException: java.lang.NullPointerException

Everything is super-stock, out of the box, and not customized. In fact, I 
tested Kafka with 

Re: PutKafka warning - why?

2016-05-26 Thread Oleg Zhurakousky
That is because the processor is already in the process of being stopped and 
you probably tried to start it again.
That said, the fact that you see it is probably a bug since you should not be 
able to see ‘start’ action until processor is fully stopped.

Could you please share more details as to what are you doing (in terms of 
sequence of events)

Thanks
Oleg

> On May 26, 2016, at 4:12 PM, Igor Kravzov  wrote:
> 
> Why I am getting warning bellow?
> 
> 
> 
> 



Re: PutKafka warning - why?

2016-05-26 Thread Oleg Zhurakousky
til.concurrent.FutureTask.runAndReset(Unknown Source) [na:1.8.0_91]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source) [na:1.8.0_91]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
[na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
[na:1.8.0_91]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_91]

On Thu, May 26, 2016 at 4:28 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
That is because the processor is already in the process of being stopped and 
you probably tried to start it again.
That said, the fact that you see it is probably a bug since you should not be 
able to see ‘start’ action until processor is fully stopped.

Could you please share more details as to what are you doing (in terms of 
sequence of events)

Thanks
Oleg

> On May 26, 2016, at 4:12 PM, Igor Kravzov 
> <igork.ine...@gmail.com<mailto:igork.ine...@gmail.com>> wrote:
>
> Why I am getting warning bellow?
> 
>
>
>





Re: GetKafka exception

2016-05-26 Thread Oleg Zhurakousky
Igor, you may want to go they Kafka mailing list. I've seen this several times, 
and all I know it has to do with broker
I'll look in more details once back on line

Oleg

Sent from my iPhone

On May 26, 2016, at 17:20, Igor Kravzov 
> wrote:

Why I am getting exception bellow


2016-05-26 17:13:36,776 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.consumer.ConsumerFetcherManager [ConsumerFetcherManager-1464297163222] 
Added fetcher for partitions ArrayBuffer()
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.utils.VerifiableProperties Verifying properties
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.utils.VerifiableProperties Property client.id is 
overridden to NiFi-d217bceb-d58b-4031-9143-26d8ddd15e40
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.utils.VerifiableProperties Property metadata.broker.list is overridden 
to 10.1.132.113:6667
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.utils.VerifiableProperties Property 
request.timeout.ms is overridden to 3
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.client.ClientUtils$ Fetching metadata from broker 
id:1001,host:10.1.132.113,port:6667 with correlation id 256 for 1 topic(s) 
Set(Test1)
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.producer.SyncProducer Connected to 
10.1.132.113:6667 for producing
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.producer.SyncProducer Disconnecting from 
10.1.132.113:6667
2016-05-26 17:13:36,981 WARN 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.client.ClientUtils$ Fetching topic metadata with correlation id 256 for 
topics [Set(Test1)] from broker [id:1001,host:10.1.132.113,port:6667] failed
java.nio.channels.ClosedChannelException: null
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) 
~[kafka_2.10-0.8.2.2.jar:na]
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) 
~[kafka_2.10-0.8.2.2.jar:na]
at 
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
 ~[kafka_2.10-0.8.2.2.jar:na]
at kafka.producer.SyncProducer.send(SyncProducer.scala:113) 
~[kafka_2.10-0.8.2.2.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58) 
[kafka_2.10-0.8.2.2.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:93) 
[kafka_2.10-0.8.2.2.jar:na]
at 
kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
 [kafka_2.10-0.8.2.2.jar:na]
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) 
[kafka_2.10-0.8.2.2.jar:na]
2016-05-26 17:13:36,981 INFO 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 kafka.producer.SyncProducer Disconnecting from 
10.1.132.113:6667
2016-05-26 17:13:36,981 WARN 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread]
 k.c.ConsumerFetcherManager$LeaderFinderThread 
[d217bceb-d58b-4031-9143-26d8ddd15e40_BPIW01a-1464297163195-b78e95a5-leader-finder-thread],
 Failed to find leader for Set([Test1,0])
kafka.common.KafkaException: fetching topic metadata for topics [Set(Test1)] 
from broker [ArrayBuffer(id:1001,host:10.1.132.113,port:6667)] failed
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:72) 
~[kafka_2.10-0.8.2.2.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:93) 
~[kafka_2.10-0.8.2.2.jar:na]
at 
kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
 ~[kafka_2.10-0.8.2.2.jar:na]
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) 
[kafka_2.10-0.8.2.2.jar:na]
Caused by: java.nio.channels.ClosedChannelException: null
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) 
~[kafka_2.10-0.8.2.2.jar:na]
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) 
~[kafka_2.10-0.8.2.2.jar:na]
at 
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
 ~[kafka_2.10-0.8.2.2.jar:na]
at kafka.producer.SyncProducer.send(SyncProducer.scala:113) 
~[kafka_2.10-0.8.2.2.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58) 
~[kafka_2.10-0.8.2.2.jar:na]
... 

Re: Nifi 0.50 and GetKafka Issues

2016-02-20 Thread Oleg Zhurakousky
Josh

Any chance to attache the app-log or relevant stack trace?

Thanks
Oleg

On Feb 20, 2016, at 3:30 PM, West, Joshua 
> wrote:

Hi folks,

I've upgraded from Nifi 0.4.1 to 0.5.0 and I am no longer able to use the 
GetKafka processor.  I'm seeing errors like so:

2016-02-20 20:10:14,953 WARN 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0],
 Error in 
fetchkafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642.
 Possible cause: java.lang.IllegalArgumentException

^ Note  the hostname of the server has been scrubbed.

My configuration is pretty generic, except that with Zookeeper we use a 
different root path, so our Zookeeper connect string looks like so:

zookeeper-node1:2181,zookeeper-node2:2181,zookeeper-node3:2181/kafka

Is anybody else experiencing issues?

Thanks.


--
Josh West >

Cloud Architect
Bose Corporation






Re: Nifi 0.50 and GetKafka Issues

2016-02-20 Thread Oleg Zhurakousky
Josh

The only change that ’s went and relevant to your issue is the fact that we’ve 
upgraded client libraries to Kafka 0.9 and between 0.8 and 0.9 Kafka introduced 
wire protocol changes that break compatibility.
I am still digging so stay tuned.

Oleg

On Feb 20, 2016, at 4:10 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi Oleg and Joe,

Kafka 0.8.2.1

Attached is the app log with hostnames scrubbed.

Thanks for your help.  Much appreciated.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>
Bose Corporation



On Sat, 2016-02-20 at 15:46 -0500, Joe Witt wrote:

And also what version of Kafka are you using?

On Feb 20, 2016 3:37 PM, "Oleg Zhurakousky" 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Josh

Any chance to attache the app-log or relevant stack trace?

Thanks
Oleg

On Feb 20, 2016, at 3:30 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi folks,

I've upgraded from Nifi 0.4.1 to 0.5.0 and I am no longer able to use the 
GetKafka processor.  I'm seeing errors like so:

2016-02-20 20:10:14,953 WARN 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0],
 Error in 
fetchkafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642<mailto:kafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642>.
 Possible cause: java.lang.IllegalArgumentException

^ Note  the hostname of the server has been scrubbed.

My configuration is pretty generic, except that with Zookeeper we use a 
different root path, so our Zookeeper connect string looks like so:

zookeeper-node1:2181,zookeeper-node2:2181,zookeeper-node3:2181/kafka

Is anybody else experiencing issues?

Thanks.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>

Cloud Architect
Bose Corporation









Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
Josh

Also, keep in mind that there are incompatible property names in Kafka between 
the 0.7 and 0.8 releases. One of the change that went it was replacing 
“zk.connectiontimeout.ms” with “zookeeper.connection.timeout.ms”. Not sure if 
it’s related though, but realizing that 0.4.1 was relying on this property it’s 
value was completely ignored with 0.8 client libraries (you could actually see 
the WARN message to that effect) and now it is not ignored, so take a look and 
see if tinkering with its value changes something.

Cheers
Oleg
On Feb 20, 2016, at 6:47 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Josh

The only change that ’s went and relevant to your issue is the fact that we’ve 
upgraded client libraries to Kafka 0.9 and between 0.8 and 0.9 Kafka introduced 
wire protocol changes that break compatibility.
I am still digging so stay tuned.

Oleg

On Feb 20, 2016, at 4:10 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi Oleg and Joe,

Kafka 0.8.2.1

Attached is the app log with hostnames scrubbed.

Thanks for your help.  Much appreciated.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>
Bose Corporation



On Sat, 2016-02-20 at 15:46 -0500, Joe Witt wrote:

And also what version of Kafka are you using?

On Feb 20, 2016 3:37 PM, "Oleg Zhurakousky" 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Josh

Any chance to attache the app-log or relevant stack trace?

Thanks
Oleg

On Feb 20, 2016, at 3:30 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi folks,

I've upgraded from Nifi 0.4.1 to 0.5.0 and I am no longer able to use the 
GetKafka processor.  I'm seeing errors like so:

2016-02-20 20:10:14,953 WARN 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0],
 Error in 
fetchkafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642<mailto:kafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642>.
 Possible cause: java.lang.IllegalArgumentException

^ Note  the hostname of the server has been scrubbed.

My configuration is pretty generic, except that with Zookeeper we use a 
different root path, so our Zookeeper connect string looks like so:

zookeeper-node1:2181,zookeeper-node2:2181,zookeeper-node3:2181/kafka

Is anybody else experiencing issues?

Thanks.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>

Cloud Architect
Bose Corporation










Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
The unfortunate part is that between 0.8 and 0.9 there are also breaking API 
changes on the Kafka side that would affect our code, so I say we need to 
probably start thinking more about versioning. And in fact we are in the 
concept of extension registry, but what I am now suggesting is that versioning 
must come in isolation from  registry.


As far as Zookeeper, i simply pointed out as one of the changes that were 
made., so I am glad it’s not affecting it. I guess to leaves the protocol 
incompatibilities.

Cheers
Oleg

On Feb 21, 2016, at 5:23 PM, Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:


Yeah the intent is to support 0.8 and 0.9.  Will figure something out.

Thanks
Joe

On Feb 21, 2016 4:47 PM, "West, Joshua" 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:
Hi Oleg,

Hmm -- from what I can tell, this isn't a Zookeeper communication issue.  Nifi 
is able to connect into the Kafka brokers' Zookeeper cluster and retrieve the 
list of the kafka brokers to connect to.  Seems, from the logs, to be a problem 
when attempting to consume from Kafka itself.

I'm guessing that the Kafka 0.9.0 client libraries just are not compatible with 
Kafka 0.8.2.1 so in order to use Nifi 0.5.0 with Kafka, the Kafka version must 
be >= 0.9.0.

Any change Nifi could add backwards compatible support for Kafka 0.8.2.1 too?  
Let you choose which client library version, when setting up the GetKafka 
processor?


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>
Bose Corporation



On Sun, 2016-02-21 at 15:02 +, Oleg Zhurakousky wrote:
Josh

Also, keep in mind that there are incompatible property names in Kafka between 
the 0.7 and 0.8 releases. One of the change that went it was replacing 
“zk.connectiontimeout.ms<http://zk.connectiontimeout.ms/>” with 
“zookeeper.connection.timeout.ms<http://zookeeper.connection.timeout.ms/>”. Not 
sure if it’s related though, but realizing that 0.4.1 was relying on this 
property it’s value was completely ignored with 0.8 client libraries (you could 
actually see the WARN message to that effect) and now it is not ignored, so 
take a look and see if tinkering with its value changes something.

Cheers
Oleg
On Feb 20, 2016, at 6:47 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Josh

The only change that ’s went and relevant to your issue is the fact that we’ve 
upgraded client libraries to Kafka 0.9 and between 0.8 and 0.9 Kafka introduced 
wire protocol changes that break compatibility.
I am still digging so stay tuned.

Oleg

On Feb 20, 2016, at 4:10 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi Oleg and Joe,

Kafka 0.8.2.1

Attached is the app log with hostnames scrubbed.

Thanks for your help.  Much appreciated.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>
Bose Corporation



On Sat, 2016-02-20 at 15:46 -0500, Joe Witt wrote:

And also what version of Kafka are you using?

On Feb 20, 2016 3:37 PM, "Oleg Zhurakousky" 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Josh

Any chance to attache the app-log or relevant stack trace?

Thanks
Oleg

On Feb 20, 2016, at 3:30 PM, West, Joshua 
<josh_w...@bose.com<mailto:josh_w...@bose.com>> wrote:

Hi folks,

I've upgraded from Nifi 0.4.1 to 0.5.0 and I am no longer able to use the 
GetKafka processor.  I'm seeing errors like so:

2016-02-20 20:10:14,953 WARN 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0],
 Error in 
fetchkafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642<mailto:kafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642>.
 Possible cause: java.lang.IllegalArgumentException

^ Note  the hostname of the server has been scrubbed.

My configuration is pretty generic, except that with Zookeeper we use a 
different root path, so our Zookeeper connect string looks like so:

zookeeper-node1:2181,zookeeper-node2:2181,zookeeper-node3:2181/kafka

Is anybody else experiencing issues?

Thanks.


--
Josh West <josh_w...@bose.com<mailto:josh_w...@bose.com>>

Cloud Architect
Bose Corporation











Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
That is a good point Bryan and would allow Josh to use 0.5.0 while still using 
older Kafka bundle
On Feb 21, 2016, at 6:07 PM, Bryan Bende 
<bbe...@gmail.com<mailto:bbe...@gmail.com>> wrote:

I know this does not address the larger problem, but in this specific case, 
would the 0.4.1 Kafka NAR still work in 0.5.x?

If the NAR doesn't depend on any other NARs I would think it would still work, 
and could be a work around for those that need to stay on Kafka 0.8.2.

On Sunday, February 21, 2016, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
The unfortunate part is that between 0.8 and 0.9 there are also breaking API 
changes on the Kafka side that would affect our code, so I say we need to 
probably start thinking more about versioning. And in fact we are in the 
concept of extension registry, but what I am now suggesting is that versioning 
must come in isolation from  registry.


As far as Zookeeper, i simply pointed out as one of the changes that were 
made., so I am glad it’s not affecting it. I guess to leaves the protocol 
incompatibilities.

Cheers
Oleg

On Feb 21, 2016, at 5:23 PM, Joe Witt 
<joe.w...@gmail.com<javascript:_e(%7B%7D,'cvml','joe.w...@gmail.com');>> wrote:


Yeah the intent is to support 0.8 and 0.9.  Will figure something out.

Thanks
Joe

On Feb 21, 2016 4:47 PM, "West, Joshua" 
<josh_w...@bose.com<javascript:_e(%7B%7D,'cvml','josh_w...@bose.com');>> wrote:
Hi Oleg,

Hmm -- from what I can tell, this isn't a Zookeeper communication issue.  Nifi 
is able to connect into the Kafka brokers' Zookeeper cluster and retrieve the 
list of the kafka brokers to connect to.  Seems, from the logs, to be a problem 
when attempting to consume from Kafka itself.

I'm guessing that the Kafka 0.9.0 client libraries just are not compatible with 
Kafka 0.8.2.1 so in order to use Nifi 0.5.0 with Kafka, the Kafka version must 
be >= 0.9.0.

Any change Nifi could add backwards compatible support for Kafka 0.8.2.1 too?  
Let you choose which client library version, when setting up the GetKafka 
processor?


--
Josh West 
<josh_w...@bose.com<javascript:_e(%7B%7D,'cvml','josh_w...@bose.com');>>
Bose Corporation



On Sun, 2016-02-21 at 15:02 +, Oleg Zhurakousky wrote:
Josh

Also, keep in mind that there are incompatible property names in Kafka between 
the 0.7 and 0.8 releases. One of the change that went it was replacing 
“zk.connectiontimeout.ms<http://zk.connectiontimeout.ms/>” with 
“zookeeper.connection.timeout.ms<http://zookeeper.connection.timeout.ms/>”. Not 
sure if it’s related though, but realizing that 0.4.1 was relying on this 
property it’s value was completely ignored with 0.8 client libraries (you could 
actually see the WARN message to that effect) and now it is not ignored, so 
take a look and see if tinkering with its value changes something.

Cheers
Oleg
On Feb 20, 2016, at 6:47 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<javascript:_e(%7B%7D,'cvml','ozhurakou...@hortonworks.com');>>
 wrote:

Josh

The only change that ’s went and relevant to your issue is the fact that we’ve 
upgraded client libraries to Kafka 0.9 and between 0.8 and 0.9 Kafka introduced 
wire protocol changes that break compatibility.
I am still digging so stay tuned.

Oleg

On Feb 20, 2016, at 4:10 PM, West, Joshua 
<josh_w...@bose.com<javascript:_e(%7B%7D,'cvml','josh_w...@bose.com');>> wrote:

Hi Oleg and Joe,

Kafka 0.8.2.1

Attached is the app log with hostnames scrubbed.

Thanks for your help.  Much appreciated.


--
Josh West 
<josh_w...@bose.com<javascript:_e(%7B%7D,'cvml','josh_w...@bose.com');>>
Bose Corporation



On Sat, 2016-02-20 at 15:46 -0500, Joe Witt wrote:

And also what version of Kafka are you using?

On Feb 20, 2016 3:37 PM, "Oleg Zhurakousky" 
<ozhurakou...@hortonworks.com<javascript:_e(%7B%7D,'cvml','ozhurakou...@hortonworks.com');>>
 wrote:
Josh

Any chance to attache the app-log or relevant stack trace?

Thanks
Oleg

On Feb 20, 2016, at 3:30 PM, West, Joshua 
<josh_w...@bose.com<javascript:_e(%7B%7D,'cvml','josh_w...@bose.com');>> wrote:

Hi folks,

I've upgraded from Nifi 0.4.1 to 0.5.0 and I am no longer able to use the 
GetKafka processor.  I'm seeing errors like so:

2016-02-20 20:10:14,953 WARN 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-NiFi-sldjflkdsjflksjf_**SCRUBBED**-1455999008728-5b8c7108-0-0],
 Error in 
fetchkafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642<javascript:_e(%7B%7D,'cvml','kafka.consumer.ConsumerFetcherThread$FetchRequest@7b49a642');>.
 Possible cause: java.lang.IllegalArgumentException

^ Note  the hostname of the server has been scrubbed.

My configuration is pretty generic, except that with Zookeeper we use a 
different root path, s

Re: Version Control on NiFi flow.xml

2016-02-17 Thread Oleg Zhurakousky
Jeff, what you are describing is in works and actively discussed
https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
and
https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements

The last one may not directly speaks to the “ExtensionRegistry”, but if you 
look through he comments there is a whole lot about it since it is dependent.
Feel free to participate, but I can say for now that it is slated for 1.0 
release.

Cheers
Oleg

On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia 
> wrote:

Hi,

As my NiFi data flow becomes more and more serious, I need to put on Version 
Control. Since flow.xml.gz is generated automatically and it is saved in a 
compressed file, I am wondering what would be the best practice regarding 
version control?

Thanks,
Jeff

--
Data Bean - A Big Data Solution Provider in Australia.



Re: Running Processors Synchronously

2016-02-11 Thread Oleg Zhurakousky
Obaid

Thanks for reaching out!
Currently it is not possible to wire a flow the way you describe. 
Wha you are asking is a true Event Driven Consumer paradigm which been 
discussed internally a lot lately, so it would be very interesting to get your 
perspective as to why do you believe it is important within your use case to 
have that, so please share, since it will help to bring this discussion to a 
resolution.

Having said that in the current NiFi architecture you still get that serial 
processing that you are describing. The only difference is that every consumer 
(Processor) is independent from the internals of another since there is a Queue 
separating each and every one of them allowing each processor to be managed 
independently (e.g., start/stop) while letting flow to continue, knowing that 
messages will queue up if processor is not available and processed as soon and 
as quick as processor becomes available. 

Anyway, your thoughts?

Cheers
Oleg

> On Feb 11, 2016, at 7:14 AM, obaidul karim  wrote:
> 
> Hi All,
> 
> Lets say, I have below processors:
> 
> 1.listfile > 2.fetchfile > 3.putHDFS > 4.ExecuteSQL
> 
> I want to run all above processors in sequence. 
> Processor 1 will wait for next run until all 2,3 & 4 completes. Similarly 2 
> will wait until 3 & 4 completes current run and so on.
> 
> In other words, can I schedule the entire workflow as single job and under 
> single schedule ?
> 
> 
> Thanks in advance.
> 
> -Obaid
> 
>  



Re: Nifi 0.50 and GetKafka Issues

2016-03-15 Thread Oleg Zhurakousky
Michael

Unfortunately I am relatively new to ASF culture and so not sure if it would be 
appropriate/feasible for ASF distribution of NiFi to support individual forks 
of any product/project (ASF gurus feel free to chime in). And with Kafka it 
became quite unfortunate since there are many forks out there with varying 
levels of compatibility with corresponding ASF version of Kafka. That said, the 
best we can do for Kafka is what’s described here 
https://issues.apache.org/jira/browse/NIFI-1629 and will go into the upcoming 
0.6 release. Hopefully that will bring some relief.

Cheers
Oleg

On Mar 15, 2016, at 1:21 PM, Michael Dyer 
<michael.d...@trapezoid.com<mailto:michael.d...@trapezoid.com>> wrote:

Oleg,

It's part of the Cloudera distro.  Not sure of the lineage beyond that.  Here 
are a couple of links.

https://community.cloudera.com/t5/Data-Ingestion-Integration/New-Kafka-0-8-2-0-1-kafka1-3-1-p0-9-Parcel-What-are-the-changes/td-p/30506
http://archive.cloudera.com/kafka/parcels/1.3.1/

Michael


On Tue, Mar 15, 2016 at 12:01 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Michael

What is KAFKA-0.8.2.0-1.kafka1.3.1.p0.9? I mean where can I get that build?
I guess based on the previous email we’ve tested our code with 3 versions of 
ASF distribution of Kafka and the above version tells me that it may be some 
kind of fork.

Also, we are considering downgrading Kafka dependencies back to the 0.8 and of 
0.7 provide a new version of Kafka processors that utilize Kafka 0.9 new 
producer/consumer API

Thanks
Oleg

On Mar 15, 2016, at 11:46 AM, Michael Dyer 
<michael.d...@trapezoid.com<mailto:michael.d...@trapezoid.com>> wrote:

Joe,

I'm seeing a similar issue moving from 0.3.0 to 0.5.1 with 
KAFKA-0.8.2.0-1.kafka1.3.1.p0.9.

I can see the tasks/time counter increment on the processor but no flow data 
ever leaves the processor.  There are no errors shown in the bulletin board.  
The app log shows below (repeating).

The rename 0.4.1 nar to 0.5.1 nar trick (restart nifi) works, except that the 
'batch size' value does not seem to be honored.  I have my batch size set to 
1, but I'm seeing files written continually (every few seconds) with much 
smaller sizes.  I suspect this has to do with the `auto.offset.reset` value 
which defaults to `largest`.  From what I have read 'smallest' causes the 
client to start at the beginning which sounds like I would be retrieving 
duplicates.

Renaming 0.3.0 nar to 0.5.1 (restart nifi) restores the original behavior.

yBuffer([[netflow5,0], initOffset 297426 to broker 
BrokerEndPoint(176,n2.foo.bar.com<http://n2.foo.bar.com/>,9092)] )
2016-03-15 07:45:17,237 WARN 
[ConsumerFetcherThread-b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-0-176]
 kafka.consumer.ConsumerFetcherThread 
[ConsumerFetcherThread-b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-0-176],
 Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest@11bd00f6. 
Possible cause: java.lang.IllegalArgumentException
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.utils.VerifiableProperties Verifying properties
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.utils.VerifiableProperties Property client.id<http://client.id/> is 
overridden to NiFi-b6c67ee3-aa9e-419d-8a57-84ab5e76c017
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.utils.VerifiableProperties Property metadata.broker.list is overridden 
to 
n3.foo.bar.com:9092<http://n3.foo.bar.com:9092/>,n2.foo.bar.com:9092<http://n2.foo.bar.com:9092/>,n4.foo.bar.com:9092<http://n4.foo.bar.com:9092/>
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.utils.VerifiableProperties Property 
request.timeout.ms<http://request.timeout.ms/> is overridden to 3
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.client.ClientUtils$ Fetching metadata from broker 
BrokerEndPoint(196,n4.foo.bar.com<http://n4.foo.bar.com/>,9092) with 
correlation id 14 for 1 topic(s) Set(netflow5)
2016-03-15 07:45:17,443 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.producer.SyncProducer Connected to 
n4.foo.bar.com:9092<http://n4.foo.bar.com:9092/> for producing
2016-03-15 07:45:17,444 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-finder-thread]
 kafka.producer.SyncProducer Disconnecting from 
n4.foo.bar.com:9092<http://n4.foo.bar.com:9092/>
2016-03-15 07:45:17,444 INFO 
[b6c67ee3-aa9e-419d-8a57-84ab5e76c017_nifi-1458053114390-d60f3cc0-leader-f

Re: Having on processor block while another one is running

2016-03-28 Thread Oleg Zhurakousky
Vincent

This sounds more like an architectural question and even outside of NiFi in 
order to achieve that especially in the distributed environment one would need 
some kind of coordination component. And while we can think of variety of way 
to accomplish that I am not entirely convinced that this is the right direction.
Would you mind sharing a bit more about your use case and perhaps we can 
jointly come up with a better and hopefully simpler solution?

Cheers
Oleg

On Mar 28, 2016, at 6:45 PM, Vincent Russell 
> wrote:


I have two processors (that aren't  part of the same flow) that write to the 
same resource (a mongo collection) via a map reduce job.

I don't want both to run at the same time.

On Mar 28, 2016 6:28 PM, "Joe Witt" 
> wrote:
Vincent,

Not really and that would largely be by design.  Can you describe the
use case more so we can suggest alternatives or perhaps understand the
motivation better?

Thanks
Joe

On Mon, Mar 28, 2016 at 4:00 PM, Vincent Russell
> wrote:
>
> Is it possible to have one processor block while another specified processor
> is running (within the onTrigger method).
>
> I can do this on a non-clustered nifi with a synchronized block I guess, but
> i wanted to know if there was a more idiomatic way of doing this.
>
> Thanks,
> Vincent



Re: NiFi Supports for Kerberized Kafka Cluster

2016-04-07 Thread Oleg Zhurakousky
Not at the moment unless you are using HDP distribution of Kafka broker 0.8 
which has Kerberos support while ASF version of 0.8 Kafka doesn't. 
That said, we are working on new pair of Kafka processors that will rely on 
Kafka 0.9 API and broker where such support is provided. It is slated for 0.7 
release, but it should be available in master some time next week if you'd like 
to try NIFI-1296.

Cheers 
Oleg 




Sent from my iPhone

> On Apr 7, 2016, at 11:39, indus well  wrote:
> 
> Hello All:
> 
> Does current version of NiFi support kerberized Kafka cluster? I am getting 
> time-out errors trying to use PutKafka and GetKafka processors. 
> 
> Please advise.
> 
> Thanks,
> 
> Indus


Re: NiFi Supports for Kerberized Kafka Cluster

2016-04-07 Thread Oleg Zhurakousky
Yep, just watch the JIRA for PR

Sent from my iPhone

On Apr 7, 2016, at 12:35, indus well 
<indusw...@gmail.com<mailto:indusw...@gmail.com>> wrote:

Thank you, Oleg. I'll be testing it out once it is available next week.

Regards,

Indus

On Thu, Apr 7, 2016 at 10:48 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Not at the moment unless you are using HDP distribution of Kafka broker 0.8 
which has Kerberos support while ASF version of 0.8 Kafka doesn't.
That said, we are working on new pair of Kafka processors that will rely on 
Kafka 0.9 API and broker where such support is provided. It is slated for 0.7 
release, but it should be available in master some time next week if you'd like 
to try NIFI-1296.

Cheers
Oleg




Sent from my iPhone

> On Apr 7, 2016, at 11:39, indus well 
> <indusw...@gmail.com<mailto:indusw...@gmail.com>> wrote:
>
> Hello All:
>
> Does current version of NiFi support kerberized Kafka cluster? I am getting 
> time-out errors trying to use PutKafka and GetKafka processors.
>
> Please advise.
>
> Thanks,
>
> Indus



Re: To help fulfill NIFI-513...

2016-03-23 Thread Oleg Zhurakousky
Russell

First, thank you for taking initiative!
Indeed we need to bring that JIRA to a closure. Having said that there is 
actually a link in our Contributor Guide on interactive  debugging  
https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-RunningNiFiinDebugmode
 which also links to a sample project which describes in details debugging in 
both Eclipse and IntelliJ https://github.com/olegz/nifi-ide-integration/, 
although it may need some TLC. Please go through it and see if you feel 
anything should be added.

Cheers
Oleg

On Mar 23, 2016, at 5:22 PM, Russell Bateman 
>
 wrote:

I needed to break down and debug a processor I'm working on. I found a JIRA 
issue a year old that's probably not been taken care of. I'd like to help.

https://issues.apache.org/jira/browse/NIFI-513

I wrote this quickie on how to set up IntelliJ or Eclipse to accomplish that. 
I'm using IntelliJ, but I'm a very old Eclipse guy and still active in the 
Eclipse forums. It's actually easier than I make it look here because I'm 
covering Eclipse as well as IntelliJ and I'm trying not to assume the person 
who follows will know too much about doing this or might never have done remote 
debugging before.

Hope this helps someone. Maybe someone in charge of the NiFi user doc could use 
this to resolve the JIRA issue.




Steps to debugging a NiFi processor you've written. See this article 
*
 to learn how remote debugging works in IntelliJ. The example uses IntelliJ to 
debug an application running in Tomcat, but it's a similar thing no matter if 
you're using Eclipse (or IntelliJ) to debug NiFi.

1. Edit /conf/bootstrap.conf/ and simply uncomment the line that starts
  out with java.arg.debug.
2. Create a Debug configuration.

  In IntelliJ IDEA, ...
   1. Do Run → Edit Configurations
   2. Click the green +.
   3. Choose Remote (because it's a remote debugging session you want
  to create.
   4. Give a Name:, something like "Local NiFi".
   5. Change the port to 8000 (this value must match the one in
  /conf/bootstrap.conf/).
   6. If you're only debugging a processor in a project with multiple
  modules, set the drop-down Search sources using module's
  classpath: to the module in which it lives.
  In Eclipse, do
   1. Do Run → Debug Configurations
   2. Choose Remote Java Application.
   3. Click on the New Launch Configuration icon (a tiny sheet of
  paper with a yellow plus sign in upper left of dialog).
   4. Give it a name like "Local NiFi".
   5. In Project:, type the name (or browse for it) of the project
  containing your processor code.
   6. Set the Port: to 8000 or whatever you established in
  /conf/bootstrap.conf/.
   7. Click Apply or, if you're ready, Debug.

3. Launch NiFi (or bounce it).
4. Set one or more breakpoints in your processor code.
5. Launch the debugger.

  In IntelliJ IDEA, do Run → Debug and choose "Local NiFi" (or
  whatever you called it) from the list presented. This brings up the
  debugger and displays, "Connected to the target VM, address
  'localhost:8000', transport: 'socket'
  In Eclipse, do Run → Debug Configurations..., scroll down and choose
  "Local NiFi" or whatever you called it. What you see should give you
  warm fuzzies, something akin to what I'm reporting seeing in
  IntelliJ IDEA.

6. Prepare data flows into your processor.
7. Start your processor; the IDE debugger should pop up stopped at the
  first breakpoint.
8. Debug away.

* 
http://blog.trifork.com/2014/07/14/how-to-remotely-debug-application-running-on-tomcat-from-within-intellij-idea/



Re: Has anyone this error signature in PutKafka

2016-03-23 Thread Oleg Zhurakousky
Interesting question and the only way I can answer it is that it shouldn’t, but 
would need more context

Oleg

> On Mar 23, 2016, at 6:27 PM, McDermott, Chris Kevin (MSDU - 
> STaTS/StorefrontRemote) <chris.mcderm...@hpe.com> wrote:
> 
> Thanks, Oleg.
> 
> FWIW the traceback seems to correspond to this line in PutKafka.java
> 
> final byte[] value = new byte[(int) flowFile.getSize()];
> 
> 
> Can the flow file size be negative?
> 
> 
> 
> On 3/23/16, 6:24 PM, "Oleg Zhurakousky" <ozhurakou...@hortonworks.com> wrote:
> 
>> Chris
>> 
>> Yes we have (can’t remember the details though). There was a lot of work 
>> around NiFi Kafka support lately and it somewhat culminated today. So. . .
>> 1. We’ve downgraded back to 0.8.2 
>> https://issues.apache.org/jira/browse/NIFI-1629. You can see details in JIRA
>> 2. We’ve done some refactoring in PutKafka to address this 
>> https://issues.apache.org/jira/browse/NIFI-1645 which was merged to the 
>> master few hours ago
>> 
>> The 0.6.0 release is imminent, so I’d suggest to either wait few days or if 
>> you can build from master that would be ideal and would help us tremendously 
>> with additional validation.
>> Let me know if you need any help with building.
>> 
>> Cheers
>> Oleg
>>> On Mar 23, 2016, at 6:07 PM, McDermott, Chris Kevin (MSDU - 
>>> STaTS/StorefrontRemote) <chris.mcderm...@hpe.com> wrote:
>>> 
>>> 2016-03-23 21:29:22,383 ERROR [Timer-Driven Process Thread-5] 
>>> o.apache.nifi.processors.kafka.PutKafka 
>>> PutKafka[id=f66e4092-946e-3338-93d6-7ea3a56bfd20] 
>>> PutKafka[id=f66e4092-946e-3338-93d6-7ea3a56bfd20] failed to process session 
>>> due to java.lang.NegativeArraySizeException: 
>>> java.lang.NegativeArraySizeException
>>> 2016-03-23 21:29:22,383 WARN [Timer-Driven Process Thread-5] 
>>> o.apache.nifi.processors.kafka.PutKafka 
>>> PutKafka[id=f66e4092-946e-3338-93d6-7ea3a56bfd20] Processor 
>>> Administratively Yielded for 1 sec due to processing failure
>>> 2016-03-23 21:29:22,383 WARN [Timer-Driven Process Thread-5] 
>>> o.a.n.c.t.ContinuallyRunProcessorTask Administratively Yielding 
>>> PutKafka[id=f66e4092-946e-3338-93d6-7ea3a56bfd20] due to uncaught 
>>> Exception: java.lang.NegativeArraySizeException
>>> 2016-03-23 21:29:22,392 WARN [Timer-Driven Process Thread-5] 
>>> o.a.n.c.t.ContinuallyRunProcessorTask
>>> java.lang.NegativeArraySizeException: null
>>>   at 
>>> org.apache.nifi.processors.kafka.PutKafka.onTrigger(PutKafka.java:451) 
>>> ~[na:na]
>>>   at 
>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1146)
>>>  ~[nifi-framework-core-1.1.1.0-12.jar:1.1.1.0-12]
>>>   at 
>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:139)
>>>  [nifi-framework-core-1.1.1.0-12.jar:1.1.1.0-12]
>>>   at 
>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49)
>>>  [nifi-framework-core-1.1.1.0-12.jar:1.1.1.0-12]
>>>   at 
>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:119)
>>>  [nifi-framework-core-1.1.1.0-12.jar:1.1.1.0-12]
>>>   at 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
>>> [na:1.8.0_45]
>>>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
>>> [na:1.8.0_45]
>>>   at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>  [na:1.8.0_45]
>>>   at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>  [na:1.8.0_45]
>>>   at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>  [na:1.8.0_45]
>>>   at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>  [na:1.8.0_45]
>>>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>>> 
>>> I’m running HDF 1.1.1.0-12 (not sure which NiFi version that maps to) and 
>>> Kafka 0.9.0
>>> 
>>> Thanks,
>>> Chris
>> 



Re: Having on processor block while another one is running

2016-03-30 Thread Oleg Zhurakousky
Vincent

Sorry for the late reply, but here it is

Based on what you have described it appears you have a mix of two problems: 
Work Flow Orchestration and Data Flow.
The main issue is that at the surface it’s not always easy to tell the 
difference, but I’ll try.

Work Flow Orchestration allows one to orchestrate a single process by breaking 
it down in a set of individual components (primarily for simplicity and 
modularization) and then composing such components into one cohesive process.
Data Flow manages individual processes, their lifecycle, execution, input and 
output from a central command/control facility.

So with the above in mind i say you have a mix problem where you have Data Flow 
consisting of simple and complex processors. And its those complex processors 
that need to invoke MR job and then act on its result is what falls into the 
category of Work Flow Orchestration where individual components within such 
process must work with awareness of the overall process they represent.  For 
example:

GetFile (NiFi)
PutHDFS (NiFi)
Process (NiFi Custom Processor) - where you execute the MR Job and react to its 
completion (success or failure) and possibly put something on the output queen 
in NiFi so the next element of Data Flow can kick in.
. . .

So, the 3 elements of Data Flow above are the individual NiFi Processors, yet 
the 3rd one internally represents a complex and orchestrated process. Now, the 
orchestration is just a term and without relying on outside frameworks that 
specifically address the orchestration it would be just a lot of custom code. 
Thankfully NiFi provides support for Spring Application Context container that 
allows you to implement your NiFi processor using work flow orchestration 
frameworks such as Spring Integration and/or Apache Camel.

I’d be more then willing to help you further with that if you’re interested, 
but wanted to see how you feel with the above architecture. I am also working 
on the blog to describe exactly that and how Data Flow and Work Flow can 
complement  one another.

Let me know
Cheers
Oleg

On Mar 29, 2016, at 9:26 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Vincent

I do have a suggestion for you but need a bit more time to craft my response. 
Give me till tonight EST.

Cheers
Oleg
On Mar 29, 2016, at 8:55 AM, Vincent Russell 
<vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote:

Thanks Oleg and Joe,

I am not currently convinced that nifi is the solution as well, but it is a 
nice way for us to manage actions based on the result of a mapreduce job.

Our use cases is to have follow on processors that perform actions based on the 
results of the map reduce jobs.  One processor kicks off the M/R process and 
then the results are sent down the flow.

The problem with our current scenario is that we have two separate flows that 
utilize the same location as the output for the M/R locations.

One simple way might be to use mongo itself has a locking mechanism.

On Mon, Mar 28, 2016 at 7:07 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Vincent

This sounds more like an architectural question and even outside of NiFi in 
order to achieve that especially in the distributed environment one would need 
some kind of coordination component. And while we can think of variety of way 
to accomplish that I am not entirely convinced that this is the right direction.
Would you mind sharing a bit more about your use case and perhaps we can 
jointly come up with a better and hopefully simpler solution?

Cheers
Oleg

On Mar 28, 2016, at 6:45 PM, Vincent Russell 
<vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote:


I have two processors (that aren't  part of the same flow) that write to the 
same resource (a mongo collection) via a map reduce job.

I don't want both to run at the same time.

On Mar 28, 2016 6:28 PM, "Joe Witt" 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:
Vincent,

Not really and that would largely be by design.  Can you describe the
use case more so we can suggest alternatives or perhaps understand the
motivation better?

Thanks
Joe

On Mon, Mar 28, 2016 at 4:00 PM, Vincent Russell
<vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote:
>
> Is it possible to have one processor block while another specified processor
> is running (within the onTrigger method).
>
> I can do this on a non-clustered nifi with a synchronized block I guess, but
> i wanted to know if there was a more idiomatic way of doing this.
>
> Thanks,
> Vincent






Re: Default termination for relationships

2016-04-24 Thread Oleg Zhurakousky
Manish

That is an interesting suggestion. I personally don't see issues with it and 
generally do believe it would improve user experience, but interested what 
others think.

Cheers
Oleg

Sent from my iPhone

On Apr 24, 2016, at 08:46, Manish Gupta 8 
> wrote:

Hi,

Does it make sense to keep all the out-flowing relationships auto-terminated by 
default when a new processor is dragged in? When user connects the processor 
and specify a relationship, only the selected one becomes non-terminating.

I think this will be good from usability point of view.

Thanks,
Manish




Re: Help Wanted

2016-05-03 Thread Oleg Zhurakousky
Andrew

Regarding PR vs. Patch.

This has been an ongoing discussion and i’ll let other’s to contribute to this. 
Basically we support both. That said, personally (and it appears to be embraced 
by the rest of the community) PR is the preference specifically due to the 
inline review/comment capabilities provided by GitHub.

Cheers
Oleg
 
> On May 3, 2016, at 11:18 AM, Andrew Psaltis <psaltis.and...@gmail.com> wrote:
> 
> Thank you Oleg!
> 
> Yeah, that page with the Code Review, has a little refresh link, but it
> really just points to this JIRA query:
> https://issues.apache.org/jira/browse/NIFI-1837?filter=12331874
> 
> As a community is there a preference given to JIRA's with Patch or GH PR's
> or are they all treated with the same priority?
> 
> Thanks,
> Andrew
> 
> On Tue, May 3, 2016 at 11:12 AM, Oleg Zhurakousky <
> ozhurakou...@hortonworks.com> wrote:
> 
>> Andrew
>> 
>> Thank you so much for following up on this.
>> I am assuming you have GitHub account. If not please create one as most of
>> our contributions deal with pull requests (PR).
>> Then you can go to https://github.com/apache/nifi , click on “Pull
>> Requests” and review them by commenting in line (you can see plenty of
>> examples there of PRs that are already in review process).
>> 
>> I would also suggest to get familiar with Contributor’s guideline for NiFi
>> - https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide. But
>> it appears you have already done so and I think there may be small
>> discrepancy in the link you provided or may be it is not as dynamic.
>> In any event JIRA and GutHub are good resources to use.
>> 
>> As for the last question, the best case scenario is both (code review and
>> test). Having said that we do realize that your time and the time of every
>> contributor may be limited, so I say whatever you can. Some time quick code
>> scan can uncover the obvious that doesn’t need testing.
>> 
>> Thanks again
>> Cheers
>> Oleg
>> 
>> On May 3, 2016, at 11:07 AM, Andrew Psaltis <psaltis.and...@gmail.com>
>> wrote:
>> 
>> Oleg,
>> I would love to help -- couple of quick questions:
>> 
>> The GH PR's are ~60 as you indicated, but the How To Contribute guide (Code
>> review process --
>> 
>> https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-CodeReviewProcess
>> ) shows a JIRA list with patches available.
>> 
>> Which should be reviewed first? For the PR's on GH are you just looking for
>> code review or same process of apply local merge and test?
>> 
>> Thanks,
>> Andrew
>> 
>> On 5/3/16, 9:58 AM, "Oleg Zhurakousky" <ozhurakou...@hortonworks.com>
>> wrote:
>> 
>> Guys
>> 
>> I’d like to use this opportunity to address all members of the NiFi
>> 
>> community hence this email is sent to both mailing lists (dev/users)
>> 
>> 
>> While somewhat skeptical when I started 6 month ago, I have to admit that
>> 
>> now I am very excited to observe the growth and adaption of the Apache NiFi
>> and say that in large part it’s because of the healthy community that we
>> have here - committers and contributors alike representing variety of
>> business domains.
>> 
>> This is absolutely great news for all of us and I am sure some if not all
>> 
>> of you share this sentiment.
>> 
>> 
>> That said and FWIW we need help!
>> While it’s great to wake up every morning to a set of new PRs and patches,
>> 
>> we now have a bit of a back log. In large this is due to the fact that most
>> of our efforts are spent in development as we all try to grow NiFi feature
>> base. However we need to remember that PRs and patches will remain as they
>> are unless and until they are reviewed/agreed to be merged by this same
>> community and that is where we need help. While “merge" responsibilities
>> are limited to “committers”, “review” is the responsibility of every member
>> of this community and I would like to ask you if at all possible to
>> redirect some of your efforts to this process.
>> 
>> We currently have 61 outstanding PRs and this particular development cycle
>> 
>> is a bit more complex then the previous ones since it addresses 0.7.0 and
>> 1.0.0 releases in parallel (so different approach to breaking changes if
>> any etc.)
>> 
>> 
>> Cheers
>> Oleg
>> 
>> 
>> --
>> Thanks,
>> Andrew
>> 
>> 
>> 
> 
> 
> -- 
> Thanks,
> Andrew
> 
> Subscribe to my book: Streaming Data <http://manning.com/psaltis>
> <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306>
> twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>



New Kafka API support (0.9+)

2016-05-03 Thread Oleg Zhurakousky
As some of you know, we are in the process of providing a set of new Processors 
that uses the new Kafka API (0.9+). This is a new NAR and will live for a while 
along side with the old Kafka NAR.
The new Processors are called PublishKafka and ConsumeKafka (specifically to 
emphasize the new consumer API provided by Kafka)

The current PR https://github.com/apache/nifi/pull/366 is in the review 
process, and even though we’ll do our best to merge it to the trunk as quick as 
possible I would encourage all of Kafka aficionados to see if you can try it 
earlier by building NiFi from this branch

The commit message provides all of the details included in the commit, and I 
have to worn you that aside form new Kafka AP support it addresses several more 
issue including some of the work for old Kafka processors.

Feel free to comment on the PR or use mailing lists with questions/concerns

Given the popularity of Kafka I felt compelled to make all of you aware of this 
work as early as possible.

Cheers
Oleg



Re: Nifi 0.6.1 PutKafka error

2016-05-02 Thread Oleg Zhurakousky
Ralf

While we are indeed working on it at the moment, please do file an enhancement 
request with all the details. It’s always good for tracking purposes.

Cheers
Oleg

> On May 2, 2016, at 10:25 AM, Perko, Ralph J  wrote:
> 
> Thanks for the update.  Is there any discussion around decoupling incoming
> message delimiters from outgoing batching?  I believe this is how it
> worked up to 0.4.x.  Should I make an enhancement request?
> 
> Thanks,
> Ralph
> 
> 
> 
> On 4/29/16, 1:02 PM, "Joe Witt"  wrote:
> 
>> Ralph,
>> 
>> Possibly related to https://issues.apache.org/jira/browse/NIFI-1827.
>> Clearly something to get sorted out promptly.
>> 
>> Thanks
>> Joe
>> 
>> On Fri, Apr 29, 2016 at 2:45 PM, Perko, Ralph J 
>> wrote:
>>> Hi
>>> 
>>> We are using Kafka as our messaging backbone and having been using Nifi
>>> since 0.3.x ­ It has been a real game changer ­ thanks for the work!
>>> 
>>> In 0.6.1 When we set concurrent to anything greater than 1 and partition
>>> strategy to "round robin" we are getting an invalid partition error:
>>> 
>>> 2016-04-29 17:44:31,536 ERROR [Timer-Driven Process Thread-6]
>>> o.apache.nifi.processors.kafka.PutKafka
>>> PutKafka[id=67a1e471-1548-407c-bedc-a2a6212c2458]
>>> PutKafka[id=67a1e471-1548-407c-bedc-a2a6212c2458] failed to process
>>> session
>>> due to java.lang.IllegalArgumentException: Invalid partition given with
>>> record: 165 is not in the range [0...10].:
>>> java.lang.IllegalArgumentException: Invalid partition given with
>>> record: 165
>>> is not in the range [0...10].
>>> 
>>> Setting the partition strategy to ³random² seems to clear up this issue.
>>> 
>>> It also seems batching is not working as it did in previous versions.
>>> Poking around in the code and comparing the PutKafka in 0.4.1 to 0.6.1
>>> it
>>> appears the incoming message delimiter is now tied to outgoing
>>> batching, but
>>> from what I can tell this is not the case in 0.4.1.
>>> 
>>> PutKafka.java:438
>>> 
>>> if (context.getProperty(MESSAGE_DELIMITER).isSet()) {
>>>properties.setProperty("batch.size",
>>> context.getProperty(BATCH_NUM_MESSAGES).getValue());
>>> } else {
>>>properties.setProperty("batch.size", "1");
>>> }
>>> 
>>> We have many single small messages coming in and do not need the message
>>> delimiter but still want to batch.
>>> 
>>> 
>>> It could very well be I am misunderstanding how this works.  Should I be
>>> using it differently?  Should I drop in the 0.4.1 kafka nar to get the
>>> batching behavior we are looking for?
>>> 
>>> Thanks,
>>> Ralph
>>> 
> 
> 



Help Wanted

2016-05-03 Thread Oleg Zhurakousky
Guys

I’d like to use this opportunity to address all members of the NiFi community 
hence this email is sent to both mailing lists (dev/users)

While somewhat skeptical when I started 6 month ago, I have to admit that now I 
am very excited to observe the growth and adaption of the Apache NiFi and say 
that in large part it’s because of the healthy community that we have here - 
committers and contributors alike representing variety of business domains.
This is absolutely great news for all of us and I am sure some if not all of 
you share this sentiment. 

That said and FWIW we need help!
While it’s great to wake up every morning to a set of new PRs and patches, we 
now have a bit of a back log. In large this is due to the fact that most of our 
efforts are spent in development as we all try to grow NiFi feature base. 
However we need to remember that PRs and patches will remain as they are unless 
and until they are reviewed/agreed to be merged by this same community and that 
is where we need help. While “merge" responsibilities are limited to 
“committers”, “review” is the responsibility of every member of this community 
and I would like to ask you if at all possible to redirect some of your efforts 
to this process.
We currently have 61 outstanding PRs and this particular development cycle is a 
bit more complex then the previous ones since it addresses 0.7.0 and 1.0.0 
releases in parallel (so different approach to breaking changes if any etc.)

Cheers
Oleg



Re: Force Stop a Processor

2016-07-22 Thread Oleg Zhurakousky
Also, if you care to share the code for you processor we’ll be happy to look 
and help you find the problem.
As Joe pointed out we make every possible attempt to stop it, but if your 
custom code is blocking and does not react to interrupts then there is not much 
we can do stoping short of Thread.stop (which is something we are reluctantly 
contemplating)

Cheers
Oleg

> On Jul 22, 2016, at 3:47 PM, Joe Witt  wrote:
> 
> Devin,
> 
> In the more recent versions of NiFi attempts to stop the processor
> which are not successful after some period of time (I think 30
> seconds) should result in the process being killed if possible and
> NiFi should then move on again.
> 
> What version are you on?
> 
> Thanks
> Joe
> 
> On Fri, Jul 22, 2016 at 3:32 PM, Devin Fisher
>  wrote:
>> Mostly a general questions.  The processor in question is a custom made one
>> that is obviously misbehaving. I'm looking into why the process doesn't stop
>> like like normal but I was wondering if there was a way to "force" stop a
>> process (kill the processor)? I know I can restart the nifi process to stop
>> the process but I was hoping for a way that was a little less drastic.
>> 
>> Devin
> 



Re: export from Teradata

2016-07-20 Thread Oleg Zhurakousky
Just to add to what Mark said, perhaps you can raise a JIRA (as new Feature) to 
add an optional property to the processor config where such property would 
point to an additional class path (to support your case and many others).
We are doing it with few processors already.

Cheers
Oleg

On Jul 20, 2016, at 8:37 AM, Mark Payne 
> wrote:

Hi Dima,

That will work, however it is very dangerous to put jar files into NiFi's lib/ 
directory. We generally recommend
only putting NAR's into the lib/ directory, as any jar that is there will be 
inherited by all NARs and all classloaders.
So you may end up getting some really funny results from other NARs if any of 
the classes in those jars conflicts
with the classes in one of the NARs.

The preferred approach is to create the uber jar, as described by Anuj, but NOT 
put it into the NiFi lib/ directory.
Instead, it should be placed outside of the lib/ directory and then referenced 
by the JAR URL property.

Thanks
-Mark


On Jul 20, 2016, at 4:10 AM, Dima Fadeyev 
> wrote:

Just in case someone has the same problem. Copying both teradata jars to nifi's 
lib directory and leaving the "jar url" blank solved the problem for us. I was 
able to start exporting data from Teradata.

Best regards,
Dima

On Thu, Jul 14, 2016 at 4:34 PM, Anuj Handa 
> wrote:
Hi Dima,

You will have to create an über jar from the JDBC drivers provided by Teradata 
and copy the uberjar into the lib folder of nifi.

As Matt pointed out the instructions are in the email thread. You can refer 
them.

Anuj

> On Jul 14, 2016, at 9:35 AM, Matt Burgess 
> > wrote:
>
> Dima,
>
> There was a discussion on how to get the SQL processors working with
> Teradata a little while ago:
>
> http://mail-archives.apache.org/mod_mbox/nifi-users/201605.mbox/%3CCAEXY4srXZkb2pMGiOFGs%3DrSc_mHCFx%2BvjW32RjPhz_K1pMr%2B%2Bg%40mail.gmail.com%3E
>
> Looks like it involves making a fat JAR to include the Teradata driver
> and all its dependencies, since the DBCPControllerService asks for a
> single JAR containing the driver class
>
> Regards,
> Matt
>
>> On Thu, Jul 14, 2016 at 9:26 AM, Dima Fadeyev 
>> > wrote:
>> Hello, everyone,
>>
>> I'm new to nifi and this mailing list, I'm evaluating if we could extract
>> data (tables) from Teradata to local fs or HDFS with nifi. Is that possible?
>>
>> What I've done so far was creating an "Execute SQL" processor with the query
>> and the a database connection pooling service with the following
>> configuration:
>>
>> connection url:  jdbc:teradata://teradata.host/database=mydb
>> class name: com.teradata.jdbc.TeraDriver
>> jar url: file:///root/nifi-0.7.0/lib/terajdbc4.jar
>> ...
>>
>> I'm seeing this error in the log file:
>>
>> org.apache.nifi.processor.exception.ProcessException:
>> org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver of
>> class 'com.teradata.jdbc.TeraDriver' for connect URL '
>> jdbc:teradata://teradata.host/database=mydb'
>>at
>> org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:225)
>> ~[na:na]
>> ...
>> Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC
>> driver of class 'com.teradata.jdbc.TeraDriver' for connect URL '
>> jdbc:teradata://teradata.host/database=mydb'
>> ...
>> Caused by: java.sql.SQLException: No suitable driver
>>
>> It looks like nifi can't find or load the driver, even though the jar is
>> located in /root/nifi-0.7.0/lib/terajdbc4.jar
>>
>> Please, help me resolve this.
>> Thanks in advance.





Re: How to send the success status to GetFile processor?

2017-02-02 Thread Oleg Zhurakousky
Prabhu

Not sure I fully understand.
While indeed GetFile does not allow for an incoming connection, it does allow 
for your use case to happen indirectly by monitoring a predefined directory. 
So, one PG finishes and produces a file that is being put into a directory 
monitored by another PG’s GetFile.

Am I missing something?

Cheers
Oleg

On Feb 2, 2017, at 5:48 AM, prabhu Mahendran 
> wrote:

Consider the below scenario:
ProcessGroupA->ProcessGroupB

Since my ProcessgroupA ends with ExecuteProcess processor that runs console 
application and save result into a directory. In ProcessGroupB, I will process 
each file in the saved directory using GetFile processor.

Once, ProcessGroupA is completed I want to run the ProcessgroupB which starts 
with GetFile processor. Since GetFile processor doesnt't have upstream 
connection, I couldn't run the flow here. How to send the success status to 
GetFile processor?

Note: Since I dont know the filename, FetchFile processor is not suitable for 
my case.



Re: How to send the success status to GetFile processor?

2017-02-03 Thread Oleg Zhurakousky
Prabhu

I must be missing something. Why do you need to know the fie name in advance? 
What I mean is you will know it eventually as soon as GetFile will pick it up 
from the pre-configured directory and it’s going to be one of the attributes 
‘filename’ among many other attributes that should give you plenty of 
information about it.

Keep in mind, I am basing my response based on the use case you’ve described in 
your initial post which is a vary common business problem and the solution we 
are proposing is also a very common solution and while we do indeed have a 
FetchFile processor I don’t yet see how it would be beneficial in such use case.

Cheers
Oleg

On Feb 3, 2017, at 12:13 AM, prabhu Mahendran 
<prabhuu161...@gmail.com<mailto:prabhuu161...@gmail.com>> wrote:

Oleg,

Thanks for your repsonse.

Is this possible for use directory in FetchFile or any processor ?  i don't 
know file name which is stored inside directory .

${absolute.path}\${filename}

Note: while using GetFile processor doesn't have upstream connections but i 
need processor which fetch file inside directory using directory name without 
give filename.

Many thanks

On Thu, Feb 2, 2017 at 7:37 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Prabhu

Not sure I fully understand.
While indeed GetFile does not allow for an incoming connection, it does allow 
for your use case to happen indirectly by monitoring a predefined directory. 
So, one PG finishes and produces a file that is being put into a directory 
monitored by another PG’s GetFile.

Am I missing something?

Cheers
Oleg

On Feb 2, 2017, at 5:48 AM, prabhu Mahendran 
<prabhuu161...@gmail.com<mailto:prabhuu161...@gmail.com>> wrote:

Consider the below scenario:
ProcessGroupA->ProcessGroupB

Since my ProcessgroupA ends with ExecuteProcess processor that runs console 
application and save result into a directory. In ProcessGroupB, I will process 
each file in the saved directory using GetFile processor.

Once, ProcessGroupA is completed I want to run the ProcessgroupB which starts 
with GetFile processor. Since GetFile processor doesnt't have upstream 
connection, I couldn't run the flow here. How to send the success status to 
GetFile processor?

Note: Since I dont know the filename, FetchFile processor is not suitable for 
my case.





Re: Dynamically set the RabbitMQ routing key

2017-01-25 Thread Oleg Zhurakousky
Brian

You are correct, at the moment it is somewhat static doe to the fact that for 
the routing key the ‘expressionLanguageSupported’ defaults to false. We can 
definitely make that change which would allow you to use NiFi Expression 
Language to make it dynamic.
Please raise a JIRA ticket - https://issues.apache.org/jira/browse/NIFI, so we 
can take care of it.

Cheers
Oleg

On Jan 25, 2017, at 2:08 PM, BD International 
> wrote:

Hello,

I am currently using nifi 1.0.0 to collect a number or different xml data types 
and publishing them onto rabbitmq.

What I would like to do is dynamically set the rabbitmq routing key based on a 
field within the data but at the moment i can only set the rabbitmq routing key 
at configuration time.

Can anyone suggest any how i could go about doing this?

Ideally i would like be able to use the nifi expression language to set the 
routing key and exchange for the publishamqp processor.

Thanks

Brian



Re: Expression language and UTC millis

2017-02-22 Thread Oleg Zhurakousky
Adam,

I think if my math is correct, what you seeing is correct since 22T23:01:23 + 7 
= 23T06:01:23

Am i missing something?
Cheers
Oleg

On Feb 22, 2017, at 6:42 PM, Adam Lamar 
> wrote:

Hi,

I recently noticed some issues with time parsing in the expression language.

On my Linux server configured with UTC time, using UpdateAttribute to convert a 
timestamp value to millis works as expected.

Attribute name: timestamp
Sample value: 2017-02-22T23:01:23Z

UpdateAttribute is configured with the following property:

name: timestamp_millis
value: ${timestamp:toDate("-MM-dd'T'HH:mm:ss'Z'"):toNumber()}

For the above sample value, I get 1487804483000 as timestamp_millis.

epochconverter.com converts timestamp_millis back 
to the same timestamp provided above.

On my local OSX dev system with a UTC offset of -07:00, the same timestamp 
input yields timestamp_millis as 1487829683000, or the equivalent of 
2017-02-23T06:01:23Z.

Is my expression language incorrect, or is something else wrong?

Adam



Re: Need to migrate the NiFi 1.0.0 nar file to NiFi 1.1.1

2017-02-16 Thread Oleg Zhurakousky
Manojkumar

The short answer is “yes" it is possible since NAR by definition is a 
self-sustaining (resources, class path etc) package.
However, I am a bit confused about the UI changes you are mentioning and can’t 
seem to find a correlation to NAR in general. Are you saying you have changed 
some UI code for NiFi in a NAR that contains such code?

Cheers
Oleg
> On Feb 16, 2017, at 4:00 AM, Manojkumar Ravichandran  
> wrote:
> 
> Hi Team,
> 
> I have done some UI changes in nifi 1.0.0 framework nar file and working on 
> it.Now I want to migrate my changes to nifi version 1.1.1.
> 
> Is it possible to reflect the changes on NiFi 1.1.1 package by just replacing 
> the older nar file in to it or else I need to do all my changes again in my 
> new package.
> 
> 
> Regards,
> Manojkumar R 
> 
> 
> 



Re: How to print flow

2017-01-19 Thread Oleg Zhurakousky
Alessio

Outside of screen shot I am not sure you have many options, at least at the 
moment.
Printing something like a flow is more complicated that it may seem at first, 
due to formatting issues. Landscape or Portrait, the paper size, etc., and what 
if the flow doesn’t fit? Should it get auto-resized or spread across multiple 
pages when you are trying to print a large flow.
On top of that each flow in NiFi may and often uses components that are not 
visible unless specifically accessed. For example, flow may contain local 
and/or remote process groups, ControllerServices etc which aren’t visible when 
looking at the flow (i.e., ControllerService). The same goes for process group 
which si just a box, yet when you click on it it opens up another flow etc.

Anyway, I know this is not much help, but as you can see it needs more thoughts 
to be put to it ;)

Cheers
Oleg

On Jan 19, 2017, at 3:27 AM, Alessio Palma 
> wrote:

Hello all,
does anybody found a way to print workflow?

I.E: tool to convert the flow into another format which is readable by other 
software with printing support.



Re: ExecuteStreamCommand Zombie

2017-03-02 Thread Oleg Zhurakousky
Olav

The fact that the processor does not detect the dead process that was initiated 
by it does appear to be a bug. Would you mind raising JIRA - 
https://issues.apache.org/jira/browse/NIFI so we can take a look.

Cheers
Oleg
On Mar 1, 2017, at 7:29 PM, Olav Jordens 
> 
wrote:


Hi Nifi lovers,

I have just seen this situation for the first time. I run a .sh script using 
ExecuteStreamCommand which usually takes a few minutes. Today, I noticed that 
the resulting flow had not produced anything for almost a day. Then I checked 
on my Linux box and saw that the .sh script had been running for almost a day. 
Something must have been wrong here, so killed the process from the command 
line. However, the Nifi processor will not stop. I eventually copied and 
replaced the components of the flow and all is working again, but the Zombie 
still won’t die in Nifi. I am sure that if I restart the Nifi process, all will 
be resolved, but thought that you may wish to see this screenshot, and consider 
if ExecuteStreamCommand could detect the process being killed and then stop 
running. Here is an image of a portion of the flow – on the left the zombie, 
and on the right, the copy-pasted components working fine.



Thanks!
Olav



Olav Jordens
Senior ETL Developer
Two Degrees Mobile Limited
===
(M) 022 620 2429
(P) 09 919 7000
www.2degreesmobile.co.nz

Two Degrees Mobile Limited | 47-49 George Street | Newmarket | Auckland | New 
Zealand |
PO Box 8355 | Symonds Street | Auckland 1150 | New Zealand | Fax +64 9 919 7001


   




Disclaimer
The e-mail and any files transmitted with it are confidential and may contain 
privileged or copyright information. If you are not the intended recipient you 
must not copy, distribute, or use this e-mail or the information contained in 
it for any purpose other than to notify us of the error. If you have received 
this message in error, please notify the sender immediately, by email or phone 
(+64 9 919 7000) and delete this email from your system. Any views expressed in 
this message are those of the individual sender, except where the sender 
specifically states them to be the views of Two Degrees Mobile Limited. We do 
not guarantee that this material is free from viruses or any other defects 
although due care has been taken to minimize the risk



Re: Best Practice for backing up NiFi Flows

2016-09-14 Thread Oleg Zhurakousky
Just to add to what James has already stated, templates also have been polished 
quite a bit to be more deterministic and source control friendly so they are an 
ideal artifact to be versioned and kept in source control repo.

Cheers
Oleg

On Sep 14, 2016, at 3:02 PM, James Wing 
> wrote:

Manish, you are absolutely right to back up your flow.xml.gz and conf files.  
But I would carefully distinguish between using these backups to recreate an 
equivalent new NiFi, versus attempting to reset the state of your existing 
NiFi.  The difference is the live data in your flow, in the provenance 
repository, in state variables, etc.  Restoring a flow definition that no 
longer matches your content and provenance data may have unexpected results for 
you, and for systems connecting with NiFi.  NiFi does try hard to handle these 
changes smoothly, but it isn't a magic time machine.

Deploying flow.xml.gz can work, especially when deployed with conf files that 
reference IDs in the flow (like authorizations.xml), or the 
nifi.sensitive.props.key setting, etc.  But if you overwrite a running flow, 
you still have the data migration problem.

Templates are the current recommended best practice for deployment.  As I 
understand it, templates provide:

1.) Concise packaging for deployment
2.) Separation between site-specific configuration like authorizations from the 
flow logic
3.) Workflow that allows, encourages, forces the administrator to address 
migration from the existing flow to incorporate the new template

Personally, I think it centers on acceptance or rejection of the 
command-and-control model, which is controversial and different from most other 
systems.  Templates fit within command-and-control, overwriting flow.xml.gz 
suggests a different model.

I know there are many other opinions on this.

Thanks,

James

On Tue, Sep 13, 2016 at 1:30 PM, Manish Gupta 8 
> wrote:
Hello Everyone,

Is there a best practice for keeping a backup of all the data flows we are 
developing in NiFi?

Currently we take a copy of flow.xml.gz every hour and keep it in backup folder 
(also in our source control). Also, we keep a copy of all Config files in 
source control.


• We are assuming that using flow.xml.gz and Config files, we will be 
able to restore the NiFi in case of any failure or if someone makes some 
mistake. Is this assumption correct? Is there a better way to deal with this?

• When we move to production (or some other environment), will it be as 
simple as dropping flow.xml.gz in a new NiFi installation on NCM along with 
making some environment related changes? Or, should we use templates on Dev, 
and import on Prod?

Thanks,
Manish





Re: Create NiFi Templates

2016-09-28 Thread Oleg Zhurakousky
Ashish

As Joe pointed out, while “possible”, at the moment we don’t expose a direct 
public API to accomplish that. When I say “possible” I am of course referring 
to the internal API that is used by NIFi (NIFI UI that is) to wire up flows, 
but at the moment it is neither public  nor it is the preferred approach.
That said, I am still wondering as to “what" is the issue you are experiencing? 
I mean you are clearly describing what you want to do, but I am missing the 
“why” part. 
Please don’t take it the wrong way. . .  You may very well have legitimate 
reasons to do what you need to do. All we are trying to do is to understand 
those reasons so we in the NiFi community can determine if it is indeed a 
missing yet valuable feature we can/should put on the road map.

Cheers
Oleg


> On Sep 28, 2016, at 7:49 AM, Ashish Agarwal 10 <aagarwa...@sapient.com> wrote:
> 
> Hello,
> 
> I am using Nifi 0.7.0. 
> I want to create the template of the processor groups.
> Instead of creating it from UI. I want to design a flow that creates a 
> processor group template based on its id, download it and store it somewhere 
> locally. 
> Is it possible ? If yes, Is there any method/api available for the same?
> 
> Thanks You
> Ashish Agarwal
> 
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com] 
> Sent: Tuesday, September 27, 2016 6:06 PM
> To: users@nifi.apache.org
> Subject: Re: Create NiFi Templates
> 
> Ashish,
> 
> If the question is 'at runtime what are the ways I could trigger the
> creation of a NiFI template?'
> 
>  You could call the REST API endpoint using some mechanism other than
> the NiFi UI or you could use the NiFi UI.
> 
> If the question is 'can I programmatically create a template during
> development time?'
> 
>  We don't have any direct public API's to accomplish this that I am
> aware of though it is an interesting idea.
> 
> Can you help direct us to where you're more interested right now so we
> can help most effectively.
> 
> Thanks
> Joe
> 
> On Tue, Sep 27, 2016 at 8:31 AM, Oleg Zhurakousky
> <ozhurakou...@hortonworks.com> wrote:
>> Ashish
>> 
>> I am not sure I fully understand the question. . .
>> Templates represented as an XML file and therefore if you “type everything
>> correctly” you’ll get a working template, but it would be simpler to use
>> NIFI UI as a design tool to do the same.
>> Could you please clarify more as to what exactly are you trying to
>> accomplish?
>> 
>> Cheers
>> Oleg
>> 
>> On Sep 27, 2016, at 3:06 AM, Ashish Agarwal 10 <aagarwa...@sapient.com>
>> wrote:
>> 
>> Hello,
>> 
>> Is there a way to create a template of a process group other than the button
>> present on UI ?
>> 
>> Regards,
>> Ashish Agarwal
>> 
>> 



Re: NPE MergeContent processor

2016-11-10 Thread Oleg Zhurakousky
Conrad

Any chance you an provide a bit more info about your flow?
I was able to find a condition when something like this can happen, but it 
would have to be with some legacy NiFi distribution, so it’s a bit puzzling, 
but i really want o see if we can close the loop on this.
In any event I think it is safe to raise JIRA on this one

Cheers
Oleg

On Nov 10, 2016, at 10:06 AM, Conrad Crampton 
> wrote:

Hi,
The processor continues to write (to HDFS – the next processor in flow) and 
doesn’t block any others coming into this processor (MergeContent), so not 
quite the same observed behaviour as NIFI-2015.
If there is anything else you would like me to do to help with this more than 
happy to help.
Regards
Conrad

From: Bryan Bende >
Reply-To: "users@nifi.apache.org" 
>
Date: Thursday, 10 November 2016 at 14:59
To: "users@nifi.apache.org" 
>
Subject: Re: NPE MergeContent processor

Conrad,

Thanks for reporting this. I wonder if this is also related to:

https://issues.apache.org/jira/browse/NIFI-2015

Seems like there is some case where the UUID is ending up as null.

-Bryan


On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton 
> wrote:
Hi,
I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the 
issues I had with that upgrade (entirely my fault it turns out!), but I have 
seen it a number of times since so I turned debugging on to get a better 
stacktrace. Relevant log section as below.
Nothing out of the ordinary, and I never saw this in v0.6.1 or below.
I would have raised a Jira issue, but after logging in to Jira it only let me 
create a service desk request (which didn’t sound right).
Regards
Conrad

2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its 
resources; will not be scheduled to run again for 1000 milliseconds
2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles
2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged 
[StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=567158, 
length=2337],offset=0,name=17453303363322987,size=2337], 
StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=573643, 
length=2279],offset=0,name=17453303351196175,size=2279], 
StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=583957, 
length=2223],offset=0,name=17453303531879367,size=2223], 
StandardFlowFileRecord[uuid=,claim=StandardContentClaim 
[resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=595617, 
length=2356],offset=0,name=,size=2356], 
StandardFlowFileRecord[uuid=,claim=StandardContentClaim 
[resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=705637, 
length=2317],offset=0,name=,size=2317], 
StandardFlowFileRecord[uuid=,claim=StandardContentClaim 
[resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=725376, 
length=2333],offset=0,name=,size=2333], 
StandardFlowFileRecord[uuid=,claim=StandardContentClaim 
[resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=728703, 
length=2377],offset=0,name=,size=2377]] into 
StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d-ed3c991fd631,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1478709819819-416, container=default, 
section=416], offset=982498, 
length=4576],offset=0,name=3649103647775837,size=4576]
2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process session 
due to java.lang.NullPointerException: java.lang.NullPointerException
2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent
java.lang.NullPointerException: null
at 

Re: NPE MergeContent processor

2016-11-11 Thread Oleg Zhurakousky
Conrad

Is it possible that you may be dealing with corrupted repositories (swap, flow 
file etc.) due to your upgrades or may be even possible crashes?

Cheers
Oleg

On Nov 11, 2016, at 3:11 AM, Conrad Crampton 
<conrad.cramp...@secdata.com<mailto:conrad.cramp...@secdata.com>> wrote:

Hi,
This is the flow. The incoming flow is basically a syslog message which is 
parsed, enriched then saved to HDFS
1.   Parse (extracttext)
2.   Assign matching parts to attributes
3.   Enrich ip address location
4.   Assign attributes with geoenrichment
5.   Execute python script to parse useragent
6.   Create json from attributes
7.   Convert to avro (all strings)
8.   Convert to target avro schema (had to do 7 & 8 due to bug(?) where 
couldn’t go directly from json to avro with integers/longs)
9.   Merge into bins (see props below)
10.   Append ‘.avro’ to filenames (for reading in Spark subsequently)
11.   Save to HDFS

Does this help at all?
If you need anything else just shout.
Regards
Conrad





additional out of shot
• compression level : 1
• Keep Path : false


From: Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Date: Thursday, 10 November 2016 at 18:40
To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Subject: Re: NPE MergeContent processor

Conrad

Any chance you an provide a bit more info about your flow?
I was able to find a condition when something like this can happen, but it 
would have to be with some legacy NiFi distribution, so it’s a bit puzzling, 
but i really want o see if we can close the loop on this.
In any event I think it is safe to raise JIRA on this one

Cheers
Oleg

On Nov 10, 2016, at 10:06 AM, Conrad Crampton 
<conrad.cramp...@secdata.com<mailto:conrad.cramp...@secdata.com>> wrote:

Hi,
The processor continues to write (to HDFS – the next processor in flow) and 
doesn’t block any others coming into this processor (MergeContent), so not 
quite the same observed behaviour as NIFI-2015.
If there is anything else you would like me to do to help with this more than 
happy to help.
Regards
Conrad

From: Bryan Bende <bbe...@gmail.com<mailto:bbe...@gmail.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Date: Thursday, 10 November 2016 at 14:59
To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Subject: Re: NPE MergeContent processor

Conrad,

Thanks for reporting this. I wonder if this is also related to:

https://issues.apache.org/jira/browse/NIFI-2015

Seems like there is some case where the UUID is ending up as null.

-Bryan


On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton 
<conrad.cramp...@secdata.com<mailto:conrad.cramp...@secdata.com>> wrote:
Hi,
I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the 
issues I had with that upgrade (entirely my fault it turns out!), but I have 
seen it a number of times since so I turned debugging on to get a better 
stacktrace. Relevant log section as below.
Nothing out of the ordinary, and I never saw this in v0.6.1 or below.
I would have raised a Jira issue, but after logging in to Jira it only let me 
create a service desk request (which didn’t sound right).
Regards
Conrad

2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its 
resources; will not be scheduled to run again for 1000 milliseconds
2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles
2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] 
o.a.n.processors.standard.MergeContent 
MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged 
[StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=567158, 
length=2337],offset=0,name=17453303363322987,size=2337], 
StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=393], offset=573643, 
length=2279],offset=0,name=17453303351196175,size=2279], 
StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1475059643340-275849, 
container=default, section=

Re: NPE MergeContent processor

2016-11-11 Thread Oleg Zhurakousky
Sorry, I should have been more clear.
I’ve spent considerable amount of time slicing and dicing this thing and while 
I am still validating few possibilities, this is more likely to due to FlowFile 
being rehydrated from the corrupted repo with missing UUID and when such file’s 
ID ends up to be in a parent/child of ProvenanceEventRecord we get this issue. 
Basically FlowFile must never exist without UUID similar to the way provenance 
event record where existence if UUID is validated during the call to build(). 
We should definitely do the same in a builder for FlowFile and even though it 
will not eliminate the issue it may help to pin point its origin.

I’ll raise  the corresponding JIRA to improve FlowFile validation.

Cheers
Oleg

> On Nov 11, 2016, at 3:00 PM, Joe Witt <joe.w...@gmail.com> wrote:
> 
> that said even if it is due to crashes or even disk full cases we
> should figure out what happened and make it not possible.  We must
> always work to eliminate the possibility of corruption causing events
> and work to recover well in the face of corruption...
> 
> On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky
> <ozhurakou...@hortonworks.com> wrote:
>> Conrad
>> 
>> Is it possible that you may be dealing with corrupted repositories (swap,
>> flow file etc.) due to your upgrades or may be even possible crashes?
>> 
>> Cheers
>> Oleg
>> 
>> On Nov 11, 2016, at 3:11 AM, Conrad Crampton <conrad.cramp...@secdata.com>
>> wrote:
>> 
>> Hi,
>> This is the flow. The incoming flow is basically a syslog message which is
>> parsed, enriched then saved to HDFS
>> 1.   Parse (extracttext)
>> 2.   Assign matching parts to attributes
>> 3.   Enrich ip address location
>> 4.   Assign attributes with geoenrichment
>> 5.   Execute python script to parse useragent
>> 6.   Create json from attributes
>> 7.   Convert to avro (all strings)
>> 8.   Convert to target avro schema (had to do 7 & 8 due to bug(?) where
>> couldn’t go directly from json to avro with integers/longs)
>> 9.   Merge into bins (see props below)
>> 10.   Append ‘.avro’ to filenames (for reading in Spark subsequently)
>> 11.   Save to HDFS
>> 
>> Does this help at all?
>> If you need anything else just shout.
>> Regards
>> Conrad
>> 
>> 
>> 
>> 
>> 
>> additional out of shot
>> · compression level : 1
>> · Keep Path : false
>> 
>> 
>> From: Oleg Zhurakousky <ozhurakou...@hortonworks.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Thursday, 10 November 2016 at 18:40
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: NPE MergeContent processor
>> 
>> Conrad
>> 
>> Any chance you an provide a bit more info about your flow?
>> I was able to find a condition when something like this can happen, but it
>> would have to be with some legacy NiFi distribution, so it’s a bit puzzling,
>> but i really want o see if we can close the loop on this.
>> In any event I think it is safe to raise JIRA on this one
>> 
>> Cheers
>> Oleg
>> 
>> 
>> On Nov 10, 2016, at 10:06 AM, Conrad Crampton <conrad.cramp...@secdata.com>
>> wrote:
>> 
>> Hi,
>> The processor continues to write (to HDFS – the next processor in flow) and
>> doesn’t block any others coming into this processor (MergeContent), so not
>> quite the same observed behaviour as NIFI-2015.
>> If there is anything else you would like me to do to help with this more
>> than happy to help.
>> Regards
>> Conrad
>> 
>> From: Bryan Bende <bbe...@gmail.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Thursday, 10 November 2016 at 14:59
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: NPE MergeContent processor
>> 
>> Conrad,
>> 
>> Thanks for reporting this. I wonder if this is also related to:
>> 
>> https://issues.apache.org/jira/browse/NIFI-2015
>> 
>> Seems like there is some case where the UUID is ending up as null.
>> 
>> -Bryan
>> 
>> 
>> On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton
>> <conrad.cramp...@secdata.com> wrote:
>> 
>> Hi,
>> I saw this error after I upgraded to 1.0.0 but thought it was maybe due to
>> the issues I had with that upgrade (entirely my fault it turns out!), but I
>> have seen it a number of times since so I turned debugging on to get a
>>

Re: Nifi- PutEmail processor issue

2016-11-15 Thread Oleg Zhurakousky
Sravani

Would you be able to provide a full stack trace of the connection exception.
Also, while I assume you are providing the correct connection properties (i.e., 
host,port etc) I would still recommend to check the they are correct, but in 
any event the full stack trace would definitely help and you cn find it the the 
wifi app logs.

Cheers
Oleg

On Nov 15, 2016, at 4:07 AM, Gadiputi, Sravani 
> wrote:

Thank you for your reply.

My requirement is , I just try to send/copy the 3 different files from source 
to destination through Nifi, and these jobs runs weekly once.
So I wanted to know which file is successfully moved through email.
In this process, I have configured putemail for each flow. There are hardly 3 
notifications only.
Though files have been moved to destination, we could not receive the 
notifications properly and giving the below error.

Please suggest.

Thanks,
Sravani


From: Jeff [mailto:jtsw...@gmail.com]
Sent: Tuesday, November 15, 2016 1:25 PM
To: users@nifi.apache.org
Subject: Re: Nifi- PutEmail processor issue

Hello Sravani,

Could it be possible that the SMTP server you're using is denying connections 
due to the volume of emails your flow might be sending?  How many emails are 
sent per flow file, and how many emails do you estimate are sent per minute?

If this is the case, you can modify your flow to aggregate flowfiles with a 
processor like MergeContent so that you can send emails that resemble a digest, 
rather than a separate email for each flowfile that moves through your flow.

On Mon, Nov 14, 2016 at 11:59 PM Gadiputi, Sravani 
> wrote:

Hi,

I have used PutEmail processor in my project to send email notification for 
successful/failure copying of a files.
Each file flow having corresponding PutEmail to send  email notification to 
respective recipients.

Here the issue is, sometimes email notification will send to respective 
recipients successfully  for successful/failure job.
But sometimes for any one specific job email notification will not be send to 
recipients though job is successful, due to  below error.

Error:

Could not connect to SMTP host
Java.net.ConnectException: Connection timed out

Could you please suggest me how we can overcome this error.


Thanks,
Sravani

This message contains information that may be privileged or confidential and is 
the property of the Capgemini Group. It is intended only for the person to whom 
it is addressed. If you are not the intended recipient, you are not authorized 
to read, print, retain, copy, disseminate, distribute, or use this message or 
any part thereof. If you receive this message in error, please notify the 
sender immediately and delete all copies of this message.



Re: Delay Processor

2016-11-15 Thread Oleg Zhurakousky
I am +1 on this as I’ve seen many cases in the field where it is applicable and 
as you mention exponential back off is one of the common one. That said, I am 
wondering if that has to be a processor at all? Actually let me answer my own 
question. There are definitely cases where it has to be a processor. Those are 
true delay with intention requirements (i.e., Compute Andrew’s greetings -> 
Delay until his b-day -> Send Greetings). 
The exponential back off is a bit different since it almost aligns with circuit 
breaker and re-tries. Currently we simply retry by resubmitting the flow file 
with fixed yield. However one may argue that if something failed the first 
time, it is very likely that it is going to fail again and again. It may also 
succeed, but one may argue that it has a higher chance of succeeding after 
certain delay which increases on subsequent failures until we may choose to 
consider it a failed cause and stop resubmitting it.

So in summary I see the two-part requirement;  A processor and enhancement to 
the core-framework’s retry logic.

Thoughts?
Cheers
Oleg


> On Nov 15, 2016, at 8:50 AM, Andrew Grande  wrote:
> 
> Hi,
> 
> I'd lime to check where discussions are on this, ir propose the new component 
> otherwise.
> 
> Use case: make delay strategies explicit, easier to use. E.g. think of a 
> failure retry loop.
> 
> Currently, ControlRate is somewhat related, but can be improved. E.g. 
> introduce delay strategies a la prioritizers on the connection?
> 
> Thinking out loud, something like a exponential backoff strategy could be 
> kept stateless by adding a number of housekeeping attributes to the FF, which 
> eliminates the need for any state in the processor itself.
> 
> I'll stop here to see if any ideas were captured prior abd what community 
> thinks of it.
> 
> Andrew



Re: Does NiFi ConsumeJMS Processor supports OpenJMS?

2016-11-01 Thread Oleg Zhurakousky
Nirmal

Did some digging and the NPE is due to the fact that we are invoking a default 
constructor for whatever connection factory class is provided (i.e., 
‘org.exolab.jms.client.JmsConnectionFactory’). And that works for most major 
JMS providers, however for OpenJMS after looking at the code it is rather clear 
that the default constructor that is exposed was not exposed to be actually 
used by the developers but rather internal serialization use.

/**
 * Default constructor required for serialization
 */
public JmsConnectionFactory() {
}

Not sure why it was done this way . . ., but the NPE is due to the fact that 
the server proxy class is null so when it attempts to do 
Class.forName(proxyClassName) in getProxy() method it fails with NPE.

One other thing I noticed is that all OpenJMS examples are based on obtaining 
ConnectionFactory from JNDI for which we currently do not have support.

Once again, please raise the JIRA if you believe that it’s important to add 
such feature.

Cheers
Oleg
On Nov 1, 2016, at 9:29 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Nirmal

While ConsumeJMS was developed and tested with the idea of supporting multiple 
providers, ‘openjms’ was not the one that it was tested with.
I will look at the error and will follow up, but the fact that it already shows 
NPE means we have a bug somewhere, so please raise the JIRA 
(https://issues.apache.org/jira/browse/NIFI) if you don’t mind or let m know 
and I’ll do it.

Thank you for reporting it.
Cheers
Oleg


On Nov 1, 2016, at 8:35 AM, Nirmal Kumar 
<nirmal.ku...@impetus.co.in<mailto:nirmal.ku...@impetus.co.in>> wrote:

Hi All,

I am trying to read messages from openjms-0.7.7 using the ConsumeJMS Processor 
but getting following exception:

2016-11-01 14:43:17,260 ERROR [Timer-Driven Process Thread-4] 
o.apache.nifi.jms.processors.ConsumeJMS ConsumeJMS - 
JMSConsumer[destination:queue1; pub-sub:false;] ConsumeJMS - 
JMSConsumer[destination:queue1; pub-sub:false;] failed to process session due 
to org.springframework.jms.UncategorizedJmsException: Uncategorized exception 
occured during JMS processing; nested exception is javax.jms.JMSException: 
Failed to create proxy: java.lang.NullPointerException: 
org.springframework.jms.UncategorizedJmsException: Uncategorized exception 
occured during JMS processing; nested exception is javax.jms.JMSException: 
Failed to create proxy: java.lang.NullPointerException
2016-11-01 14:43:17,265 ERROR [Timer-Driven Process Thread-4] 
o.apache.nifi.jms.processors.ConsumeJMS
org.springframework.jms.UncategorizedJmsException: Uncategorized exception 
occured during JMS processing; nested exception is javax.jms.JMSException: 
Failed to create proxy: java.lang.NullPointerException
at 
org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:316)
 ~[na:na]
at 
org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:169)
 ~[na:na]
at 
org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:497) ~[na:na]
at 
org.springframework.jms.core.JmsTemplate.receiveSelected(JmsTemplate.java:764) 
~[na:na]
at 
org.springframework.jms.core.JmsTemplate.receive(JmsTemplate.java:738) ~[na:na]
at 
org.springframework.jms.core.JmsTemplate.receive(JmsTemplate.java:727) ~[na:na]
at 
org.apache.nifi.jms.processors.JMSConsumer.consume(JMSConsumer.java:65) ~[na:na]
at 
org.apache.nifi.jms.processors.ConsumeJMS.rendezvousWithJms(ConsumeJMS.java:79) 
~[na:na]
at 
org.apache.nifi.jms.processors.AbstractJMSProcessor.onTrigger(AbstractJMSProcessor.java:136)
 ~[na:na]
at 
org.apache.nifi.jms.processors.ConsumeJMS.onTrigger(ConsumeJMS.java:50) ~[na:na]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 ~[nifi-api-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064)
 ~[nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_45]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_45]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_45]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThr

Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James I have not used AMQP in a while (although I am the one responsible for 
developing the functionality you’re having problem with ;)), so I need to brush 
up a bit on some of the administrative actions you describe (idle to active). 
Do i’ll get back to you, but as a joint debug step, can you please try to make 
your queens explicitly active and see if behavior changes?

Also, what version of NiFi you are using?
Cheers
Oleg

On Dec 8, 2016, at 7:37 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Good morning and thank you Oleg. No, no other stack traces or errors are 
indicated. I've got my Publish AMQP configuration set to DEBUG for Bulletin 
Level.

I do get an error at the UI when I try to Start the Publish AMQP step after I 
have Stopped it. It tells me "PublishAMQP[id=47f88.744] cannot be started 
because it is not stopped. Current state is STOPPING". I suspect this is a 
reflection of the WARN log message.

In your experience must I take some action at the RabbitMQ Management Console 
to change my Exchange and Queue from Idle to Active? I believe that would 
happen automatically when AMQP messages starting hitting the exchange, but I 
thought it best to ask. I see nothing on the management console that alludes to 
enabling the exchange or the queue.

Jim

On Thu, Dec 8, 2016 at 7:24 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Aside from “closing” failure message, do you see any other errors or stack 
traces in the logs?
Meanwhile I’ll try to dig more with what I have?

Oleg

On Dec 8, 2016, at 7:13 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hello I am trying to use a Publish AMQP step in my NiFi workflow. I generate a 
flow file with random text content using Generate Flow File, and then send to a 
rabbitMQ Exchange named nifiAbcX using Publish AMQP. My exchange is bound to 
queue AbcQ by key *.Abc.*, and my virtual host is nifiVhost. I have established 
my exchange as a Topic type.



My NiFi instance and my RabbitMQ instance are co-located for development test 
purposes on the same server.



It appears that my Publish AMQP does indeed generate output which it sends down 
the "success" output flow path, but at the RabbitMQ Management Console I find 
that my Exchange and my Queue are Idle and show no messages. My nifi log shows 
this Warning message: Failure while closing target resource 
AMQPublisher:amqp://test@127.0.0.1:5673/nifiVhost<http://test@127.0.0.1:5673/nifiVhost>,
 EXCHANGE:nifiAbcX, ROUTING_KEY:xyz.Abc.prod  What is this telling me? Why am I 
seeing no messages posted to my rabbitMQ queue through the exchange? Thank you 
in advance.





Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

Could you also see if you can upgrade to a newer version of NiFi? Current 
release is 1.1.0, so 0.6.1 is about 3 releases behind and there were quite a 
few bug fixes and improvements since then that could shed some more light into 
your issue.
Meanwhile I’ll see what I can dig up on my end.

Cheers
Oleg

On Dec 8, 2016, at 8:00 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Yes absolutely Oleg. I will try to figure out how to make my rabbitMQ queues 
explicitly active. I'll need to dig into that because it is not plainly obvious 
to me how / if that needs to be done. I'll dig for that.

I am using NiFi 0.6.1.

Cheers and thanks again,

Jim

On Thu, Dec 8, 2016 at 7:46 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James I have not used AMQP in a while (although I am the one responsible for 
developing the functionality you’re having problem with ;)), so I need to brush 
up a bit on some of the administrative actions you describe (idle to active). 
Do i’ll get back to you, but as a joint debug step, can you please try to make 
your queens explicitly active and see if behavior changes?

Also, what version of NiFi you are using?
Cheers
Oleg

On Dec 8, 2016, at 7:37 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Good morning and thank you Oleg. No, no other stack traces or errors are 
indicated. I've got my Publish AMQP configuration set to DEBUG for Bulletin 
Level.

I do get an error at the UI when I try to Start the Publish AMQP step after I 
have Stopped it. It tells me "PublishAMQP[id=47f88.744] cannot be started 
because it is not stopped. Current state is STOPPING". I suspect this is a 
reflection of the WARN log message.

In your experience must I take some action at the RabbitMQ Management Console 
to change my Exchange and Queue from Idle to Active? I believe that would 
happen automatically when AMQP messages starting hitting the exchange, but I 
thought it best to ask. I see nothing on the management console that alludes to 
enabling the exchange or the queue.

Jim

On Thu, Dec 8, 2016 at 7:24 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Aside from “closing” failure message, do you see any other errors or stack 
traces in the logs?
Meanwhile I’ll try to dig more with what I have?

Oleg

On Dec 8, 2016, at 7:13 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hello I am trying to use a Publish AMQP step in my NiFi workflow. I generate a 
flow file with random text content using Generate Flow File, and then send to a 
rabbitMQ Exchange named nifiAbcX using Publish AMQP. My exchange is bound to 
queue AbcQ by key *.Abc.*, and my virtual host is nifiVhost. I have established 
my exchange as a Topic type.



My NiFi instance and my RabbitMQ instance are co-located for development test 
purposes on the same server.



It appears that my Publish AMQP does indeed generate output which it sends down 
the "success" output flow path, but at the RabbitMQ Management Console I find 
that my Exchange and my Queue are Idle and show no messages. My nifi log shows 
this Warning message: Failure while closing target resource 
AMQPublisher:amqp://test@127.0.0.1:5673/nifiVhost<http://test@127.0.0.1:5673/nifiVhost>,
 EXCHANGE:nifiAbcX, ROUTING_KEY:xyz.Abc.prod  What is this telling me? Why am I 
seeing no messages posted to my rabbitMQ queue through the exchange? Thank you 
in advance.







Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

Aside from “closing” failure message, do you see any other errors or stack 
traces in the logs?
Meanwhile I’ll try to dig more with what I have?

Oleg

On Dec 8, 2016, at 7:13 AM, James McMahon 
> wrote:

Hello I am trying to use a Publish AMQP step in my NiFi workflow. I generate a 
flow file with random text content using Generate Flow File, and then send to a 
rabbitMQ Exchange named nifiAbcX using Publish AMQP. My exchange is bound to 
queue AbcQ by key *.Abc.*, and my virtual host is nifiVhost. I have established 
my exchange as a Topic type.



My NiFi instance and my RabbitMQ instance are co-located for development test 
purposes on the same server.



It appears that my Publish AMQP does indeed generate output which it sends down 
the "success" output flow path, but at the RabbitMQ Management Console I find 
that my Exchange and my Queue are Idle and show no messages. My nifi log shows 
this Warning message: Failure while closing target resource 
AMQPublisher:amqp://test@127.0.0.1:5673/nifiVhost,
 EXCHANGE:nifiAbcX, ROUTING_KEY:xyz.Abc.prod  What is this telling me? Why am I 
seeing no messages posted to my rabbitMQ queue through the exchange? Thank you 
in advance.



Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

My apology , but I am not sure I am clear on the second part of your 
explanation. It appears you are saying that when you bound the Exchange to a 
legit routing key all works as expected (in NiFi), but then what?
Could you please clarify?

Sorry about that.
Oleg

On Dec 8, 2016, at 11:41 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Two quick updates. I generated a test message in the RMQ exchange using the 
RabbitMQ Management Console. My goal was to see if the Idle and blocking were 
problems or expected, and whether those would change when I sent a test message 
to the queue. It's also a good validation that I've bound the exchange and 
queue with a legit routing key. This worked - the test message was delivered 
successfully.

I noticed that for each AMQP message nifi attempts to send to RMQ, RabbitMQ 
Management Console creates a Connection that is shown with State of blocking. 
and that is how it remains indefinitely. Nothing shows up in the exchange or in 
the queue.

Jim

On Thu, Dec 8, 2016 at 8:26 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Fair enough

On Dec 8, 2016, at 8:16 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I'm a captive audience in this respect. I do not have the latitude to change 
that, as it is dictated by a parent organization much broader than my level.

On Thu, Dec 8, 2016 at 8:10 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Could you also see if you can upgrade to a newer version of NiFi? Current 
release is 1.1.0, so 0.6.1 is about 3 releases behind and there were quite a 
few bug fixes and improvements since then that could shed some more light into 
your issue.
Meanwhile I’ll see what I can dig up on my end.

Cheers
Oleg

On Dec 8, 2016, at 8:00 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Yes absolutely Oleg. I will try to figure out how to make my rabbitMQ queues 
explicitly active. I'll need to dig into that because it is not plainly obvious 
to me how / if that needs to be done. I'll dig for that.

I am using NiFi 0.6.1.

Cheers and thanks again,

Jim

On Thu, Dec 8, 2016 at 7:46 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James I have not used AMQP in a while (although I am the one responsible for 
developing the functionality you’re having problem with ;)), so I need to brush 
up a bit on some of the administrative actions you describe (idle to active). 
Do i’ll get back to you, but as a joint debug step, can you please try to make 
your queens explicitly active and see if behavior changes?

Also, what version of NiFi you are using?
Cheers when
Oleg

On Dec 8, 2016, at 7:37 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Good morning and thank you Oleg. No, no other stack traces or errors are 
indicated. I've got my Publish AMQP configuration set to DEBUG for Bulletin 
Level.

I do get an error at the UI when I try to Start the Publish AMQP step after I 
have Stopped it. It tells me "PublishAMQP[id=47f88.744] cannot be started 
because it is not stopped. Current state is STOPPING". I suspect this is a 
reflection of the WARN log message.

In your experience must I take some action at the RabbitMQ Management Console 
to change my Exchange and Queue from Idle to Active? I believe that would 
happen automatically when AMQP messages starting hitting the exchange, but I 
thought it best to ask. I see nothing on the management console that alludes to 
enabling the exchange or the queue.

Jim

On Thu, Dec 8, 2016 at 7:24 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Aside from “closing” failure message, do you see any other errors or stack 
traces in the logs?
Meanwhile I’ll try to dig more with what I have?

Oleg

On Dec 8, 2016, at 7:13 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hello I am trying to use a Publish AMQP step in my NiFi workflow. I generate a 
flow file with random text content using Generate Flow File, and then send to a 
rabbitMQ Exchange named nifiAbcX using Publish AMQP. My exchange is bound to 
queue AbcQ by key *.Abc.*, and my virtual host is nifiVhost. I have established 
my exchange as a Topic type.



My NiFi instance and my RabbitMQ instance are co-located for development test 
purposes on the same server.



It appears that my Publish AMQP does indeed generate output which it sends down 
the "success" output flow path, but at the RabbitMQ Management Console I find 
that my Exchange and my Queue are Idle and show no messages. My nifi log shows 
this Warning message: Failure while closing target resource 
AMQPublishe

Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
Got it, let me dig in, i got time now

On Dec 8, 2016, at 12:46 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

No problem Oleg. My fault for not being clear. The test works demonstrating the 
message delivery from RMQ exchange to RMQ queue. I was trying to be systematic, 
and eliminate points of possible failure, the binding of the exchange to the 
queue in RMQ being one.

It does not work, NiFi to RMQ. I see that a connection is created by RMQ but 
that connection is blocking. I do not believe the content ever actually gets 
handed to the RMQ exchange by NiFi.



On Thu, Dec 8, 2016 at 12:41 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

My apology , but I am not sure I am clear on the second part of your 
explanation. It appears you are saying that when you bound the Exchange to a 
legit routing key all works as expected (in NiFi), but then what?
Could you please clarify?

Sorry about that.
Oleg

On Dec 8, 2016, at 11:41 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Two quick updates. I generated a test message in the RMQ exchange using the 
RabbitMQ Management Console. My goal was to see if the Idle and blocking were 
problems or expected, and whether those would change when I sent a test message 
to the queue. It's also a good validation that I've bound the exchange and 
queue with a legit routing key. This worked - the test message was delivered 
successfully.

I noticed that for each AMQP message nifi attempts to send to RMQ, RabbitMQ 
Management Console creates a Connection that is shown with State of blocking. 
and that is how it remains indefinitely. Nothing shows up in the exchange or in 
the queue.

Jim

On Thu, Dec 8, 2016 at 8:26 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Fair enough

On Dec 8, 2016, at 8:16 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I'm a captive audience in this respect. I do not have the latitude to change 
that, as it is dictated by a parent organization much broader than my level.

On Thu, Dec 8, 2016 at 8:10 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Could you also see if you can upgrade to a newer version of NiFi? Current 
release is 1.1.0, so 0.6.1 is about 3 releases behind and there were quite a 
few bug fixes and improvements since then that could shed some more light into 
your issue.
Meanwhile I’ll see what I can dig up on my end.

Cheers
Oleg

On Dec 8, 2016, at 8:00 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Yes absolutely Oleg. I will try to figure out how to make my rabbitMQ queues 
explicitly active. I'll need to dig into that because it is not plainly obvious 
to me how / if that needs to be done. I'll dig for that.

I am using NiFi 0.6.1.

Cheers and thanks again,

Jim

On Thu, Dec 8, 2016 at 7:46 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James I have not used AMQP in a while (although I am the one responsible for 
developing the functionality you’re having problem with ;)), so I need to brush 
up a bit on some of the administrative actions you describe (idle to active). 
Do i’ll get back to you, but as a joint debug step, can you please try to make 
your queens explicitly active and see if behavior changes?

Also, what version of NiFi you are using?
Cheers when
Oleg

On Dec 8, 2016, at 7:37 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Good morning and thank you Oleg. No, no other stack traces or errors are 
indicated. I've got my Publish AMQP configuration set to DEBUG for Bulletin 
Level.

I do get an error at the UI when I try to Start the Publish AMQP step after I 
have Stopped it. It tells me "PublishAMQP[id=47f88.744] cannot be started 
because it is not stopped. Current state is STOPPING". I suspect this is a 
reflection of the WARN log message.

In your experience must I take some action at the RabbitMQ Management Console 
to change my Exchange and Queue from Idle to Active? I believe that would 
happen automatically when AMQP messages starting hitting the exchange, but I 
thought it best to ask. I see nothing on the management console that alludes to 
enabling the exchange or the queue.

Jim

On Thu, Dec 8, 2016 at 7:24 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Aside from “closing” failure message, do you see any other errors or stack 
traces in the logs?
Meanwhile I’ll try to dig more with what I have?

Oleg

On Dec 8, 2016, at 7:13 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hello I am trying to use a Publish AMQP st

Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

So, first good news. I was able to reproduce what you’re seeing. That is always 
good.
Basically I am sending messages to a bogus exchange with a bogus routing key. 
The first send results in success from Java API standpoint (no exception is 
thrown), however, the second send fails due to the fact that the channel is 
closed and then when I try to stop the processor it gives me the same error as 
you see during the stop call.
That said, one thing is clear is that your message does not reach Rabbit with 
any type of success and that is most likely due to some misconfiguration 
between exchange, binding and routing key.
Let me investigate a bit further and i’ll be in touch

Meanwhile if you look in your app logs can you confirm that you can find the 
stack trace message that looks similar to what’s below?

13:37:30,449 ERROR Timer-Driven Process Thread-10 processors.PublishAMQP:268 -
java.lang.IllegalStateException: This instance of AMQPPublisher is invalid 
since its publishingChannel is closed
at org.apache.nifi.amqp.processors.AMQPPublisher.publish(AMQPPublisher.java:99)
at 
org.apache.nifi.amqp.processors.PublishAMQP.rendezvousWithAmqp(PublishAMQP.java:137)

Cheers
Oleg
On Dec 8, 2016, at 1:04 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Got it, let me dig in, i got time now

On Dec 8, 2016, at 12:46 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

No problem Oleg. My fault for not being clear. The test works demonstrating the 
message delivery from RMQ exchange to RMQ queue. I was trying to be systematic, 
and eliminate points of possible failure, the binding of the exchange to the 
queue in RMQ being one.

It does not work, NiFi to RMQ. I see that a connection is created by RMQ but 
that connection is blocking. I do not believe the content ever actually gets 
handed to the RMQ exchange by NiFi.



On Thu, Dec 8, 2016 at 12:41 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

My apology , but I am not sure I am clear on the second part of your 
explanation. It appears you are saying that when you bound the Exchange to a 
legit routing key all works as expected (in NiFi), but then what?
Could you please clarify?

Sorry about that.
Oleg

On Dec 8, 2016, at 11:41 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Two quick updates. I generated a test message in the RMQ exchange using the 
RabbitMQ Management Console. My goal was to see if the Idle and blocking were 
problems or expected, and whether those would change when I sent a test message 
to the queue. It's also a good validation that I've bound the exchange and 
queue with a legit routing key. This worked - the test message was delivered 
successfully.

I noticed that for each AMQP message nifi attempts to send to RMQ, RabbitMQ 
Management Console creates a Connection that is shown with State of blocking. 
and that is how it remains indefinitely. Nothing shows up in the exchange or in 
the queue.

Jim

On Thu, Dec 8, 2016 at 8:26 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Fair enough

On Dec 8, 2016, at 8:16 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I'm a captive audience in this respect. I do not have the latitude to change 
that, as it is dictated by a parent organization much broader than my level.

On Thu, Dec 8, 2016 at 8:10 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Could you also see if you can upgrade to a newer version of NiFi? Current 
release is 1.1.0, so 0.6.1 is about 3 releases behind and there were quite a 
few bug fixes and improvements since then that could shed some more light into 
your issue.
Meanwhile I’ll see what I can dig up on my end.

Cheers
Oleg

On Dec 8, 2016, at 8:00 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Yes absolutely Oleg. I will try to figure out how to make my rabbitMQ queues 
explicitly active. I'll need to dig into that because it is not plainly obvious 
to me how / if that needs to be done. I'll dig for that.

I am using NiFi 0.6.1.

Cheers and thanks again,

Jim

On Thu, Dec 8, 2016 at 7:46 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James I have not used AMQP in a while (although I am the one responsible for 
developing the functionality you’re having problem with ;)), so I need to brush 
up a bit on some of the administrative actions you describe (idle to active). 
Do i’ll get back to you, but as a joint debug step, can you please try to make 
your queens explicitly active and see if behavior changes?

Also, what version of NiFi you are using?
Cheers when
Oleg

On Dec 8, 2016, at 7:37 AM

Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

I got to the bottom of the problem and will raise a few bugs against it, but 
those will be more about providing more information as to what is the cause of 
the problem, but  the problem is most definitely on your end with regard to 
exchange, routingKey and queue setup.

Yes, what you’re describing does make some sense and that is exactly what I’ll 
be filing for improvement. So here is the short version of it.
- When sending to a non-existing exchange is fatal and the channel will be 
automatically closed. Do first improvement is to check for state of the channel 
right after basicPublish().
- If exchange is valid but routingKey is not and ‘mandatory’ flag set to true 
(which is always true in our case) there is a ReturnListener that could be 
registered with the channel that will let you know pretty much instantaneously 
if routing failed.
So for your current case I am almost positive that due to some old versions of 
Rabbit and AMQP client version used by us there is some kind of incompatibility 
that results in either of those condition to be presented to a client (even 
though I believe that you may have configured everything correctly, but on the 
older version of Rabbit).

So, anyway, I’ll file these two bugs while your will try to convince the 
powerful men you’re working for to upgrade RabbitMQ ;)

I’ll follow up with JIRA numbers so you can track

Cheers
Oleg

On Dec 8, 2016, at 2:58 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Glad to hear that you also saw this behavior Oleg. Excellent work - thank you. 
I think I've proven that this is a problem caused by the version of rabbitMQ 
and Erlang I am running. The one that breaks: rMQ 3.15, Erlang R14B04.

To prove my theory a co-worker exposed for my test use a dev instance of rMQ 
that he happened to have handy (rMQ 3.65, Erlang 19.1) on another server and 
port. And my test messages started popping up in the same exchange / queue / 
binding key / routing key configuration over there like minions. I think I need 
to get approval to upgrade rMQ and E. While I won't be able to upgrade my NiFi 
baseline because of other organizational dependencies, I can certainly 
eliminate the problems due to the rMQ and Erlang versions.

Does this seem like a reasonable approach in your opinion?

On Thu, Dec 8, 2016 at 1:45 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

So, first good news. I was able to reproduce what you’re seeing. That is always 
good.
Basically I am sending messages to a bogus exchange with a bogus routing key. 
The first send results in success from Java API standpoint (no exception is 
thrown), however, the second send fails due to the fact that the channel is 
closed and then when I try to stop the processor it gives me the same error as 
you see during the stop call.
That said, one thing is clear is that your message does not reach Rabbit with 
any type of success and that is most likely due to some misconfiguration 
between exchange, binding and routing key.
Let me investigate a bit further and i’ll be in touch

Meanwhile if you look in your app logs can you confirm that you can find the 
stack trace message that looks similar to what’s below?

13:37:30,449 ERROR Timer-Driven Process Thread-10 processors.PublishAMQP:268 -
java.lang.IllegalStateException: This instance of AMQPPublisher is invalid 
since its publishingChannel is closed
at org.apache.nifi.amqp.processors.AMQPPublisher.publish(AMQPPublisher.java:99)
at 
org.apache.nifi.amqp.processors.PublishAMQP.rendezvousWithAmqp(PublishAMQP.java:137)

Cheers
Oleg
On Dec 8, 2016, at 1:04 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Got it, let me dig in, i got time now

On Dec 8, 2016, at 12:46 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

No problem Oleg. My fault for not being clear. The test works demonstrating the 
message delivery from RMQ exchange to RMQ queue. I was trying to be systematic, 
and eliminate points of possible failure, the binding of the exchange to the 
queue in RMQ being one.

It does not work, NiFi to RMQ. I see that a connection is created by RMQ but 
that connection is blocking. I do not believe the content ever actually gets 
handed to the RMQ exchange by NiFi.



On Thu, Dec 8, 2016 at 12:41 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

My apology , but I am not sure I am clear on the second part of your 
explanation. It appears you are saying that when you bound the Exchange to a 
legit routing key all works as expected (in NiFi), but then what?
Could you please clarify?

Sorry about that.
Oleg

On Dec 8, 2016, at 11:41 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Two quick updates. I generated a test message in the

Re: Nothing in rabbitMQ queue from Publish AMQP processor

2016-12-08 Thread Oleg Zhurakousky
James

You're welcome! And here is the new ticket 
https://issues.apache.org/jira/browse/NIFI-3172

Cheers
Oleg

On Dec 8, 2016, at 3:13 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Outstanding work, Oleg. Thank you. I'll watch for those JIRA tickets and in the 
meanwhile will battle to bring our rMQ out of the dark ages. Thank you very 
much once again for your help investigating this behavior Oleg.

On Thu, Dec 8, 2016 at 3:08 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

I got to the bottom of the problem and will raise a few bugs against it, but 
those will be more about providing more information as to what is the cause of 
the problem, but  the problem is most definitely on your end with regard to 
exchange, routingKey and queue setup.

Yes, what you’re describing does make some sense and that is exactly what I’ll 
be filing for improvement. So here is the short version of it.
- When sending to a non-existing exchange is fatal and the channel will be 
automatically closed. Do first improvement is to check for state of the channel 
right after basicPublish().
- If exchange is valid but routingKey is not and ‘mandatory’ flag set to true 
(which is always true in our case) there is a ReturnListener that could be 
registered with the channel that will let you know pretty much instantaneously 
if routing failed.
So for your current case I am almost positive that due to some old versions of 
Rabbit and AMQP client version used by us there is some kind of incompatibility 
that results in either of those condition to be presented to a client (even 
though I believe that you may have configured everything correctly, but on the 
older version of Rabbit).

So, anyway, I’ll file these two bugs while your will try to convince the 
powerful men you’re working for to upgrade RabbitMQ ;)

I’ll follow up with JIRA numbers so you can track

Cheers
Oleg

On Dec 8, 2016, at 2:58 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Glad to hear that you also saw this behavior Oleg. Excellent work - thank you. 
I think I've proven that this is a problem caused by the version of rabbitMQ 
and Erlang I am running. The one that breaks: rMQ 3.15, Erlang R14B04.

To prove my theory a co-worker exposed for my test use a dev instance of rMQ 
that he happened to have handy (rMQ 3.65, Erlang 19.1) on another server and 
port. And my test messages started popping up in the same exchange / queue / 
binding key / routing key configuration over there like minions. I think I need 
to get approval to upgrade rMQ and E. While I won't be able to upgrade my NiFi 
baseline because of other organizational dependencies, I can certainly 
eliminate the problems due to the rMQ and Erlang versions.

Does this seem like a reasonable approach in your opinion?

On Thu, Dec 8, 2016 at 1:45 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

So, first good news. I was able to reproduce what you’re seeing. That is always 
good.
Basically I am sending messages to a bogus exchange with a bogus routing key. 
The first send results in success from Java API standpoint (no exception is 
thrown), however, the second send fails due to the fact that the channel is 
closed and then when I try to stop the processor it gives me the same error as 
you see during the stop call.
That said, one thing is clear is that your message does not reach Rabbit with 
any type of success and that is most likely due to some misconfiguration 
between exchange, binding and routing key.
Let me investigate a bit further and i’ll be in touch

Meanwhile if you look in your app logs can you confirm that you can find the 
stack trace message that looks similar to what’s below?

13:37:30,449 ERROR Timer-Driven Process Thread-10 processors.PublishAMQP:268 -
java.lang.IllegalStateException: This instance of AMQPPublisher is invalid 
since its publishingChannel is closed
at org.apache.nifi.amqp.processors.AMQPPublisher.publish(AMQPPublisher.java:99)
at 
org.apache.nifi.amqp.processors.PublishAMQP.rendezvousWithAmqp(PublishAMQP.java:137)

Cheers
Oleg
On Dec 8, 2016, at 1:04 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:

Got it, let me dig in, i got time now

On Dec 8, 2016, at 12:46 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

No problem Oleg. My fault for not being clear. The test works demonstrating the 
message delivery from RMQ exchange to RMQ queue. I was trying to be systematic, 
and eliminate points of possible failure, the binding of the exchange to the 
queue in RMQ being one.

It does not work, NiFi to RMQ. I see that a connection is created by RMQ but 
that connection is blocking. I do not believe the content ever actually gets 
handed to the RMQ exchange b

Re: How to integrate a custom protocol with Apache Nifi

2016-12-07 Thread Oleg Zhurakousky
Kant

So, yes NiFi already provides basic integration with HTTP by virtue of PostHTTP 
processor (post a content of the FlowFile to HTTP endpoint) and the same in 
reverse with GetHTTP. So for you case your integration will be as simple as 
posting or getting from your micro service

And what makes it even more interesting is that NIFi in the microservices 
architecture can play an even more significant role as coordinator, 
orchestrator and mediator of heterogeneous environment of many micro services.

Hope that answers it.
Cheers
Oleg


On Dec 7, 2016, at 2:48 PM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

Sorry I should have been more clear. My question is even more simpler and 
naive. Say I am writing a HTTP based microservice (Now I assume Nifi has 
integration with HTTP since it is a well known protocol ). Now, how would I 
integrate Nifi with my HTTP based server?

On Wed, Dec 7, 2016 at 4:55 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Kant

There are couple of questions here so, let me try one at the time.
1. “why Nifi would act as a server?”. Well, NiFi is a runtime environment where 
things are happening. To be more exact; NiFi is a runtime environment where 
things are triggered. The big distinction here is trigger vs happening. In 
other words you may choose (as most processors do) to have NiFi act as an 
execution container and run your code, or you may chose for the NiFi to be a 
triggering container that triggers to run your code elsewhere. Example of the 
later one is the SpringContextProcessor which while still runs in the same JVM 
delegates execution to Spring application that runs in its own container (which 
could as well be a separate JVM or nah other container - ala micro services). 
So in this case NiFi still manages the orchestration, mediation, security etc.

2. “...do I need to worry about how my service would talk to Nifi or do I have 
the freedom of just focusing on my application and using whatever protocol I 
want and In the end just plugin to Nifi which would take care?” That is a 
loaded question and I must admit not fully understood, but i’ll give it a try. 
When integrating with NiFi you use one of the integration points such as 
Processor and/or ControllerService. Those are NiFi known strategies 
(interfaces) and so your custom Processor would need to implement such strategy 
and obviously be compliant with its contract. However, the other part of your 
question is about implementing something “independent” and just plug-in and if 
t’s possible. My answer is still YES as long as you design it that way. As an 
example of such design you may want to look at AMQP support where there are a 
pair of processors (PublishAMQP/ConsumeAMQP), but when you look at the code of 
these processors you’ll see that neither has any AMQP dependencies. Instead 
they depend on another case (let’s call it Worker) and that class is completely 
independent of NiFi. So in summary your protocol-specific Processor is 
independent of the protocol and your protocol-specific Worker is independent of 
NiFi and the two delegate between one another through common to both object (in 
this case bye[]). The Spring processor mentioned above implements the same 
pattern.

And one more point about Microservices. I am willing to go as far as saying 
that NiFi is a variation of Microservices container. I am basing it n the fact 
that NiFi components (i.e., processors) implement a fundamental micro services 
pattern  - certain independence from it’s runtime and persistence of its 
results - which allows each and every component to be managed 
(start/stopped/reconfigured/changed) independently of upstream/downstream 
components.

Ok, that is a load to process, so I’ll stop ;)

Look forward to more questions

Cheers
Oleg


On Dec 7, 2016, at 4:52 AM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

I am also confused a little bit since I am new to Nifi. I wonder why Nifi would 
act as a server? isn't Nifi a routing layer between systems? because this 
brings in another question about Nifi in general.

When I write my applications/microservices do I need to worry about how my 
service would talk to Nifi or do I have the freedom of just focusing on my 
application and using whatever protocol I want and In the end just plugin to 
Nifi which would take care? other words is Nifi a tight integration with 
applications such that I always have to import a Nifi Library within my 
application/microservice ? other words do I need to worry about Nifi at 
programming/development time of an Application/Microservice or at deployment 
time?

Sorry if these are naive questions. But answers to those will help greatly and 
prevent me from asking more questions!

Thanks much!
kant




On Wed, Dec 7, 2016 at 1:23 AM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:
Hi K

Re: How to integrate a custom protocol with Apache Nifi

2016-12-07 Thread Oleg Zhurakousky
Ohh no, not at all. Just tell it the host/port combination and off you go, 
basically the second part of what you said

Oleg

On Dec 7, 2016, at 3:08 PM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

so do I need to import PostHTTP processor, GetHTTP processor Libraries in my 
application code? or can I just build a HTTP Server and tell Nifi JVM that "hey 
here is my Application (it runs on specific IP and port) and it uses HTTP".

On Wed, Dec 7, 2016 at 12:00 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Kant

So, yes NiFi already provides basic integration with HTTP by virtue of PostHTTP 
processor (post a content of the FlowFile to HTTP endpoint) and the same in 
reverse with GetHTTP. So for you case your integration will be as simple as 
posting or getting from your micro service

And what makes it even more interesting is that NIFi in the microservices 
architecture can play an even more significant role as coordinator, 
orchestrator and mediator of heterogeneous environment of many micro services.

Hope that answers it.
Cheers
Oleg


On Dec 7, 2016, at 2:48 PM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

Sorry I should have been more clear. My question is even more simpler and 
naive. Say I am writing a HTTP based microservice (Now I assume Nifi has 
integration with HTTP since it is a well known protocol ). Now, how would I 
integrate Nifi with my HTTP based server?

On Wed, Dec 7, 2016 at 4:55 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Kant

There are couple of questions here so, let me try one at the time.
1. “why Nifi would act as a server?”. Well, NiFi is a runtime environment where 
things are happening. To be more exact; NiFi is a runtime environment where 
things are triggered. The big distinction here is trigger vs happening. In 
other words you may choose (as most processors do) to have NiFi act as an 
execution container and run your code, or you may chose for the NiFi to be a 
triggering container that triggers to run your code elsewhere. Example of the 
later one is the SpringContextProcessor which while still runs in the same JVM 
delegates execution to Spring application that runs in its own container (which 
could as well be a separate JVM or nah other container - ala micro services). 
So in this case NiFi still manages the orchestration, mediation, security etc.

2. “...do I need to worry about how my service would talk to Nifi or do I have 
the freedom of just focusing on my application and using whatever protocol I 
want and In the end just plugin to Nifi which would take care?” That is a 
loaded question and I must admit not fully understood, but i’ll give it a try. 
When integrating with NiFi you use one of the integration points such as 
Processor and/or ControllerService. Those are NiFi known strategies 
(interfaces) and so your custom Processor would need to implement such strategy 
and obviously be compliant with its contract. However, the other part of your 
question is about implementing something “independent” and just plug-in and if 
t’s possible. My answer is still YES as long as you design it that way. As an 
example of such design you may want to look at AMQP support where there are a 
pair of processors (PublishAMQP/ConsumeAMQP), but when you look at the code of 
these processors you’ll see that neither has any AMQP dependencies. Instead 
they depend on another case (let’s call it Worker) and that class is completely 
independent of NiFi. So in summary your protocol-specific Processor is 
independent of the protocol and your protocol-specific Worker is independent of 
NiFi and the two delegate between one another through common to both object (in 
this case bye[]). The Spring processor mentioned above implements the same 
pattern.

And one more point about Microservices. I am willing to go as far as saying 
that NiFi is a variation of Microservices container. I am basing it n the fact 
that NiFi components (i.e., processors) implement a fundamental micro services 
pattern  - certain independence from it’s runtime and persistence of its 
results - which allows each and every component to be managed 
(start/stopped/reconfigured/changed) independently of upstream/downstream 
components.

Ok, that is a load to process, so I’ll stop ;)

Look forward to more questions

Cheers
Oleg


On Dec 7, 2016, at 4:52 AM, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

I am also confused a little bit since I am new to Nifi. I wonder why Nifi would 
act as a server? isn't Nifi a routing layer between systems? because this 
brings in another question about Nifi in general.

When I write my applications/microservices do I need to worry about how my 
service would talk to Nifi or do I have the freedom of just focusing on my 
application and using wh

Re: NiFi PlublishAMQP using cert CN as username

2016-12-10 Thread Oleg Zhurakousky
Brian

Thank you for detailed explanation.
I don't believe you're doing anything wrong. We just need do add the feature 
you describe (pulling credentials from certificate).

Would you mind creating JIRA ticket and if at all possible attach the sample 
code that demonstrates exactly what you're trying to accomplish?

Cheers
Oleg


On Dec 10, 2016, at 03:52, Kiran 
> wrote:

Hello,

I'm having a bit of trouble getting NiFi to talk to RabbitMQ using SSL. I've 
created some certificates using the openssl and I have been successful in 
sending messages to RabbitMQ when I specific an SSL context and a 
username/password. In this scenario I can see a TLS 1.2 HTTPS connection form 
between NiFi and RabbitMQ and the username and password used to then 
authenticate successfully, so from this I know that the certs being used are 
valid.

What I'm trying to achieve is for the RabbitMQ username to be pulled out of the 
certificate COMMON_NAME so don't need to provide a username and password. I've 
created a quick test application to confirm that I can connect successfully to 
RabbitMQ using the certs I created and just the certificate CN name and this 
worked, which means it must be something I've done wrong within my NiFi 
processor configuration which is why I'm sending this email for help :)

The RabbitMQ configuration I'm using is:

  *   RabbitMQ 3.5.4
  *   Erlang 18.0
  *   rabbitmq_auth_mechanism_ssl plugin enabled
  *   Base OS is RHEL 6.5

My RabbitMQ.config contains the following:
[
  {rabbit, [
 {ssl_listeners, [5671]},
 {loopback_users, []},
 {auth_mechanisms, ['EXTERNAL', 'PLAIN']},
 {ssl_options, [{cacertfile,"/home/data/openssl/brian_testca/cacert.pem"},
{certfile,"/home/data/openssl/brian_server/cert.pem"},
{keyfile,"/home/data/openssl/brian_server/key.pem"},
{verify,verify_peer},
{versions, ['tlsv1.2']},
{password,  "MySecretPassword"},
{verify,verify_peer},
{ssl_cert_login_from, common_name},
{fail_if_no_peer_cert,true}]}
   ]}
].

The NiFi configuration I'm using is:

  *
NiFi 0.7.1 (We are in the process of updating to NiFi 1.1.0 but there are some 
dependencies on other projects so it will happen just not for a few months)
  *
2 Clusters each made up of 1 NCM and 3 Nodes
  *
In the PublishAMQP I've put the certificate CN name into the "username" field.

The client certificate I'm using to connect to RabbitMQ has a CN name of: 
"rabbitmq_client". There is an entry for it in the RabbitMQ users with NO 
PASSWORD set.

Error message in the rabbitmq log files:

=ERROR REPORT 7-Dec-2016::21:47:30 ===
closing AMQP connection <0.905.0> (192.168.137.1:54324 -> 192.168.137.128:5671):
{handshake_error,starting,0,
 {amqp_error,access_refused,
 "PLAIN login refused: user 'rabbitmq_client' - 
invalid credentials",
 'connection.start_ok'}}

Please can you tell me if there is something obvious that I'm missed out in my 
NiFi configuration?

I did have a very brief look at the code and I was thinking that because the 
USERNAME and PASSWORD were mandatory fields and always used to establish the 
connection it could be that RabbitMQ prioritises those fields before trying to 
pull out the CN name and using that for authentication. The reason I was 
thinking this was in the test app I created I didn't specify the username or 
password when setting up my ConnectionFactory but the RabbitMQ documentation 
says even if you don't specify the username and password they default to 
guest/guest so this could be a red herring.

Thanks in advance for the help,

Brian


Re: File causes unexpected stoppage in play

2016-12-14 Thread Oleg Zhurakousky
James

Could you also let us know what version of NiFi you are using? The issue with 
properly handling InvalidPathException was fixed in NiFi 0.7.0 as part of 
https://issues.apache.org/jira/browse/NIFI-920
It essentially has this catch block:
} catch (final ProcessException | InvalidPathException e) {
 logger.error("Unable to unpack {} due to {}; routing to failure", new 
Object[]{flowFile, e});
 session.transfer(flowFile, REL_FAILURE);

So I am wondering if you are on the proviso release?
Cheers
Oleg

On Dec 14, 2016, at 10:49 AM, Joe Witt 
> wrote:

James,

Can you please share the full log entry for that failure.  It is
possible the processor is not catching the exception properly and
routing the data to failure.  It might simply be letting the exception
loose thus the framework detects the issue and rollsback the session
and yields the processor.

Likely an easy thing to fix in the processor but please do share the
full nifi-app.log entry for this.

Thanks
Joe

On Wed, Dec 14, 2016 at 10:47 AM, James McMahon 
> wrote:
Hello. Am wondering if anyone knows how to overcome a challenge with
unmappable file characters? I have used a GetFile processor to bring a large
number of zip files into a NiFi flow. All read in successfully. I try to
then use an UnpackContent processor to unzip the files to their individual
file members. Most work just fine. However there appears to be a file that
throws this error in UnpackContent:

failed to process session due to java.nio.file.InvalidPathException:
Malformed input or input contains unmappable characters

My processing stalls. Nothing else flows. What is the proper way to
configure the UnpackContent processor step so that it shuttle such files off
to the side when it encounters them, and permits the other files waiting in
queue to process? I do now have a "failure" path established for my
UnpackContent processor, but for some reason it does not send these problem
files down that path. I suspect it may be because the zip files does unpack
successfully but the underlying file(s) within the zip cause processing to
choke.

How can I engineer a flow to overcome this challenge? Thanks in advance for
your help.




Re: How to integrate a custom protocol with Apache Nifi

2016-12-06 Thread Oleg Zhurakousky
Hi Kant

What you’re trying to accomplish is definitely possible, however more 
information may be needed from you.
For example, the way I understand your statement about “integration with many 
systems” is something like JMS, Kafka, TCP, FTP etc…. If that is the case such 
integration is definitely possible with your “custom system” by developing 
custom Processor and/or ControllerService.
Processors and ControllerServices are the two main integration points within 
NiFi 
You can definitely find may examples by looking at some of the processors 
(i.e., PublishKafka or ConsumeKafka, PublishJMS or ConsumeJMS etc.)

Let us know if you need more help to guide you through the process.

Cheers
Oleg

> On Dec 6, 2016, at 7:46 PM, kant kodali  wrote:
> 
> HI All,
> 
> I understand that Apache Nifi has integration with many systems but what If I 
> have an application that talks a custom protocol ? How do I integrate Apache 
> Nifi with the custom protocol?
> 
> Thanks,
> kant



Re: How to integrate a custom protocol with Apache Nifi

2016-12-07 Thread Oleg Zhurakousky
blem, We use NSQ a lot (although not my favorite) and
> want to be able integrate Nifi with NSQ to other systems such as Kafka,
> Spark, Cassandra, ElasticSearch, some micro services that uses HTTP2 and
> some micro service that uses Websockets
>
> (which brings in other question if Nifi has HTTP2 and Websocket processors?)
>
> Also below is NSQ protocol spec. They have Java client library as well.
>
> http://nsq.io/clients/tcp_protocol_spec.html
>
>
> On Tue, Dec 6, 2016 at 5:14 PM, Oleg Zhurakousky
> <ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
>>
>> Hi Kant
>>
>> What you’re trying to accomplish is definitely possible, however more
>> information may be needed from you.
>> For example, the way I understand your statement about “integration with
>> many systems” is something like JMS, Kafka, TCP, FTP etc…. If that is the
>> case such integration is definitely possible with your “custom system” by
>> developing custom Processor and/or ControllerService.
>> Processors and ControllerServices are the two main integration points
>> within NiFi
>> You can definitely find may examples by looking at some of the processors
>> (i.e., PublishKafka or ConsumeKafka, PublishJMS or ConsumeJMS etc.)
>>
>> Let us know if you need more help to guide you through the process.
>>
>> Cheers
>> Oleg
>>
>> > On Dec 6, 2016, at 7:46 PM, kant kodali 
>> > <kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:
>> >
>> > HI All,
>> >
>> > I understand that Apache Nifi has integration with many systems but what
>> > If I have an application that talks a custom protocol ? How do I integrate
>> > Apache Nifi with the custom protocol?
>> >
>> > Thanks,
>> > kant
>>
>





Re: File causes unexpected stoppage in play

2016-12-14 Thread Oleg Zhurakousky
James

This article should help you get started 
https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions 
especially since you only have to go through the first part only “Maven 
Processor Archetype” (I am assuming you are familiar with maven).
Also, make sure that you pick the right NiFi version when executing 'mvn 
archetype. . .’  command. The current website example use 1.0.0. Change it to 
0.6.1 (see below).

$> mvn archetype:generate -DarchetypeGroupId=org.apache.nifi 
-DarchetypeArtifactId=nifi-processor-bundle-archetype -DarchetypeVersion=0.6.1 
-DnifiVersion=0.6.1

After that you may want to copy the UnpackContent into some package within he 
bundle, rename it, adjust the class path to ensure it compiles and build. Make 
sure you add/modify your processor’s name in 
src/main/resources/META-INF/services/org.apache.nifi.processor.Processor file.

Iet me know if you run into any issues.
Cheers
Oleg


On Dec 14, 2016, at 3:08 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I've never tried to create and deploy my own custom bundle but would like to 
try #3 as you recommend Oleg. If you can offer guidance to that process let me 
try that. Thank you. -Jim

On Wed, Dec 14, 2016 at 2:57 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Not sure what kind of liberties you have with your environment, but here is 
what you can do outside of full upgrade.

1. You can download and drop the bundle (NAR) from NiFi 0.7 release into 0.6 
release. That should work (just replace it in the ‘lib’ directory of NiFi 
home). However, this particular component is located in ’nifi-standard-nar’ 
which means it contains majority of the processors and controller services that 
come with NiFi. So while upgrading one component you will essentially upgrade 
almost all and I am not sure what it may lead to, but you can definitely try.

2.  You can certainly checkout the NiFi 0.6.1 source from the repo and apply 
the same patch on UpdateContent, build NAR and replace it as described in the 
step above, Even though you’ll still be replacing the entire NAR the changes 
are only going to be in that single component.

3.  You can create your own custom bundle (NAR), copy the UnpackContent code 
from 0.7 branch, rename the actual class to something different (e.g., 
UnzipContent) and then deploy it as your own custom bundle and use it in your 
flow.

4.  You can also apply patch through fixing the byte code in flight but that 
would require you to write javaagent and be familiar with byte code 
manipulations frameworks. Not advisable. . .

Honestly, aside from full upgrade the only viable option here is #3 since it 
gives you full control and complete isolation from every other component in 
NiFi.

Let me know what you think so we can guide you thru the process.

Cheers
Oleg

On Dec 14, 2016, at 2:18 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Oleg, I am bound to NiFi 0.6.1 by enterprise application dependencies. So this 
fix will not be in my baseline if I understand you correctly. Let me ask you 
this: is there any way I can build this into my code baseline - either through 
a code mod and rebuild or as a custom plugin feature specific to my flow? 
Thanks very much for your help.

On Wed, Dec 14, 2016 at 10:59 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Could you also let us know what version of NiFi you are using? The issue with 
properly handling InvalidPathException was fixed in NiFi 0.7.0 as part of 
https://issues.apache.org/jira/browse/NIFI-920
It essentially has this catch block:
} catch (final ProcessException | InvalidPathException e) {
 logger.error("Unable to unpack {} due to {}; routing to failure", new 
Object[]{flowFile, e});
 session.transfer(flowFile, REL_FAILURE);

So I am wondering if you are on the proviso release?
Cheers
Oleg

On Dec 14, 2016, at 10:49 AM, Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:

James,

Can you please share the full log entry for that failure.  It is
possible the processor is not catching the exception properly and
routing the data to failure.  It might simply be letting the exception
loose thus the framework detects the issue and rollsback the session
and yields the processor.

Likely an easy thing to fix in the processor but please do share the
full nifi-app.log entry for this.

Thanks
Joe

On Wed, Dec 14, 2016 at 10:47 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:
Hello. Am wondering if anyone knows how to overcome a challenge with
unmappable file characters? I have used a GetFile processor to bring a large
number of zip files into a NiFi flow. All read in successfully. I try to
then use an UnpackContent processor to unzip the files to t

Re: File causes unexpected stoppage in play

2016-12-14 Thread Oleg Zhurakousky
James

Not sure what kind of liberties you have with your environment, but here is 
what you can do outside of full upgrade.

1. You can download and drop the bundle (NAR) from NiFi 0.7 release into 0.6 
release. That should work (just replace it in the ‘lib’ directory of NiFi 
home). However, this particular component is located in ’nifi-standard-nar’ 
which means it contains majority of the processors and controller services that 
come with NiFi. So while upgrading one component you will essentially upgrade 
almost all and I am not sure what it may lead to, but you can definitely try.

2.  You can certainly checkout the NiFi 0.6.1 source from the repo and apply 
the same patch on UpdateContent, build NAR and replace it as described in the 
step above, Even though you’ll still be replacing the entire NAR the changes 
are only going to be in that single component.

3.  You can create your own custom bundle (NAR), copy the UnpackContent code 
from 0.7 branch, rename the actual class to something different (e.g., 
UnzipContent) and then deploy it as your own custom bundle and use it in your 
flow.

4.  You can also apply patch through fixing the byte code in flight but that 
would require you to write javaagent and be familiar with byte code 
manipulations frameworks. Not advisable. . .

Honestly, aside from full upgrade the only viable option here is #3 since it 
gives you full control and complete isolation from every other component in 
NiFi.

Let me know what you think so we can guide you thru the process.

Cheers
Oleg

On Dec 14, 2016, at 2:18 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Oleg, I am bound to NiFi 0.6.1 by enterprise application dependencies. So this 
fix will not be in my baseline if I understand you correctly. Let me ask you 
this: is there any way I can build this into my code baseline - either through 
a code mod and rebuild or as a custom plugin feature specific to my flow? 
Thanks very much for your help.

On Wed, Dec 14, 2016 at 10:59 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Could you also let us know what version of NiFi you are using? The issue with 
properly handling InvalidPathException was fixed in NiFi 0.7.0 as part of 
https://issues.apache.org/jira/browse/NIFI-920
It essentially has this catch block:
} catch (final ProcessException | InvalidPathException e) {
 logger.error("Unable to unpack {} due to {}; routing to failure", new 
Object[]{flowFile, e});
 session.transfer(flowFile, REL_FAILURE);

So I am wondering if you are on the proviso release?
Cheers
Oleg

On Dec 14, 2016, at 10:49 AM, Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:

James,

Can you please share the full log entry for that failure.  It is
possible the processor is not catching the exception properly and
routing the data to failure.  It might simply be letting the exception
loose thus the framework detects the issue and rollsback the session
and yields the processor.

Likely an easy thing to fix in the processor but please do share the
full nifi-app.log entry for this.

Thanks
Joe

On Wed, Dec 14, 2016 at 10:47 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:
Hello. Am wondering if anyone knows how to overcome a challenge with
unmappable file characters? I have used a GetFile processor to bring a large
number of zip files into a NiFi flow. All read in successfully. I try to
then use an UnpackContent processor to unzip the files to their individual
file members. Most work just fine. However there appears to be a file that
throws this error in UnpackContent:

failed to process session due to java.nio.file.InvalidPathException:
Malformed input or input contains unmappable characters

My processing stalls. Nothing else flows. What is the proper way to
configure the UnpackContent processor step so that it shuttle such files off
to the side when it encounters them, and permits the other files waiting in
queue to process? I do now have a "failure" path established for my
UnpackContent processor, but for some reason it does not send these problem
files down that path. I suspect it may be because the zip files does unpack
successfully but the underlying file(s) within the zip cause processing to
choke.

How can I engineer a flow to overcome this challenge? Thanks in advance for
your help.






Re: Migrating NiFi templates/canvas

2017-01-13 Thread Oleg Zhurakousky
Panos

What version of NiFi are you using?
The issue with the NiFi templates diffs has actually been addressed (see 
https://issues.apache.org/jira/browse/NIFI-826). So as of NiFi 1.0 you can 
export templates in a deterministic way and then execute diff on them to see 
only what was changed. It was addressed specifically for the SDLC issue you are 
describing. Also, when processor is associated with controller service and you 
created a template that includes such processor, its dependent controller 
service will end up on the template and back in your environment once such 
template is imported.

Anyway, give it a shot and let us know.
Cheers
Oleg

On Jan 13, 2017, at 4:49 AM, Panos Geo 
> wrote:


Hello all,

We have a Continuous Integration/Continuous Development pipeline that involves 
for each stage (development, testing, customer deployment) a dedicated virtual 
machine running a NiFi instance. As such, for each stage of the pipeline we 
have to migrate the changes of our NiFi canvas from one virtual machine to 
another.

For this we encounter two big problems :

  1.  As far as I know, there is no easy way to diff NiFi canvases (or 
templates) so that we can recognize the changes and don’t have to export and 
import the full canvas each time (as there are configuration changes in the 
target VM that we wouldn’t like to overwrite).
  2.  If we have to export the full canvas, is there a way to reassociate 
processors with the required services on the target virtual machine (e.g. 
DBCPConnectionPool or HTTPContextMap) without having to go through each 
processor explicitly and do the reassignment?


Many thanks in advance,
Panos



Re: SFTP and Data Integrity

2017-01-05 Thread Oleg Zhurakousky
Nicolas

I don’t believe there is. A possible workaround may be to use ExecuteScript 
processor, but I’ll be honest I have not tried.
That said, what you are describing seems to be rather valuable feature and if 
the underlying API (or other libraries) allows for it then we can definitely 
make that happen in the future. Would you mind creating a feature request for 
it here - https://issues.apache.org/jira/browse/NIFI

Cheers
Oleg

On Jan 5, 2017, at 4:23 AM, Provenzano Nicolas 
> wrote:

Hi all,

Is there any way of configuring the fetchSFTP processor to calculate a remote 
file checksum in order to check data integrity ?

If there isn’t with this processor, is there any other ways of doing it ?

Thanks in advance,

BR

Nicolas Provenzano



Re: Building nifi locally on my mac

2016-12-29 Thread Oleg Zhurakousky
Actually it may be related to the known JDK issue that has been reported 
several times.
Does it look something like this (see below)? If so then please upgrade your 
JDK to something newer. Something above 1.8.0_65

Cheers
Oleg
. . .
testReadFlowFileContentAndStoreInFlowFileAttribute(org.apache.nifi.processors.script.TestInvokeJavascript)
  Time elapsed: 0.206 sec  <<< FAILURE!
java.lang.AssertionError: Processor has 1 validation failures:
'Validation' validated against 
'target/test/resources/javascript/test_reader.js' is invalid because An error 
occurred calling validate in the configured script Processor.

   at org.junit.Assert.fail(Assert.java:88)
   at 
org.apache.nifi.util.MockProcessContext.assertValid(MockProcessContext.java:251)
   at 
org.apache.nifi.util.StandardProcessorTestRunner.assertValid(StandardProcessorTestRunner.java:334)
   at 
org.apache.nifi.processors.script.TestInvokeJavascript.testReadFlowFileContentAndStoreInFlowFileAttribute(TestInvokeJavascript.java:59)

testScriptDefinedRelationship(org.apache.nifi.processors.script.TestInvokeJavascript)
  Time elapsed: 0.102 sec  <<< FAILURE!
java.lang.AssertionError: null
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertTrue(Assert.java:52)
   at 
org.apache.nifi.processors.script.TestInvokeJavascript.testScriptDefinedRelationship(TestInvokeJavascript.java:128)

. . .

On Dec 29, 2016, at 10:11 AM, Joe Percivall 
> wrote:

Hello,

I don't see any text between "... test error :" and "(on CentOS it works 
fine)". Could you try reformatting and resending?


Joe
On Thu, Dec 29, 2016 at 5:26 AM, ddewaele 
> wrote:
When I try to create a local build of nifi on my mac I always get the
following test error : (on CentOS it works fine).

Any idea what is causing this and how this can be fixed ?





--
View this message in context: 
http://apache-nifi-users-list.2361937.n4.nabble.com/Building-nifi-locally-on-my-mac-tp542.html
Sent from the Apache NiFi Users List mailing list archive at 
Nabble.com.



--

- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: jperciv...@apache.com



Re: PublishAMQP had been working, now failing

2017-03-20 Thread Oleg Zhurakousky
James, while I am looking couple of questions.
1. Did you actual RabbitMQ version has changed since you used NiFi 0.6?
2. Are you using default guest/guest user/password to connect?

Cheers
Oleg
On Mar 20, 2017, at 11:04 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I can certainly share germane pieces from it, Oleg.

Background: NiFi version is 0.7.1. RabbitMQ is 3.6.6. Erlang is 17.5-1.

>From nifi-app.log:
java.lang.IllegalStateException: Failed to establish connection with AMQP 
Broker: com.rabbitmq.client.ConnectionFactory@78..
.
. lots of stack trace lines
.
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: com.rabbitmq.client.Authentication.FailureException: ACCESS REFUSED 
- Login was refused using authentication PLAIN. For details see the broker 
logfile.

>From the rmq log file:
accepting AMQP connection <0.13753.9> (127.0.0.1:54318<http://127.0.0.1:54318/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>)

=ERROR REPORT 20-Mar-2017:10:46:30 ===
closing AMQP connection <0.20511.61> (127.0.0.1:48902<http://127.0.0.1:48902/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>):
{bad header,<< 22,3,3,0,195,1,0,0>>}

I had recently upgraded my NiFi from 0.6.x to 0.7.1. My broker log shows 
successful connections without the error report when I was still 0.6.x. Since I 
upgraded I have not been able to connect with success.

Thanks in advance for your help. I am very interested in why I seem to be 
getting this bad header message now. -Jim

On Mon, Mar 20, 2017 at 10:00 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James, any chance you can provide stack trace from the logs?

Cheers
Oleg

> On Mar 20, 2017, at 15:34, James McMahon 
> <jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:
>
> Good morning. I am using a NiFi 0.7 code base. I recently upgraded to that 
> from a 0.6.x version. I'm limited to that 0.7.x baseline now. I realize it is 
> old(er), but I have no choice in the matter.
>
> I had a PublishAMQP processor that had been working without issue to connect 
> to RabbitMQ but is now failing with this error:
>
> Failed to establish connection with AMQP Broker: 
> com.rabbitmq.client.ConnectionFactory@45cd2c16
>
> I can determine why this is now happening. RabbitMQ appears to be started and 
> running fine. I'm connecting to host localhost, and a specific port that 
> seems available. I get the same error if I (temporarily) turn off my firewall 
> entirely.
>
> I am attempting to employ a StandardSSLContextService that Enables without 
> error.
>
> How can I troubleshoot this error and figure out why I cannot connect to 
> rabbitMQ using PublishAMQP? Thanks in advance for any help you can offer. -Jim




Re: PublishAMQP had been working, now failing

2017-03-20 Thread Oleg Zhurakousky
James

Basically my next question would be if admin-wise the new user had been given 
the correct permissions as I believe you are experiencing the symptoms 
described in the following thread 
http://stackoverflow.com/questions/26811924/spring-amqp-rabbitmq-3-3-5-access-refused-login-was-refused-using-authentica

Let me know
Oleg

On Mar 20, 2017, at 12:02 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hi Oleg. The RabbitMQ version is unchanged. I created a user called test, with 
its own password. I endeavor to connect as that user from PublishAMQP.  Thanks 
very much for looking more closely at this. -Jim

On Mon, Mar 20, 2017 at 11:55 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James, while I am looking couple of questions.
1. Did you actual RabbitMQ version has changed since you used NiFi 0.6?
2. Are you using default guest/guest user/password to connect?

Cheers
Oleg

On Mar 20, 2017, at 11:04 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I can certainly share germane pieces from it, Oleg.

Background: NiFi version is 0.7.1. RabbitMQ is 3.6.6. Erlang is 17.5-1.

>From nifi-app.log:
java.lang.IllegalStateException: Failed to establish connection with AMQP 
Broker: com.rabbitmq.client.ConnectionFactory@78..
.
. lots of stack trace lines
.
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: com.rabbitmq.client.Authentication.FailureException: ACCESS REFUSED 
- Login was refused using authentication PLAIN. For details see the broker 
logfile.

>From the rmq log file:
accepting AMQP connection <0.13753.9> (127.0.0.1:54318<http://127.0.0.1:54318/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>)

=ERROR REPORT 20-Mar-2017:10:46:30 ===
closing AMQP connection <0.20511.61> (127.0.0.1:48902<http://127.0.0.1:48902/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>):
{bad header,<< 22,3,3,0,195,1,0,0>>}

I had recently upgraded my NiFi from 0.6.x to 0.7.1. My broker log shows 
successful connections without the error report when I was still 0.6.x. Since I 
upgraded I have not been able to connect with success.

Thanks in advance for your help. I am very interested in why I seem to be 
getting this bad header message now. -Jim

On Mon, Mar 20, 2017 at 10:00 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James, any chance you can provide stack trace from the logs?

Cheers
Oleg

> On Mar 20, 2017, at 15:34, James McMahon 
> <jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:
>
> Good morning. I am using a NiFi 0.7 code base. I recently upgraded to that 
> from a 0.6.x version. I'm limited to that 0.7.x baseline now. I realize it is 
> old(er), but I have no choice in the matter.
>
> I had a PublishAMQP processor that had been working without issue to connect 
> to RabbitMQ but is now failing with this error:
>
> Failed to establish connection with AMQP Broker: 
> com.rabbitmq.client.ConnectionFactory@45cd2c16
>
> I can determine why this is now happening. RabbitMQ appears to be started and 
> running fine. I'm connecting to host localhost, and a specific port that 
> seems available. I get the same error if I (temporarily) turn off my firewall 
> entirely.
>
> I am attempting to employ a StandardSSLContextService that Enables without 
> error.
>
> How can I troubleshoot this error and figure out why I cannot connect to 
> rabbitMQ using PublishAMQP? Thanks in advance for any help you can offer. -Jim






Re: PublishAMQP had been working, now failing

2017-03-20 Thread Oleg Zhurakousky
James, any chance you can provide stack trace from the logs?

Cheers 
Oleg

> On Mar 20, 2017, at 15:34, James McMahon  wrote:
> 
> Good morning. I am using a NiFi 0.7 code base. I recently upgraded to that 
> from a 0.6.x version. I'm limited to that 0.7.x baseline now. I realize it is 
> old(er), but I have no choice in the matter.
> 
> I had a PublishAMQP processor that had been working without issue to connect 
> to RabbitMQ but is now failing with this error:
> 
> Failed to establish connection with AMQP Broker: 
> com.rabbitmq.client.ConnectionFactory@45cd2c16
> 
> I can determine why this is now happening. RabbitMQ appears to be started and 
> running fine. I'm connecting to host localhost, and a specific port that 
> seems available. I get the same error if I (temporarily) turn off my firewall 
> entirely.
> 
> I am attempting to employ a StandardSSLContextService that Enables without 
> error.
> 
> How can I troubleshoot this error and figure out why I cannot connect to 
> rabbitMQ using PublishAMQP? Thanks in advance for any help you can offer. -Jim


Re: PublishAMQP had been working, now failing

2017-03-20 Thread Oleg Zhurakousky
James

My availability this week is somewhat intermittent, but I will try to dig 
deeper later on

Sent from my iPhone

On Mar 20, 2017, at 19:11, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

My situation appears to be different from this one. There is no bad header 
error message exhibited in their case, and it's definitely there every time in 
mine. Additionally, it appears that I do indeed already have a non-Guest user 
with configure/write/read privileges.

That bad header notification may be key, but I cannot determine why that is 
being triggered with my upgraded version of NiFi.

On Mon, Mar 20, 2017 at 12:14 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James

Basically my next question would be if admin-wise the new user had been given 
the correct permissions as I believe you are experiencing the symptoms 
described in the following thread 
http://stackoverflow.com/questions/26811924/spring-amqp-rabbitmq-3-3-5-access-refused-login-was-refused-using-authentica

Let me know
Oleg

On Mar 20, 2017, at 12:02 PM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

Hi Oleg. The RabbitMQ version is unchanged. I created a user called test, with 
its own password. I endeavor to connect as that user from PublishAMQP.  Thanks 
very much for looking more closely at this. -Jim

On Mon, Mar 20, 2017 at 11:55 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James, while I am looking couple of questions.
1. Did you actual RabbitMQ version has changed since you used NiFi 0.6?
2. Are you using default guest/guest user/password to connect?

Cheers
Oleg

On Mar 20, 2017, at 11:04 AM, James McMahon 
<jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:

I can certainly share germane pieces from it, Oleg.

Background: NiFi version is 0.7.1. RabbitMQ is 3.6.6. Erlang is 17.5-1.

>From nifi-app.log:
java.lang.IllegalStateException: Failed to establish connection with AMQP 
Broker: com.rabbitmq.client.ConnectionFactory@78..
.
. lots of stack trace lines
.
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: com.rabbitmq.client.Authentication.FailureException: ACCESS REFUSED 
- Login was refused using authentication PLAIN. For details see the broker 
logfile.

>From the rmq log file:
accepting AMQP connection <0.13753.9> (127.0.0.1:54318<http://127.0.0.1:54318/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>)

=ERROR REPORT 20-Mar-2017:10:46:30 ===
closing AMQP connection <0.20511.61> (127.0.0.1:48902<http://127.0.0.1:48902/> 
-> 127.0.0.1:5673<http://127.0.0.1:5673/>):
{bad header,<< 22,3,3,0,195,1,0,0>>}

I had recently upgraded my NiFi from 0.6.x to 0.7.1. My broker log shows 
successful connections without the error report when I was still 0.6.x. Since I 
upgraded I have not been able to connect with success.

Thanks in advance for your help. I am very interested in why I seem to be 
getting this bad header message now. -Jim

On Mon, Mar 20, 2017 at 10:00 AM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
James, any chance you can provide stack trace from the logs?

Cheers
Oleg

> On Mar 20, 2017, at 15:34, James McMahon 
> <jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote:
>
> Good morning. I am using a NiFi 0.7 code base. I recently upgraded to that 
> from a 0.6.x version. I'm limited to that 0.7.x baseline now. I realize it is 
> old(er), but I have no choice in the matter.
>
> I had a PublishAMQP processor that had been working without issue to connect 
> to RabbitMQ but is now failing with this error:
>
> Failed to establish connection with AMQP Broker: 
> com.rabbitmq.client.ConnectionFactory@45cd2c16
>
> I can determine why this is now happening. RabbitMQ appears to be started and 
> running fine. I'm connecting to host localhost, and a specific port that 
> seems available. I get the same error if I (temporarily) turn off my firewall 
> entirely.
>
> I am attempting to employ a StandardSSLContextService that Enables without 
> error.
>
> How can I troubleshoot this error and figure out why I cannot connect to 
> rabbitMQ using PublishAMQP? Thanks in advance for any help you can offer. -Jim







Re: Automating updates in process groups through templates

2017-03-09 Thread Oleg Zhurakousky
Jim

I see where you’re going with this and I do believe that it is a valuable 
enterprise feature. Unfortunately, template XML files are not the bootstrap 
source of the flow such as flow.gz file. They are only used for 
save/move/migrate/etc., which means any modifications to them are not going to 
be automatically picked up.
But as I said, I do personally believe that something like this would be 
extremely valuable and could potentially be put on NiFi roadmap. Would you mind 
raising a Feature Request - https://issues.apache.org/jira/browse/NIFI so we 
can at least track it and gage the public interest?

Cheers
Oleg

> On Mar 9, 2017, at 6:46 AM, James McMahon  wrote:
> 
> Good morning. We are developing NiFi workflows with a few key strategic goals 
> in mind, one of which is this: change common workflows in one place only, and 
> have those changes ripple through to all process groups that employ that code.
> 
> More specifically: we have a complex but common workflow that we save as 
> template XYZ. That template is used within Process Group ABC, DEF, and XYZ. 
> We need to make improvements to the template, and want those improvements to 
> be picked up automatically by each of the process groups. How do we make that 
> happen?
> 
> Thank you in advance for your insights.
> 
> Jim



Re: Deadletter

2017-03-01 Thread Oleg Zhurakousky
Well, in my experience Dead Letter is more of a global storage mechanism for 
undeliverable units of “something” (i.e., messages). While I do see how 
“failed” Flow Files may resemble such undeliverable/unprocessable units of 
work, the nature of Data Flow makes it a bit different, since there is really 
no one stop solution for that. Some may be satisfied with auto-termination 
while others may try to reprocess or route to a different sub-flow which may be 
the actual delegation to some storage where things could be reviewed. And for 
that case as you’ve already noticed NiFi has provides all the moving parts. So 
PutS3 or PutFile followed by PutSlack or SendEmail are just a different 
implementations of the "store and notify” pattern that you essentially 
presenting here.

Also, keep in mind that once failed there is no one-stop re-processing 
solution. You may have certain FlowFiles that can be automatically fixed and 
re-sent, while others may need manual intervention and for that you really need 
to build and customize the Dead Latter component which may include some 
internal splitting and routing of Flow Files that Could be fixed vs the ones 
that can not be.

Cheers
Oleg

On Mar 1, 2017, at 3:38 PM, Nick Carenza 
<nick.care...@thecontrolgroup.com<mailto:nick.care...@thecontrolgroup.com>> 
wrote:

Sorry for the confusion, I meant to put emphasis on the _you_, as in 'you all' 
or other users of nifi. I am looking to get insight into solutions other have 
implemented to deal with failures.

- Nick

On Wed, Mar 1, 2017 at 12:29 PM, Oleg Zhurakousky 
<ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote:
Nick

Since you’ve already designed Process Group (PG) that is specific to failed 
flow files, I am not sure I understand your last question “. . .How do you 
manage failure relationships?. . .”.
I am assuming that within your global flow all failure relationships are sent 
to this PG, which essentially is a Dead Letter Storage.

Are you asking about how do you get more information from the failed Flow Files 
 (i.e., failure location, reason etc)?

Cheers
Oleg

On Mar 1, 2017, at 3:21 PM, Nick Carenza 
<nick.care...@thecontrolgroup.com<mailto:nick.care...@thecontrolgroup.com>> 
wrote:

I have a lot of processors in my flow, all of which can, and do, route 
flowfiles to their failure relationships at some point.

In the first iteration of my flow, I routed every failure relationship to an 
inactive DebugFlow but monitoring these was difficult, I wouldn't get 
notifications when something started to fail and if the queue got filled up it 
would apply backpressure and prevent new, good flowfiles from being processed.

Not only was that just not a good way to handle failures, but my flow was 
littered with all of these do-nothing processors and was an eye sore. So then I 
tried routing processor failure relationships into themselves which tidied up 
my flow but caused nifi to go berserk when a failure occurred because the 
failure relationship is not penalized (nor should it be) and most processors 
don't provide a 'Retry' relationship (InvokeHttp being a notable exception). 
But really, most processors wouldn't conceivable succeed if they were tried 
again. I mostly just wanted the flowfiles to sit there until I had a chance to 
check out why they failed and fix them manually.

This leads me to https://issues.apache.org/jira/browse/NIFI-3351. I think I 
need a way to store failed flowfiles, fix them and reprocess them. The process 
group I am currently considering implementing everywhere is:

Input Port [Failed Flowfile] --> PutS3 deadletter///${uuid} --> PutSlack
ListS3 deadletter/// --> FetchS3 -> Output 
Port [Fixed]

This gives me storage of failed messages logically grouped and in a place that 
wont block up my flow since s3 never goes down, err... wait. Configurable 
process groups or template like https://issues.apache.org/jira/browse/NIFI-1096 
would make this easier to reuse.

How do you manage failure relationships?

- Nick





Re: Deadletter

2017-03-01 Thread Oleg Zhurakousky
Nick

Since you’ve already designed Process Group (PG) that is specific to failed 
flow files, I am not sure I understand your last question “. . .How do you 
manage failure relationships?. . .”.
I am assuming that within your global flow all failure relationships are sent 
to this PG, which essentially is a Dead Letter Storage.

Are you asking about how do you get more information from the failed Flow Files 
 (i.e., failure location, reason etc)?

Cheers
Oleg

On Mar 1, 2017, at 3:21 PM, Nick Carenza 
> 
wrote:

I have a lot of processors in my flow, all of which can, and do, route 
flowfiles to their failure relationships at some point.

In the first iteration of my flow, I routed every failure relationship to an 
inactive DebugFlow but monitoring these was difficult, I wouldn't get 
notifications when something started to fail and if the queue got filled up it 
would apply backpressure and prevent new, good flowfiles from being processed.

Not only was that just not a good way to handle failures, but my flow was 
littered with all of these do-nothing processors and was an eye sore. So then I 
tried routing processor failure relationships into themselves which tidied up 
my flow but caused nifi to go berserk when a failure occurred because the 
failure relationship is not penalized (nor should it be) and most processors 
don't provide a 'Retry' relationship (InvokeHttp being a notable exception). 
But really, most processors wouldn't conceivable succeed if they were tried 
again. I mostly just wanted the flowfiles to sit there until I had a chance to 
check out why they failed and fix them manually.

This leads me to https://issues.apache.org/jira/browse/NIFI-3351. I think I 
need a way to store failed flowfiles, fix them and reprocess them. The process 
group I am currently considering implementing everywhere is:

Input Port [Failed Flowfile] --> PutS3 deadletter///${uuid} --> PutSlack
ListS3 deadletter/// --> FetchS3 -> Output 
Port [Fixed]

This gives me storage of failed messages logically grouped and in a place that 
wont block up my flow since s3 never goes down, err... wait. Configurable 
process groups or template like https://issues.apache.org/jira/browse/NIFI-1096 
would make this easier to reuse.

How do you manage failure relationships?

- Nick