Re: Flow File Stuck for no reason

2016-11-30 Thread Manish G
Hi Joe,

Thanks for the quick reply. Yes, the processor keeps running on a single
thread (even after stopping). And the number remains there even after
stopping.
Today, it happened on my customized putHDFS processor. Only thing different
in this processor is - I have added an additional attribute that tells if
the processor created the directory while loading the file on HDFS. I don't
think this should be the issue though.

Regards,
Manish


On Wed, Nov 30, 2016 at 7:05 PM, Joe Witt  wrote:

> Manish
>
> When it is stuck do you see a number in the top right corner of the
> processor?  When you stop it does the number remain?  That number is
> telling you how many threads are still executing.  Which processor are
> we talking about?  When it is in the stuck state can you please run
> bin/nifi.sh dump.  If you can then share the nifi-bootstrap.log that
> would aid us in narrowing in on a possible cause.
>
> Thanks
> Joe
>
> On Wed, Nov 30, 2016 at 7:02 PM, Manish G  wrote:
> >
> > Hi,
> >
> > I have noticed that sometime a flow file gets stuck on a processor for a
> > very long time for no reason and then I can not even stop the processor
> to
> > look at the flow flow file from queue. If I click on stop, then processor
> > goes into a state where I cannot start/stop the processor.
> >
> > On restarting the NiFi, the file gets processed successfully and routed
> to
> > success queue. I checked in App log, but everything seems to be normal
> for
> > the flow file. I don't see anything mysterious in provenance too (except
> > that queue time is in hours).
> >
> > Has anyone else faced a similar issue? What else should I check to
> identify
> > the root cause for this?
> >
> > Thanks,
> > Manish
>



-- 


*With Warm Regards,*
*Manish*


Re: Flow File Stuck for no reason

2016-11-30 Thread Joe Witt
Manish

When it is stuck do you see a number in the top right corner of the
processor?  When you stop it does the number remain?  That number is
telling you how many threads are still executing.  Which processor are
we talking about?  When it is in the stuck state can you please run
bin/nifi.sh dump.  If you can then share the nifi-bootstrap.log that
would aid us in narrowing in on a possible cause.

Thanks
Joe

On Wed, Nov 30, 2016 at 7:02 PM, Manish G  wrote:
>
> Hi,
>
> I have noticed that sometime a flow file gets stuck on a processor for a
> very long time for no reason and then I can not even stop the processor to
> look at the flow flow file from queue. If I click on stop, then processor
> goes into a state where I cannot start/stop the processor.
>
> On restarting the NiFi, the file gets processed successfully and routed to
> success queue. I checked in App log, but everything seems to be normal for
> the flow file. I don't see anything mysterious in provenance too (except
> that queue time is in hours).
>
> Has anyone else faced a similar issue? What else should I check to identify
> the root cause for this?
>
> Thanks,
> Manish


Flow File Stuck for no reason

2016-11-30 Thread Manish G
Hi,

I have noticed that sometime a flow file gets stuck on a processor for a
very long time for no reason and then I can not even stop the processor to
look at the flow flow file from queue. If I click on stop, then processor
goes into a state where I cannot start/stop the processor.

On restarting the NiFi, the *file gets processed successfully* and routed
to success queue. I checked in App log, but everything seems to be normal
for the flow file. I don't see anything mysterious in provenance too
(except that queue time is in hours).

Has anyone else faced a similar issue? What else should I check to identify
the root cause for this?

Thanks,
Manish


Re: Error instantiating template on cluster: The specified observer identifier already exists.

2016-11-30 Thread Matt Gilman
Simon,

Sorry for the late response here. I believe we encountered and addressed
this issue [1] today. Unfortunately, it just missed being included in
Apache 1.1.0 by a couple days. If you're able to build from source, this
should be fixed in the current master branch. Again sorry for the
inconvenience.

Matt

[1] https://issues.apache.org/jira/browse/NIFI-3129

Ma

On Thu, Oct 27, 2016 at 8:43 AM, Simon Tack  wrote:

> Hello,
>
>
>
> I am running into a problem moving some fairly large flows (40-50
> processors) from a NiFi 1.0.0 standalone instance to an 3-node NiFi 1.0.0
> cluster.  I saved the flows in the standalone instance as templates and
> uploaded the templates to the cluster instance.  When I instantiate the
> template on the cluster instance, I get an error dialog box that says, The
> specified observer identifier already exists.  After a delay of about 30
> seconds, the flow I was trying to instantiate appears, but it has no
> connections.  Here is the only thing I could find in the logs (this was in
> the nifi-user log, ip-addresses removed):
>
>
>
> 2016-10-26 07:32:09,589 INFO [NiFi Web Server-209]
> org.apache.nifi.web.filter.RequestLogger Attempting request for
> (anonymous) POST http://xxx.xxx.xxx.xxx:/nifi-api/process-groups/
> bf104e19-0157-1000-96a7-b87c2ff614ca/template-instance (source ip:
> yyy.yyy.yyy.yyy)
>
> 2016-10-26 07:32:09,630 INFO [NiFi Web Server-209] 
> o.a.n.w.a.c.IllegalStateExceptionMapper
> java.lang.IllegalStateException: The specified observer identifier
> already exists.. Returning Conflict response.
>
> 2016-10-26 07:32:14,299 INFO [NiFi Web Server-194]
> org.apache.nifi.web.filter.RequestLogger Attempting request for
> (anonymous) GET http://xxx.xxx.xxx.xxx:/nifi-api/flow/process-groups/
> bf104e19-0157-1000-96a7-b87c2ff614ca (source ip: yyy.yyy.yyy.yyy)
>
>
>
> This does not happen with every flow I try to move from the standalone
> instance to the cluster instance.  But it does consistently happen or not
> happen with any given flow.  For example, I made a template on the
> standalone instance for a flow with 1 processor, saved it and uploaded it
> to the cluster instance with no problem.
>
>
>
> As a work around, I have been able to do this:
>
>
>
> 1.   Disconnect one node from the cluster
>
> 2.   Instantiate my flows on the disconnected node from the templates
>
> 3.   Shutdown NiFi on all of the nodes
>
> 4.   Copy the flow.xml.gz file from the updated node to the other 2
> nodes
>
> 5.   Restart NiFi on all of the nodes
>
>
>
> After this the flows are instantiated and work fine.
>
>
>
> Has anyone seen this before?  Any ideas on why I may be getting this error?
>
>
>
> Thank you,
>
>
>
> Simon
>


[ANNOUNCE] Apache NiFi 1.1.0 Release

2016-11-30 Thread Joe Witt
Hello

The Apache NiFi team would like to announce the release of Apache NiFi 1.1.0.

Apache NiFi is an easy to use, powerful, and reliable system to
process and distribute data.  Apache NiFi was made for dataflow.  It
supports highly configurable directed graphs of data routing,
transformation, and system mediation logic.

This release is the result of fantastic community contribution across
feature requests, documentation, bug reports, code contributions,
reviews, and release validation.

The release highlights:

https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.1.0

More details on Apache NiFi can be found here:
  http://nifi.apache.org/

The release artifacts can be downloaded from here:
  http://nifi.apache.org/download.html

Maven artifacts have been made available here:
  https://repository.apache.org/content/repositories/releases/org/apache/nifi/

Issues closed/resolved for this list can be found here:
  
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12337875

Thank you
The Apache NiFi team


Save the date: ApacheCon Miami, May 15-19, 2017

2016-11-30 Thread Rich Bowen
Dear Apache enthusiast,

ApacheCon and Apache Big Data will be held at the Intercontinental in
Miami, Florida, May 16-18, 2017. Submit your talks, and register, at
http://apachecon.com/  Talks aimed at the Big Data section of the event
should go to
http://events.linuxfoundation.org/events/apache-big-data-north-america/program/cfp
while other talks should go to
http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp


ApacheCon is the best place to meet the people that develop the software
that you use and rely on. It’s also a great opportunity to deepen your
involvement in the project, and perhaps make the leap to contributing.
And we find that user case studies, showcasing how you use Apache
projects to solve real world problems, are very popular at this event.
So, do consider whether you have a use case that might make a good
presentation.

ApacheCon will have many different ways that you can participate:

Technical Content: We’ll have three days of technical sessions covering
many of the projects at the ASF. We’ll be publishing a schedule of talks
on March 9th, so that you can plan what you’ll be attending

BarCamp: The Apache BarCamp is a standard feature of ApacheCon - an
un-conference style event, where the schedule is determined on-site by
the attendees, and anything is fair game.

Lightning Talks: Even if you don’t give a full-length talk, the
Lightning Talks are five minute presentations on any topic related to
the ASF, and can be given by any attendee. If there’s something you’re
passionate about, consider giving a Lightning Talk.

Sponsor: It costs money to put on a conference, and this is a great
opportunity for companies involved in Apache projects, or who benefit
from Apache code - your employers - to get their name and products in
front of the community. Sponsors can start any any monetary level, and
can sponsor everything from the conference badge lanyard, through larger
items such as video recordings and evening events. For more information
on sponsoring ApacheCon, see http://apachecon.com/sponsor/

So, get your tickets today at http://apachecon.com/ and submit your
talks. ApacheCon Miami is going to be our best ApacheCon yet, and you,
and your project, can’t afford to miss it.

-- 
Rich Bowen - rbo...@apache.org
VP, Conferences
http://apachecon.com
@apachecon



Re: Processors and

2016-11-30 Thread Aldrin Piri
Hi Andreas,

1)  There is nothing from a framework perspective that provides this.
However, a typical option is to make use of an attribute from an upstream
processor to help categorize and handle the data.  Attributes written vary
from processor to processor or can be explicitly set/updated using the
UpdateAttribute processor.
2)  This is also something that is universally handled across the framework
through processors.  Some processors, such as InvokeHTTP and, I believe,
those for AWS, do set such properties when a failure happens.  What you are
attempting to do though seems like it might be a good enhancement to add to
the processor and, frankly, a reasonable request to also work toward
providing more universally across components in the application.  For the
time being, however, your UpdateAttribute approach is the best option at
this juncture.

Would you mind opening up a JIRA issue so we can discuss this a bit more
and evaluate trying to extend such functionality in a standardized way?


On Wed, Nov 30, 2016 at 9:08 AM, Andreas Petter (External) <
andreas.petter.exter...@telefonica.com> wrote:

> Hello everybody,
>
>
>
> I have 2 questions:
>
> 1.   Is there some way to find out through which relationship/queue a
> FlowFile walked into a processor, in the onTrigger-Method?
>
> 2.   Is there a generic way how errors (e.g. Exceptions) are
> propagated with FlowFiles to subsequent processors?
>
>
>
> Background Story:
>
> I am writing a failure processor which handles failure events from
> FetchSFTP outgoing relationships, writing some flowfile attributes into a
> database and performing some further tasks to cope with the error. Now I
> would like to know through which of the three failure-reporting-relationships
> the FlowFile came along and get some generic failure information (e.g. the
> Exception). Right now I am adding 3 UpdateAttribute processors which each
> add an attribute identifying the relationship (and thereby the type of
> error). Maybe there is a better way to do this? I am using NiFi 1.0.
>
>
>
> Thank you very much for any help you might provide.
>
> Kind regards,
>
> Andreas Petter
>


Processors and

2016-11-30 Thread Andreas Petter (External)
Hello everybody,

I have 2 questions:

1.   Is there some way to find out through which relationship/queue a 
FlowFile walked into a processor, in the onTrigger-Method?

2.   Is there a generic way how errors (e.g. Exceptions) are propagated 
with FlowFiles to subsequent processors?

Background Story:
I am writing a failure processor which handles failure events from FetchSFTP 
outgoing relationships, writing some flowfile attributes into a database and 
performing some further tasks to cope with the error. Now I would like to know 
through which of the three failure-reporting-relationships the FlowFile came 
along and get some generic failure information (e.g. the Exception). Right now 
I am adding 3 UpdateAttribute processors which each add an attribute 
identifying the relationship (and thereby the type of error). Maybe there is a 
better way to do this? I am using NiFi 1.0.

Thank you very much for any help you might provide.
Kind regards,
Andreas Petter


Re: Merge Content : triggering merge according to a field

2016-11-30 Thread Joe Witt
Nicolas,

I don't believe there is support for this in MergeContent.  It does
support a similar pattern for segmented data such that it knows how to
recombine it.  It handles out of order alignment and understanding
start/end/indexes.  However, it requires specific metadata be made
available.  You'd want to build/have something that support your
particular case.

Thanks
Joe

On Wed, Nov 30, 2016 at 4:07 AM, Provenzano Nicolas
 wrote:
> Hi all,
>
>
>
> Yet another question…
>
>
>
> I defined a flow to process session information coming from CSV files.
>
>
>
> Each record contains a session ID, a session state and some session
> counters.
>
>
>
> The merge content processor allows merging flowfiles according to an
> attribute (the correlation attribute name) so I was able to merge flowfiles
> according to the session ID.
>
>
>
> However, I would like to trigger the merging only when the session state
> reaches a specific value (for example, ended).
>
>
>
> Please note that session info can be distributed over several input flows
> but always end with state = ended.
>
>
>
> So for example, a first CSV file contains :
>
>
>
> 100, started, 1, 2
>
>
>
> A second CSV file contains
>
>
>
> 100, inProgress, 2, 4
>
>
>
> And a third CSV file contains
>
>
>
> 100, ended, 4,6
>
>
>
> The merge content processor should then produce the following flowfile :
>
>
>
> 100, started, 1, 2
>
> 100, inProgress, 2, 4
>
> 100, ended, 4,6
>
>
>
> Meaning that the two first records should be kept as long as the “ended” one
> is not received.
>
>
>
> Is there anyway of doing it directly with this processor or with any others
> ?
>
>
>
> Thanks
>
>
>
> Nicolas


Merge Content : triggering merge according to a field

2016-11-30 Thread Provenzano Nicolas
Hi all,

Yet another question...

I defined a flow to process session information coming from CSV files.

Each record contains a session ID, a session state and some session counters.

The merge content processor allows merging flowfiles according to an attribute 
(the correlation attribute name) so I was able to merge flowfiles according to 
the session ID.

However, I would like to trigger the merging only when the session state 
reaches a specific value (for example, ended).

Please note that session info can be distributed over several input flows but 
always end with state = ended.

So for example, a first CSV file contains :

100, started, 1, 2

A second CSV file contains

100, inProgress, 2, 4

And a third CSV file contains

100, ended, 4,6

The merge content processor should then produce the following flowfile :

100, started, 1, 2
100, inProgress, 2, 4
100, ended, 4,6

Meaning that the two first records should be kept as long as the "ended" one is 
not received.

Is there anyway of doing it directly with this processor or with any others ?

Thanks

Nicolas