[DRAFT][REPORT] Apache NiFi Board Report Jan 2018

2018-01-08 Thread Joe Witt
Team,

It is that time again to send in our board report.  I'll submit it
tomorrow but would appreciate if you can take a quick look and point
out things to tweak or add or remove.  Really proud of what we're
accomplishing!

Thanks
Joe

-=-=-=-=-=-=-

## Description:
 - Apache NiFi is an easy to use, powerful, and reliable system to
   process and distribute data.
 - Apache NiFi MiNiFi is an edge data collection agent built to seamlessly
   integrate with and leverage the command and control of NiFi. There are
   both Java and C++ implementations.
 - Apache NiFi Registry is a centralized registry for key configuration items
   including flow versions, assets, and extensions for Apache NiFi
   and Apache MiNiFi.
 - Apache NiFi Nar Maven Plugin is a release artifact used for supporting
   the NiFi classloader isolation model.

## Issues:
 - There are no issues requiring board attention at this time.

## Activity:
 - Conducted several releases including Apache NiFi Registry,
   Apache NiFi MiNiFi CPP and Apache NiFi MiNiFi Java. The Apache NiFi Registry
   release is a very important step forward for the community as it provides a
   powerful software development lifecycle mechanism for Apache NiFi users and
   answers a long standing request and often discussed capability.
 - Apache NiFi 1.5.0 Release Candidate under vote at the time of this board
   report writing.
 - Voted a significant number of new committers and PMC members and are
   extremely fortunate to see strong committer pipeline continue through newer
   contributors with significant contributions.

## Health report:
 - Health of the community remains strong. Mailing list and JIRA activity is
   consistent. ASF Hipchat is serving as an on-ramp for new users to our
   mailing list and JIRA systems. We continue to see new users and
   contributors.
 - As we called out in our previous report we anticipated significant
   recognition of earned meritocracy within the committer and PMC ranks.  That
   held true and the pipeline remains encouraging.

## PMC changes:

 - Currently 26 PMC members.
 - New PMC members:
- Marc Parisi was added to the PMC on Wed Dec 13 2017
- Jeff Storck was added to the PMC on Mon Dec 04 2017
- Scott Aslan was added to the PMC on Fri Dec 01 2017
- Michael W Moser was added to the PMC on Sun Nov 19 2017
 - Last PMC member added Wed Dec 13 2017.

## Committer base changes:

 - Currently 38 committers.
 - New commmitters:
- Kevin Doran was added as a committer on Wed Jan 03 2018
- Andrew Ian Christianson was added as a committer on Mon Nov 13 2017
- Mike Hogue was added as a committer on Thu Nov 09 2017
- Peter Wicks was added as a committer on Thu Nov 09 2017
 - Last committer added Wed Jan 03 2018.

## Releases:

 - Apache NiFi Registry 0.1.0 was released Jan 1 2018.
 - Apache NiFi MiNiFi Java 0.3.0 was released Dec 22 2017.
 - Apache NiFi MiNiFi CPP 0.3.0 was released Nov 30 2017.

## Mailing list activity:

 - Activity on the mailing lists remains high with a mixture
   of new users, contributors, and deeper more experienced users and
   contributors sparking discussion and questions and filing bugs or
   new features.

 - us...@nifi.apache.org:
- 581 subscribers (up 25 in the last 3 months):
- 783 emails sent to list (721 in previous quarter)

 - dev@nifi.apache.org:
- 388 subscribers (up 5 in the last 3 months):
- 679 emails sent to list (590 in previous quarter)

 - iss...@nifi.apache.org:
- 46 subscribers (up 0 in the last 3 months):
- 6950 emails sent to list (5388 in previous quarter)


## JIRA activity:

 - 356 JIRA tickets created in the last 3 months
 - 276 JIRA tickets closed/resolved in the last 3 months


Re: NiFi data HA in cluster mode

2018-01-08 Thread Joe Witt
That is a fair point Brett - i wasnt thinking of that when I answer
but that is a good point.  Then again we should create those
connections lazily so if we don't i'd call that a bug :)

Ben

Yeah there is definitely intent to provide distributed data durability
across nodes.  This is especially important as it serves as a great
way to support elastic clustering behavior.

I'm not sure HDFS as the backing store is best and we all have to keep
in mind we must ensure distributed durability of flowfile, content,
and provenance.  That might mean application level replication similar
to what Apache Kafka does.  That might mean distributed durable block
storage and then deciding which node is responsible for processing a
given set of data at a time.  There are a lot of ways to slice this
and they all offer different tradeoffs.

On Mon, Jan 8, 2018 at 11:37 PM, Brett Ryan  wrote:
> I had someone from Hortonworks suggest to me that I should also set any 
> PutSQL processors to only execute on primary. The reasoning was due to 
> flooding of the JDBC pool.
>
>> On 9 Jan 2018, at 17:25, Joe Witt  wrote:
>>
>> I'd avoid setting any processor to primary node only unless it is a
>> source processor (something that brings data into the system).
>>
>> But, yes, I believe your description is accurate as of now.
>>
>> Thanks
>>
>>> On Mon, Jan 8, 2018 at 11:21 PM, 尹文才  wrote:
>>> Thanks Joe, so you mean for example, if I set one processor to run only on
>>> primary node in the cluster and there're 100 FlowFiles in the incoming
>>> queue of the processor
>>> waiting to be processed by this processor, and the processor suddenly goes
>>> down and then another node is elected as the primary node, those 100
>>> FlowFiles will be kept locally
>>> in the node that went down and will continue to be processed by the node
>>> when it goes back online, these FlowFiles will not be available to the new
>>> primary node and other nodes,
>>> am I correct?
>>>
>>> Regards,
>>> Ben
>>>
>>>
>>> 2018-01-09 14:08 GMT+08:00 Joe Witt :
>>>
 Ben,

 Data already mid-flow within a node will be kept on the node and
 processed when the node is back on-line.  All other data coming into
 the cluster can fail-over to other nodes provided you're sourcing data
 with queuing semantics or automated load balancing or fail-over as-is
 present in the Apache NiFi Site to Site protocol.

 Thanks
 Joe

> On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
> Hi guys, I have a question about data HA when NiFi is run in clustered
> mode, if one node goes down, will the flowfiles owned by this node taken
> over and processed by another node?
> Or will the flowfiles be kept locally to that node and will only be
> processed when that node is back online? Thanks.
>
> Regards,
> Ben



Re: NiFi data HA in cluster mode

2018-01-08 Thread Brett Ryan
I had someone from Hortonworks suggest to me that I should also set any PutSQL 
processors to only execute on primary. The reasoning was due to flooding of the 
JDBC pool.

> On 9 Jan 2018, at 17:25, Joe Witt  wrote:
> 
> I'd avoid setting any processor to primary node only unless it is a
> source processor (something that brings data into the system).
> 
> But, yes, I believe your description is accurate as of now.
> 
> Thanks
> 
>> On Mon, Jan 8, 2018 at 11:21 PM, 尹文才  wrote:
>> Thanks Joe, so you mean for example, if I set one processor to run only on
>> primary node in the cluster and there're 100 FlowFiles in the incoming
>> queue of the processor
>> waiting to be processed by this processor, and the processor suddenly goes
>> down and then another node is elected as the primary node, those 100
>> FlowFiles will be kept locally
>> in the node that went down and will continue to be processed by the node
>> when it goes back online, these FlowFiles will not be available to the new
>> primary node and other nodes,
>> am I correct?
>> 
>> Regards,
>> Ben
>> 
>> 
>> 2018-01-09 14:08 GMT+08:00 Joe Witt :
>> 
>>> Ben,
>>> 
>>> Data already mid-flow within a node will be kept on the node and
>>> processed when the node is back on-line.  All other data coming into
>>> the cluster can fail-over to other nodes provided you're sourcing data
>>> with queuing semantics or automated load balancing or fail-over as-is
>>> present in the Apache NiFi Site to Site protocol.
>>> 
>>> Thanks
>>> Joe
>>> 
 On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
 Hi guys, I have a question about data HA when NiFi is run in clustered
 mode, if one node goes down, will the flowfiles owned by this node taken
 over and processed by another node?
 Or will the flowfiles be kept locally to that node and will only be
 processed when that node is back online? Thanks.
 
 Regards,
 Ben
>>> 


Re: NiFi data HA in cluster mode

2018-01-08 Thread 尹文才
Thanks Joe, I will try to avoid to set processor to primary node. By the
way, I've seen someone posted suggestion about Data HA in NiFi's
wiki(HDFSContentRepository), is there a plan for that feature to be
implemented and included in NiFi?

Regards,
Ben

2018-01-09 14:25 GMT+08:00 Joe Witt :

> I'd avoid setting any processor to primary node only unless it is a
> source processor (something that brings data into the system).
>
> But, yes, I believe your description is accurate as of now.
>
> Thanks
>
> On Mon, Jan 8, 2018 at 11:21 PM, 尹文才  wrote:
> > Thanks Joe, so you mean for example, if I set one processor to run only
> on
> > primary node in the cluster and there're 100 FlowFiles in the incoming
> > queue of the processor
> > waiting to be processed by this processor, and the processor suddenly
> goes
> > down and then another node is elected as the primary node, those 100
> > FlowFiles will be kept locally
> > in the node that went down and will continue to be processed by the node
> > when it goes back online, these FlowFiles will not be available to the
> new
> > primary node and other nodes,
> > am I correct?
> >
> > Regards,
> > Ben
> >
> >
> > 2018-01-09 14:08 GMT+08:00 Joe Witt :
> >
> >> Ben,
> >>
> >> Data already mid-flow within a node will be kept on the node and
> >> processed when the node is back on-line.  All other data coming into
> >> the cluster can fail-over to other nodes provided you're sourcing data
> >> with queuing semantics or automated load balancing or fail-over as-is
> >> present in the Apache NiFi Site to Site protocol.
> >>
> >> Thanks
> >> Joe
> >>
> >> On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
> >> > Hi guys, I have a question about data HA when NiFi is run in clustered
> >> > mode, if one node goes down, will the flowfiles owned by this node
> taken
> >> > over and processed by another node?
> >> > Or will the flowfiles be kept locally to that node and will only be
> >> > processed when that node is back online? Thanks.
> >> >
> >> > Regards,
> >> > Ben
> >>
>


Re: NiFi data HA in cluster mode

2018-01-08 Thread Joe Witt
I'd avoid setting any processor to primary node only unless it is a
source processor (something that brings data into the system).

But, yes, I believe your description is accurate as of now.

Thanks

On Mon, Jan 8, 2018 at 11:21 PM, 尹文才  wrote:
> Thanks Joe, so you mean for example, if I set one processor to run only on
> primary node in the cluster and there're 100 FlowFiles in the incoming
> queue of the processor
> waiting to be processed by this processor, and the processor suddenly goes
> down and then another node is elected as the primary node, those 100
> FlowFiles will be kept locally
> in the node that went down and will continue to be processed by the node
> when it goes back online, these FlowFiles will not be available to the new
> primary node and other nodes,
> am I correct?
>
> Regards,
> Ben
>
>
> 2018-01-09 14:08 GMT+08:00 Joe Witt :
>
>> Ben,
>>
>> Data already mid-flow within a node will be kept on the node and
>> processed when the node is back on-line.  All other data coming into
>> the cluster can fail-over to other nodes provided you're sourcing data
>> with queuing semantics or automated load balancing or fail-over as-is
>> present in the Apache NiFi Site to Site protocol.
>>
>> Thanks
>> Joe
>>
>> On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
>> > Hi guys, I have a question about data HA when NiFi is run in clustered
>> > mode, if one node goes down, will the flowfiles owned by this node taken
>> > over and processed by another node?
>> > Or will the flowfiles be kept locally to that node and will only be
>> > processed when that node is back online? Thanks.
>> >
>> > Regards,
>> > Ben
>>


Re: NiFi data HA in cluster mode

2018-01-08 Thread 尹文才
Thanks Joe, so you mean for example, if I set one processor to run only on
primary node in the cluster and there're 100 FlowFiles in the incoming
queue of the processor
waiting to be processed by this processor, and the processor suddenly goes
down and then another node is elected as the primary node, those 100
FlowFiles will be kept locally
in the node that went down and will continue to be processed by the node
when it goes back online, these FlowFiles will not be available to the new
primary node and other nodes,
am I correct?

Regards,
Ben


2018-01-09 14:08 GMT+08:00 Joe Witt :

> Ben,
>
> Data already mid-flow within a node will be kept on the node and
> processed when the node is back on-line.  All other data coming into
> the cluster can fail-over to other nodes provided you're sourcing data
> with queuing semantics or automated load balancing or fail-over as-is
> present in the Apache NiFi Site to Site protocol.
>
> Thanks
> Joe
>
> On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
> > Hi guys, I have a question about data HA when NiFi is run in clustered
> > mode, if one node goes down, will the flowfiles owned by this node taken
> > over and processed by another node?
> > Or will the flowfiles be kept locally to that node and will only be
> > processed when that node is back online? Thanks.
> >
> > Regards,
> > Ben
>


Re: NiFi data HA in cluster mode

2018-01-08 Thread Joe Witt
Ben,

Data already mid-flow within a node will be kept on the node and
processed when the node is back on-line.  All other data coming into
the cluster can fail-over to other nodes provided you're sourcing data
with queuing semantics or automated load balancing or fail-over as-is
present in the Apache NiFi Site to Site protocol.

Thanks
Joe

On Mon, Jan 8, 2018 at 11:05 PM, 尹文才  wrote:
> Hi guys, I have a question about data HA when NiFi is run in clustered
> mode, if one node goes down, will the flowfiles owned by this node taken
> over and processed by another node?
> Or will the flowfiles be kept locally to that node and will only be
> processed when that node is back online? Thanks.
>
> Regards,
> Ben


NiFi data HA in cluster mode

2018-01-08 Thread 尹文才
Hi guys, I have a question about data HA when NiFi is run in clustered
mode, if one node goes down, will the flowfiles owned by this node taken
over and processed by another node?
Or will the flowfiles be kept locally to that node and will only be
processed when that node is back online? Thanks.

Regards,
Ben


Re: [DISCUSS] Apache NiFi 1.5.0 release candidate ready

2018-01-08 Thread Joe Witt
Team

Lots of great progress and things appear stable.  Will review this NPE
fix for QueryRecord and then I plan to kick off the RC generation and
hopefully have it out tonight/early tomorrow for a vote.

Thanks
Joe

On Fri, Jan 5, 2018 at 2:57 PM, Joe Witt  wrote:
> Mike,
>
> Just looked at each and I'll comment on each in their PR but i'll
> include any of these that are ready whenever the RC starts.
>
> Thanks
>
> On Fri, Jan 5, 2018 at 1:09 PM, Mike Thomsen  wrote:
>> I have 3 PRs for new processors that have been partially or completely
>> reviewed that might be candidates:
>>
>> FlattenJson: https://github.com/apache/nifi/pull/2307
>> DeleteHBaseRow: https://github.com/apache/nifi/pull/2294
>> RunMongoAggregation: https://github.com/apache/nifi/pull/2180
>>
>> I think Matt is mostly done with the reviews for FlattenJson and
>> RunMongoAggregation, but I don't think anyone with commit rights has
>> reviewed DeleteHBaseRow.
>>
>> Thanks,
>>
>> Mike
>>
>> On Fri, Jan 5, 2018 at 10:56 AM, Joe Witt  wrote:
>>
>>> cool.  will keep an eye out
>>>
>>> On Jan 5, 2018 10:50 AM, "Andrew Lim"  wrote:
>>>
>>> > I’ve been working on updating the documentation to include content
>>> related
>>> > to registry integration [1].  I think the changes should be included in
>>> > 1.5.0 (tagged the Jira accordingly) and expect to submit a PR soon.
>>> >
>>> > [1] https://issues.apache.org/jira/browse/NIFI-4679
>>> >
>>> > Thanks,
>>> > -Drew
>>> >
>>> > > On Jan 5, 2018, at 10:39 AM, Joe Witt  wrote:
>>> > >
>>> > > Team,
>>> > >
>>> > > Here is the current status of the tagged 1.5.0 JIRAs for NiFi [1]
>>> > >
>>> > > Looks like even the outstanding items all have patches ready with
>>> > > active reviews.  I just set fix version on the registry integration
>>> > > since those are clearly nearing completion.
>>> > >
>>> > > I'll plan to start the RC early next week so please advise if there is
>>> > > anything glaring/outstanding that isn't accounted for.  This release
>>> > > is loaded with a lot of goodness so this should be fun.
>>> > >
>>> > > [1] https://issues.apache.org/jira/projects/NIFI/versions/
>>> > 12341668#release-report-tab-body
>>> > >
>>> > > Thanks
>>> > > Joe
>>> >
>>> >
>>>


Re: Getting lost a bit

2018-01-08 Thread Brett Ryan
I should have been more thorough, but yes, of course one must be careful. Which 
is why I don’t necessarily like GUI merge tools.

NiFi is pretty well structured, so merge problems like you suggest should be 
less likely unless refactoring of core happens a lot.

Git has a pretty smart merge strategy with recursive being the default which 
allows for method renames.

If you get into a situation where two have been working in the same block 
causing a conflict there may be a problem with what is trying to be achieved.

> On 9 Jan 2018, at 05:00, Jeff  wrote:
> 
> Brett,
> 
> You don't want to necessarily delete the other side and just keep the
> section you've modified...  There may be changes that were made to that
> code that you'll want to add to your own changes.  A simple example would
> be that a new method call was added in the section that you've modified,
> which caused the conflict.  Deleting that other side of the diff would
> remove that method call, and undo someone else's work.  We need to be
> careful when resolving conflicts to make sure the final commit is a proper
> merging of the commits (yours and and any from the git repo) that were made
> since the commit hash from which you were making your changes.
> 
> - Jeff
> 
>> On Mon, Jan 8, 2018 at 1:58 AM Brett Ryan  wrote:
>> 
>> I must be an odd one, I’m more comfortable resolving conflicts from the
>> cli ;)
>> 
>> The simplest way is the CSV resolution strategy. Edit each file looking
>> for <<< —- >>> blocks where conflicts are presented. Keep the section you
>> want deleting the other side, then git add the file. Do for all conflicts
>> then git commit.
>> 
>> Fetch often to try and keep your base aligned in order to keep conflicts
>> as small as possible.
>> 
>>> On 8 Jan 2018, at 17:03, James Wing  wrote:
>>> 
>>> Mike,
>>> 
>>> I believe you will want to rebase your branch on the latest master before
>>> you submit your PR.  First, update your master branch if you have not
>>> already done so, with commands like the following:
>>> 
>>> git checkout master
>>> git fetch origin
>>> git merge origin/master
>>> 
>>> Second, rebase your feature branch so that your commits are relative to
>> the
>>> latest master:
>>> 
>>> git checkout NIFI-4731
>>> git rebase master
>>> 
>>> Given the recent changes in the GCP bundle, you may have some conflicts
>> to
>>> resolve.  A GUI tool like IntelliJ can be very handy for resolving git
>>> conflicts if you are not familiar with the command-line git resolution
>>> process (almost nobody is).
>>> 
>>> If that does not work for you, an alternative would be to create a new
>>> branch from the latest master, then manually copy/paste your changes to
>>> make a more concise commit.
>>> 
>>> Thanks,
>>> 
>>> James
>>> 
>>> On Sun, Jan 7, 2018 at 7:16 PM, Mikhail Sosonkin
>> >> 
 Hi Devs,
 
 I'm trying to create a PR for this branch in my fork
 https://github.com/nologic/nifi/tree/NIFI-4731. I see that master has
 moved
 on to 1.5.0 but I'd like to have the processor built for 1.4 and later.
 It's the version we're using. However, I don't see an origin/1.4.0
>> based on
 the instructions from the contributor guide. How do I go about this?
>> Please
 give simple instructions, I'm not exactly a git master :)
 
 Thanks,
 Mike.
 
 --
 This email may contain material that is confidential for the sole use of
 the intended recipient(s).  Any review, reliance or distribution or
 disclosure by others without express permission is strictly
>> prohibited.  If
 you are not the intended recipient, please contact the sender and delete
 all copies of this message.
>> 


Re: site-to-site timeout

2018-01-08 Thread Mark Bean
Thanks for confirming. As this question was mostly for 1.x, I'm happy to
hear (and further read into your statement) that site-to-site will
continuously function on surviving Nodes even if one or more Nodes go down.

Thanks,
Mark


On Mon, Jan 8, 2018 at 1:58 PM, Mark Payne  wrote:

> Mark,
>
> That does sound correct. The code would be probably be either in the
> nifi-site-to-site-client module
> or the in the StandardRemoteGroupPort, within the nifi-framework-core
> module. It's been a while since
> I've looked at 0.x, so it's hard to say for sure.
>
> In 1.x, this is a non-issue. The site-to-site client will iterate through
> all nodes in the remote cluster, attempting
> to connect to each of them to determine the cluster topology. If it fails
> to connect to any of them, it will then
> try again after a bit, until it is finally successful.
>
> Thanks
> -Mark
>
>
> > On Jan 8, 2018, at 1:29 PM, Mark Bean  wrote:
> >
> > It was observed in Apache NiFi 0.x that when the NCM goes down or becomes
> > unavailable, data being sent to the cluster via site-to-site continues to
> > flow to the Nodes for 24 hours. There is a state file
> > NIFI_HOME/conf/state/{RPG-UUID}.peers. When this file becomes > 24 hours
> > old, the site-to-site communication finally stops.
> >
> > First, can you confirm this is an accurate observation? Second, where in
> > the code can I find this? Lastly, and most importantly, does this
> behavior
> > apply to 1.x since there is no longer an NCM?
> >
> > Thanks,
> > Mark
>
>


Re: site-to-site timeout

2018-01-08 Thread Mark Payne
Mark,

That does sound correct. The code would be probably be either in the 
nifi-site-to-site-client module
or the in the StandardRemoteGroupPort, within the nifi-framework-core module. 
It's been a while since
I've looked at 0.x, so it's hard to say for sure.

In 1.x, this is a non-issue. The site-to-site client will iterate through all 
nodes in the remote cluster, attempting
to connect to each of them to determine the cluster topology. If it fails to 
connect to any of them, it will then
try again after a bit, until it is finally successful.

Thanks
-Mark


> On Jan 8, 2018, at 1:29 PM, Mark Bean  wrote:
> 
> It was observed in Apache NiFi 0.x that when the NCM goes down or becomes
> unavailable, data being sent to the cluster via site-to-site continues to
> flow to the Nodes for 24 hours. There is a state file
> NIFI_HOME/conf/state/{RPG-UUID}.peers. When this file becomes > 24 hours
> old, the site-to-site communication finally stops.
> 
> First, can you confirm this is an accurate observation? Second, where in
> the code can I find this? Lastly, and most importantly, does this behavior
> apply to 1.x since there is no longer an NCM?
> 
> Thanks,
> Mark



site-to-site timeout

2018-01-08 Thread Mark Bean
It was observed in Apache NiFi 0.x that when the NCM goes down or becomes
unavailable, data being sent to the cluster via site-to-site continues to
flow to the Nodes for 24 hours. There is a state file
NIFI_HOME/conf/state/{RPG-UUID}.peers. When this file becomes > 24 hours
old, the site-to-site communication finally stops.

First, can you confirm this is an accurate observation? Second, where in
the code can I find this? Lastly, and most importantly, does this behavior
apply to 1.x since there is no longer an NCM?

Thanks,
Mark


Re: Getting lost a bit

2018-01-08 Thread Jeff
Brett,

You don't want to necessarily delete the other side and just keep the
section you've modified...  There may be changes that were made to that
code that you'll want to add to your own changes.  A simple example would
be that a new method call was added in the section that you've modified,
which caused the conflict.  Deleting that other side of the diff would
remove that method call, and undo someone else's work.  We need to be
careful when resolving conflicts to make sure the final commit is a proper
merging of the commits (yours and and any from the git repo) that were made
since the commit hash from which you were making your changes.

- Jeff

On Mon, Jan 8, 2018 at 1:58 AM Brett Ryan  wrote:

> I must be an odd one, I’m more comfortable resolving conflicts from the
> cli ;)
>
> The simplest way is the CSV resolution strategy. Edit each file looking
> for <<< —- >>> blocks where conflicts are presented. Keep the section you
> want deleting the other side, then git add the file. Do for all conflicts
> then git commit.
>
> Fetch often to try and keep your base aligned in order to keep conflicts
> as small as possible.
>
> > On 8 Jan 2018, at 17:03, James Wing  wrote:
> >
> > Mike,
> >
> > I believe you will want to rebase your branch on the latest master before
> > you submit your PR.  First, update your master branch if you have not
> > already done so, with commands like the following:
> >
> > git checkout master
> > git fetch origin
> > git merge origin/master
> >
> > Second, rebase your feature branch so that your commits are relative to
> the
> > latest master:
> >
> > git checkout NIFI-4731
> > git rebase master
> >
> > Given the recent changes in the GCP bundle, you may have some conflicts
> to
> > resolve.  A GUI tool like IntelliJ can be very handy for resolving git
> > conflicts if you are not familiar with the command-line git resolution
> > process (almost nobody is).
> >
> > If that does not work for you, an alternative would be to create a new
> > branch from the latest master, then manually copy/paste your changes to
> > make a more concise commit.
> >
> > Thanks,
> >
> > James
> >
> > On Sun, Jan 7, 2018 at 7:16 PM, Mikhail Sosonkin
>  >> wrote:
> >
> >> Hi Devs,
> >>
> >> I'm trying to create a PR for this branch in my fork
> >> https://github.com/nologic/nifi/tree/NIFI-4731. I see that master has
> >> moved
> >> on to 1.5.0 but I'd like to have the processor built for 1.4 and later.
> >> It's the version we're using. However, I don't see an origin/1.4.0
> based on
> >> the instructions from the contributor guide. How do I go about this?
> Please
> >> give simple instructions, I'm not exactly a git master :)
> >>
> >> Thanks,
> >> Mike.
> >>
> >> --
> >> This email may contain material that is confidential for the sole use of
> >> the intended recipient(s).  Any review, reliance or distribution or
> >> disclosure by others without express permission is strictly
> prohibited.  If
> >> you are not the intended recipient, please contact the sender and delete
> >> all copies of this message.
> >>
>


Re: Contributing to NiFi

2018-01-08 Thread Aldrin Piri
Hi Rishabh,

Entire processors are certainly welcomed.  I would suggest you open a JIRA
[1] with your proposed idea so that folks can provide some guidance on its
fit and/or considerations for its design and development and so that we
attempt to avoid any duplication of efforts.

The developer guide Brett highlighted is a good start to highlight some of
the common patterns and framework features.

Let us know if any additional questions arise throughout the process.

[1] https://issues.apache.org/jira/projects/NIFI

On Mon, Jan 8, 2018 at 6:43 AM, Brett Ryan  wrote:

> I am not a NiFi contributor myself, however; I have just started writing a
> plugin for publication.
>
> My advice would be to use the NiFi developer guide [1] and publish on
> github under the apache 2 licence. Having done this and you think this is a
> plugin that could be beneficial for the NiFi user community then announce
> it here.
>
> At least that's what I've done :)
>
>
>   [1]: https://nifi.apache.org/developer-guide.html
>
> > On 8 Jan 2018, at 18:04,  <
> rishabh.she...@open-insights.co.in> wrote:
> >
> > Hello, my name is Rishabh and I am building a custom NiFi processor. I
> would
> > like to make it Open source to everybody.
> >
> > I was going through the NiFi contributor guide and I was wondering if it
> was
> > possible for me to get in touch with someone who could guide me through
> the
> > process.
> >
> > Can I contribute an entire processor or can I only contribute to the
> > development of an existing processor?
> >
> > It would be really helpful.
> >
> >
> >
> > Regards,
> >
> > Rishabh
> >
>
>


Re: Contributing to NiFi

2018-01-08 Thread Brett Ryan
I am not a NiFi contributor myself, however; I have just started writing a 
plugin for publication.

My advice would be to use the NiFi developer guide [1] and publish on github 
under the apache 2 licence. Having done this and you think this is a plugin 
that could be beneficial for the NiFi user community then announce it here.

At least that's what I've done :)


  [1]: https://nifi.apache.org/developer-guide.html

> On 8 Jan 2018, at 18:04,  
>  wrote:
> 
> Hello, my name is Rishabh and I am building a custom NiFi processor. I would
> like to make it Open source to everybody.
> 
> I was going through the NiFi contributor guide and I was wondering if it was
> possible for me to get in touch with someone who could guide me through the
> process.
> 
> Can I contribute an entire processor or can I only contribute to the
> development of an existing processor?
> 
> It would be really helpful.
> 
> 
> 
> Regards,
> 
> Rishabh
> 



signature.asc
Description: Message signed with OpenPGP


Re: Status from the Jar

2018-01-08 Thread Brett Ryan
Hi Anakit. From a processor you can control whatever your relationships are for 
down stream processors. This can be achieved by a few steps.

Create a static relationship reference within your processor. For example, if 
you want to handle success and failure routing behaviour

static final Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description("Your process succeeds")
.build();
static final Relationship REL_FAILURE = new Relationship.Builder()
.name("failure")
.description("Something within the process failed.")
.build();

private static final Set RELATIONSHIPS;

static {
// ..
Set r = new HashSet<>();
r.add(REL_SUCCESS);
r.add(REL_FAILURE);
RELATIONSHIPS = Collections.unmodifiableSet(r);
}

Now in your trigger method you can route to any of these relationships.

@Override
public void onTrigger(ProcessContext pc, ProcessSession ps) throws 
ProcessException {
final FlowFile ff = ps.get();
if (ff == null) {
return;
}

try {
// do something.
ps.transfer(ff, REL_SUCCESS);
} catch (Throwable t) {
// if something fails
ps.transfer(ff, REL_FAILURE);
}
}

While this demonstrates failing from an exception, you can fail from anywhere 
you like.

You can also have as many relationships as you like, well; I'm not sure on the 
practical and allowable limitations.

> On 8 Jan 2018, at 20:40, Ankit Dhar  wrote:
> 
> Hi team,
>I have a jar which does one operation .
> Like it fetches the csv file and processes it.
> My problem is I want to know weather the jar worked or it didn't , So is
> there any option in nifi, So  that I can check weather the execution was
> successful or not
> 
> Thanks in Advance.
> 
> --
> ankit



signature.asc
Description: Message signed with OpenPGP


Status from the Jar

2018-01-08 Thread Ankit Dhar
Hi team,
I have a jar which does one operation .
Like it fetches the csv file and processes it.
My problem is I want to know weather the jar worked or it didn't , So is
there any option in nifi, So  that I can check weather the execution was
successful or not

Thanks in Advance.

-- 
ankit


Contributing to NiFi

2018-01-08 Thread rishabh.shetty
Hello, my name is Rishabh and I am building a custom NiFi processor. I would
like to make it Open source to everybody.

I was going through the NiFi contributor guide and I was wondering if it was
possible for me to get in touch with someone who could guide me through the
process.

Can I contribute an entire processor or can I only contribute to the
development of an existing processor?

It would be really helpful.

 

Regards,

Rishabh