Apache NiFi - MiNiFi 0.1.0 RC1 Release Helper Guide

2016-11-30 Thread Aldrin Piri
Hello Apache NiFi community,

Please find the associated guidance to help those interested in 
validating/verifying the release so they can vote.

# Download latest KEYS file:
  https://dist.apache.org/repos/dist/dev/nifi/KEYS

# Import keys file:
  gpg --import KEYS

# [optional] Clear out local maven artifact repository

# Pull down minifi-0.1.0 source release artifacts for review:

  wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/minifi-0.1.0-source-release.zip
  wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/minifi-0.1.0-source-release.zip.asc
  wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/minifi-0.1.0-source-release.zip.md5
  wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/minifi-0.1.0-source-release.zip.sha1
  wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/minifi-0.1.0-source-release.zip.sha256

# Verify the signature
  gpg --verify minifi-0.1.0-source-release.zip.asc

# Verify the hashes (md5, sha1, sha256) match the source and what was provided 
in the vote email thread
  md5sum minifi-0.1.0-source-release.zip
  sha1sum minifi-0.1.0-source-release.zip
  sha256sum minifi-0.1.0-source-release.zip

# Unzip minifi-0.1.0-source-release.zip

# Verify the build works including release audit tool (RAT) checks
  cd minifi-0.1.0
  mvn clean install -Pcontrib-check

# Verify the contents contain a good README, NOTICE, and LICENSE.

# Verify the git commit ID is correct

# Verify the RC was branched off the correct git commit ID


There are two convenience binaries generated as part of this process.  The 
MiNiFi assembly and a MiNiFi Toolkit assembly.

For the MiNiFi assembly:

# Look at the resulting convenience binary as found in minifi-assembly/target

# Make sure the README, NOTICE, and LICENSE are present and correct

# Run the resulting convenience binary and make sure it works as expected


For the MiNiFi Toolkit assembly:

# Look at the resulting convenience binary as found in 
minifi-toolkit/minifi-toolkit-assembly/target

# Make sure the README, NOTICE, and LICENSE are present and correct

# Run the resulting convenience binary and make sure it works as expected



# Send a response to the vote thread indicating a +1, 0, -1 based on your 
findings.


Thank you for your time and effort to validate the release!


signature.asc
Description: Message signed with OpenPGP using GPGMail


[VOTE] Release Apache NiFi - MiNiFi 0.1.0 (RC1)

2016-11-30 Thread Aldrin Piri
Hello
I am pleased to be calling this vote for the source release of Apache NiFi - 
MiNiFi, minifi-0.1.0.

The source zip, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/nifi/nifi-minifi/0.1.0/ 


The Git tag is minifi-0.1.0-RC1
The Git commit ID is 6e7f05d4bef3637a829c17435eb9eff83aa6b810
* 
https://git-wip-us.apache.org/repos/asf?p=nifi-minifi.git;a=commit;h=6e7f05d4bef3637a829c17435eb9eff83aa6b810
*  
https://github.com/apache/nifi-minifi/commit/6e7f05d4bef3637a829c17435eb9eff83aa6b810

Checksums of nifi-0.0.1-source-release.zip:
MD5: 9d44398bc1eec7d5596ad425bbb9257b
SHA1: d8043759eb53d815badccf9136a239103aa914df
SHA256: c740a2765c74b6bda7ab0ef3ee57353e8c26cb5009f8949a3349af7ded0be181

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/aldrin.asc

KEYS file available here:
https://dist.apache.org/repos/dist/release/nifi/KEYS

29 issues were closed/resolved for this release:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12335482

Release note highlights can be found here:
https://cwiki.apache.org/confluence/display/MINIFI/Release+Notes#ReleaseNotes-Version0.1.0

The vote will be open for 72 hours.

Please download the release candidate and evaluate the necessary items 
including checking hashes, signatures, build from source, and test.

Then please vote:

[ ] +1 Release this package as minifi-0.1.0
[ ] +0 no opinion
[ ] -1 Do not release this package because because...


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Secure Cluster Mode Issues

2016-11-30 Thread Ricky Saltzer
Andy -

I'm using version 1.1.0 (official binary). I'm using kerberos
authentication, and able to log in using my internal Kerberos principal. To
be clear, the only difference between blanking out the *keyPasswd *and
*keystorePasswd* was that I was allowed to access the UI without manually
importing the certificate, but instead agreeing to proceed even though I
know the certificate was untrusted.

[image: Inline image 1]
[image: Inline image 2]

Ricky

On Wed, Nov 30, 2016 at 7:43 PM, Andy LoPresto  wrote:

> Ricky,
>
> Removing the redundant key password property shouldn’t have an impact
> (although you may be running a legacy version before NIFI- [1] and
> NIFI-2466 [2] were fixed). Can you look at the top right of your NiFi UI
> and see what user is accessing the system? It should look like the
> screenshot I have attached. This, and the contents of logs/nifi-user.log,
> will indicate the authenticated user. That should help you figure out how
> the authentication is occurring (client certificate, LDAP, or Kerberos). If
> you still cannot determine it, you can update conf/logback.xml and change
> the logging level for the following loggers from INFO to DEBUG:
>
>  additivity="false">
> 
> 
>  additivity="false">
> 
> 
>  additivity="false">
> 
> 
>
>
> I only ask for this information because your results do not make sense and
> I fear that they will not be reproducible for the rest of your team when
> you try to deploy the system and let them access NiFi and I would hope we
> can provide the best experience from the beginning.
>
> [1] https://issues.apache.org/jira/browse/NIFI-
> [2] https://issues.apache.org/jira/browse/NIFI-2466
>
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Nov 30, 2016, at 11:12 AM, Ricky Saltzer  wrote:
>
> Hey Andy -
>
> I think I may have figured out the problem. Although the keystorePasswd and
> keyPasswd are the same, after completely removing the value for
> nifi.security.keyPasswd,and restarting NiFi...I'm able to access the web UI
> without manually importing the certificate.
>
> On Tue, Nov 29, 2016 at 2:19 PM, Andy LoPresto 
> wrote:
>
> Ricky,
>
> When using HTTPS in non-cluster mode, NiFi still requires user
> authentication — this can be either client certificate (perhaps you already
> had one loaded?), LDAP, or Kerberos. If you are able to access the NiFi UI
> over HTTPS without presenting some authentication, something is seriously
> broken. The warning in the browser is because the CA certificate that
> signed the server certificate (the one being presented *to* the browser
> by the application) is not trusted in the browser’s pre-installed trust
> chain. If, for example, that CA cert had been imported to the browser ahead
> of time, or if it was signed by a publicly known entity like DigiCert,
> Verisign, Comodo, etc., you would not receive a warning.
>
> For small teams, client certificates can be manageable, but if you want to
> allow multiple users to connect with minimal identity management, I
> recommend setting up an LDAP server (OpenLDAP, Microsoft ActiveDirectory,
> Apache Directory Studio, etc.) and administering users there. Then the
> users will just enter a username and password into a login field on their
> first connection to NiFi and be authenticated.
>
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Nov 29, 2016, at 11:07 AM, Ricky Saltzer  wrote:
>
> Hey Andy -
>
> Thanks for the reply, I used the openssl command you provided and indeed
> the return code was *OK*. Before proceeding with the recommendation of
>
> importing the key into my OSX keychain, I would like to understand why this
> required. When using HTTPS mode in non-clustered mode, it does not require
> clients to have a special key or cert imported to their machine. Instead,
> the client is given a warning in the browser, and it's up to them to
> proceed. This UI will serve as an endpoint to several users, and I would
> really like to avoid the cumbersomeness of having members of multiple teams
> follow instructions for importing keys just so they can access a web UI.
>
> On Tue, Nov 29, 2016 at 1:55 PM, Andy LoPresto 
> wrote:
>
> Hi Ricky,
>
> The ERR_CONNECTION_CLOSED is likely because you are not sending a client
> certificate on the HTTP request. By default, a secured cluster requires
> client certificate authentication unless LDAP or Kerberos are configured as
> identity providers [1]. The TLS Toolkit provides a quick way to generate a
> valid client certificate which you can load into your browser in order to
> access the site.
>
> First, verify the cluster is 

Re: Global property for custom processor

2016-11-30 Thread Andy LoPresto
The trigger should probably be a new Controller Service, because that makes 
more logical sense, but it’s not available today.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 30, 2016, at 5:19 PM, Andy LoPresto  wrote:
> 
> Hi Russ,
> 
> Could you use the cluster state manager [1] to hold this boolean trigger 
> value which each instance of your customer processor checks before execution, 
> and then use a simple ExecuteScript processor which simply toggles/explicitly 
> writes that value? In this way, the ExecuteScript processor is like the light 
> switch. You can manually start/stop that processor to trigger or stop all the 
> others, or use the REST API to do the same, or even make the ExecuteScript 
> processor read from a system/environment variable or the 
> absence/presence/value of a file on disk to get the desired state value.
> 
> The ExecuteScript processor might have to abuse the StandardStateManager by 
> first enumerating all instances of the desired “controllable” components 
> (this could be achieved by dynamically querying a containing process group 
> for processors by type or manually populating a list of component IDs in a 
> static list, which the ExecuteScript processor could then store in its own 
> StateManager) and then manually instantiating a StateManager containing the 
> local/cluster StateProvider [2] for each component ID and setting the state.
> 
> Not sure if I explained that well, but Mark Payne would be your guy for a 
> better explanation and possibly a cleaner solution.
> 
> [1] 
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#state_management
>  
> 
> [2] 
> https://github.com/apache/nifi/blob/master/nifi-framework-api/src/main/java/org/apache/nifi/components/state/StateProvider.java#L66
>  
> 
> 
> 
> Andy LoPresto
> alopre...@apache.org 
> alopresto.apa...@gmail.com 
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Nov 30, 2016, at 2:00 PM, Russell Bateman > > wrote:
>> 
>> Our usage, for ETL, is controlled very up-close and personal right now. Our 
>> ETL of medical documents is pretty involved, changes radically from customer 
>> to customer, and must be baby-sat closely. Anything we're able to do for our 
>> implementation folk to enable them to pin down waste of resources including 
>> and especially time to ingest horrendous quantities of information is going 
>> to serve us for a long time to come. User access to an easy processor like 
>> the one I've written, though what it does is pretty trivial, makes their 
>> life so much easier and they can talk back to us about where time (in 
>> particular) is being spent, in which processor (we have lots of custom 
>> processors that do very out-of-the-ordinary things), across which subflow, 
>> etc.
>> 
>> Except that we anticipate moving to a clustered implementation soon, I 
>> thought about merely looking for a system environment variable or even the 
>> presence of a file, then setting static state inside the processor to halt 
>> doing anything. Conversely, a change to that state might start the processor 
>> back up again (time-stamping, histogramming, etc.). I think this naïve 
>> control strategy falls apart as soon we go to a cluster.
>> 
>> It's taking me a while to get into the NiFi culture, I think. However, I 
>> also think that NiFi folk use NiFi in wildly different ways so maybe how I'm 
>> looking to do something isn't always so un-NiFi, but that others just 
>> haven't tackled it yet.
>> 
>> Yeah, if NiFi gave us some kind of modifiable, global state, especially if 
>> less static than /conf/nifi.properties/, but even if requiring a bounce to 
>> engage it (so, /conf/flow.properties/ or /conf/flow.conf/), that would solve 
>> our problem pretty elegantly. However, I haven't thought about what problems 
>> it also creates for you or others.
>> 
>> Russ
>> 
>> 
>> On 11/30/2016 02:42 PM, Joe Witt wrote:
>>> Russ
>>> 
>>> I don't think we provide anything particularly helpful here to do this
>>> conveniently.  You could of course script this external to NiFi to
>>> make HTTP calls to shut off such items.  Spitballing ideas here but
>>> what about giving you the ability to tag components with some label
>>> and then be able to do global execution of some task
>>> (stop/start/disable/delete/etc..) against components that you're
>>> authorized to and which have those labels.
>>> 
>>> Do you think this would be a typical use case or do you feel this is
>>> useful because you're testing right now?  Does the 

Re: Global property for custom processor

2016-11-30 Thread Andy LoPresto
Hi Russ,

Could you use the cluster state manager [1] to hold this boolean trigger value 
which each instance of your customer processor checks before execution, and 
then use a simple ExecuteScript processor which simply toggles/explicitly 
writes that value? In this way, the ExecuteScript processor is like the light 
switch. You can manually start/stop that processor to trigger or stop all the 
others, or use the REST API to do the same, or even make the ExecuteScript 
processor read from a system/environment variable or the absence/presence/value 
of a file on disk to get the desired state value.

The ExecuteScript processor might have to abuse the StandardStateManager by 
first enumerating all instances of the desired “controllable” components (this 
could be achieved by dynamically querying a containing process group for 
processors by type or manually populating a list of component IDs in a static 
list, which the ExecuteScript processor could then store in its own 
StateManager) and then manually instantiating a StateManager containing the 
local/cluster StateProvider [2] for each component ID and setting the state.

Not sure if I explained that well, but Mark Payne would be your guy for a 
better explanation and possibly a cleaner solution.

[1] 
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#state_management
[2] 
https://github.com/apache/nifi/blob/master/nifi-framework-api/src/main/java/org/apache/nifi/components/state/StateProvider.java#L66
 



Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 30, 2016, at 2:00 PM, Russell Bateman  wrote:
> 
> Our usage, for ETL, is controlled very up-close and personal right now. Our 
> ETL of medical documents is pretty involved, changes radically from customer 
> to customer, and must be baby-sat closely. Anything we're able to do for our 
> implementation folk to enable them to pin down waste of resources including 
> and especially time to ingest horrendous quantities of information is going 
> to serve us for a long time to come. User access to an easy processor like 
> the one I've written, though what it does is pretty trivial, makes their life 
> so much easier and they can talk back to us about where time (in particular) 
> is being spent, in which processor (we have lots of custom processors that do 
> very out-of-the-ordinary things), across which subflow, etc.
> 
> Except that we anticipate moving to a clustered implementation soon, I 
> thought about merely looking for a system environment variable or even the 
> presence of a file, then setting static state inside the processor to halt 
> doing anything. Conversely, a change to that state might start the processor 
> back up again (time-stamping, histogramming, etc.). I think this naïve 
> control strategy falls apart as soon we go to a cluster.
> 
> It's taking me a while to get into the NiFi culture, I think. However, I also 
> think that NiFi folk use NiFi in wildly different ways so maybe how I'm 
> looking to do something isn't always so un-NiFi, but that others just haven't 
> tackled it yet.
> 
> Yeah, if NiFi gave us some kind of modifiable, global state, especially if 
> less static than /conf/nifi.properties/, but even if requiring a bounce to 
> engage it (so, /conf/flow.properties/ or /conf/flow.conf/), that would solve 
> our problem pretty elegantly. However, I haven't thought about what problems 
> it also creates for you or others.
> 
> Russ
> 
> 
> On 11/30/2016 02:42 PM, Joe Witt wrote:
>> Russ
>> 
>> I don't think we provide anything particularly helpful here to do this
>> conveniently.  You could of course script this external to NiFi to
>> make HTTP calls to shut off such items.  Spitballing ideas here but
>> what about giving you the ability to tag components with some label
>> and then be able to do global execution of some task
>> (stop/start/disable/delete/etc..) against components that you're
>> authorized to and which have those labels.
>> 
>> Do you think this would be a typical use case or do you feel this is
>> useful because you're testing right now?  Does the above idea make
>> sense or do you have other suggestions?
>> 
>> Thanks
>> Joe
>> 
>> On Wed, Nov 30, 2016 at 3:14 PM, Russell Bateman  
>> wrote:
>>> I've written a custom processor for some trivial profiling, time-stamping,
>>> time-since, histogram-generating, etc., but would like the ability to turn
>>> all instances completely off without having to visit each instance in the
>>> UI. If it works out, I might consider even leaving some instances in
>>> production- or at least staging-environment flows.
>>> 
>>> 1. I know that the NiFi Expression Language has access to various system- or
>>> NiFi properties or settings, 

Re: Secure Cluster Mode Issues

2016-11-30 Thread Andy LoPresto
Ricky,

Removing the redundant key password property shouldn’t have an impact (although 
you may be running a legacy version before NIFI- [1] and NIFI-2466 [2] were 
fixed). Can you look at the top right of your NiFi UI and see what user is 
accessing the system? It should look like the screenshot I have attached. This, 
and the contents of logs/nifi-user.log, will indicate the authenticated user. 
That should help you figure out how the authentication is occurring (client 
certificate, LDAP, or Kerberos). If you still cannot determine it, you can 
update conf/logback.xml and change the logging level for the following loggers 
from INFO to DEBUG:













I only ask for this information because your results do not make sense and I 
fear that they will not be reproducible for the rest of your team when you try 
to deploy the system and let them access NiFi and I would hope we can provide 
the best experience from the beginning.

[1] https://issues.apache.org/jira/browse/NIFI-
[2] https://issues.apache.org/jira/browse/NIFI-2466


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 30, 2016, at 11:12 AM, Ricky Saltzer  wrote:
> 
> Hey Andy -
> 
> I think I may have figured out the problem. Although the keystorePasswd and
> keyPasswd are the same, after completely removing the value for
> nifi.security.keyPasswd,and restarting NiFi...I'm able to access the web UI
> without manually importing the certificate.
> 
> On Tue, Nov 29, 2016 at 2:19 PM, Andy LoPresto  > wrote:
> 
>> Ricky,
>> 
>> When using HTTPS in non-cluster mode, NiFi still requires user
>> authentication — this can be either client certificate (perhaps you already
>> had one loaded?), LDAP, or Kerberos. If you are able to access the NiFi UI
>> over HTTPS without presenting some authentication, something is seriously
>> broken. The warning in the browser is because the CA certificate that
>> signed the server certificate (the one being presented *to* the browser
>> by the application) is not trusted in the browser’s pre-installed trust
>> chain. If, for example, that CA cert had been imported to the browser ahead
>> of time, or if it was signed by a publicly known entity like DigiCert,
>> Verisign, Comodo, etc., you would not receive a warning.
>> 
>> For small teams, client certificates can be manageable, but if you want to
>> allow multiple users to connect with minimal identity management, I
>> recommend setting up an LDAP server (OpenLDAP, Microsoft ActiveDirectory,
>> Apache Directory Studio, etc.) and administering users there. Then the
>> users will just enter a username and password into a login field on their
>> first connection to NiFi and be authenticated.
>> 
>> 
>> Andy LoPresto
>> alopre...@apache.org
>> *alopresto.apa...@gmail.com  
>> >*
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>> On Nov 29, 2016, at 11:07 AM, Ricky Saltzer > > wrote:
>> 
>> Hey Andy -
>> 
>> Thanks for the reply, I used the openssl command you provided and indeed
>> the return code was *OK*. Before proceeding with the recommendation of
>> 
>> importing the key into my OSX keychain, I would like to understand why this
>> required. When using HTTPS mode in non-clustered mode, it does not require
>> clients to have a special key or cert imported to their machine. Instead,
>> the client is given a warning in the browser, and it's up to them to
>> proceed. This UI will serve as an endpoint to several users, and I would
>> really like to avoid the cumbersomeness of having members of multiple teams
>> follow instructions for importing keys just so they can access a web UI.
>> 
>> On Tue, Nov 29, 2016 at 1:55 PM, Andy LoPresto > >
>> wrote:
>> 
>> Hi Ricky,
>> 
>> The ERR_CONNECTION_CLOSED is likely because you are not sending a client
>> certificate on the HTTP request. By default, a secured cluster requires
>> client certificate authentication unless LDAP or Kerberos are configured as
>> identity providers [1]. The TLS Toolkit provides a quick way to generate a
>> valid client certificate which you can load into your browser in order to
>> access the site.
>> 
>> First, verify the cluster is running and accepting incoming connections
>> (we’re going to cheat here just to be quick about it; disclaimer that this
>> is not the RIGHT way to do this):
>> 
>> In the directory where you ran the toolkit, you noted there was a
>> “nifi-cert.pem” and “nifi-cert.key” file. The pem file is the PEM-encoded
>> public certificate of the NiFi CA cert that was generated by the toolkit,
>> and the key file is the PEM-encoded private key. 

Re: NiFi upgrade version

2016-11-30 Thread Joe Witt
Hello

It is a bit tricky to respond to.  Apache NiFi 0.3.0 was released well
over a year ago and this is referencing a snapshot version.

Can you start from a fresh install perhaps?  Support for and
integration with Hive in NiFi has improved quite a lot in the past
couple of releases so I'd definitely recommend updating and working
the discussion from there.

Thanks
Joe

On Wed, Nov 30, 2016 at 3:31 PM, Mothi86  wrote:
> Hi. A gentle reminder for anyone to respond on this topic for an answer.
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-upgrade-version-tp14035p14073.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: NiFi upgrade version

2016-11-30 Thread Mothi86
Hi. A gentle reminder for anyone to respond on this topic for an answer.



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-upgrade-version-tp14035p14073.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: Global property for custom processor

2016-11-30 Thread Russell Bateman
Our usage, for ETL, is controlled very up-close and personal right now. 
Our ETL of medical documents is pretty involved, changes radically from 
customer to customer, and must be baby-sat closely. Anything we're able 
to do for our implementation folk to enable them to pin down waste of 
resources including and especially time to ingest horrendous quantities 
of information is going to serve us for a long time to come. User access 
to an easy processor like the one I've written, though what it does is 
pretty trivial, makes their life so much easier and they can talk back 
to us about where time (in particular) is being spent, in which 
processor (we have lots of custom processors that do very 
out-of-the-ordinary things), across which subflow, etc.


Except that we anticipate moving to a clustered implementation soon, I 
thought about merely looking for a system environment variable or even 
the presence of a file, then setting static state inside the processor 
to halt doing anything. Conversely, a change to that state might start 
the processor back up again (time-stamping, histogramming, etc.). I 
think this naïve control strategy falls apart as soon we go to a cluster.


It's taking me a while to get into the NiFi culture, I think. However, I 
also think that NiFi folk use NiFi in wildly different ways so maybe how 
I'm looking to do something isn't always so un-NiFi, but that others 
just haven't tackled it yet.


Yeah, if NiFi gave us some kind of modifiable, global state, especially 
if less static than /conf/nifi.properties/, but even if requiring a 
bounce to engage it (so, /conf/flow.properties/ or /conf/flow.conf/), 
that would solve our problem pretty elegantly. However, I haven't 
thought about what problems it also creates for you or others.


Russ


On 11/30/2016 02:42 PM, Joe Witt wrote:

Russ

I don't think we provide anything particularly helpful here to do this
conveniently.  You could of course script this external to NiFi to
make HTTP calls to shut off such items.  Spitballing ideas here but
what about giving you the ability to tag components with some label
and then be able to do global execution of some task
(stop/start/disable/delete/etc..) against components that you're
authorized to and which have those labels.

Do you think this would be a typical use case or do you feel this is
useful because you're testing right now?  Does the above idea make
sense or do you have other suggestions?

Thanks
Joe

On Wed, Nov 30, 2016 at 3:14 PM, Russell Bateman  wrote:

I've written a custom processor for some trivial profiling, time-stamping,
time-since, histogram-generating, etc., but would like the ability to turn
all instances completely off without having to visit each instance in the
UI. If it works out, I might consider even leaving some instances in
production- or at least staging-environment flows.

1. I know that the NiFi Expression Language has access to various system- or
NiFi properties or settings, but what would someone suggest as best practice
for this? (Don't invade /conf/nifi.properties/, etc.)

2. I guess I'd add a property to configure in my processor and check whether
it evaluates true/false/etc. based on the source data (whatever that will
be--see previous paragraph)?

3. Last, if this processor is thereby reduced merely to

session.transfer( flowfile, SUCCESS );

there isn't any handling even more minimal or faster than that in the sense
of turning a processor off, right?

Thanks for any suggestions,

Russ




Re: Global property for custom processor

2016-11-30 Thread Joe Witt
Russ

I don't think we provide anything particularly helpful here to do this
conveniently.  You could of course script this external to NiFi to
make HTTP calls to shut off such items.  Spitballing ideas here but
what about giving you the ability to tag components with some label
and then be able to do global execution of some task
(stop/start/disable/delete/etc..) against components that you're
authorized to and which have those labels.

Do you think this would be a typical use case or do you feel this is
useful because you're testing right now?  Does the above idea make
sense or do you have other suggestions?

Thanks
Joe

On Wed, Nov 30, 2016 at 3:14 PM, Russell Bateman  wrote:
> I've written a custom processor for some trivial profiling, time-stamping,
> time-since, histogram-generating, etc., but would like the ability to turn
> all instances completely off without having to visit each instance in the
> UI. If it works out, I might consider even leaving some instances in
> production- or at least staging-environment flows.
>
> 1. I know that the NiFi Expression Language has access to various system- or
> NiFi properties or settings, but what would someone suggest as best practice
> for this? (Don't invade /conf/nifi.properties/, etc.)
>
> 2. I guess I'd add a property to configure in my processor and check whether
> it evaluates true/false/etc. based on the source data (whatever that will
> be--see previous paragraph)?
>
> 3. Last, if this processor is thereby reduced merely to
>
>session.transfer( flowfile, SUCCESS );
>
> there isn't any handling even more minimal or faster than that in the sense
> of turning a processor off, right?
>
> Thanks for any suggestions,
>
> Russ


Re: How to clear the canvas?

2016-11-30 Thread srini
Great Andy, Thanks a lot!



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/Templates-flows-tp14064p14070.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: [VOTE] Release Apache NiFi MiNiFI C++ 0.1.0 (RC1)

2016-11-30 Thread Aldrin Piri
Hey Bryan, thanks for pointing this out.

Seems this was not updated in the docs with the transition to CMake and
libleveldb was also not captured in the run portion of requirements only in
the build.  I've taken note of this and will capture these doc items in a
JIRA with any others that may arise at the conclusion of this voting
process.  I would also like to add dependency install commands as needed
for the common platforms/package managers.

On Wed, Nov 30, 2016 at 3:49 PM, Bryan Rosander 
wrote:

> +1 (non-binding)
>
> Notes:
> I needed to install uuid-dev to build on Ubuntu (not part of README.md).
> I needed to install libleveldb-dev to build (part of README.md) as well as
> to run (not part of README.md) on Ubuntu.
>
> Validated signature, hashes, build, sample flow in Docker Ubuntu container.
>
> On Tue, Nov 29, 2016 at 9:31 PM, Jeremy Dyer  wrote:
>
> > +1 (non-binding) Release this package as nifi-minifi-cpp-0.1.0
> >
> > Validated signature, resulting assembly artifacts, and builds/packaging
> for
> > Ubuntu 16.10 and CentOs 7.2. I also validated a simple flow of tailing
> > local files and transferring data over site-to-site to NiFi.
> >
> > On Tue, Nov 29, 2016 at 11:54 AM, Aldrin Piri  wrote:
> >
> > > Hello Apache NiFi Community,
> > >
> > > I am pleased to be calling this vote for the source release of Apache
> > NiFi
> > > MiNiFi C++, nifi-minifi-cpp-0.1.0.
> > >
> > > The source archive, signature, and digests can be located at:
> > >
> > > Source Archive:
> > >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz
> > > GPG armored signature:
> > >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.asc
> > > Source MD5:
> > >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.md5
> > > Source SHA1:
> > >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.sha1
> > > Source SHA256:
> > >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.sha256
> > >
> > > The Git tag is minifi-cpp-0.1.0-RC1
> > > The Git commit hash is bd963503586aeb9b24b4ad5a96da9a1a6818a186
> > > * https://git-wip-us.apache.org/repos/asf?p=nifi-minifi-
> > > cpp.git;a=commit;h=bd963503586aeb9b24b4ad5a96da9a1a6818a186
> > > * https://github.com/apache/nifi-minifi-cpp/commit/bd9635035
> > > 86aeb9b24b4ad5a96da9a1a6818a186
> > >
> > > Checksums of nifi-minifi-cpp-0.1.0-source.tar.gz:
> > > MD5: a7155f53d86ef93e37bf28d6e4a0299f
> > > SHA1: f3cb105584d79f70edbd6e5bc0908be3731263fd
> > > SHA256: 62441650684bc2d9631f683b29b3f5f12c3c55b8b1f336badf7f7f0061d4
> 7b66
> > >
> > >
> > > Release artifacts are signed with the following key:
> > > https://people.apache.org/keys/committer/aldrin
> > >
> > > KEYS file available here:
> > > https://dist.apache.org/repos/dist/release/nifi/KEYS
> > >
> > > 15 issues were closed/resolved for this release:
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?versi
> > > on=12338046=12319921
> > >
> > > Release note highlights can be found here:
> > > https://cwiki.apache.org/confluence/display/MINIFI/Release+
> > > Notes#ReleaseNotes-Versioncpp-0.1.0
> > >
> > > The vote will be open for 72 hours.
> > > Please download the release candidate and evaluate the necessary items
> > > including checking hashes, signatures, build from source, and test.
> Then
> > > please vote:
> > >
> > > [ ] +1 Release this package as nifi-minifi-cpp-0.1.0
> > > [ ] +0 no opinion
> > > [ ] -1 Do not release this package because...
> > >
> >
>


[ANNOUNCE] Apache NiFi 1.1.0 Release

2016-11-30 Thread Joe Witt
Hello

The Apache NiFi team would like to announce the release of Apache NiFi 1.1.0.

Apache NiFi is an easy to use, powerful, and reliable system to
process and distribute data.  Apache NiFi was made for dataflow.  It
supports highly configurable directed graphs of data routing,
transformation, and system mediation logic.

This release is the result of fantastic community contribution across
feature requests, documentation, bug reports, code contributions,
reviews, and release validation.

The release highlights:

https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.1.0

More details on Apache NiFi can be found here:
  http://nifi.apache.org/

The release artifacts can be downloaded from here:
  http://nifi.apache.org/download.html

Maven artifacts have been made available here:
  https://repository.apache.org/content/repositories/releases/org/apache/nifi/

Issues closed/resolved for this list can be found here:
  
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12337875

Thank you
The Apache NiFi team


Re: [VOTE] Release Apache NiFi MiNiFI C++ 0.1.0 (RC1)

2016-11-30 Thread Bryan Rosander
+1 (non-binding)

Notes:
I needed to install uuid-dev to build on Ubuntu (not part of README.md).
I needed to install libleveldb-dev to build (part of README.md) as well as
to run (not part of README.md) on Ubuntu.

Validated signature, hashes, build, sample flow in Docker Ubuntu container.

On Tue, Nov 29, 2016 at 9:31 PM, Jeremy Dyer  wrote:

> +1 (non-binding) Release this package as nifi-minifi-cpp-0.1.0
>
> Validated signature, resulting assembly artifacts, and builds/packaging for
> Ubuntu 16.10 and CentOs 7.2. I also validated a simple flow of tailing
> local files and transferring data over site-to-site to NiFi.
>
> On Tue, Nov 29, 2016 at 11:54 AM, Aldrin Piri  wrote:
>
> > Hello Apache NiFi Community,
> >
> > I am pleased to be calling this vote for the source release of Apache
> NiFi
> > MiNiFi C++, nifi-minifi-cpp-0.1.0.
> >
> > The source archive, signature, and digests can be located at:
> >
> > Source Archive:
> >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz
> > GPG armored signature:
> >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.asc
> > Source MD5:
> >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.md5
> > Source SHA1:
> >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.sha1
> > Source SHA256:
> >https://dist.apache.org/repos/dist/dev/nifi/nifi-
> > minifi-cpp/0.1.0/nifi-minifi-cpp-0.1.0-source.tar.gz.sha256
> >
> > The Git tag is minifi-cpp-0.1.0-RC1
> > The Git commit hash is bd963503586aeb9b24b4ad5a96da9a1a6818a186
> > * https://git-wip-us.apache.org/repos/asf?p=nifi-minifi-
> > cpp.git;a=commit;h=bd963503586aeb9b24b4ad5a96da9a1a6818a186
> > * https://github.com/apache/nifi-minifi-cpp/commit/bd9635035
> > 86aeb9b24b4ad5a96da9a1a6818a186
> >
> > Checksums of nifi-minifi-cpp-0.1.0-source.tar.gz:
> > MD5: a7155f53d86ef93e37bf28d6e4a0299f
> > SHA1: f3cb105584d79f70edbd6e5bc0908be3731263fd
> > SHA256: 62441650684bc2d9631f683b29b3f5f12c3c55b8b1f336badf7f7f0061d47b66
> >
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/aldrin
> >
> > KEYS file available here:
> > https://dist.apache.org/repos/dist/release/nifi/KEYS
> >
> > 15 issues were closed/resolved for this release:
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?versi
> > on=12338046=12319921
> >
> > Release note highlights can be found here:
> > https://cwiki.apache.org/confluence/display/MINIFI/Release+
> > Notes#ReleaseNotes-Versioncpp-0.1.0
> >
> > The vote will be open for 72 hours.
> > Please download the release candidate and evaluate the necessary items
> > including checking hashes, signatures, build from source, and test. Then
> > please vote:
> >
> > [ ] +1 Release this package as nifi-minifi-cpp-0.1.0
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because...
> >
>


Re: From an external Java program, I want to send some XML to NiFi.

2016-11-30 Thread Andy LoPresto
StandardSSLContextService is necessary if you want the incoming HTTP 
connections to be over HTTPS (i.e. encrypted and secured). If you want this 
feature, you will need to configure an SSL controller service with your 
keystore and truststore files and the appropriate passwords.


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 30, 2016, at 10:46 AM, srini  wrote:
> 
> Thanks Andy, Now StandardHttpContextMap is good.
> 
> I still have this error with HandleHttpRequest. StandardSSLContextService is
> invalid:
> 
> "'SSL Context Service' validated against
> 'c6b03cd2-69af-4a04-3ded-b3a1d39c07a6' is invalid because Invalid Controller
> Service: c6b03cd2-69af-4a04-3ded-b3a1d39c07a6 is not a valid Controller
> Service Identifier or does not reference the correct type of Controller
> Service"
> 
> 
> 
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/From-an-external-Java-program-I-want-to-send-some-XML-to-NiFi-tp14059p14063.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: From an external Java program, I want to send some XML to NiFi.

2016-11-30 Thread srini
Thanks Andy, Now StandardHttpContextMap is good.

I still have this error with HandleHttpRequest. StandardSSLContextService is
invalid:

"'SSL Context Service' validated against
'c6b03cd2-69af-4a04-3ded-b3a1d39c07a6' is invalid because Invalid Controller
Service: c6b03cd2-69af-4a04-3ded-b3a1d39c07a6 is not a valid Controller
Service Identifier or does not reference the correct type of Controller
Service"



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/From-an-external-Java-program-I-want-to-send-some-XML-to-NiFi-tp14059p14063.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


How to clear the canvas?

2016-11-30 Thread srini
Hi,

1. Where are the created templates store?
2. How to save a template after modification?

thanks
Srini



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-clear-the-canvas-tp14064.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Global property for custom processor

2016-11-30 Thread Russell Bateman
I've written a custom processor for some trivial profiling, 
time-stamping, time-since, histogram-generating, etc., but would like 
the ability to turn all instances completely off without having to visit 
each instance in the UI. If it works out, I might consider even leaving 
some instances in production- or at least staging-environment flows.


1. I know that the NiFi Expression Language has access to various 
system- or NiFi properties or settings, but what would someone suggest 
as best practice for this? (Don't invade /conf/nifi.properties/, etc.)


2. I guess I'd add a property to configure in my processor and check 
whether it evaluates true/false/etc. based on the source data (whatever 
that will be--see previous paragraph)?


3. Last, if this processor is thereby reduced merely to

   session.transfer( flowfile, SUCCESS );

there isn't any handling even more minimal or faster than that in the 
sense of turning a processor off, right?


Thanks for any suggestions,

Russ


Re: Secure Cluster Mode Issues

2016-11-30 Thread Ricky Saltzer
Hey Andy -

I think I may have figured out the problem. Although the keystorePasswd and
keyPasswd are the same, after completely removing the value for
nifi.security.keyPasswd,and restarting NiFi...I'm able to access the web UI
without manually importing the certificate.

On Tue, Nov 29, 2016 at 2:19 PM, Andy LoPresto  wrote:

> Ricky,
>
> When using HTTPS in non-cluster mode, NiFi still requires user
> authentication — this can be either client certificate (perhaps you already
> had one loaded?), LDAP, or Kerberos. If you are able to access the NiFi UI
> over HTTPS without presenting some authentication, something is seriously
> broken. The warning in the browser is because the CA certificate that
> signed the server certificate (the one being presented *to* the browser
> by the application) is not trusted in the browser’s pre-installed trust
> chain. If, for example, that CA cert had been imported to the browser ahead
> of time, or if it was signed by a publicly known entity like DigiCert,
> Verisign, Comodo, etc., you would not receive a warning.
>
> For small teams, client certificates can be manageable, but if you want to
> allow multiple users to connect with minimal identity management, I
> recommend setting up an LDAP server (OpenLDAP, Microsoft ActiveDirectory,
> Apache Directory Studio, etc.) and administering users there. Then the
> users will just enter a username and password into a login field on their
> first connection to NiFi and be authenticated.
>
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Nov 29, 2016, at 11:07 AM, Ricky Saltzer  wrote:
>
> Hey Andy -
>
> Thanks for the reply, I used the openssl command you provided and indeed
> the return code was *OK*. Before proceeding with the recommendation of
>
> importing the key into my OSX keychain, I would like to understand why this
> required. When using HTTPS mode in non-clustered mode, it does not require
> clients to have a special key or cert imported to their machine. Instead,
> the client is given a warning in the browser, and it's up to them to
> proceed. This UI will serve as an endpoint to several users, and I would
> really like to avoid the cumbersomeness of having members of multiple teams
> follow instructions for importing keys just so they can access a web UI.
>
> On Tue, Nov 29, 2016 at 1:55 PM, Andy LoPresto 
> wrote:
>
> Hi Ricky,
>
> The ERR_CONNECTION_CLOSED is likely because you are not sending a client
> certificate on the HTTP request. By default, a secured cluster requires
> client certificate authentication unless LDAP or Kerberos are configured as
> identity providers [1]. The TLS Toolkit provides a quick way to generate a
> valid client certificate which you can load into your browser in order to
> access the site.
>
> First, verify the cluster is running and accepting incoming connections
> (we’re going to cheat here just to be quick about it; disclaimer that this
> is not the RIGHT way to do this):
>
> In the directory where you ran the toolkit, you noted there was a
> “nifi-cert.pem” and “nifi-cert.key” file. The pem file is the PEM-encoded
> public certificate of the NiFi CA cert that was generated by the toolkit,
> and the key file is the PEM-encoded private key. Because this is the same
> certificate that signed the NiFi server key loaded in the keystore, it is
> also loaded into the truststore. That means the server will accept an
> incoming connection with any certificate signed by the CA cert.
> Coincidentally, the CA cert is self-signed, so…
>
> $ openssl s_client -connect  -debug -state -cert nifi-cert.pem
> -key nifi-key.key -CAfile nifi-cert.pem
>
> That command will attempt to negotiate a TLS connection to your server by
> presenting the CA cert and key as the client. Again, not semantically
> correct, but  technically will work. You’ll get a long output, but it
> should end in a section like this:
>
> ---
> New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-SHA384
> Server public key is 2048 bit
> Secure Renegotiation IS supported
> Compression: NONE
> Expansion: NONE
> No ALPN negotiated
> SSL-Session:
>Protocol  : TLSv1.2
>Cipher: ECDHE-RSA-AES256-SHA384
>Session-ID: 583DCD...9E828C
>Session-ID-ctx:
>Master-Key: 5477C0...A51E85
>Key-Arg   : None
>PSK identity: None
>PSK identity hint: None
>SRP username: None
>Start Time: 1480445265
>Timeout   : 300 (sec)
>Verify return code: 0 (ok)
> ---
>
> The important part is the last line — you want the *Verify return code* to
>
> be 0 for success. Once you have verified this, run the TLS toolkit again to
> generate a valid client certificate:
>
> $ ./bin/tls-toolkit.sh standalone -C 'CN=Ricky Saltzer, OU=Apache NiFi'
> -B thisIsABadPassword
>
> This will generate a PKCS12 keystore (*.p12) containing your public
> 

Re: From an external Java program, I want to send some XML to NiFi.

2016-11-30 Thread Andy LoPresto
Hi Srini,

To move data from an external Java process to NiFi, there are a number of 
possible solutions. If the data can be serialized to disk, simply writing it to 
a file in the Java process and using the GetFile processor in NiFi is the 
easiest way.

If you want to use HandleHTTPRequest, you will need to enable the 
StandardHttpContextMap controller service [1], as indicated by the error 
message you posted. It appears you have already created it, but you have not 
yet enabled it. After entering the Controller Services management dialog, there 
should be a lightning bolt icon on the right that you can click to enable this 
controller service.

[1] 
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Controller_Services

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 30, 2016, at 8:15 AM, srini  wrote:
> 
> Here is my requirement:
> From an external Java program, I want to send some XML to NiFi. Whenever
> NiFi gets that XML, it does something.
> 
> 1) What NiFi process I need to use to get this request from Java.
> 
> So far, I learned how to start/stop the flow and process from Java. I am
> using NiFi 1.0.
> 
> 2) I have tried to use HandleHttpRequest, but it is showing this Error:
> 
> HTTP Context Map' validated against 'StandardHttpContextMap' is invalid
> because Controller Service
> StandardHttpContextMap[id=2c5ddba9-ed6a-4c5f-d911-bfd6468bf19b] is disabled.
> 
> Thanks
> Srini
> 
> 
> 
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/From-an-external-Java-program-I-want-to-send-some-XML-to-NiFi-tp14059.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.



signature.asc
Description: Message signed with OpenPGP using GPGMail


From an external Java program, I want to send some XML to NiFi.

2016-11-30 Thread srini
Here is my requirement:
>From an external Java program, I want to send some XML to NiFi. Whenever
NiFi gets that XML, it does something. 

1) What NiFi process I need to use to get this request from Java. 

So far, I learned how to start/stop the flow and process from Java. I am
using NiFi 1.0.

2) I have tried to use HandleHttpRequest, but it is showing this Error:

HTTP Context Map' validated against 'StandardHttpContextMap' is invalid
because Controller Service
StandardHttpContextMap[id=2c5ddba9-ed6a-4c5f-d911-bfd6468bf19b] is disabled.

Thanks
Srini



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/From-an-external-Java-program-I-want-to-send-some-XML-to-NiFi-tp14059.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: MiNiFi C++ Data Provenance and Related Issues

2016-11-30 Thread Daniel Cave
I will not be continuing this discussion.  I will leave it to others to pick
it up if they feel it's needed.



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/MiNiFi-C-Data-Provenance-and-Related-Issues-tp14024p14058.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: MiNiFi C++ Data Provenance and Related Issues

2016-11-30 Thread Joe Witt
Regarding the scenario I am highlighting to show the problem of
in-band or in-line provenance exfil what I was pointing out is not:

MiNiFi -> SystemA -> SystemB -> ... -> NiFi

but rather it is

MiNiFi -> SystemA
MiNiFI -> SystemB

Where the data being sent to A and B is happening in parallel (not
series) and is actually the same piece of data for instance.  This
would look like a "fan out" graph.

The current model that we've followed supports generation and
transmission of the provenance graph regardless of the nature of the
graph of how data flows within the system.  The current approach we
have for exfil of events is to leverage reporting tasks and this too
has worked well.  We can filter events in such tasks, we can manage
bandwidth used, etc.. Can we rebase the discussion the problems we're
trying to solve?  That will help us better discuss solutions to those
problems.  If I look at the original thread I see "#4 and #5" being
used to articulate what I think became the s2s alteration proposal.
But I don't quite follow what #4 or #5 mean so can we restate/rephrase
the core problem.

Regarding ETL patterns and fundamental disagreement: It wasn't clear
to me what part of the discussion that was referring to and I'm not
familiar with the public papers you've released.  Would be happy to
read through to better understand your perspective. Can you share the
links here?

Regarding contributions and branching: I don't believe anyone has
pushed back on your idea to provide an alternative implementation of
the repositories.  Please do feel free to contribute your alternative
implementation.  It would be great to be able to have both available
and run side by side.  This sort of pluggability also promotes good
interface design to the repositories so it will be healthy regardless
of what the outcome is.

Regarding issues getting contributions into NiFi: Is there a specific
engagement you've found has been left hanging?  I see a couple of
JIRAs and contribs you were involved in that culminated in merged
commits and one that appears to have hit some snags and has not
progressed.  Is that the one you're talking about or are there other
challenges?  Let's take these cases and work through them.

Thanks
Joe


On Tue, Nov 29, 2016 at 10:35 AM, Daniel Cave  wrote:
> "Yes but there can be other hubs too and in parallel."
> [Daniel]For MiNiFi C++ -> SystemA -> SystemB -> ... -> NiFi, if you dont
> want provenance to travel then I don't see it as an issue since the outgoing
> message would be identical to what you have now.  If you feel it's going to
> be extremely confusing then I could make it a new clone of the S2S MiNiFi
> C++ processor, but I don't see a point to just hide a toggle.  On the NiFi
> side for this case you would use the normal S2S intake methods you use now.
> No change.  Also, if you're going from MiNiFi C++ -> SystemA there is no
> change.
> For MiNiFi C++ -> MiNiFi C++ ->-> NiFi, if you want provenance travel
> then yes you are locked into using n*(MiNiFi C++) -> NiFi with the
> provenance toggled on and using the new S2S receiving processors in MiNiFi
> C++/NiFi (it has to be a new one to avoid backwards compatibility issues)
> that can handle provenance.  Again, I don't see this as an issue either
> since you are clearly wanting this functionality if you're doing this.
> Am I missing something in my logic flow that you are seeing that I need to
> account for?
>
> "You've mentioned this a couple times now. "
> [Daniel] Agreed and this is how this discussion is meant to be taken.
>
> "I'm not quite sure I understand so please elaborate if my
> comments don't apply."
> [Daniel]It has to do with when and how it's consumed.  On current path Atlas
> won't answer the issues, but as you said there are others and I have my own
> in progress as well.  I fundamentally disagree with the current
> sink-retrieve-sink ETL paradigm (as you've seen from my public papers, there
> are others not public yet as well) as it is a complete waste of time and
> resources at this point.  In all my work, data is handled as available (near
> real-time) rather than waiting for some ETL processes to run at some
> arbitrary point in the future.  By doing this you avoid unnecessary traffic,
> storage, processing, maintenance, and design all while improving data
> availability.  More specifically to this discussion, the issue comes down to
> access from the point of origin.  In an embedded or background instance of
> MiNiFi C++, bidirectional followup calls for provenance only are not always
> going to be available.  Additionally, where they are available they are not
> going to be current and hence are fairly useless for security applications.
> Think of trying this on your laptop, IoT devices, or on financial
> transactions.  If I find out 12-36hrs later when you reconnect or I can send
> someone to the field to retrieve it or the ETL processes run that there was
> an issue, it doesn't do me any good.