Execute script - python example

2016-02-17 Thread Madhukar Thota
Hi

I am looking for an example in python to convert a new field based on
attribute value.

Let say syslog.facilty holds value 23, based on the value i want to create
new field with text value like syslog.facility_label=LOCAL7

If this transformation possible with existing processors, please provide an
example or direct me to right processor.

Thanks in Advance,


Re: Maximum attribute size

2016-02-17 Thread Lars Francke
Thanks a lot for confirming my suspicions.

One last clarification: The WAL is different from the swapping concept,
correct? I guess it's way faster to swap in a dedicated "dump" than
replaying a WAL.

On Wed, Feb 17, 2016 at 7:53 PM, Joe Witt  wrote:

> Lars,
>
> You are right about the thought process.  We've never provided solid
> guidance here but we should.  It is definitely the case that flow file
> content is streamed to and from the underlying repository and the only
> way to access it is through that API.  Thus well behaved extensions
> and the framework itself can handle basically data as large as the
> underlying repository has space for.  For the flow file attributes
> though these are held in memory in a map with each flowfile object.
> So it is important to avoid having vast (undefined) quantities of
> attributes or attributes with really large (undefined) values.
>
> There are things we can and should do to make even this relatively
> transparent to the users and it is why actually we support swapping
> flowfiles to disk when there are large queues because even those inmem
> attributes can really add up.
>
> Thanks
> Joe
>
> On Wed, Feb 17, 2016 at 11:06 AM, Lars Francke 
> wrote:
> > Hi and sorry for all these questions.
> >
> > I know that FlowFile content is persisted to the content_repository and
> can
> > handle reasonably large amounts of data. Is the same true for attributes?
> >
> > I download JSON files (up to 200kb I'd say) and I want to insert them as
> > they are into a PostgreSQL JSONB column. I'd love to use the PutSQL
> > processor for that but it requires parameters in attributes.
> >
> > I have a feeling that putting large objects in attributes is a bad idea?
>


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Joe Witt
Vincent,

Yeah you're hitting the nail on the head from what we're hearing more
and more.  We have a couple really nice roadmap items to make these
work more like you're doing now.

Thanks
Joe

On Wed, Feb 17, 2016 at 5:27 PM, Vincent Russell
 wrote:
> My team has played around with versioning control with the nifi in the
> following way (we have yet to use this for deployments yet though):
>
> We version control the flow.xml file and all of the config files that need
> to be changed
> We build a distribution of nifi, gziping the flow.xml and string-replacing
> properties in the config files with maven
> We then can install this "version" of our nifi app.
>
> We want to be able to use this to test our flows and processes on our test
> system before making it live in production.  But like I said he have yet to
> actually use this for production deployments.
>
> On Wed, Feb 17, 2016 at 7:21 PM, Jeff - Data Bean Australia
>  wrote:
>>
>> Thanks Matt for describing the feature in such an intuitive way, and
>> pointing out the location for the archive.
>>
>> This looks good. Just wondering whether we also want to archive the
>> templates along with flow.xml.gz.
>>
>> Thanks,
>> Jeff
>>
>> On Thu, Feb 18, 2016 at 11:08 AM, Matthew Clarke
>>  wrote:
>>>
>>> Jeff,
>>>   NiFi gives users the ability to create snapshot backups of their
>>> flow.xml through the "back-up flow" link found under the "controller
>>> settings" (Icon looks like wrench and screwdriver in upper right corner).
>>> The default nifi.properties configuration will write these back-ups to a
>>> directory called archive inside teh /conf directory, but
>>> you can of course change were they are written.
>>>
>>> Matt
>>>
>>> On Wed, Feb 17, 2016 at 4:52 PM, Jeff - Data Bean Australia
>>>  wrote:

 Thanks Oleg for sharing this. They are definitely useful.

 By my question focused more on keeping the data flow definition files'
 versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
 term can keep track of our work.

 Currently I am using the following command line to generate a formatted
 XML to put it into our Git repository:

 cat conf/flow.xml.gz | gzip -dc | xmllint --format -




 On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky
  wrote:
>
> Jeff, what you are describing is in works and actively discussed
> https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
> and
>
> https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements
>
> The last one may not directly speaks to the “ExtensionRegistry”, but if
> you look through he comments there is a whole lot about it since it is
> dependent.
> Feel free to participate, but I can say for now that it is slated for
> 1.0 release.
>
> Cheers
> Oleg
>
> On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia
>  wrote:
>
> Hi,
>
> As my NiFi data flow becomes more and more serious, I need to put on
> Version Control. Since flow.xml.gz is generated automatically and it is
> saved in a compressed file, I am wondering what would be the best practice
> regarding version control?
>
> Thanks,
> Jeff
>
> --
> Data Bean - A Big Data Solution Provider in Australia.
>
>



 --
 Data Bean - A Big Data Solution Provider in Australia.
>>>
>>>
>>
>>
>>
>> --
>> Data Bean - A Big Data Solution Provider in Australia.
>
>


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Vincent Russell
My team has played around with versioning control with the nifi in the
following way (we have yet to use this for deployments yet though):


   - We version control the flow.xml file and all of the config files that
   need to be changed
   - We build a distribution of nifi, gziping the flow.xml and
   string-replacing properties in the config files with maven
   - We then can install this "version" of our nifi app.

We want to be able to use this to test our flows and processes on our test
system before making it live in production.  But like I said he have yet to
actually use this for production deployments.

On Wed, Feb 17, 2016 at 7:21 PM, Jeff - Data Bean Australia <
databean...@gmail.com> wrote:

> Thanks Matt for describing the feature in such an intuitive way, and
> pointing out the location for the archive.
>
> This looks good. Just wondering whether we also want to archive the
> templates along with flow.xml.gz.
>
> Thanks,
> Jeff
>
> On Thu, Feb 18, 2016 at 11:08 AM, Matthew Clarke <
> matt.clarke@gmail.com> wrote:
>
>> Jeff,
>>   NiFi gives users the ability to create snapshot backups of their
>> flow.xml through the "back-up flow" link found under the "controller
>> settings" (Icon looks like wrench and screwdriver in upper right corner).
>> The default nifi.properties configuration will write these back-ups to a
>> directory called archive inside teh /conf directory, but
>> you can of course change were they are written.
>>
>> Matt
>>
>> On Wed, Feb 17, 2016 at 4:52 PM, Jeff - Data Bean Australia <
>> databean...@gmail.com> wrote:
>>
>>> Thanks Oleg for sharing this. They are definitely useful.
>>>
>>> By my question focused more on keeping the data flow definition files'
>>> versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
>>> term can keep track of our work.
>>>
>>> Currently I am using the following command line to generate a formatted
>>> XML to put it into our Git repository:
>>>
>>> cat conf/flow.xml.gz | gzip -dc | xmllint --format -
>>>
>>>
>>>
>>>
>>> On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky <
>>> ozhurakou...@hortonworks.com> wrote:
>>>
 Jeff, what you are describing is in works and actively discussed
 https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
 and

 https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements

 The last one may not directly speaks to the “ExtensionRegistry”, but if
 you look through he comments there is a whole lot about it since it is
 dependent.
 Feel free to participate, but I can say for now that it is slated for
 1.0 release.

 Cheers
 Oleg

 On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia <
 databean...@gmail.com> wrote:

 Hi,

 As my NiFi data flow becomes more and more serious, I need to put on
 Version Control. Since flow.xml.gz is generated automatically and it is
 saved in a compressed file, I am wondering what would be the best practice
 regarding version control?

 Thanks,
 Jeff

 --
 Data Bean - A Big Data Solution Provider in Australia.



>>>
>>>
>>> --
>>> Data Bean - A Big Data Solution Provider in Australia.
>>>
>>
>>
>
>
> --
> Data Bean - A Big Data Solution Provider in Australia.
>


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Joe Witt
Jeff,

"do we have some tool to compare two flow.xml.gz for some subtle changes?"

Unfortunately no.  That is what Oleg was referring to.  We're finding
an increasing number of people that are interested in this sort of
Git/Diff capability so we def need to get some momentum on it.

Making ordering deterministic for the flow and templates should be
pretty doable.  We already have feature proposal/JIRA to go after
this.

Thanks
Joe

On Wed, Feb 17, 2016 at 5:21 PM, Jeff - Data Bean Australia
 wrote:
> Thanks Matt for describing the feature in such an intuitive way, and
> pointing out the location for the archive.
>
> This looks good. Just wondering whether we also want to archive the
> templates along with flow.xml.gz.
>
> Thanks,
> Jeff
>
> On Thu, Feb 18, 2016 at 11:08 AM, Matthew Clarke 
> wrote:
>>
>> Jeff,
>>   NiFi gives users the ability to create snapshot backups of their
>> flow.xml through the "back-up flow" link found under the "controller
>> settings" (Icon looks like wrench and screwdriver in upper right corner).
>> The default nifi.properties configuration will write these back-ups to a
>> directory called archive inside teh /conf directory, but
>> you can of course change were they are written.
>>
>> Matt
>>
>> On Wed, Feb 17, 2016 at 4:52 PM, Jeff - Data Bean Australia
>>  wrote:
>>>
>>> Thanks Oleg for sharing this. They are definitely useful.
>>>
>>> By my question focused more on keeping the data flow definition files'
>>> versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
>>> term can keep track of our work.
>>>
>>> Currently I am using the following command line to generate a formatted
>>> XML to put it into our Git repository:
>>>
>>> cat conf/flow.xml.gz | gzip -dc | xmllint --format -
>>>
>>>
>>>
>>>
>>> On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky
>>>  wrote:

 Jeff, what you are describing is in works and actively discussed
 https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
 and

 https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements

 The last one may not directly speaks to the “ExtensionRegistry”, but if
 you look through he comments there is a whole lot about it since it is
 dependent.
 Feel free to participate, but I can say for now that it is slated for
 1.0 release.

 Cheers
 Oleg

 On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia
  wrote:

 Hi,

 As my NiFi data flow becomes more and more serious, I need to put on
 Version Control. Since flow.xml.gz is generated automatically and it is
 saved in a compressed file, I am wondering what would be the best practice
 regarding version control?

 Thanks,
 Jeff

 --
 Data Bean - A Big Data Solution Provider in Australia.


>>>
>>>
>>>
>>> --
>>> Data Bean - A Big Data Solution Provider in Australia.
>>
>>
>
>
>
> --
> Data Bean - A Big Data Solution Provider in Australia.


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Jeff - Data Bean Australia
Thanks Joe for pointing out the order issue. Given that, I need to
reconsider my approach, because the original thought was to help
facilitating existing version control tools, such as Git, and compare
different versions on the fly. Given the order issue, this approach doesn't
make more sense than simply store the gz file.

In this case, do we have some tool to compare two flow.xml.gz for some
subtle changes? I am sure the UI based auditing is helpful though.

On Thu, Feb 18, 2016 at 11:07 AM, Joe Witt  wrote:

> Jeff
>
> I think what you're doing is just fine for now.  To Oleg's point we
> should make it better.
>
> We do also have a database where each flow change is being written to
> from a audit perspective and so we can show in the UI who made what
> changes last.  That is less about true CM and more about providing a
> meaningful user experience.
>
> The biggest knock for CM of our current flow.xml.gz and for the
> templates is that the order in which their components are serialized
> is not presently guaranteed so it means diff won't be meaningful.  But
> as far as capturing at specific intervals and storing the flow you
> should be in good shape with your approach.
>
> Thanks
> Joe
>
> On Wed, Feb 17, 2016 at 4:52 PM, Jeff - Data Bean Australia
>  wrote:
> > Thanks Oleg for sharing this. They are definitely useful.
> >
> > By my question focused more on keeping the data flow definition files'
> > versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
> > term can keep track of our work.
> >
> > Currently I am using the following command line to generate a formatted
> XML
> > to put it into our Git repository:
> >
> > cat conf/flow.xml.gz | gzip -dc | xmllint --format -
> >
> >
> >
> >
> > On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky
> >  wrote:
> >>
> >> Jeff, what you are describing is in works and actively discussed
> >> https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
> >> and
> >>
> >>
> https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements
> >>
> >> The last one may not directly speaks to the “ExtensionRegistry”, but if
> >> you look through he comments there is a whole lot about it since it is
> >> dependent.
> >> Feel free to participate, but I can say for now that it is slated for
> 1.0
> >> release.
> >>
> >> Cheers
> >> Oleg
> >>
> >> On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia
> >>  wrote:
> >>
> >> Hi,
> >>
> >> As my NiFi data flow becomes more and more serious, I need to put on
> >> Version Control. Since flow.xml.gz is generated automatically and it is
> >> saved in a compressed file, I am wondering what would be the best
> practice
> >> regarding version control?
> >>
> >> Thanks,
> >> Jeff
> >>
> >> --
> >> Data Bean - A Big Data Solution Provider in Australia.
> >>
> >>
> >
> >
> >
> > --
> > Data Bean - A Big Data Solution Provider in Australia.
>



-- 
Data Bean - A Big Data Solution Provider in Australia.


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Joe Witt
Jeff

I think what you're doing is just fine for now.  To Oleg's point we
should make it better.

We do also have a database where each flow change is being written to
from a audit perspective and so we can show in the UI who made what
changes last.  That is less about true CM and more about providing a
meaningful user experience.

The biggest knock for CM of our current flow.xml.gz and for the
templates is that the order in which their components are serialized
is not presently guaranteed so it means diff won't be meaningful.  But
as far as capturing at specific intervals and storing the flow you
should be in good shape with your approach.

Thanks
Joe

On Wed, Feb 17, 2016 at 4:52 PM, Jeff - Data Bean Australia
 wrote:
> Thanks Oleg for sharing this. They are definitely useful.
>
> By my question focused more on keeping the data flow definition files'
> versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
> term can keep track of our work.
>
> Currently I am using the following command line to generate a formatted XML
> to put it into our Git repository:
>
> cat conf/flow.xml.gz | gzip -dc | xmllint --format -
>
>
>
>
> On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky
>  wrote:
>>
>> Jeff, what you are describing is in works and actively discussed
>> https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
>> and
>>
>> https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements
>>
>> The last one may not directly speaks to the “ExtensionRegistry”, but if
>> you look through he comments there is a whole lot about it since it is
>> dependent.
>> Feel free to participate, but I can say for now that it is slated for 1.0
>> release.
>>
>> Cheers
>> Oleg
>>
>> On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia
>>  wrote:
>>
>> Hi,
>>
>> As my NiFi data flow becomes more and more serious, I need to put on
>> Version Control. Since flow.xml.gz is generated automatically and it is
>> saved in a compressed file, I am wondering what would be the best practice
>> regarding version control?
>>
>> Thanks,
>> Jeff
>>
>> --
>> Data Bean - A Big Data Solution Provider in Australia.
>>
>>
>
>
>
> --
> Data Bean - A Big Data Solution Provider in Australia.


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Jeff - Data Bean Australia
Thanks Oleg for sharing this. They are definitely useful.

By my question focused more on keeping the data flow definition files'
versions, so that Data Flow Developers, or NiFi Cluster Manager in NiFi's
term can keep track of our work.

Currently I am using the following command line to generate a formatted XML
to put it into our Git repository:

cat conf/flow.xml.gz | gzip -dc | xmllint --format -




On Thu, Feb 18, 2016 at 10:01 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Jeff, what you are describing is in works and actively discussed
> https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
> and
>
> https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements
>
> The last one may not directly speaks to the “ExtensionRegistry”, but if
> you look through he comments there is a whole lot about it since it is
> dependent.
> Feel free to participate, but I can say for now that it is slated for 1.0
> release.
>
> Cheers
> Oleg
>
> On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia <
> databean...@gmail.com> wrote:
>
> Hi,
>
> As my NiFi data flow becomes more and more serious, I need to put on
> Version Control. Since flow.xml.gz is generated automatically and it is
> saved in a compressed file, I am wondering what would be the best practice
> regarding version control?
>
> Thanks,
> Jeff
>
> --
> Data Bean - A Big Data Solution Provider in Australia.
>
>
>


-- 
Data Bean - A Big Data Solution Provider in Australia.


Re: Version Control on NiFi flow.xml

2016-02-17 Thread Oleg Zhurakousky
Jeff, what you are describing is in works and actively discussed
https://cwiki.apache.org/confluence/display/NIFI/Extension+Registry
and
https://cwiki.apache.org/confluence/display/NIFI/Component+documentation+improvements

The last one may not directly speaks to the “ExtensionRegistry”, but if you 
look through he comments there is a whole lot about it since it is dependent.
Feel free to participate, but I can say for now that it is slated for 1.0 
release.

Cheers
Oleg

On Feb 17, 2016, at 3:08 PM, Jeff - Data Bean Australia 
> wrote:

Hi,

As my NiFi data flow becomes more and more serious, I need to put on Version 
Control. Since flow.xml.gz is generated automatically and it is saved in a 
compressed file, I am wondering what would be the best practice regarding 
version control?

Thanks,
Jeff

--
Data Bean - A Big Data Solution Provider in Australia.



Version Control on NiFi flow.xml

2016-02-17 Thread Jeff - Data Bean Australia
Hi,

As my NiFi data flow becomes more and more serious, I need to put on
Version Control. Since flow.xml.gz is generated automatically and it is
saved in a compressed file, I am wondering what would be the best practice
regarding version control?

Thanks,
Jeff

-- 
Data Bean - A Big Data Solution Provider in Australia.


Re: Maximum attribute size

2016-02-17 Thread Joe Witt
Lars,

You are right about the thought process.  We've never provided solid
guidance here but we should.  It is definitely the case that flow file
content is streamed to and from the underlying repository and the only
way to access it is through that API.  Thus well behaved extensions
and the framework itself can handle basically data as large as the
underlying repository has space for.  For the flow file attributes
though these are held in memory in a map with each flowfile object.
So it is important to avoid having vast (undefined) quantities of
attributes or attributes with really large (undefined) values.

There are things we can and should do to make even this relatively
transparent to the users and it is why actually we support swapping
flowfiles to disk when there are large queues because even those inmem
attributes can really add up.

Thanks
Joe

On Wed, Feb 17, 2016 at 11:06 AM, Lars Francke  wrote:
> Hi and sorry for all these questions.
>
> I know that FlowFile content is persisted to the content_repository and can
> handle reasonably large amounts of data. Is the same true for attributes?
>
> I download JSON files (up to 200kb I'd say) and I want to insert them as
> they are into a PostgreSQL JSONB column. I'd love to use the PutSQL
> processor for that but it requires parameters in attributes.
>
> I have a feeling that putting large objects in attributes is a bad idea?


Re: Nifi 'as a service'?

2016-02-17 Thread Bryan Bende
Keaton,

You can definitely build a REST service in NiFi! I would take a look at
HandleHttpRequest and HandleHttpResponse.

HandleHttpRequest would be the entry point of your service, the FlowFiles
coming out of this processor would represent the request being made, you
can then perform whatever logic you need and send a response back with
HandleHttpResponse.

Let us know if that doesn't make sense.

Thanks,

Bryan

[1]
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpRequest/index.html
[2]
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpResponse/index.html

On Wed, Feb 17, 2016 at 12:58 PM, Keaton Cleve 
wrote:

> Hi,
>
> Would it be possible to use Nifi 'as a service'? If yes what would be the
> best pattern?
>
> Here is what I have in mind:
>
> I would like to setup a template with different possible predetermined
> destinations. But instead of having predefined sources that I would query
> with a cron like GetFile or GetHDFS, I would like to have a REST API as an
> entry point and users can request to copy a file from any directory into
> one or several of the predetermined destination (this would involve some
> routing to the correct processor I guess). The REST API would only support
> to specify sources which the templates allows, but the specific directory /
> file would be dynamic.
>
> Does that make any sense?


Re: Generate URL based on different conditions

2016-02-17 Thread Jeff - Data Bean Australia
Thank you Matt and Joe for your help.

On Wed, Feb 17, 2016 at 4:22 PM, Matt Burgess  wrote:

> Here's a Gist template that uses Joe's approach of RouteOnAttribute then
> UpdateAttribute to generate URLs with the use case you described:
> https://gist.github.com/mattyb149/8fd87efa1338a70c
>
> On Tue, Feb 16, 2016 at 9:51 PM, Joe Witt  wrote:
>
>> Jeff,
>>
>> For each of the input files could it be that you would pull data from
>> multiple URLs?
>>
>> Have you had a chance to learn about the NiFi Expression language?
>> That will come in quite handy for constructing the URL used in
>> InvokeHTTP.
>>
>> The general pattern I think makes sense here is:
>> - Gather Data
>> - Extract Features from data to construct URL
>> - Fetch document/response from URL
>>
>> During 'Gather Data' you acquire the files.
>>
>> During 'Extract features' you pull out elements of the content of the
>> file into flow file attributes.  You can use RouteOnAttribute to send
>> to an UpdateAttribute processor which constructs a new attribute of
>> URL pattern A or URL pattern B respectively.  You can also collapse
>> that into a single UpdateAttribute possibly using the advanced UI and
>> set specific URLs based on patterns of attributes.  Lots of ways to
>> slice that.
>>
>> During Fetch document you should be able to just have a single
>> InvokeHTTP potentially which looks at some attribute you've defined
>> say 'the-url' and specify in InvokeHTTP the remote URL value to be
>> "${the-url}"
>>
>> We should publish a template for this pattern/approach if we've not
>> already but let's see how you progress and decide what would be most
>> useful for others.
>>
>> Thanks
>> Joe
>>
>> On Tue, Feb 16, 2016 at 9:36 PM, Jeff - Data Bean Australia
>>  wrote:
>> > Hi,
>> >
>> > I got a use case like this:
>> >
>> > There are two files, say fileA and fileB, both of them contains multiple
>> > lines of items and used for generate URLs. However, the algorithm for
>> > generating URLs are different. If items come from fileA, the URL
>> template
>> > looks like this:
>> >
>> > foo--foo
>> >
>> > If items come from fileB, the template looks like this:
>> >
>> > bar--foo--whatever
>> >
>> > I am going to create a NiFi template to for the Data Flow from reading
>> the
>> > list file up to downloading data using InvokeHTTP, and place a
>> > UpdateAttribute processor in front of the template to feed in different
>> file
>> > names (I have only two files).
>> >
>> > The problem I have so far is how to generate the URLs based on different
>> > input, so that I can make a general NiFi template for reusability.
>> >
>> > Thanks,
>> > Jeff
>> >
>> >
>> >
>> > --
>> > Data Bean - A Big Data Solution Provider in Australia.
>>
>
>


-- 
Data Bean - A Big Data Solution Provider in Australia.


Nifi 'as a service'?

2016-02-17 Thread Keaton Cleve
Hi,

Would it be possible to use Nifi 'as a service'? If yes what would be the best 
pattern?

Here is what I have in mind:

I would like to setup a template with different possible predetermined 
destinations. But instead of having predefined sources that I would query with 
a cron like GetFile or GetHDFS, I would like to have a REST API as an entry 
point and users can request to copy a file from any directory into one or 
several of the predetermined destination (this would involve some routing to 
the correct processor I guess). The REST API would only support to specify 
sources which the templates allows, but the specific directory / file would be 
dynamic.

Does that make any sense?