Re: nifi questions

2017-07-05 Thread Clay Teahouse
Thank you for very helpful feedback.

On Wed, Jul 5, 2017 at 1:46 PM, Joe Witt  wrote:

> ah!  Good call Mike and thanks for adding that.
>
> On Wed, Jul 5, 2017 at 2:21 PM, Michael Hogue
>  wrote:
> > Clay,
> >
> >  Regarding number one, Joe is correct. There current isn't a processor
> that
> > can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC
> were
> > recently added (targeting 1.4.0) that can accept and send gRPC messages
> > (which wrap protobuf) defined by an IDL [1].
> >
> > There's a how-to article with a few examples:
> > https://cwiki.apache.org/confluence/display/NIFI/
> Leveraging+gRPC+Processors
> >
> > It's hard for me to say if either would meet your use case, but i thought
> > i'd at least mention their existence here.
> >
> > Thanks,
> > Mike
> >
> > [1]
> > https://github.com/apache/nifi/tree/master/nifi-nar-
> bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto
> >
> > On Tue, Jul 4, 2017 at 12:08 PM Joe Witt  wrote:
> >
> >> Clay
> >>
> >> Here some answers to each.  Happy to discuss further.
> >>
> >> #1) No processors exist in the apache nifi codebase to receive or send
> >> data using google protobuf that I know of right now.  This could work
> >> very well with our record oriented format and schema aware
> >> readers/writers though so perhaps it would be a good mode to offer.
> >> If you're interested in contributing in this area please let us know.
> >>
> >> #2) All of our processors that accept messages such as Kafka, MQTT,
> >> amqp, jms, etc.. can bring in any format/schema of data.  They are
> >> effectively just bringing in binary data.  When it comes to
> >> routing/transforming/etc.. that is when it really matters about being
> >> format/schema aware.  We have built in support already for a number of
> >> formats/schemas but more importantly with the recent 1.2.0/1.3.0
> >> releases we've added this record concept I've mentioned.  This lets us
> >> have a series of common patterns/processors to handle records as a
> >> concept and then plugin in various readers/writers which understand
> >> the specifics of serialization/deserialization.  So, it would be easy
> >> to extend the various methods of acquiring and delivering data for
> >> whatever record oriented data you have.  For now though with regard to
> >> protobuf I dont think we have anything out of the box.
> >>
> >> #3) I think my answer in #2 will help and I strongly encourage you to
> >> take a look here [1] and here [2]. In short, yes we offer a ton of
> >> flexibility in how you handle record oriented data.  As is generally
> >> the case in NiFi you should get a great deal of reuse out of existing
> >> capabilities with minimal need to customize.
> >>
> >> #4) I dont know how NiFi compares in performance to Apache Kafka's
> >> Connect concept or to any other project in a generic sense.  What we
> >> know is what NiFi is designed for and the use cases it is used
> >> against.  NiFi and Kafka Connect have very different execution models.
> >> With NiFi for common record oriented use cases including format and
> >> schema aware acquisition, routing, enrichment, conversion, and
> >> delivery of data achieving hundreds of thousands of records per second
> >> throughput is straightforward while also running a number of other
> >> flows on structured and unstructured data as well.  Just depends on
> >> your configuration, needs, and what the appropriate execution model
> >> is.  NiFi offers you a data broker in which you put the logic for how
> >> to handle otherwise decoupled producers and consumers.  Driving data
> >> into out and of out of Kafka with NiFi is very common.
> >>
> >> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
> >> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> >> [3]
> >> http://bryanbende.com/development/2017/06/20/apache-
> nifi-records-and-schema-registries
> >>
> >> Thanks
> >> Joe
> >>
> >> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse 
> >> wrote:
> >> > Hello All,
> >> >
> >> > I am new to nifi. I'd appreciate your help with some questions.
> >> >
> >> > 1) Is there a  processor like TCP Listen that would work with protobuf
> >> > messages? In particular, I am interested in processing protobuf
> messages
> >> > prefixed with the length of the message. I
> >> > 2) Is there a  processor like Consume MQTT that would work with
> protobuf
> >> > messages?
> >> > 3) As a general question, as with kafka connect, does nifi have a
> means
> >> for
> >> > specifying the converters, declaratively,  or do I need to write a
> >> separate
> >> > processor for each converter?
> >> > 4) How does nifi compare to kafka connect, in terms of performance?
> >> >
> >> > thanks
> >> > Clay
> >>
>


Re: Filesystem based Schema Registry, does it make sense?

2017-07-05 Thread Andre
Joe,

At least in my case, I would be happy to control versioning outside nifi
(eg via git).

I would suspect that by adopting an approach like this versioning would be
handled pretty much like the AvroSchemaRegistry?

I assumed that with AvroSchemaRegistry you either name your schema with
versioning in mind (ie mySchemav1)  or you need to overwrite the schema
upon change? Am I missing something?

Cheers

On 6 Jul 2017 1:59 AM, "Joe Witt"  wrote:

I think it does make sense and someone at a meetup asked a similar
question.  There are some things to be considered like how does one
annotate the version of a schema, the name, etc.. when all they are
providing are files in a directory?  How can they support multiple versions
of a given schema (or maybe they just dont in this approach)?  But there is
no question that being able to just push an avsc file into a directory and
then have it be useable in the flow could be helpful.

On Jul 5, 2017 9:00 AM, "Andre"  wrote:

dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?


UI Not Responsive

2017-07-05 Thread Karthik Kothareddy (karthikk) [CONT - Type 2]
All,

I am currently running NiFi 1.2.0 on a Linux(RHEL) machine. Everything was 
running fine until yesterday where it started behaving weird. The UI is not at 
all responsive and sometimes the page wouldn't even load. I bumped up the Java 
Heap Space just to make sure I'm not overloading the system to a point where I 
cannot load the UI. I looked at the nifi-bootstrap.log and it says the instance 
is running

INFO [main] org.apache.nifi.bootstrap.Command Apache NiFi is currently running, 
listening to Bootstrap on port 54224, PID=41602

However, in the nifi-app.log I find something interesting, the trace is as 
below and I keep getting this warning almost every 5-10 seconds in the log

2017-07-05 20:00:02,032 WARN [NiFi Web 
Server-16-acceptor-0@6fecd17b-ServerConnector@a852{SSL,[ssl, 
http/1.1]}{server-name:8443}] o.eclipse.jetty.server.AbstractConnector
java.nio.channels.ClosedSelectorException: null
   at sun.nio.ch.SelectorImpl.keys(SelectorImpl.java:68) 
~[na:1.8.0_101]
   at 
org.eclipse.jetty.io.ManagedSelector.size(ManagedSelector.java:104) 
~[jetty-io-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.io.SelectorManager.chooseSelector(SelectorManager.java:190) 
~[jetty-io-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.io.SelectorManager.accept(SelectorManager.java:232) 
~[jetty-io-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.io.SelectorManager.accept(SelectorManager.java:217) 
~[jetty-io-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.server.ServerConnector.accepted(ServerConnector.java:383) 
~[jetty-server-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:374) 
~[jetty-server-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:593)
 ~[jetty-server-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
 [jetty-util-9.3.9.v20160517.jar:9.3.9.v20160517]
   at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) 
[jetty-util-9.3.9.v20160517.jar:9.3.9.v20160517]
   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

Does anyone have similar problems in launching the UI? any help around this is 
appreciated.


Thanks
Karthik






Re: evaluateJsonPath processor

2017-07-05 Thread Yuri Krysko
Hi Matt,

Thanks. ³$" works well.

On 7/5/17, 4:09 PM, "Matt Burgess"  wrote:

>Yuri,
>
>If your Return Type is set to "json" then you should be able to use
>"$" as the JSON Path rather than "$."
>
>Also to put the entire content into an attribute you could use
>ExtractText to match the entire body, this works whether the file is
>JSON or not.
>
>Regards,
>Matt
>
>On Wed, Jul 5, 2017 at 4:02 PM, Yuri Krysko 
>wrote:
>> Hello Everyone,
>>
>> Could anyone please advise why $. expression is considered invalid in
>>EvaluateJsonPath processor. The idea is to copy the entire content of a
>>JSON-formatted flowfile into its attribute.
>>
>> Thanks,
>> Yuri
>>
>> 
>>
>> LEGAL DISCLAIMER: M.C. Dean, Inc. and its subsidiaries considers this
>>e-mail and any files transmitted with it to be protected, proprietary or
>>privileged information intended solely for the use of the named
>>recipient(s). Any disclosure of this material or the information
>>contained herein, in whole or in part, to anyone outside of the intended
>>recipient or affiliates is strictly prohibited. M. C. Dean, Inc. accepts
>>no liability for the content of this e-mail or for the consequences of
>>any actions taken on the basis of the information contained in it,
>>unless that information is subsequently confirmed in writing. Employees
>>of M.C. Dean, Inc. are instructed not to infringe on any rights of the
>>recipient; any such communication violates company policy. If you are
>>not the intended recipient, any disclosure, copying, distribution, or
>>action taken or omitted in reliance on this information is strictly
>>prohibited by M.C. Dean, Inc.; please notify the sender immediately by
>>return e-mail, delete this communication and destroy all copies.




LEGAL DISCLAIMER: M.C. Dean, Inc. and its subsidiaries considers this e-mail 
and any files transmitted with it to be protected, proprietary or privileged 
information intended solely for the use of the named recipient(s). Any 
disclosure of this material or the information contained herein, in whole or in 
part, to anyone outside of the intended recipient or affiliates is strictly 
prohibited. M. C. Dean, Inc. accepts no liability for the content of this 
e-mail or for the consequences of any actions taken on the basis of the 
information contained in it, unless that information is subsequently confirmed 
in writing. Employees of M.C. Dean, Inc. are instructed not to infringe on any 
rights of the recipient; any such communication violates company policy. If you 
are not the intended recipient, any disclosure, copying, distribution, or 
action taken or omitted in reliance on this information is strictly prohibited 
by M.C. Dean, Inc.; please notify the sender immediately by return e-mail, 
delete this communication and destroy all copies.


Re: evaluateJsonPath processor

2017-07-05 Thread Matt Burgess
Yuri,

If your Return Type is set to "json" then you should be able to use
"$" as the JSON Path rather than "$."

Also to put the entire content into an attribute you could use
ExtractText to match the entire body, this works whether the file is
JSON or not.

Regards,
Matt

On Wed, Jul 5, 2017 at 4:02 PM, Yuri Krysko  wrote:
> Hello Everyone,
>
> Could anyone please advise why $. expression is considered invalid in 
> EvaluateJsonPath processor. The idea is to copy the entire content of a 
> JSON-formatted flowfile into its attribute.
>
> Thanks,
> Yuri
>
> 
>
> LEGAL DISCLAIMER: M.C. Dean, Inc. and its subsidiaries considers this e-mail 
> and any files transmitted with it to be protected, proprietary or privileged 
> information intended solely for the use of the named recipient(s). Any 
> disclosure of this material or the information contained herein, in whole or 
> in part, to anyone outside of the intended recipient or affiliates is 
> strictly prohibited. M. C. Dean, Inc. accepts no liability for the content of 
> this e-mail or for the consequences of any actions taken on the basis of the 
> information contained in it, unless that information is subsequently 
> confirmed in writing. Employees of M.C. Dean, Inc. are instructed not to 
> infringe on any rights of the recipient; any such communication violates 
> company policy. If you are not the intended recipient, any disclosure, 
> copying, distribution, or action taken or omitted in reliance on this 
> information is strictly prohibited by M.C. Dean, Inc.; please notify the 
> sender immediately by return e-mail, delete this communication and destroy 
> all copies.


evaluateJsonPath processor

2017-07-05 Thread Yuri Krysko
Hello Everyone,

Could anyone please advise why $. expression is considered invalid in 
EvaluateJsonPath processor. The idea is to copy the entire content of a 
JSON-formatted flowfile into its attribute.

Thanks,
Yuri



LEGAL DISCLAIMER: M.C. Dean, Inc. and its subsidiaries considers this e-mail 
and any files transmitted with it to be protected, proprietary or privileged 
information intended solely for the use of the named recipient(s). Any 
disclosure of this material or the information contained herein, in whole or in 
part, to anyone outside of the intended recipient or affiliates is strictly 
prohibited. M. C. Dean, Inc. accepts no liability for the content of this 
e-mail or for the consequences of any actions taken on the basis of the 
information contained in it, unless that information is subsequently confirmed 
in writing. Employees of M.C. Dean, Inc. are instructed not to infringe on any 
rights of the recipient; any such communication violates company policy. If you 
are not the intended recipient, any disclosure, copying, distribution, or 
action taken or omitted in reliance on this information is strictly prohibited 
by M.C. Dean, Inc.; please notify the sender immediately by return e-mail, 
delete this communication and destroy all copies.


Re: Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Scott Wagner

To provide closure to this thread:

Bryan, the exception indeed was being thrown there, but the root cause 
of that was because "Ghost" processors were being created in the place 
of the actual processors because my custom NAR was confusing 
BuildUtils.findBundleForType.


Joe, thank you very much for the suggestion of what to look into.  We 
had copied most of the InvokeHTTP processor into our NAR with some 
modifications, but it was still including references to classes in 
nifi-standard-processors which was being dragged in through the pom.  I 
was able to extricate the 2 classes I need and copy them into my own 
project to at least get NiFi up and running; while this is not the 
correct permanent solution, I was able to move past the startup problems.


Thanks for being part of such a great and responsive community.

- Scott


Bryan Bende 
Wednesday, July 5, 2017 12:07 PM
Scott,

Thanks for providing the stacktrace... do any of your custom
processors use the @DefaultSchedule annotation? and if so, do any of
them set the tasks to a number less than 1?

The exception you are getting is from some code that is preventing
using 0 or negative number of tasks for a processor that is not
scheduled for event driven, basically meaning event driven is the only
one where it would make sense to have less than 1 task.

What Joe mentioned about including other standard processors in your
NAR could very well still be a problem, but that stacktrace might be
something else.

-Bryan

Scott Wagner 
Wednesday, July 5, 2017 11:44 AM
Hi Joe,

We are extending AbstractProcessor for our processors, and 
AbstractControllerService for our controller service.  However, we did 
include the InvokeHTTP processor with some modifications that are 
referencing some other classes that are in the 
nifi-processors-standard JAR.  I will look into breaking those out to 
remove that dependency.


The actual error that we are getting is below:

2017-07-04 12:28:29,076 WARN [main] 
org.apache.nifi.web.server.JettyServer Failed to start web server... 
shutting down.
org.apache.nifi.controller.serialization.FlowSynchronizationException: 
java.lang.IllegalArgumentException
at 
org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:426)
at 
org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1576)
at 
org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:84)
at 
org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:722)
at 
org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:533)
at 
org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
at 
org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:876)
at 
org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:532)
at 
org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:839)
at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:344)
at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1480)
at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1442)
at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:799)
at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261)
at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:540)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:290)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)

at org.eclipse.jetty.server.Server.start(Server.java:452)
at 

Re: MiNiFi & SSL Context

2017-07-05 Thread davidrsmith
Marc

Thanks for the reply, yes, I tried sending from MiNiFi to NiFi using a http 
post (using usename/password). In opetations I would need to use ssl, but I 
know in NiFi the sslcontext is setup in the UI and stored in the flow.xml.gz 
file. As I created my flow in NiFi and then templated it before transforming it 
to a yml file.
I just wondered how I would get the sslcontext set up in MiNiFi, or did I need 
to set it up in NiFi and again transfer it across.
I will follow the links you posted and give it a try.

Many thanks 
Dave




Sent from Samsung tablet

 Original message 
From Marc  
Date: 05/07/2017  16:59  (GMT+00:00) 
To dev@nifi.apache.org,DAVID SMITH  
Subject Re: MiNiFi & SSL Context 
 
Hello David,
   I'm making the assumption here that you are attempting to use an
SSLContextService within a consumable suck as InvokeHTTP:

   The configuration for MiNiFi Java will be very similar to that of NiFi.
The System guide [1] for MiNiFi references the NiFi System administration
guide's security configuration section [2]. As per the MiNIFi guide:
    "A StandardSSLContextService will be made automatically with the ID
"SSL-Context-Service" if "ssl protocol" is configured."

    Once these properties are defined you can use the
StandrdSSLContextService.

   Am I off base in my assumption? Are you attempting to do something else?
Please let me know if I can provide more details. Thanks!

[1] https://nifi.apache.org/minifi/system-admin-guide.html
[2] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#
security-configuration

  Thanks,
  Marc


On Wed, Jul 5, 2017 at 7:35 AM, DAVID SMITH 
wrote:

> Hi
> I tried the Java version of MiNiFi yesterday and was very impressed, I do
> however have a question.
> What is the best way to set up and SSL context in MiNiFi so that I can do
> HTTPS rather than HTTP?
> Many thanksDave


Re: nifi questions

2017-07-05 Thread Joe Witt
ah!  Good call Mike and thanks for adding that.

On Wed, Jul 5, 2017 at 2:21 PM, Michael Hogue
 wrote:
> Clay,
>
>  Regarding number one, Joe is correct. There current isn't a processor that
> can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC were
> recently added (targeting 1.4.0) that can accept and send gRPC messages
> (which wrap protobuf) defined by an IDL [1].
>
> There's a how-to article with a few examples:
> https://cwiki.apache.org/confluence/display/NIFI/Leveraging+gRPC+Processors
>
> It's hard for me to say if either would meet your use case, but i thought
> i'd at least mention their existence here.
>
> Thanks,
> Mike
>
> [1]
> https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto
>
> On Tue, Jul 4, 2017 at 12:08 PM Joe Witt  wrote:
>
>> Clay
>>
>> Here some answers to each.  Happy to discuss further.
>>
>> #1) No processors exist in the apache nifi codebase to receive or send
>> data using google protobuf that I know of right now.  This could work
>> very well with our record oriented format and schema aware
>> readers/writers though so perhaps it would be a good mode to offer.
>> If you're interested in contributing in this area please let us know.
>>
>> #2) All of our processors that accept messages such as Kafka, MQTT,
>> amqp, jms, etc.. can bring in any format/schema of data.  They are
>> effectively just bringing in binary data.  When it comes to
>> routing/transforming/etc.. that is when it really matters about being
>> format/schema aware.  We have built in support already for a number of
>> formats/schemas but more importantly with the recent 1.2.0/1.3.0
>> releases we've added this record concept I've mentioned.  This lets us
>> have a series of common patterns/processors to handle records as a
>> concept and then plugin in various readers/writers which understand
>> the specifics of serialization/deserialization.  So, it would be easy
>> to extend the various methods of acquiring and delivering data for
>> whatever record oriented data you have.  For now though with regard to
>> protobuf I dont think we have anything out of the box.
>>
>> #3) I think my answer in #2 will help and I strongly encourage you to
>> take a look here [1] and here [2]. In short, yes we offer a ton of
>> flexibility in how you handle record oriented data.  As is generally
>> the case in NiFi you should get a great deal of reuse out of existing
>> capabilities with minimal need to customize.
>>
>> #4) I dont know how NiFi compares in performance to Apache Kafka's
>> Connect concept or to any other project in a generic sense.  What we
>> know is what NiFi is designed for and the use cases it is used
>> against.  NiFi and Kafka Connect have very different execution models.
>> With NiFi for common record oriented use cases including format and
>> schema aware acquisition, routing, enrichment, conversion, and
>> delivery of data achieving hundreds of thousands of records per second
>> throughput is straightforward while also running a number of other
>> flows on structured and unstructured data as well.  Just depends on
>> your configuration, needs, and what the appropriate execution model
>> is.  NiFi offers you a data broker in which you put the logic for how
>> to handle otherwise decoupled producers and consumers.  Driving data
>> into out and of out of Kafka with NiFi is very common.
>>
>> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
>> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
>> [3]
>> http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
>>
>> Thanks
>> Joe
>>
>> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse 
>> wrote:
>> > Hello All,
>> >
>> > I am new to nifi. I'd appreciate your help with some questions.
>> >
>> > 1) Is there a  processor like TCP Listen that would work with protobuf
>> > messages? In particular, I am interested in processing protobuf messages
>> > prefixed with the length of the message. I
>> > 2) Is there a  processor like Consume MQTT that would work with protobuf
>> > messages?
>> > 3) As a general question, as with kafka connect, does nifi have a means
>> for
>> > specifying the converters, declaratively,  or do I need to write a
>> separate
>> > processor for each converter?
>> > 4) How does nifi compare to kafka connect, in terms of performance?
>> >
>> > thanks
>> > Clay
>>


Re: nifi questions

2017-07-05 Thread Michael Hogue
Clay,

 Regarding number one, Joe is correct. There current isn't a processor that
can process arbitrary protobuf messages, but InvokeGRPC and ListenGRPC were
recently added (targeting 1.4.0) that can accept and send gRPC messages
(which wrap protobuf) defined by an IDL [1].

There's a how-to article with a few examples:
https://cwiki.apache.org/confluence/display/NIFI/Leveraging+gRPC+Processors

It's hard for me to say if either would meet your use case, but i thought
i'd at least mention their existence here.

Thanks,
Mike

[1]
https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-grpc-bundle/nifi-grpc-processors/src/main/resources/proto

On Tue, Jul 4, 2017 at 12:08 PM Joe Witt  wrote:

> Clay
>
> Here some answers to each.  Happy to discuss further.
>
> #1) No processors exist in the apache nifi codebase to receive or send
> data using google protobuf that I know of right now.  This could work
> very well with our record oriented format and schema aware
> readers/writers though so perhaps it would be a good mode to offer.
> If you're interested in contributing in this area please let us know.
>
> #2) All of our processors that accept messages such as Kafka, MQTT,
> amqp, jms, etc.. can bring in any format/schema of data.  They are
> effectively just bringing in binary data.  When it comes to
> routing/transforming/etc.. that is when it really matters about being
> format/schema aware.  We have built in support already for a number of
> formats/schemas but more importantly with the recent 1.2.0/1.3.0
> releases we've added this record concept I've mentioned.  This lets us
> have a series of common patterns/processors to handle records as a
> concept and then plugin in various readers/writers which understand
> the specifics of serialization/deserialization.  So, it would be easy
> to extend the various methods of acquiring and delivering data for
> whatever record oriented data you have.  For now though with regard to
> protobuf I dont think we have anything out of the box.
>
> #3) I think my answer in #2 will help and I strongly encourage you to
> take a look here [1] and here [2]. In short, yes we offer a ton of
> flexibility in how you handle record oriented data.  As is generally
> the case in NiFi you should get a great deal of reuse out of existing
> capabilities with minimal need to customize.
>
> #4) I dont know how NiFi compares in performance to Apache Kafka's
> Connect concept or to any other project in a generic sense.  What we
> know is what NiFi is designed for and the use cases it is used
> against.  NiFi and Kafka Connect have very different execution models.
> With NiFi for common record oriented use cases including format and
> schema aware acquisition, routing, enrichment, conversion, and
> delivery of data achieving hundreds of thousands of records per second
> throughput is straightforward while also running a number of other
> flows on structured and unstructured data as well.  Just depends on
> your configuration, needs, and what the appropriate execution model
> is.  NiFi offers you a data broker in which you put the logic for how
> to handle otherwise decoupled producers and consumers.  Driving data
> into out and of out of Kafka with NiFi is very common.
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
> [2] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> [3]
> http://bryanbende.com/development/2017/06/20/apache-nifi-records-and-schema-registries
>
> Thanks
> Joe
>
> On Tue, Jul 4, 2017 at 11:21 AM, Clay Teahouse 
> wrote:
> > Hello All,
> >
> > I am new to nifi. I'd appreciate your help with some questions.
> >
> > 1) Is there a  processor like TCP Listen that would work with protobuf
> > messages? In particular, I am interested in processing protobuf messages
> > prefixed with the length of the message. I
> > 2) Is there a  processor like Consume MQTT that would work with protobuf
> > messages?
> > 3) As a general question, as with kafka connect, does nifi have a means
> for
> > specifying the converters, declaratively,  or do I need to write a
> separate
> > processor for each converter?
> > 4) How does nifi compare to kafka connect, in terms of performance?
> >
> > thanks
> > Clay
>


Re: Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Bryan Bende
Scott,

Thanks for providing the stacktrace... do any of your custom
processors use the @DefaultSchedule annotation? and if so, do any of
them set the tasks to a number less than 1?

The exception you are getting is from some code that is preventing
using 0 or negative number of tasks for a processor that is not
scheduled for event driven, basically meaning event driven is the only
one where it would make sense to have less than 1 task.

What Joe mentioned about including other standard processors in your
NAR could very well still be a problem, but that stacktrace might be
something else.

-Bryan


On Wed, Jul 5, 2017 at 12:44 PM, Scott Wagner  wrote:
> Hi Joe,
>
> We are extending AbstractProcessor for our processors, and
> AbstractControllerService for our controller service.  However, we did
> include the InvokeHTTP processor with some modifications that are
> referencing some other classes that are in the nifi-processors-standard JAR.
> I will look into breaking those out to remove that dependency.
>
> The actual error that we are getting is below:
>
> 2017-07-04 12:28:29,076 WARN [main] org.apache.nifi.web.server.JettyServer
> Failed to start web server... shutting down.
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> java.lang.IllegalArgumentException
> at
> org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:426)
> at
> org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1576)
> at
> org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:84)
> at
> org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:722)
> at
> org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:533)
> at
> org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
> at
> org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:876)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:532)
> at
> org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:839)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:344)
> at
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1480)
> at
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1442)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:799)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261)
> at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:540)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at
> org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:290)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at org.eclipse.jetty.server.Server.start(Server.java:452)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at org.eclipse.jetty.server.Server.doStart(Server.java:419)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.apache.nifi.web.server.JettyServer.start(JettyServer.java:705)
> at org.apache.nifi.NiFi.(NiFi.java:160)
> at org.apache.nifi.NiFi.main(NiFi.java:267)
> Caused by: java.lang.IllegalArgumentException: null
> at
> org.apache.nifi.controller.StandardProcessorNode.setMaxConcurrentTasks(StandardProcessorNode.java:620)
> at
> org.apache.nifi.controller.StandardFlowSynchronizer.updateProcessor(StandardFlowSynchronizer.java:985)
> at
> 

Re: Communication between flows

2017-07-05 Thread Bryan Bende
Steve,

In 1.2.0 there were some new processors added called Wait/Notify...

With those you could send your original JSON (before splitting) to a
Wait processor and tell it to wait until the signal count is equal to
the number of splits, then you could put a Notify processor right
after PutMongo connected to the success relationship. For example, if
100 JSON documents get split out, the Wait processor is waiting for
100 signals or until it times out, and signals are only sent after
successful insertion to Mongo. You can checkout this blog for an
example [1].

In 1.1.2, you might be able to put a MergeContent processor configured
to run in defragment mode connected to the success relationship of
PutMongo. Defragment mode is used to undo the splitting that was done
by an unpstream processors, so it will only defragment and merged back
together if all the fragments made it through.

-Bryan

[1] https://ijokarumawak.github.io/nifi/2017/02/02/nifi-notify-batch/


On Wed, Jul 5, 2017 at 12:46 PM, Byers, Steven K (Steve) CTR USARMY
MEDCOM JMLFDC (US)  wrote:
> Is there a mechanism or technique for communicating the results of a flow 
> file to its "sister" flow files?
>
> Here is a high-level description of what I am doing:
>
> Input to my flow is a JSON array of documents that get split (SplitJson) into 
> individual documents and each document becomes a distinct flow file.  Each 
> document (flow file) gets validated against a JSON schema (ValidateJson) then 
> gets updated into a Mongo collection (PutMongoUpdate).  At the end of all 
> this, I want to do some post processing but only if all documents processed 
> successfully.
>
> When a failure occurs (either in the validation or the Mongo update) is there 
> a way to communicate that to the success branch of the flow process so a 
> decision can be made about whether to proceed to post processing or not.
>
> I am using NiFi 1.1.2
>
> Thank you for any guidance you can offer,
>
> Steve
>
>
>


Communication between flows

2017-07-05 Thread Byers, Steven K (Steve) CTR USARMY MEDCOM JMLFDC (US)
Is there a mechanism or technique for communicating the results of a flow file 
to its "sister" flow files?

Here is a high-level description of what I am doing:

Input to my flow is a JSON array of documents that get split (SplitJson) into 
individual documents and each document becomes a distinct flow file.  Each 
document (flow file) gets validated against a JSON schema (ValidateJson) then 
gets updated into a Mongo collection (PutMongoUpdate).  At the end of all this, 
I want to do some post processing but only if all documents processed 
successfully.  

When a failure occurs (either in the validation or the Mongo update) is there a 
way to communicate that to the success branch of the flow process so a decision 
can be made about whether to proceed to post processing or not.

I am using NiFi 1.1.2

Thank you for any guidance you can offer,

Steve





Re: Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Scott Wagner

Hi Joe,

We are extending AbstractProcessor for our processors, and 
AbstractControllerService for our controller service.  However, we did 
include the InvokeHTTP processor with some modifications that are 
referencing some other classes that are in the nifi-processors-standard 
JAR.  I will look into breaking those out to remove that dependency.


The actual error that we are getting is below:

2017-07-04 12:28:29,076 WARN [main] 
org.apache.nifi.web.server.JettyServer Failed to start web server... 
shutting down.
org.apache.nifi.controller.serialization.FlowSynchronizationException: 
java.lang.IllegalArgumentException
at 
org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:426)
at 
org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1576)
at 
org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:84)
at 
org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:722)
at 
org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:533)
at 
org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
at 
org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:876)
at 
org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:532)
at 
org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:839)
at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:344)
at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1480)
at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1442)
at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:799)
at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261)
at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:540)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:290)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)

at org.eclipse.jetty.server.Server.start(Server.java:452)
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)

at org.eclipse.jetty.server.Server.doStart(Server.java:419)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at 
org.apache.nifi.web.server.JettyServer.start(JettyServer.java:705)

at org.apache.nifi.NiFi.(NiFi.java:160)
at org.apache.nifi.NiFi.main(NiFi.java:267)
Caused by: java.lang.IllegalArgumentException: null
at 
org.apache.nifi.controller.StandardProcessorNode.setMaxConcurrentTasks(StandardProcessorNode.java:620)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.updateProcessor(StandardFlowSynchronizer.java:985)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.addProcessGroup(StandardFlowSynchronizer.java:1064)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.addProcessGroup(StandardFlowSynchronizer.java:1183)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.addProcessGroup(StandardFlowSynchronizer.java:1183)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.addProcessGroup(StandardFlowSynchronizer.java:1183)
at 
org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:312)

... 33 common frames omitted



Joe Witt 
Wednesday, July 5, 2017 10:57 AM
Scott

In your custom NAR are you extending some other processor/component
such as one found in standard processors? You'll really need to break
that chain and not pull in the same 

Re: MiNiFi & SSL Context

2017-07-05 Thread Marc
Hello David,
   I'm making the assumption here that you are attempting to use an
SSLContextService within a consumable suck as InvokeHTTP:

   The configuration for MiNiFi Java will be very similar to that of NiFi.
The System guide [1] for MiNiFi references the NiFi System administration
guide's security configuration section [2]. As per the MiNIFi guide:
"A StandardSSLContextService will be made automatically with the ID
"SSL-Context-Service" if "ssl protocol" is configured."

Once these properties are defined you can use the
StandrdSSLContextService.

   Am I off base in my assumption? Are you attempting to do something else?
Please let me know if I can provide more details. Thanks!

[1] https://nifi.apache.org/minifi/system-admin-guide.html
[2] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#
security-configuration

  Thanks,
  Marc


On Wed, Jul 5, 2017 at 7:35 AM, DAVID SMITH 
wrote:

> Hi
> I tried the Java version of MiNiFi yesterday and was very impressed, I do
> however have a question.
> What is the best way to set up and SSL context in MiNiFi so that I can do
> HTTPS rather than HTTP?
> Many thanksDave


Re: Filesystem based Schema Registry, does it make sense?

2017-07-05 Thread Joe Witt
I think it does make sense and someone at a meetup asked a similar
question.  There are some things to be considered like how does one
annotate the version of a schema, the name, etc.. when all they are
providing are files in a directory?  How can they support multiple versions
of a given schema (or maybe they just dont in this approach)?  But there is
no question that being able to just push an avsc file into a directory and
then have it be useable in the flow could be helpful.

On Jul 5, 2017 9:00 AM, "Andre"  wrote:

dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?


Re: Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Joe Witt
Scott

In your custom NAR are you extending some other processor/component
such as one found in standard processors?  You'll really need to break
that chain and not pull in the same components.

Can you share the actual error being provided at startup/

Thanks
joe

On Wed, Jul 5, 2017 at 11:53 AM, Scott Wagner  wrote:
> Hey all,
>
> I have been running on NiFi 1.1.1 for a while, and just started working
> on migrating to 1.3.0.  However, when I tried starting up, I was getting
> unhandled exceptions during startup of Jetty.  I have traced this down to
> something being wrong with a custom NAR file that we built for 1.1.1.
>
> The problem is when BundleUtils.findBundleForType is being called, it is
> finding 2 bundles for every internal NiFi processor class - the correct NAR
> that it comes with in 1.3.0, and my custom NAR file as well.  I'm sure that
> I have done something wrong in the packaging for my custom NAR file, but I
> don't know what it could be.  It was originally built from the NiFi NAR
> archetype late last year.
>
> If anyone can point in me in the direction of what I'm doing wrong, I
> would greatly appreciate it.  I'm not a maven/packaging expert by any means
> so no advice is too basic.
>
> Thanks!
>
> - Scott


Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Scott Wagner

Hey all,

I have been running on NiFi 1.1.1 for a while, and just started 
working on migrating to 1.3.0.  However, when I tried starting up, I was 
getting unhandled exceptions during startup of Jetty.  I have traced 
this down to something being wrong with a custom NAR file that we built 
for 1.1.1.


The problem is when BundleUtils.findBundleForType is being called, 
it is finding 2 bundles for every internal NiFi processor class - the 
correct NAR that it comes with in 1.3.0, and my custom NAR file as 
well.  I'm sure that I have done something wrong in the packaging for my 
custom NAR file, but I don't know what it could be.  It was originally 
built from the NiFi NAR archetype late last year.


If anyone can point in me in the direction of what I'm doing wrong, 
I would greatly appreciate it.  I'm not a maven/packaging expert by any 
means so no advice is too basic.


Thanks!

- Scott


Filesystem based Schema Registry, does it make sense?

2017-07-05 Thread Andre
dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?