from:"Bryan Bende"

Re: Anything like this been implemented yet?

2018-03-28 Thread Bryan Bende

Since it sounds like each query is a JSON document, can you create a
JSON array of all your queries and put that as the Custom Text of a
GenerateFlowFile processor?

Then follow it with SplitJson to split the array into a query per flow
file, assuming that is what you want.

Could also use ExecuteScript and enter all the queries as user defined
properties, then write a small script that loops of the properties and
produces a flow file for each dynamic property, where the content is
the value of the property which would be the query string.

On Wed, Mar 28, 2018 at 5:03 PM, Mike Thomsen  wrote:
> More specifically: I know you can do this functionality by chaining other
> processors or using a bunch of generateflowfiles. What I'm looking for is a
> one stop way of associating dozens of small queries to one processor and
> have it batch send them on its own with no backend dependency like a
> database.
>
> On Wed, Mar 28, 2018 at 5:02 PM, Mike Thomsen 
> wrote:
>
>> I have a client that would benefit from being able to run certain queries
>> periodically. Like half a dozen or more. Is there any processor where you
>> can associate a bunch of strings (like JSON) and send them out individually
>> as flowfiles?
>>
>> Thanks,
>>
>> Mike
>>

Re: [VOTE] Release Apache NiFi 1.6.0 (RC2)

2018-03-27 Thread Bryan Bende

+1 (binding)

- Verified everything in the release helper
- Verified the fix for the fingerprinting issue
- Successfully ran some test flows with record processors, HDFS
processors, and the keytab CS


On Tue, Mar 27, 2018 at 10:54 AM, Jeff Zemerick  wrote:
> +1 non-binding
>
> Built successfully and tests pass.
> Ran some simple flows -- worked as expected.
>
> On Tue, Mar 27, 2018 at 9:59 AM, Matt Burgess  wrote:
>> +1 (binding) Release this package as nifi-1.6.0
>>
>> Verified checksums, commit, L, etc. Ran full build with unit tests
>> and tried various flows, all looks well!
>>
>>
>> On Mon, Mar 26, 2018 at 11:34 PM, Joe Witt  wrote:
>>> Hello,
>>>
>>> I am pleased to be calling this vote for the source release of Apache
>>> NiFi nifi-1.6.0.
>>>
>>> The source zip, including signatures, digests, etc. can be found at:
>>> https://repository.apache.org/content/repositories/orgapachenifi-1123
>>>
>>> The Git tag is nifi-1.6.0-RC2
>>> The Git commit ID is b5935ec81a7cbc048820781ac62cd96bbea5b232
>>> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=b5935ec81a7cbc048820781ac62cd96bbea5b232
>>>
>>> Checksums of nifi-1.6.0-source-release.zip:
>>> SHA1: 009f1e2e3c17e38f21f27170b9c06228d11653c0
>>> SHA256: 39941a5b25427e2b4cc5ba8206084ff92df58863f29ddd097d4ac1e85424beb9
>>> SHA512: 
>>> 1773417a48665e3cda22180ea7f401bc8190ebddbf3f7bc29831e46e7ab0a07694c6e478d252fa573209d4a3c8132a522a8507b6a8784669ab7364847a07e234
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/joewitt.asc
>>>
>>> KEYS file available here:
>>> https://dist.apache.org/repos/dist/release/nifi/KEYS
>>>
>>> 146 issues were closed/resolved for this release:
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12342422
>>>
>>> Release note highlights can be found here:
>>> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.6.0
>>>
>>> The vote will be open for 72 hours.
>>> Please download the release candidate and evaluate the necessary items
>>> including checking hashes, signatures, build
>>> from source, and test.  The please vote:
>>>
>>> [ ] +1 Release this package as nifi-1.6.0
>>> [ ] +0 no opinion
>>> [ ] -1 Do not release this package because...

Re: Class Loading Conflicts - Different JAR Versions

2018-03-27 Thread Bryan Bende

Mike,

Here is a PR that should setup the ES stuff appropriately:

https://github.com/apache/nifi/pull/2586

If we merge this for 1.7 then we should mention something in the
migration notes, not that there is anything someone needs to do, but
just for knowledge in case someone tries to do some interesting setup
running 1.7 stuff in 1.6.

-Bryan


On Tue, Mar 27, 2018 at 9:37 AM, Otto Fowler <ottobackwa...@gmail.com> wrote:
> You can look at the aws nar for a sample of what I think Brian means.
>
>
>
> On March 27, 2018 at 07:31:54, Mike Thomsen (mikerthom...@gmail.com) wrote:
>
> Brian,
>
> So...
>
> nifi-foo-service-impl-nar + nifi-foo-processors-nar ---depend on--->
> nifi-foo-service-api-nar
> ---depends on---> nifi-standard-services-api-nar
>
> That's what I should look for in the ES NARs?
>
> On Mon, Mar 26, 2018 at 10:31 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> Also, regarding the elastic search issue Mike mentioned…
>>
>> Usually when a processor can’t select the controller service at runtime
> it
>> is because theres an issue with the way the dependencies are setup
> between
>> NARs. There should generally be dependency paths like the following…
>>
>> nifi-foo-service-impl-nar --|
>> |-->
>> nifi-foo-service-api-nar -> nifi-standard-services-api-nar
>> nifi-foo-processors-nar ---|
>>
>> All these links would be dependencies of nar in the poms.
>>
>> The above is for the case where processors and service implementation are
>> packaged separately, which is slightly different than what I was
> suggesting
>> for the Mongo case.
>>
>> > On Mar 26, 2018, at 10:06 PM, Bryan Bende <bbe...@gmail.com> wrote:
>> >
>> > I’m a +1 for moving the Mongo stuff out of standard services.
>> >
>> > Controller service APIs should always be broken out into their own NAR,
>> but the processors and implementation can usually be bundled together.
>> >
>> > So something like the following would work:
>> >
>> > nifi-mongo-bundle
>> > - nifi-mongodb-services-api
>> > - nifi-mongodb-services-api-nar (would have NAR dependency on standard
>> services API NAR)
>> > - nifi-mongodb-processors
>> > - nifi-mongodb-nar (would have NAR dependency on
>> nifi-mongodb-services-api-nar)
>> >
>> > The processors module would have the processors and the client service
>> implementation.
>> >
>> >
>> >> On Mar 26, 2018, at 9:13 PM, Brian Ghigiarelli <briang...@gmail.com>
>> wrote:
>> >>
>> >> Thanks for the clarification, Mike.
>> >>
>> >> My primary motivation for this upgrade is to support an aggregation
>> >> pipeline with the $out stage. Support for this in the Java driver was
>> >> introduced in 3.4 via https://jira.mongodb.org/browse/JAVA-2254. Prior
>> to
>> >> this, I believe you had to use the cursor to query all of the results,
>> >> which suffers from both query performance issues and concurrency
> issues
>> for
>> >> the data flow if we're trying to overwrite an entire collection at
> once.
>> >> That said, if I'm missing something, I'm happy to learn!
>> >>
>> >> Thanks again,
>> >> Brian
>> >>
>> >> On Mon, Mar 26, 2018 at 7:57 PM Mike Thomsen <mikerthom...@gmail.com>
>> wrote:
>> >>
>> >>> I was just following a convention at the time. So unless something
>> breaks
>> >>> because of the move, I don't see any reason why it would be an issue
> to
>> >>> move it.
>> >>>
>> >>> In fact, with NIFI-4325 the only reason I put the new ElasticSearch
>> >>> service/api in the standard-services segment of the code base was I
> ran
>> >>> into a bizarre issue at runtime where I could define the controller
>> >>> service, but the processor had no idea it was there. So that too
> might
>> >>> warrant some extra eyes now that #4325 was added to master today.
>> >>>
>> >>> FWIW, I can't think of any reason why the MongoDB client service
> needs
>> 3.4
>> >>> or 3.6 over 3.2.
>> >>>
>> >>> Brian, feel free to ping me on the review.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Mike
>> >>>
>> >>> On Mon, Mar 26, 2018 at 6:59 PM, Matt Burgess <mattyb...@apache.org>
>> >>> wrote:
>> >>

Re: ListSFTP incoming relationship

2018-03-27 Thread Bryan Bende

I'm not sure that would solve the problem because you'd still be
limited to one directory. What most people are asking for is the
ability to use a dynamic directory from an incoming flow file.

I think we might be trying to fit two different use-cases into one
processor which might not make sense.

Scenario #1... There is a directory that is constantly receiving new
data and has a significant amount of files, and I want to periodically
find new files. This is what the current processors are optimized for.

Scenario #2... There is a directory that is mostly static with a
moderate/small number of files, and at points in my flow I want to
dynamically perform a listing of this directory and retrieve the
files. This is more geared towards the mentality of running a
job/workflow.




On Tue, Mar 27, 2018 at 9:36 AM, Otto Fowler  wrote:
> What if the changes where ‘on top of’ some base set of properties, like
> directory?
> Like a filter, where if present from the incoming file will have the LIST*
> list only things
> that match a name or attribute?
>
>
>
> On March 27, 2018 at 00:08:41, Joe Witt (joe.w...@gmail.com) wrote:
>
> Scott
>
> This idea has come up a couple of times and there is definitely
> something intriguing to it. Where I think this idea stalls out though
> is in implementation.
>
> While I agree that the other List* processors might similarly benefit
> lets focus on ListFile. Today you tell ListFile what directory to
> start looking for files in. It goes off scanning that directory for
> hits and stores state about what it has already searched/seen. And it
> is important to keep track of how much it has already scanned because
> at times the search directory can be massive (100,000s of thousands or
> more files and directories to scan for example).
>
> In the proposed model the directory to be scanned could be provided
> dynamically by looking at an attribute of an incoming flowfile (or
> other criteria can be provided - not just the directory to scan). In
> this case the ListFile processor goes on scanning against that now.
> What about the previous directory (or directories) it was told to
> scan? Does it still track those too? What if it starts scanning the
> newly provided directory, hasn't finished pulling all the data or new
> data is continually arriving, and it is told to switch to another
> directory.
>
> I think if those questions can get solid answers and someone invests
> time in creating a PR then this could be pretty powerful. Would be
> good to see a written description of the use case(s) for this too.
>
> Thanks
> Joe
>
> On Mon, Mar 26, 2018 at 11:58 PM, scott  wrote:
>> Hello Devs,
>>
>> I would like to request a feature to a major processor, ListSFTP. But
> before
>> I do down the official road, I wanted to ask if anyone thought it was a
>> terrible idea or impossible, etc. The request is to add support for an
>> incoming relationship to the ListSFTP processor specifically, but I could
>> see it added to many of the commonly used head processes, such as
> ListFile.
>> I would envision functionality more like InvokeHTTP or ExecuteSQL, where
> an
>> incoming flow file could initiate the action, and the attributes in the
>> incoming flow file could be used to configure the processor actions. It's
>> the configuration aspect that most appeals to me, because it opens it up
> to
>> being centrally or dynamically configured.
>>
>> Thanks,
>>
>> Scott
>>

Re: Class Loading Conflicts - Different JAR Versions

2018-03-26 Thread Bryan Bende

Also, regarding the elastic search issue Mike mentioned…

Usually when a processor can’t select the controller service at runtime it is 
because theres an issue with the way the dependencies are setup between NARs. 
There should generally be dependency paths like the following…

nifi-foo-service-impl-nar --|
  |--> nifi-foo-service-api-nar -> 
nifi-standard-services-api-nar
nifi-foo-processors-nar ---|

All these links would be dependencies of nar in the poms.

The above is for the case where processors and service implementation are 
packaged separately, which is slightly different than what I was suggesting for 
the Mongo case.

> On Mar 26, 2018, at 10:06 PM, Bryan Bende <bbe...@gmail.com> wrote:
> 
> I’m a +1 for moving the Mongo stuff out of standard services.
> 
> Controller service APIs should always be broken out into their own NAR, but 
> the processors and implementation can usually be bundled together. 
> 
> So something like the following would work:
> 
> nifi-mongo-bundle
> - nifi-mongodb-services-api
> - nifi-mongodb-services-api-nar (would have NAR dependency on standard 
> services API NAR)
> - nifi-mongodb-processors 
> - nifi-mongodb-nar (would have NAR dependency on 
> nifi-mongodb-services-api-nar)
> 
> The processors module would have the processors and the client service 
> implementation.
> 
> 
>> On Mar 26, 2018, at 9:13 PM, Brian Ghigiarelli <briang...@gmail.com> wrote:
>> 
>> Thanks for the clarification, Mike.
>> 
>> My primary motivation for this upgrade is to support an aggregation
>> pipeline with the $out stage. Support for this in the Java driver was
>> introduced in 3.4 via https://jira.mongodb.org/browse/JAVA-2254. Prior to
>> this, I believe you had to use the cursor to query all of the results,
>> which suffers from both query performance issues and concurrency issues for
>> the data flow if we're trying to overwrite an entire collection at once.
>> That said, if I'm missing something, I'm happy to learn!
>> 
>> Thanks again,
>> Brian
>> 
>> On Mon, Mar 26, 2018 at 7:57 PM Mike Thomsen <mikerthom...@gmail.com> wrote:
>> 
>>> I was just following a convention at the time. So unless something breaks
>>> because of the move, I don't see any reason why it would be an issue to
>>> move it.
>>> 
>>> In fact, with NIFI-4325 the only reason I put the new ElasticSearch
>>> service/api in the standard-services segment of the code base was I ran
>>> into a bizarre issue at runtime where I could define the controller
>>> service, but the processor had no idea it was there. So that too might
>>> warrant some extra eyes now that #4325 was added to master today.
>>> 
>>> FWIW, I can't think of any reason why the MongoDB client service needs 3.4
>>> or 3.6 over 3.2.
>>> 
>>> Brian, feel free to ping me on the review.
>>> 
>>> Thanks,
>>> 
>>> Mike
>>> 
>>> On Mon, Mar 26, 2018 at 6:59 PM, Matt Burgess <mattyb...@apache.org>
>>> wrote:
>>> 
>>>> Brian (this one's all you ;),
>>>> 
>>>> I think that PR is quite welcome in my opinion, but I'd like to get
>>>> Mike Thomsen's and others' opinions on the subject too, I think Mike
>>>> wrote the original service (or at the least is very knowledgable about
>>>> Mongo and NiFi), he might have run into issues that led him to put it
>>>> there for a reason. If not, let's do it, if you're willing to submit
>>>> the contribution I/we'd be more than happy to review/incorporate it :)
>>>> 
>>>> Regards,
>>>> Matt
>>>> 
>>>> 
>>>> On Mon, Mar 26, 2018 at 6:20 PM, Brian Ghigiarelli <briang...@gmail.com>
>>>> wrote:
>>>>> Matt,
>>>>> 
>>>>> +1 [non-binding] for the idea to move the Mongo dependencies into that
>>>>> bundle. I think it will likely need the NAR dependency on
>>>>> nifi-standard-services-api in order to provide a link to the SSL
>>> Context
>>>>> Service. Is that ticket / PR worthy as a one-off in the short term? No
>>>>> doubt the Extension Registry will be a powerful capability.
>>>>> 
>>>>> Also, +2 for the greeting, though Bryan may choose to put the "y"
>>> first.
>>>> :-)
>>>>> 
>>>>> Thanks,
>>>>> Brian
>>>>> 
>>>>> On Mon, Mar 26, 2018 at 5:13 PM Matt Burgess <mat

Re: Class Loading Conflicts - Different JAR Versions

2018-03-26 Thread Bryan Bende

I’m a +1 for moving the Mongo stuff out of standard services.

Controller service APIs should always be broken out into their own NAR, but the 
processors and implementation can usually be bundled together. 

So something like the following would work:

nifi-mongo-bundle
 - nifi-mongodb-services-api
 - nifi-mongodb-services-api-nar (would have NAR dependency on standard 
services API NAR)
 - nifi-mongodb-processors 
 - nifi-mongodb-nar (would have NAR dependency on nifi-mongodb-services-api-nar)

The processors module would have the processors and the client service 
implementation.


> On Mar 26, 2018, at 9:13 PM, Brian Ghigiarelli <briang...@gmail.com> wrote:
> 
> Thanks for the clarification, Mike.
> 
> My primary motivation for this upgrade is to support an aggregation
> pipeline with the $out stage. Support for this in the Java driver was
> introduced in 3.4 via https://jira.mongodb.org/browse/JAVA-2254. Prior to
> this, I believe you had to use the cursor to query all of the results,
> which suffers from both query performance issues and concurrency issues for
> the data flow if we're trying to overwrite an entire collection at once.
> That said, if I'm missing something, I'm happy to learn!
> 
> Thanks again,
> Brian
> 
> On Mon, Mar 26, 2018 at 7:57 PM Mike Thomsen <mikerthom...@gmail.com> wrote:
> 
>> I was just following a convention at the time. So unless something breaks
>> because of the move, I don't see any reason why it would be an issue to
>> move it.
>> 
>> In fact, with NIFI-4325 the only reason I put the new ElasticSearch
>> service/api in the standard-services segment of the code base was I ran
>> into a bizarre issue at runtime where I could define the controller
>> service, but the processor had no idea it was there. So that too might
>> warrant some extra eyes now that #4325 was added to master today.
>> 
>> FWIW, I can't think of any reason why the MongoDB client service needs 3.4
>> or 3.6 over 3.2.
>> 
>> Brian, feel free to ping me on the review.
>> 
>> Thanks,
>> 
>> Mike
>> 
>> On Mon, Mar 26, 2018 at 6:59 PM, Matt Burgess <mattyb...@apache.org>
>> wrote:
>> 
>>> Brian (this one's all you ;),
>>> 
>>> I think that PR is quite welcome in my opinion, but I'd like to get
>>> Mike Thomsen's and others' opinions on the subject too, I think Mike
>>> wrote the original service (or at the least is very knowledgable about
>>> Mongo and NiFi), he might have run into issues that led him to put it
>>> there for a reason. If not, let's do it, if you're willing to submit
>>> the contribution I/we'd be more than happy to review/incorporate it :)
>>> 
>>> Regards,
>>> Matt
>>> 
>>> 
>>> On Mon, Mar 26, 2018 at 6:20 PM, Brian Ghigiarelli <briang...@gmail.com>
>>> wrote:
>>>> Matt,
>>>> 
>>>> +1 [non-binding] for the idea to move the Mongo dependencies into that
>>>> bundle. I think it will likely need the NAR dependency on
>>>> nifi-standard-services-api in order to provide a link to the SSL
>> Context
>>>> Service. Is that ticket / PR worthy as a one-off in the short term? No
>>>> doubt the Extension Registry will be a powerful capability.
>>>> 
>>>> Also, +2 for the greeting, though Bryan may choose to put the "y"
>> first.
>>> :-)
>>>> 
>>>> Thanks,
>>>> Brian
>>>> 
>>>> On Mon, Mar 26, 2018 at 5:13 PM Matt Burgess <mattyb...@apache.org>
>>> wrote:
>>>> 
>>>>> Bri/yan,
>>>>> 
>>>>> IMO I think we'd be better off with moving the
>>>>> nifi-mongodb-client-service-api and nifi-mongodb-services bundles into
>>>>> the nifi-mongodb-bundle. That's something I missed while reviewing the
>>>>> PR that put them in standard services. I don't see a direct dependency
>>>>> on nifi-standard-services, and if the nifi-mongodb-client-service-api
>>>>> needs the nifi-standard-services-api as a dependency, it can make it a
>>>>> NAR dependency.
>>>>> 
>>>>> We might want to visit this as a much broader case, since I believe we
>>>>> could run into this with other services APIs (Elasticsearch, HBase,
>>>>> etc.)? Certainly when the Extension Registry becomes a thing, we might
>>>>> not want "external" service APIs to be part of
>>>>> nifi-standard-services-api.
>>>>> 
>>>>> Regards,
>&g

Re: Class Loading Conflicts - Different JAR Versions

2018-03-26 Thread Bryan Bende

Brian,

Is your custom processor using the MongoDBClientService provided by
NiFI's standard services API? or does your NAR have a parent of
nifi-standard-services-api-nar to use other services?

Looking at where the Mongo JARs are from a build of master...

find work/nar/ -name *mongo-java*.jar
work/nar//extensions/nifi-standard-services-api-nar-1.6.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/mongo-java-driver-3.2.2.jar
work/nar//extensions/nifi-mongodb-services-nar-1.6.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/mongo-java-driver-3.2.2.jar
work/nar//extensions/nifi-mongodb-nar-1.6.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/mongo-java-driver-3.2.2.jar

I think the issue is that if your NAR has
nifi-standard-services-api-nar as a parent NAR (which it probably does
to use SSLContext service, or any other standard service) then you are
getting mongo-java-driver-3.2.2 from MongoDBClientService since we
have parent first class loading.

Given this situation I think there are only two options... not having
nifi-standard-services-api-nar as a parent (which stops you from using
all standard services), or upgrading the version of mongo-java-driver
used by MongoDBClientService.

-Bryan

On Mon, Mar 26, 2018 at 3:16 PM, Brian Ghigiarelli  wrote:
> Hey all,
>
> Is there a means of troubleshooting a custom NAR that seems to be having
> runtime conflicts by picking up an older JAR that's provided by the NiFi
> standard bundle?
>
> In this particular case, I'm using some custom processors that are built
> against NiFi 1.4.0 and have a dependency on MongoDB 3.6.3. At runtime, I'm
> seeing the processor use classes from MongoDB 3.2.2 that's provided by
> NiFi. Both NiFi and the custom NAR compile successfully on their own, but
> using the custom NAR in NiFi causes NoSuchMethodError's due to a new method
> only available in MongoDB 3.4+.
>
> Thanks,
> Brian

Re: Weird issue w/ Redis pool service in an integration test

2018-03-26 Thread Bryan Bende

Ah yea, take a look at RedisDistributedMapCacheClientService, that one
is closer to what you need.

Something like

RedisConnectionPool redisConnectionPool =
context.getProperty(REDIS_CONNECTION_POOL).asControllerService(RedisConnectionPool.class);
RedisConnection redisConnection = redisConnectionPool.getConnection();



On Mon, Mar 26, 2018 at 11:58 AM, Mike Thomsen <mikerthom...@gmail.com> wrote:
> Yeah, it does. Copied withConnection from the state provider. Looks like
> copya pasta may have struck again...
>
> On Mon, Mar 26, 2018 at 11:44 AM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> I can't tell for sure, but the stacktrace looks like your
>> AbstractRedisProcessor is making a direct call to RedisUtils to create
>> a connection, rather than using the RedisConnectionPool to obtain a
>> connection.
>>
>> On Mon, Mar 26, 2018 at 11:38 AM, Bryan Bende <bbe...@gmail.com> wrote:
>> > Can you share the code for your AbstractRedisProcessor?
>> >
>> >
>> > On Mon, Mar 26, 2018 at 9:52 AM, Mike Thomsen <mikerthom...@gmail.com>
>> wrote:
>> >> Over the weekend I started playing around with a new processor called
>> >> PutRedisHash based on a request from the user list. I set up a really
>> >> simple IT and hit a problem pretty quickly. This passes validation:
>> >>
>> >> @Test
>> >> public void testStandalone() throws Exception {
>> >> final String attrName = "key.name";
>> >> final String attrValue = "simple_test";
>> >> RedisConnectionPool connectionPool = new
>> RedisConnectionPoolService();
>> >> TestRunner runner = TestRunners.newTestRunner(PutRedisHash.class);
>> >>
>> >> runner.addControllerService("connPool", connectionPool);
>> >> runner.setProperty(connectionPool, RedisUtils.CONNECTION_STRING,
>> >> "localhost:6379");
>> >> runner.enableControllerService(connectionPool);
>> >> runner.setProperty(PutRedisHash.REDIS_CONNECTION_POOL, "connPool");
>> >> runner.setProperty(PutRedisHash.NAME_ATTRIBUTE, attrName);
>> >> runner.assertValid();
>> >> }
>> >>
>> >> As soon as I enqueue some data and call run(), I see the following
>> >> exception get thrown in the processor. I checked the Connection String
>> >> property, and it is marked as supporting EL and does call
>> >> evaluationExpressionLanguage in the RedisUtils.createConnectionFactory
>> >> method.
>> >>
>> >> java.lang.IllegalStateException: Attempting to Evaluate Expressions but
>> >> PropertyDescriptor[Connection String] indicates that the Expression
>> >> Language is not supported. If you realize that this is the case and do
>> not
>> >> want this error to occur, it can be disabled by calling
>> >> TestRunner.setValidateExpressionUsage(false)
>> >> at
>> >> org.apache.nifi.util.MockPropertyValue.markEvaluated(
>> MockPropertyValue.java:133)
>> >> at
>> >> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(
>> MockPropertyValue.java:183)
>> >> at
>> >> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(
>> MockPropertyValue.java:177)
>> >> at
>> >> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(
>> MockPropertyValue.java:142)
>> >> at
>> >> org.apache.nifi.redis.util.RedisUtils.createConnectionFactory(
>> RedisUtils.java:260)
>> >> at
>> >> org.apache.nifi.processors.redis.AbstractRedisProcessor.getRedis(
>> AbstractRedisProcessor.java:41)
>> >> at
>> >> org.apache.nifi.processors.redis.AbstractRedisProcessor.withConnection(
>> AbstractRedisProcessor.java:50)
>> >> at
>> >> org.apache.nifi.processors.redis.PutRedisHash.onTrigger(
>> PutRedisHash.java:162)
>> >> at
>> >> org.apache.nifi.processor.AbstractProcessor.onTrigger(
>> AbstractProcessor.java:27)
>> >> at
>> >> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(
>> StandardProcessorTestRunner.java:251)
>> >> at
>> >> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(
>> StandardProcessorTestRunner.java:245)
>> >> at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
>> >> at java.util.concurrent.FutureTask.run(FutureTask.java)
>> >> at
>> >> java.util.concurrent.ScheduledThreadPoolExecutor$
>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> >> at
>> >> java.util.concurrent.ScheduledThreadPoolExecutor$
>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1149)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:624)
>> >> at java.lang.Thread.run(Thread.java:748)
>> >>
>> >> Any ideas?
>>

Re: published by PublishKafkaRecord_0_10 doesn't embed schema.

2018-03-26 Thread Bryan Bende

You might be able to get the nifi-kafka-0-10-nar from 1.5.0 and run it in 1.4.0.

On Mon, Mar 26, 2018 at 11:28 AM, Milan Das <m...@interset.com> wrote:
> Hi Bryan,
> We are using NIFI 1.4.0. Can we backport this fix to NIFI 1.4?
>
> Thanks,
> Milan Das
>
> On 3/26/18, 11:26 AM, "Bryan Bende" <bbe...@gmail.com> wrote:
>
> Hello,
>
> What version of NiFi are you using?
>
> This should be fixed in 1.5.0:
>
> https://issues.apache.org/jira/browse/NIFI-4639
>
> Thanks,
>
> Bryan
>
>
> On Sun, Mar 25, 2018 at 6:45 PM, Milan Das <m...@interset.com> wrote:
> > Hello Nifi Users,
> >
> > Apparently, it seems like PublishKafkaRecord_0_10 doesn't embed schema 
> even if it Avro Record writer is configured with “Embed Avro Schema”.
> >
> > I have seen the following post from Bryan Brende.  Wondering if it is a 
> known issue or if I am missing anything here.
> >
> >
> >
> > 
> https://community.hortonworks.com/questions/110652/cant-consume-avro-messages-whcih-are-published-by.html
> >
> >
> >
> >
> >
> > This is how message looks in Kafka-console-consumer, when published 
> using “PublishKafkaRecord_0_11”
> >
> >
> >
> >
> >
> >  $ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 
> --topic test --from-beginning
> >
> >
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�zpA
> >
> > 
> �=h*�p��l
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$+�=�ت��;�.Y7
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�D�p��"B��
>r0
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$ekLl�;]�,Y�͙�
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�z��klŤ�1�'�z�
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$���ξu��5�V}>�_
> >
> > 
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$��=%��VbK�
> >
> > 
> ��'~���X�controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$��0
> >
> >
> >
> >
> >
> > When I publish the message using my java class output from console 
> consumer prints the avro schema.
> >
> >
> >
> > 
> Objavro.schema�P{"type":"record","name":"ActiveDirectoryRecord","namespace":"com..schema","doc":"for
>  more info, refer to 
> http://docs.splunk.com/Documentation/CIM/4.2.0/User/Resource","fields":[{"name":"action","type":"string","doc":"The
>  action performed on the 
> resource."},{"name":"dest","type":"string","doc":"The target involved in the 
> authentication. May be aliased from more specific fields, such as dest_host, 
> dest_ip, or 
> dest_nt_host."},{"name":"signature_id","type":"int","doc":"Description of the 
> change performed (integer)"},{"name":"time","type":"string","doc":"ISO 8601 
> timestamp of the eventl <TRUNCATED…..  
> >,{"name":"privileges","type":["null",{"type":"array","items":"string"}],"doc":"The
>  list of privileges associated with a Privilege Escalation 
> event","default":null},{"name":"subcode","type":["null","string"],"doc":"The 
> error subcode for 
> auth;��d��g%�z�SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00usedid1Pz��
> >
> >
> >
> >
> >
> > Regards,
> >
> > Milan Das
> >
> >
>
>
>

Re: Weird issue w/ Redis pool service in an integration test

2018-03-26 Thread Bryan Bende

I can't tell for sure, but the stacktrace looks like your
AbstractRedisProcessor is making a direct call to RedisUtils to create
a connection, rather than using the RedisConnectionPool to obtain a
connection.

On Mon, Mar 26, 2018 at 11:38 AM, Bryan Bende <bbe...@gmail.com> wrote:
> Can you share the code for your AbstractRedisProcessor?
>
>
> On Mon, Mar 26, 2018 at 9:52 AM, Mike Thomsen <mikerthom...@gmail.com> wrote:
>> Over the weekend I started playing around with a new processor called
>> PutRedisHash based on a request from the user list. I set up a really
>> simple IT and hit a problem pretty quickly. This passes validation:
>>
>> @Test
>> public void testStandalone() throws Exception {
>> final String attrName = "key.name";
>> final String attrValue = "simple_test";
>> RedisConnectionPool connectionPool = new RedisConnectionPoolService();
>> TestRunner runner = TestRunners.newTestRunner(PutRedisHash.class);
>>
>> runner.addControllerService("connPool", connectionPool);
>> runner.setProperty(connectionPool, RedisUtils.CONNECTION_STRING,
>> "localhost:6379");
>> runner.enableControllerService(connectionPool);
>> runner.setProperty(PutRedisHash.REDIS_CONNECTION_POOL, "connPool");
>> runner.setProperty(PutRedisHash.NAME_ATTRIBUTE, attrName);
>> runner.assertValid();
>> }
>>
>> As soon as I enqueue some data and call run(), I see the following
>> exception get thrown in the processor. I checked the Connection String
>> property, and it is marked as supporting EL and does call
>> evaluationExpressionLanguage in the RedisUtils.createConnectionFactory
>> method.
>>
>> java.lang.IllegalStateException: Attempting to Evaluate Expressions but
>> PropertyDescriptor[Connection String] indicates that the Expression
>> Language is not supported. If you realize that this is the case and do not
>> want this error to occur, it can be disabled by calling
>> TestRunner.setValidateExpressionUsage(false)
>> at
>> org.apache.nifi.util.MockPropertyValue.markEvaluated(MockPropertyValue.java:133)
>> at
>> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:183)
>> at
>> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:177)
>> at
>> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:142)
>> at
>> org.apache.nifi.redis.util.RedisUtils.createConnectionFactory(RedisUtils.java:260)
>> at
>> org.apache.nifi.processors.redis.AbstractRedisProcessor.getRedis(AbstractRedisProcessor.java:41)
>> at
>> org.apache.nifi.processors.redis.AbstractRedisProcessor.withConnection(AbstractRedisProcessor.java:50)
>> at
>> org.apache.nifi.processors.redis.PutRedisHash.onTrigger(PutRedisHash.java:162)
>> at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>> at
>> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(StandardProcessorTestRunner.java:251)
>> at
>> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(StandardProcessorTestRunner.java:245)
>> at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
>> at java.util.concurrent.FutureTask.run(FutureTask.java)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>> Any ideas?

Re: Weird issue w/ Redis pool service in an integration test

2018-03-26 Thread Bryan Bende

Can you share the code for your AbstractRedisProcessor?


On Mon, Mar 26, 2018 at 9:52 AM, Mike Thomsen  wrote:
> Over the weekend I started playing around with a new processor called
> PutRedisHash based on a request from the user list. I set up a really
> simple IT and hit a problem pretty quickly. This passes validation:
>
> @Test
> public void testStandalone() throws Exception {
> final String attrName = "key.name";
> final String attrValue = "simple_test";
> RedisConnectionPool connectionPool = new RedisConnectionPoolService();
> TestRunner runner = TestRunners.newTestRunner(PutRedisHash.class);
>
> runner.addControllerService("connPool", connectionPool);
> runner.setProperty(connectionPool, RedisUtils.CONNECTION_STRING,
> "localhost:6379");
> runner.enableControllerService(connectionPool);
> runner.setProperty(PutRedisHash.REDIS_CONNECTION_POOL, "connPool");
> runner.setProperty(PutRedisHash.NAME_ATTRIBUTE, attrName);
> runner.assertValid();
> }
>
> As soon as I enqueue some data and call run(), I see the following
> exception get thrown in the processor. I checked the Connection String
> property, and it is marked as supporting EL and does call
> evaluationExpressionLanguage in the RedisUtils.createConnectionFactory
> method.
>
> java.lang.IllegalStateException: Attempting to Evaluate Expressions but
> PropertyDescriptor[Connection String] indicates that the Expression
> Language is not supported. If you realize that this is the case and do not
> want this error to occur, it can be disabled by calling
> TestRunner.setValidateExpressionUsage(false)
> at
> org.apache.nifi.util.MockPropertyValue.markEvaluated(MockPropertyValue.java:133)
> at
> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:183)
> at
> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:177)
> at
> org.apache.nifi.util.MockPropertyValue.evaluateAttributeExpressions(MockPropertyValue.java:142)
> at
> org.apache.nifi.redis.util.RedisUtils.createConnectionFactory(RedisUtils.java:260)
> at
> org.apache.nifi.processors.redis.AbstractRedisProcessor.getRedis(AbstractRedisProcessor.java:41)
> at
> org.apache.nifi.processors.redis.AbstractRedisProcessor.withConnection(AbstractRedisProcessor.java:50)
> at
> org.apache.nifi.processors.redis.PutRedisHash.onTrigger(PutRedisHash.java:162)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(StandardProcessorTestRunner.java:251)
> at
> org.apache.nifi.util.StandardProcessorTestRunner$RunProcessor.call(StandardProcessorTestRunner.java:245)
> at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
> at java.util.concurrent.FutureTask.run(FutureTask.java)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> Any ideas?

Re: Objection: LDAP Auth needs HTTPS

2018-03-26 Thread Bryan Bende

Hello,

Passing LDAP credentials in plain-text over http would not be secure.

You'll want to have the SSL connection pass through the load balancer
all the way to the NiFi nodes.

There are several articles on setting up a secure NiFi cluster:

https://pierrevillard.com/2016/11/29/apache-nifi-1-1-0-secured-cluster-setup/
https://holisticsecurity.io/2017/05/17/apache-nifi-and-tls-toolkit-ansible-roles-to-create-a-multi-node-secure-nifi-cluster/
https://bryanbende.com/development/2016/08/17/apache-nifi-1-0-0-authorization-and-multi-tenancy

Thanks,

Bryan

On Fri, Mar 23, 2018 at 5:32 PM, Qcho  wrote:
> Hi,
>
> I'm working on deploying a nifi cluster with kubernetes.
>
> The idea was to place a nifi-cluster behind an nginx-ingress.
>
> We want to be able to access:
>
> nifi.mydomain.com and allow load-balancing over:
>
> node1.nifi.mydomain.com
> node2.nifi.mydomain.com
> ... etc
>
> The problem here is that we want to perform SSL termination in the
> load-balancer at nifi.mydomain.com and then just talk plain http with each
> node. But then it seems Nifi is not allowing this because it thinks talking
> over http is always insecure and therefore no LDAP auth is supported.
>
> Any though on this? Anyone has been able to do some kind of proxying??
>
> Maybe we can add an enable-if-you-know-what-you-are-doing config for
> allowing http auth?
>
> Thanks in advance!

Re: published by PublishKafkaRecord_0_10 doesn't embed schema.

2018-03-26 Thread Bryan Bende

Hello,

What version of NiFi are you using?

This should be fixed in 1.5.0:

https://issues.apache.org/jira/browse/NIFI-4639

Thanks,

Bryan


On Sun, Mar 25, 2018 at 6:45 PM, Milan Das  wrote:
> Hello Nifi Users,
>
> Apparently, it seems like PublishKafkaRecord_0_10 doesn't embed schema even 
> if it Avro Record writer is configured with “Embed Avro Schema”.
>
> I have seen the following post from Bryan Brende.  Wondering if it is a known 
> issue or if I am missing anything here.
>
>
>
> https://community.hortonworks.com/questions/110652/cant-consume-avro-messages-whcih-are-published-by.html
>
>
>
>
>
> This is how message looks in Kafka-console-consumer, when published using 
> “PublishKafkaRecord_0_11”
>
>
>
>
>
>  $ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic 
> test --from-beginning
>
>
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�zpA
>
>   
>   �=h*�p��l
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$+�=�ت��;�.Y7
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�D�p��"B��
>r0
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$ekLl�;]�,Y�͙�
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$�z��klŤ�1�'�z�
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$���ξu��5�V}>�_
>
> �SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$��=%��VbK�
>
> ��'~���X�controller1.ad.interset.com�H22018-03-15T09:07:04-04:00CONTROLLER1$��0
>
>
>
>
>
> When I publish the message using my java class output from console consumer 
> prints the avro schema.
>
>
>
> Objavro.schema�P{"type":"record","name":"ActiveDirectoryRecord","namespace":"com..schema","doc":"for
>  more info, refer to 
> http://docs.splunk.com/Documentation/CIM/4.2.0/User/Resource","fields":[{"name":"action","type":"string","doc":"The
>  action performed on the 
> resource."},{"name":"dest","type":"string","doc":"The target involved in the 
> authentication. May be aliased from more specific fields, such as dest_host, 
> dest_ip, or 
> dest_nt_host."},{"name":"signature_id","type":"int","doc":"Description of the 
> change performed (integer)"},{"name":"time","type":"string","doc":"ISO 8601 
> timestamp of the eventl  >,{"name":"privileges","type":["null",{"type":"array","items":"string"}],"doc":"The
>  list of privileges associated with a Privilege Escalation 
> event","default":null},{"name":"subcode","type":["null","string"],"doc":"The 
> error subcode for 
> auth;��d��g%�z�SUCCESS6controller1.ad.interset.com�H22018-03-15T09:07:04-04:00usedid1Pz��
>
>
>
>
>
> Regards,
>
> Milan Das
>
>

Re: [CANCEL][VOTE] Release Apache NiFi 1.6.0 RC1

2018-03-25 Thread Bryan Bende

I plan to review/test the fix that Sivaprasanna made first thing tomorrow 
morning, unless someone gets to it before then.

> On Mar 25, 2018, at 10:29 AM, Joey Frazee <joey.fra...@icloud.com> wrote:
> 
> Joe, yes, referring to what Bryan discovered.
> 
> On Mar 25, 2018, 9:23 AM -0500, Joe Witt <joe.w...@gmail.com>, wrote:
>> Team
>> 
>> RC1 vote is cancelled to correct findings of vote process.
>> 
>> Joey:
>> I pushed the RC1 branch
>> https://github.com/apache/nifi/tree/NIFI-4995-RC1 as per release
>> guide. We will push the actual tag once we get to a release point.
>> 
>> Can you clarify what fingerprint issue you are referring to? Just
>> want to make sure this is what BryanB pointed out and not something
>> else.
>> 
>> Thanks
>> Joe
>> 
>> On Sun, Mar 25, 2018 at 10:12 AM, Joey Frazee <joey.fra...@icloud.com> wrote:
>>> -1
>>> 
>>> Ran through the usual release helper stuff, but it seems like the 
>>> fingerprint issue is going to cause problems, so not sure how useful 
>>> putting 1.6.0 out there will be if 1.6.1 will have to be turned around 
>>> immediately.
>>> 
>>> Did you mean to say there's a nifi-1.6.0 -RC tag? It doesn't look like the 
>>> tag got pushed.
>>> 
>>> -joey
>>> 
>>> On Mar 24, 2018, 12:38 AM -0500, Pierre Villard 
>>> <pierre.villard...@gmail.com>, wrote:
>>>> -1 (binding)
>>>> 
>>>> I confirm the issue mentioned by Bryan. That's actually what Matt and I
>>>> experienced when trying the PR about the S2S Metrics Reporting task [1]. I
>>>> thought it was due to my change but it appears it's not the case.
>>>> 
>>>> [1] https://github.com/apache/nifi/pull/2575
>>>> 
>>>> 2018-03-23 22:53 GMT+01:00 Bryan Bende <bbe...@gmail.com>:
>>>> 
>>>>> After voting I happened to be using the RC to test something else and
>>>>> came across a bug that I think warrants changing my vote to a -1.
>>>>> 
>>>>> I created a simple two node cluster and made a standard convert record
>>>>> flow. When I ran the flow I got a schema not found exception, so I
>>>>> used the debugger which showed AvroSchemaRegistry had no schemas, even
>>>>> though there was one in the UI.
>>>>> 
>>>>> I then used the debugger to make sure the onPropertyModified was
>>>>> getting when a schema was added, and it was which meant some after
>>>>> adding the schema but before running the flow, it was being removed.
>>>>> 
>>>>> As far as I can tell, the issue is related to changes introduced in
>>>>> NIFI-4864... the intent here was for components with property
>>>>> descriptors that have "dynamically modifies classpath" to be able to
>>>>> smartly reload when they are started based on knowing if more
>>>>> classpath resources were added.
>>>>> 
>>>>> The issue is that for components that don't have any property
>>>>> descriptors like this, they have a null fingerprint, and before
>>>>> starting it compares null to the fingerprint of empty string, and
>>>>> decides to reload [2].
>>>>> 
>>>>> I think the fix should be fairly easy to just short-circuit at the
>>>>> beginning of that method and return immediately if
>>>>> additionalResourcesFingerprint is null, but will have to do some
>>>>> testing.
>>>>> 
>>>>> [1] https://issues.apache.org/jira/browse/NIFI-4864
>>>>> [2] https://github.com/apache/nifi/blob/master/nifi-nar-
>>>>> bundles/nifi-framework-bundle/nifi-framework/nifi-framework-
>>>>> core-api/src/main/java/org/apache/nifi/controller/
>>>>> AbstractConfiguredComponent.java#L313-L314
>>>>> 
>>>>> 
>>>>> On Fri, Mar 23, 2018 at 4:20 PM, Matt Gilman <matt.c.gil...@gmail.com
>>>>> wrote:
>>>>>> +1 (binding) Release this package as nifi-1.6.0
>>>>>> 
>>>>>> Executed the release helper and verified new granular restrictions with
>>>>>> regards to flow versioning.
>>>>>> 
>>>>>> Thanks for RMing Joe!
>>>>>> 
>>>>>> Matt
>>>>>> 
>>>>>> On Fri, Mar 23, 2018 at 4:12 PM, Michael Moser <moser

Re: [VOTE] Release Apache NiFi 1.6.0

2018-03-23 Thread Bryan Bende

After voting I happened to be using the RC to test something else and
came across a bug that I think warrants changing my vote to a -1.

I created a simple two node cluster and made a standard convert record
flow. When I ran the flow I got a schema not found exception, so I
used the debugger which showed AvroSchemaRegistry had no schemas, even
though there was one in the UI.

I then used the debugger to make sure the onPropertyModified was
getting when a schema was added, and it was which meant some after
adding the schema but before running the flow, it was being removed.

As far as I can tell, the issue is related to changes introduced in
NIFI-4864... the intent here was for components with property
descriptors that have "dynamically modifies classpath" to be able to
smartly reload when they are started based on knowing if more
classpath resources were added.

The issue is that for components that don't have any property
descriptors like this, they have a null fingerprint, and before
starting it compares null to the fingerprint of empty string, and
decides to reload [2].

I think the fix should be fairly easy to just short-circuit at the
beginning of that method and return immediately if
additionalResourcesFingerprint is null, but will have to do some
testing.

[1] https://issues.apache.org/jira/browse/NIFI-4864
[2] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/AbstractConfiguredComponent.java#L313-L314

On Fri, Mar 23, 2018 at 4:20 PM, Matt Gilman <matt.c.gil...@gmail.com> wrote:
> +1 (binding) Release this package as nifi-1.6.0
>
> Executed the release helper and verified new granular restrictions with
> regards to flow versioning.
>
> Thanks for RMing Joe!
>
> Matt
>
> On Fri, Mar 23, 2018 at 4:12 PM, Michael Moser <moser...@gmail.com> wrote:
>
>> +1 (binding)
>>
>> Ran through release helper to verify the release and run NiFi on Ubuntu
>> 16.04.  It worked as expected with no new comments to add.
>>
>> -- Mike
>>
>>
>> On Fri, Mar 23, 2018 at 4:02 PM, Scott Aslan <scottyas...@gmail.com>
>> wrote:
>>
>> > +1 (binding)
>> >
>> > - Ran through release helper
>> > - Setup secure NiFi and verified a test flow
>> >
>> > On Fri, Mar 23, 2018 at 3:29 PM, Bryan Bende <bbe...@gmail.com> wrote:
>> >
>> > > +1 (binding)
>> > >
>> > > - Ran through release helper and everything checked out
>> > > - Verified some test flows with the restricted components + keytab CS
>> > >
>> > >
>> > >
>> > > On Fri, Mar 23, 2018 at 2:42 PM, Mark Payne <marka...@hotmail.com>
>> > wrote:
>> > > > +1 (binding)
>> > > >
>> > > > Was able to verify hashes, build with contrib-check, and start up
>> > > application. Performed some basic functionality tests and all worked as
>> > > expected.
>> > > >
>> > > > Thanks!
>> > > > -Mark
>> > > >
>> > > >
>> > > >> On Mar 23, 2018, at 6:02 AM, Takanobu Asanuma <
>> tasan...@yahoo-corp.jp
>> > >
>> > > wrote:
>> > > >>
>> > > >> Thanks for all your efforts, Joe.
>> > > >>
>> > > >> I have one question. The version of the generated package is
>> > > nifi-1.7.0-SNAPSHOT. Is this correct at this stage? If it's ok,
>> > > +1(non-binding).
>> > > >>
>> > > >> - Succeeded "mvn -T 2.0C clean install -DskipTests -Prpm"
>> > > >> - Started secure cluster mode
>> > > >> - Verified sample dataflows work fine
>> > > >> - Verified the whitelist feature(NIFI-4761) works fine
>> > > >>
>> > > >> -Takanobu Asanuma
>> > > >>
>> > > >> -Original Message-
>> > > >> From: Joe Witt [mailto:joew...@apache.org]
>> > > >> Sent: Friday, March 23, 2018 1:12 AM
>> > > >> To: dev@nifi.apache.org
>> > > >> Subject: [VOTE] Release Apache NiFi 1.6.0
>> > > >>
>> > > >> Hello,
>> > > >>
>> > > >> I am pleased to be calling this vote for the source release of
>> Apache
>> > > NiFi nifi-1.6.0.
>> > > >>
>> > > >> The source zip, including signatures, digests, etc. can be found at:
>> > > >> https://repository.apache.org/conten

Re: [VOTE] Release Apache NiFi 1.6.0

2018-03-23 Thread Bryan Bende

+1 (binding)

- Ran through release helper and everything checked out
- Verified some test flows with the restricted components + keytab CS



On Fri, Mar 23, 2018 at 2:42 PM, Mark Payne  wrote:
> +1 (binding)
>
> Was able to verify hashes, build with contrib-check, and start up 
> application. Performed some basic functionality tests and all worked as 
> expected.
>
> Thanks!
> -Mark
>
>
>> On Mar 23, 2018, at 6:02 AM, Takanobu Asanuma  wrote:
>>
>> Thanks for all your efforts, Joe.
>>
>> I have one question. The version of the generated package is 
>> nifi-1.7.0-SNAPSHOT. Is this correct at this stage? If it's ok, 
>> +1(non-binding).
>>
>> - Succeeded "mvn -T 2.0C clean install -DskipTests -Prpm"
>> - Started secure cluster mode
>> - Verified sample dataflows work fine
>> - Verified the whitelist feature(NIFI-4761) works fine
>>
>> -Takanobu Asanuma
>>
>> -Original Message-
>> From: Joe Witt [mailto:joew...@apache.org]
>> Sent: Friday, March 23, 2018 1:12 AM
>> To: dev@nifi.apache.org
>> Subject: [VOTE] Release Apache NiFi 1.6.0
>>
>> Hello,
>>
>> I am pleased to be calling this vote for the source release of Apache NiFi 
>> nifi-1.6.0.
>>
>> The source zip, including signatures, digests, etc. can be found at:
>> https://repository.apache.org/content/repositories/orgapachenifi-1122
>>
>> The Git tag is nifi-1.6.0-RC1
>> The Git commit ID is 49a71f4740c9fac38958961f78dd3cde874b0e45
>> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=49a71f4740c9fac38958961f78dd3cde874b0e45
>>
>> Checksums of nifi-1.6.0-source-release.zip:
>> SHA1: 4fb82f386a0aa83614cc01449e540527443811ce
>> SHA256: 4df6638ec87a5bee12e7978abc64137ee5da5fc8c399e34cf34ca1c3720ac891
>> SHA512: 
>> 6fa536f9618c6c153c04df5db59913eaf3dd54ae2389368129ac6237f75519e1eee7ba0ca70145a95de01517a1c3ea1f36975895d9d72bb04abb56b7934e013a
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/joewitt.asc
>>
>> KEYS file available here:
>> https://dist.apache.org/repos/dist/release/nifi/KEYS
>>
>> 135 issues were closed/resolved for this release:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12342422
>>
>> Release note highlights can be found here:
>> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.6.0
>>
>> The vote will be open for 72 hours.
>> Please download the release candidate and evaluate the necessary items 
>> including checking hashes, signatures, build from source, and test.  The 
>> please vote:
>>
>> [ ] +1 Release this package as nifi-1.6.0 [ ] +0 no opinion [ ] -1 Do not 
>> release this package because...
>

Re: FlattenJson

2018-03-23 Thread Bryan Bende

Most of the ideas discussed here are assuming there is one record per
flow file, which for any serious amount of data is not what you want
to do.

It might be better to have a new ExtractJsonToAttributes processor
that enforces limitations like a single json doc per flow file and all
flat fields, so if you don't split and flatten before hand then it
routes to failure.


On Fri, Mar 23, 2018 at 11:07 AM, Jorge Machado <jom...@me.com> wrote:
> So I’m pretty lost now, all the suggestions from Matt will not solve my 
> problem that I need to have all contents of a flow file as attritube key 
> -paired…
>
> A good place to have it would be on ConvertAvroToJSON so that it has a option 
> to say if it goes to attribute or to FlowFile, defaulting to Flowfile.
>
> Would be the Changed accepted  ? I would create a PR for it.
>
>
> Jorge Machado
>
>
>
>
>
>> On 20 Mar 2018, at 22:35, Otto Fowler <ottobackwa...@gmail.com> wrote:
>>
>> We could start with routeOnJsonPath and do the record path as the need
>> arises?
>>
>>
>> On March 20, 2018 at 16:06:34, Matt Burgess (mattyb...@apache.org) wrote:
>>
>> Rather than restricting it to JSONPath, perhaps we should have a
>> RouteOnRecordPath or RouteRecord using the RecordPath API? Even better
>> would be the ability to use RecordPath functions in QueryRecord, but
>> that involves digging into Calcite as well. I realize JSONPath might
>> have more capabilities than RecordPath at the moment, but it seems a
>> shame to force the user to convert to JSON to use a "RouteOnJSONPath"
>> processor, the record-aware processors are meant to replace that kind
>> of format-specific functionality.
>>
>> Regards,
>> Matt
>>
>> On Tue, Mar 20, 2018 at 12:19 PM, Sivaprasanna
>> <sivaprasanna...@gmail.com> wrote:
>>> Like the idea that Otto suggested. RoutOnJSONPath makes more sense since
>>> making the flattened JSON write to attributes is restricted to that
>>> processor alone.
>>>
>>> On Tue, Mar 20, 2018 at 8:37 PM, Otto Fowler <ottobackwa...@gmail.com>
>>> wrote:
>>>
>>>> Why not create a new processor that does routeOnJSONPath and works on
>> the
>>>> flow file?
>>>>
>>>>
>>>> On March 20, 2018 at 10:39:37, Jorge Machado (jom...@me.com) wrote:
>>>>
>>>> So that is what we actually are doing EvaluateJsonPath the problem with
>>>> that is, that is hard to build something generic if we need to specify
>> each
>>>> property by his name, that’s why this idea.
>>>>
>>>> Should I make a PR for this or is this to business specific ?
>>>>
>>>>
>>>> Jorge Machado
>>>>
>>>>> On 20 Mar 2018, at 15:30, Bryan Bende <bbe...@gmail.com> wrote:
>>>>>
>>>>> Ok so I guess it depends whether you end up needing all 30 fields as
>>>>> attributes to achieve the logic in your flow, or if you only need a
>>>>> couple.
>>>>>
>>>>> If you only need a couple you could probably use EvaluateJsonPath
>>>>> after FlattenJson to extract just the couple of fields you need into
>>>>> attributes.
>>>>>
>>>>> If you need them all then I guess it makes sense to want the option to
>>>>> flatten into attributes.
>>>>>
>>>>> On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado <jom...@me.com> wrote:
>>>>>> From there on we use a lot of routeOnAttritutes and use that values
>> on
>>>> sql queries to other tables like select * from someTable where
>>>> id=${myExtractedAttribute}
>>>>>> To be honest I tryed JoltTransformJSON but I could not get it working
>> :)
>>>>>>
>>>>>> Jorge Machado
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 20 Mar 2018, at 15:12, Matt Burgess <mattyb...@apache.org> wrote:
>>>>>>>
>>>>>>> I think Bryan is asking about what happens AFTER this part of the
>>>>>>> flow. For example, if you are doing routing you can use QueryRecord
>>>>>>> (and you won't need the SplitJson), if you are doing transformations
>>>>>>> you can use JoltTransformJSON (often without SplitJson as well),
>> etc.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Matt
>>

Re: FlattenJson

2018-03-20 Thread Bryan Bende

The only issue is that typically with record processors you will have
many records per flow file, so what do you do in that case?

If RouteOnRecordPath is going to send out individual records, then I
think you could already achieve this with PartitionRecord +
SplitRecord.

PartitionRecord would create a flow file per grouping of records, you
could then use RouteOnAttribute to route each group to a different
part of the flow, and then SplitRecord with record count of 1 if you
needed 1 record per flow file.

On Tue, Mar 20, 2018 at 4:06 PM, Matt Burgess <mattyb...@apache.org> wrote:
> Rather than restricting it to JSONPath, perhaps we should have a
> RouteOnRecordPath or RouteRecord using the RecordPath API? Even better
> would be the ability to use RecordPath functions in QueryRecord, but
> that involves digging into Calcite as well.  I realize JSONPath might
> have more capabilities than RecordPath at the moment, but it seems a
> shame to force the user to convert to JSON to use a "RouteOnJSONPath"
> processor, the record-aware processors are meant to replace that kind
> of format-specific functionality.
>
> Regards,
> Matt
>
> On Tue, Mar 20, 2018 at 12:19 PM, Sivaprasanna
> <sivaprasanna...@gmail.com> wrote:
>> Like the idea that Otto suggested. RoutOnJSONPath makes more sense since
>> making the flattened JSON write to attributes is restricted to that
>> processor alone.
>>
>> On Tue, Mar 20, 2018 at 8:37 PM, Otto Fowler <ottobackwa...@gmail.com>
>> wrote:
>>
>>> Why not create a new processor that does routeOnJSONPath and works on the
>>> flow file?
>>>
>>>
>>> On March 20, 2018 at 10:39:37, Jorge Machado (jom...@me.com) wrote:
>>>
>>> So that is what we actually are doing EvaluateJsonPath the problem with
>>> that is, that is hard to build something generic if we need to specify each
>>> property by his name, that’s why this idea.
>>>
>>> Should I make a PR for this or is this to business specific ?
>>>
>>>
>>> Jorge Machado
>>>
>>> > On 20 Mar 2018, at 15:30, Bryan Bende <bbe...@gmail.com> wrote:
>>> >
>>> > Ok so I guess it depends whether you end up needing all 30 fields as
>>> > attributes to achieve the logic in your flow, or if you only need a
>>> > couple.
>>> >
>>> > If you only need a couple you could probably use EvaluateJsonPath
>>> > after FlattenJson to extract just the couple of fields you need into
>>> > attributes.
>>> >
>>> > If you need them all then I guess it makes sense to want the option to
>>> > flatten into attributes.
>>> >
>>> > On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado <jom...@me.com> wrote:
>>> >> From there on we use a lot of routeOnAttritutes and use that values on
>>> sql queries to other tables like select * from someTable where
>>> id=${myExtractedAttribute}
>>> >> To be honest I tryed JoltTransformJSON but I could not get it working :)
>>> >>
>>> >> Jorge Machado
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>> On 20 Mar 2018, at 15:12, Matt Burgess <mattyb...@apache.org> wrote:
>>> >>>
>>> >>> I think Bryan is asking about what happens AFTER this part of the
>>> >>> flow. For example, if you are doing routing you can use QueryRecord
>>> >>> (and you won't need the SplitJson), if you are doing transformations
>>> >>> you can use JoltTransformJSON (often without SplitJson as well), etc.
>>> >>>
>>> >>> Regards,
>>> >>> Matt
>>> >>>
>>> >>> On Tue, Mar 20, 2018 at 10:08 AM, Jorge Machado <jom...@me.com> wrote:
>>> >>>> Hi Bryan,
>>> >>>>
>>> >>>> thanks for the help.
>>> >>>> Our Flow: ExecuteSql -> convertToJSON -> SplitJson -> ExecuteScript
>>> with attachedcode 1.
>>> >>>>
>>> >>>> We are now writting a custom processor that does this which is a copy
>>> of FlattenJson but instead of putting the result into a flowfile we put it
>>> into the attributes.
>>> >>>> That’s why I asked if it makes sense to contribute this back
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Attached code 1:
>>> >>>>
>>> >>>&g

Re: FlattenJson

2018-03-20 Thread Bryan Bende

Ok so I guess it depends whether you end up needing all 30 fields as
attributes to achieve the logic in your flow, or if you only need a
couple.

If you only need a couple you could probably use EvaluateJsonPath
after FlattenJson to extract just the couple of fields you need into
attributes.

If you need them all then I guess it makes sense to want the option to
flatten into attributes.

On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado <jom...@me.com> wrote:
> From there on  we use a lot of routeOnAttritutes and use that values on sql 
> queries to other tables like select * from someTable where 
> id=${myExtractedAttribute}
> To be honest I tryed JoltTransformJSON but I could not get it working :)
>
> Jorge Machado
>
>
>
>
>
>> On 20 Mar 2018, at 15:12, Matt Burgess <mattyb...@apache.org> wrote:
>>
>> I think Bryan is asking about what happens AFTER this part of the
>> flow. For example, if you are doing routing you can use QueryRecord
>> (and you won't need the SplitJson), if you are doing transformations
>> you can use JoltTransformJSON (often without SplitJson as well), etc.
>>
>> Regards,
>> Matt
>>
>> On Tue, Mar 20, 2018 at 10:08 AM, Jorge Machado <jom...@me.com> wrote:
>>> Hi Bryan,
>>>
>>> thanks for the help.
>>> Our Flow: ExecuteSql -> convertToJSON ->  SplitJson -> ExecuteScript with 
>>> attachedcode 1.
>>>
>>> We are now writting a custom processor that does this which is a copy of 
>>> FlattenJson but instead of putting the result into a flowfile we put it 
>>> into the attributes.
>>> That’s why I asked if it makes sense to contribute this back
>>>
>>>
>>>
>>> Attached code 1:
>>>
>>> import org.apache.commons.io.IOUtils
>>> import java.nio.charset.*
>>> def flowFile = session.get();
>>> if (flowFile == null) {
>>>return;
>>> }
>>> def slurper = new groovy.json.JsonSlurper()
>>> def attrs = [:] as Map<String,String>
>>> session.read(flowFile,
>>>{ inputStream ->
>>>def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>>>def obj = slurper.parseText(text)
>>>    obj.each {k,v ->
>>>if(v!=null && v.toString()!=""){
>>>  attrs[k] = v.toString()
>>>  }
>>>}
>>>} as InputStreamCallback)
>>> flowFile = session.putAllAttributes(flowFile, attrs)
>>> session.transfer(flowFile, REL_SUCCESS)
>>>
>>> some code removed
>>>
>>>
>>> Jorge Machado
>>>
>>>
>>>
>>>
>>>
>>>> On 20 Mar 2018, at 15:03, Bryan Bende <bbe...@gmail.com> wrote:
>>>>
>>>> Ok it is still not clear what the reason for needing it in attributes
>>>> is though... Is there another processor you are using after this that
>>>> only works off attributes?
>>>>
>>>> Just trying to understand if there is another way to accomplish what
>>>> you want to do.
>>>>
>>>> On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado <jom...@me.com> wrote:
>>>>> We are using nifi for Workflow and we get from a database like job_status 
>>>>> and job_name and some nested json columns.  (30 columns)
>>>>> We need to put it as attributes from the Flow file and not the content. 
>>>>> For the first part (columns without a json is done by groovy script) but 
>>>>> then would be nice to use this standard processor and instead of writing 
>>>>> this to a flow content write it to attributes.
>>>>>
>>>>>
>>>>> Jorge Machado
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On 20 Mar 2018, at 14:47, Bryan Bende <bbe...@gmail.com> wrote:
>>>>>>
>>>>>> What would be the main use case for wanting all the flattened values
>>>>>> in attributes?
>>>>>>
>>>>>> If the reason was to keep the original content, we could probably just
>>>>>> added an original relationship.
>>>>>>
>>>>>> Also, I think FlattenJson supports flattening a flow file where the
>>>>>> root is an array of JSON documents (although I'm not totally sure), so
>>>>>> you'd have to consider what to do in that case.
>>>>>>
>>>>>

Re: FlattenJson

2018-03-20 Thread Bryan Bende

Ok it is still not clear what the reason for needing it in attributes
is though... Is there another processor you are using after this that
only works off attributes?

Just trying to understand if there is another way to accomplish what
you want to do.

On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado <jom...@me.com> wrote:
> We are using nifi for Workflow and we get from a database like job_status and 
> job_name and some nested json columns.  (30 columns)
> We need to put it as attributes from the Flow file and not the content. For 
> the first part (columns without a json is done by groovy script) but then 
> would be nice to use this standard processor and instead of writing this to a 
> flow content write it to attributes.
>
>
> Jorge Machado
>
>
>
>
>
>> On 20 Mar 2018, at 14:47, Bryan Bende <bbe...@gmail.com> wrote:
>>
>> What would be the main use case for wanting all the flattened values
>> in attributes?
>>
>> If the reason was to keep the original content, we could probably just
>> added an original relationship.
>>
>> Also, I think FlattenJson supports flattening a flow file where the
>> root is an array of JSON documents (although I'm not totally sure), so
>> you'd have to consider what to do in that case.
>>
>> On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard
>> <pierre.villard...@gmail.com> wrote:
>>> No I do see how this could be convenient in some cases. My comment was
>>> more: you can certainly submit a PR for that feature, but it'll need to be
>>> clearly documented using the appropriate annotations, documentation, and
>>> property descriptions.
>>>
>>> 2018-03-20 10:20 GMT+01:00 Jorge Machado <jom...@me.com>:
>>>
>>>> Hi Pierre, I’m aware of that. So This means the change would not be
>>>> accepted correct ?
>>>>
>>>> Regards
>>>>
>>>> Jorge Machado
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> On 20 Mar 2018, at 09:54, Pierre Villard <pierre.villard...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi Jorge,
>>>>>
>>>>> I think this should be carefully documented to remind users that the
>>>>> attributes are in memory. Doing what you propose would mean having in
>>>>> memory the full content of the flow file as long as the flow file is
>>>>> processed in the workflow (unless you remove attributes using
>>>>> UpdateAttributes).
>>>>>
>>>>> Pierre
>>>>>
>>>>> 2018-03-20 7:55 GMT+01:00 Jorge Machado <jom...@me.com>:
>>>>>
>>>>>> Hey guys,
>>>>>>
>>>>>> I would like to change the FlattenJson Procerssor to be possible to
>>>>>> Flatten to the attributes instead of Only to content. Is this a good
>>>> Idea ?
>>>>>> would the PR be accepted ?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Jorge Machado
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>

Re: FlattenJson

2018-03-20 Thread Bryan Bende

What would be the main use case for wanting all the flattened values
in attributes?

If the reason was to keep the original content, we could probably just
added an original relationship.

Also, I think FlattenJson supports flattening a flow file where the
root is an array of JSON documents (although I'm not totally sure), so
you'd have to consider what to do in that case.

On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard
 wrote:
> No I do see how this could be convenient in some cases. My comment was
> more: you can certainly submit a PR for that feature, but it'll need to be
> clearly documented using the appropriate annotations, documentation, and
> property descriptions.
>
> 2018-03-20 10:20 GMT+01:00 Jorge Machado :
>
>> Hi Pierre, I’m aware of that. So This means the change would not be
>> accepted correct ?
>>
>> Regards
>>
>> Jorge Machado
>>
>>
>>
>>
>>
>> > On 20 Mar 2018, at 09:54, Pierre Villard 
>> wrote:
>> >
>> > Hi Jorge,
>> >
>> > I think this should be carefully documented to remind users that the
>> > attributes are in memory. Doing what you propose would mean having in
>> > memory the full content of the flow file as long as the flow file is
>> > processed in the workflow (unless you remove attributes using
>> > UpdateAttributes).
>> >
>> > Pierre
>> >
>> > 2018-03-20 7:55 GMT+01:00 Jorge Machado :
>> >
>> >> Hey guys,
>> >>
>> >> I would like to change the FlattenJson Procerssor to be possible to
>> >> Flatten to the attributes instead of Only to content. Is this a good
>> Idea ?
>> >> would the PR be accepted ?
>> >>
>> >> Cheers
>> >>
>> >> Jorge Machado
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>>
>>

Re: Custom NAR Class Loader Issue

2018-03-16 Thread Bryan Bende

This should be better going forward once the 1.6.0 release is out.

https://issues.apache.org/jira/browse/NIFI-4936

On Fri, Mar 16, 2018 at 3:49 PM, Otto Fowler <ottobackwa...@gmail.com> wrote:
> This is also a problem for new AWS processors, say if you use a newer aws
> java sdk core in a dependency but also have to depend on the nifi-aws-nar.
>
>
> On March 2, 2018 at 13:40:21, Bryan Bende (bbe...@gmail.com) wrote:
>
> Doug,
>
> I think the only solution is what you proposed about fixing the
> nifi-gcp-bundle...
>
> Basically, if a NAR needs a different version of a dependency that is
> already declared in the root pom's dependencyManagement, then the
> bundle's pom needs it own dependencyManagement to force it back to the
> specific version for the bundle.
>
> Ultimately it would be nice to re-evaluate our use of
> dependencyManagement in the root pom, because I don't think we should
> be forcing the version of so many things. Guava is a perfect example
> where we shouldn't be forcing all NARs to use a specific version of
> Guava, that defeats the whole purpose of NARs.
>
> There are other cases though where it does make sense because there is
> code in nifi/nifi-commons that we would likely want to keep in sync
> with the NARs that use the code.
>
> -Bryan
>
> On Fri, Mar 2, 2018 at 1:27 PM, Douglas Willcocks
> <douglas.willco...@artefact.com> wrote:
>> Hi all,
>>
>> I'm encountering a JAR version dependency issue when developing a custom
>> Processor that uses the GCPCredentialsService in Nifi 1.5.0.
>>
>> 1) The nifi-gcp-bundle distributed with Nifi 1.5.0 (which contains
>> GCPCredentialsService) is packaged with dependencies on:
>>
>> - com.google.cloud:google-cloud:0.8.0
>> - com.google.auth:google-auth-library-oauth2-http:0.6.0
>>
>> Which both transitively depend on com.google.guava:guava:19.0, except the
>> root-level pom.xml for Nifi (here
>> <https://github.com/apache/nifi/blob/master/pom.xml>) explicitly pins the
>> guava version to 18.0.
>>
>> 2) The custom Processor I am building also depends on
>> com.google.cloud:google-cloud:0.8.0, but actually uses methods from this
>> library that depend on features added in com.google.guava:guava:19.0.
>>
>> 3) In order to make the GCPCredentialsService available inside my
>> Processor
>> NAR, I have added the nifi-gcp-services-api-nar as a parent.
>>
>> 4) The implementation of NarClassLoader always starts the search for
>> classes in the parent, which means that when my processor looks for
>> com.google.guava class definitions, the com.google.guava:guava:18.0 JAR
>> packaged with the nifi-gcp-services-api-nar is always found first,
>> resulting in a java.lang.NoSuchMethodError exception in my Processor's
>> execution.
>>
>> I'm not sure what the best solution is – I guess I could create my own
>> version of the nifi-gcp-bundle that has the correct pom.xml file to force
>> the use of guava:19.0, but that seems a bit overkill for a change that is
>> essentially 5 lines of XML in a POM file.
>>
>> Any suggestions or thoughts?
>>
>> Thanks!
>> Douglas
>> --
>>
>> Douglas Willcocks
>> VP France, Data Science & Engineering
>>
>> 19 Rue Richer, 75009 Paris, France
>> M: +33 6 80 37 60 72
>> E: douglas.willco...@artefact.com
>>
>> W: http://www.artefact.com/
>>
>> --
>> This email and any attachments contains privileged and confidential
>> information intended only for the use of the addressee(s). If you are not
>> an intended recipient of this email, you are hereby notified that any
>> dissemination, copying or use of information is strictly prohibited. If
>> you
>> received this email in error or without authorization, please delete the
>> email from your system and notify us immediately by sending us an email.
>> If
>> you need any further assistance, please send a message to
>> he...@artefact.com
>> .

Re: DBCPConnectionPool encrypted password

2018-03-12 Thread Bryan Bende

Toivo,

I think there needs to be some improvements around variables &
sensitive property handling, but it is a challenging situation.

Some things you could investigate with the current capabilities..

- With the registry scenario, you could define a DBCPConnectionPool at
the root process group of each of your environments, then all your
versioned process groups can reference the DBCPConnectionPool from the
level above. When deploying a versioned flow you would still need to
go in the first time and update any processors to reference the
appropriate connection pool, but maybe this could be scripted?

- Similar to above, but if each versioned process group had its own
DBCPConnectionPool, then maybe after importing the flow you can script
the process of setting the password on the connection pool.

- You could possibly implement a custom version of DBCPConnectionPool
that obtained the password from somewhere outside the flow, although
this isn't great because it only works for this one component.

-Bryan


On Mon, Mar 12, 2018 at 9:59 AM, Toivo Adams  wrote:
> Hi Bryan,
>
> We start using Registry soon anyway, so this is useful info.
> But it would be event better when we don’t need to enter passwords manually
> each time we deploy template.
> Any ideas how to do this?
>
> Thank you
> Toivo
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: DBCPConnectionPool encrypted password

2018-03-12 Thread Bryan Bende

You may want to consider moving from templates to NiFi Registry for
your deployment approach. The idea of this approach is that your flow
will get saved to registry with no sensitive values, when you import
the flow to the next environment you enter the sensitive values there
the first time and they get encrypted like normal, and then on future
deployments it retains the values you entered in the current
environment. There was actually a bug with this that is fixed on
master and will be in the next release [1].

[1] https://issues.apache.org/jira/browse/NIFI-4920

On Mon, Mar 12, 2018 at 9:22 AM, Toivo Adams  wrote:
> Hi Bryan,
>
>>> Are you saying you are trying to externalize the value outside the
>>> w and keep it encrypted somewhere else?
>
> Yes, exactly. We have different passwords on different environments (dev,
> test, production).
> After development flow (using template currently) will be deployed to test
> env. And if testing is successful we deploy same flow to production.
> Ideally flow should remain unmodified.
> So we keep password outside of flow – in properties file.
>
> Thank you
> Toivo
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: DBCPConnectionPool encrypted password

2018-03-12 Thread Bryan Bende

Toivo,

The password property on DBCPConnectionPool is a "sensitive" property
which means it is already encrypted in the flow.xml.gz using
nifi.sensitive.props.key.

Are you saying you are trying to externalize the value outside the
flow and keep it encrypted somewhere else?

-Bryan


On Mon, Mar 12, 2018 at 6:02 AM, Toivo Adams  wrote:
> Hi,
>
> I need to encrypt DBCPConnectionPool password.
> I have working decryption code.
> Problem is how to make decrypted password available to  DBCPConnectionPool.
>
> I thought to use expression language and JVM System Property’s.
> DBCPConnectionPool is capable to read System property value.
>
> But how to set System property value after decryption.
> I could create custom Controller service, but this service should be
> executed before DBCPConnectionPool.
> Is order of execution Controller services anyhow defined?
>
> Thank you
> Toivo
>
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: [VOTE] Establish Fluid Design System, a sub-project of Apache NiFi

2018-03-09 Thread Bryan Bende

+1

On Fri, Mar 9, 2018 at 3:11 PM, Joe Witt  wrote:
> +1
>
> On Mar 9, 2018 3:10 PM, "Scott Aslan"  wrote:
>
> All,
>
> Following a solid discussion for the past couple of weeks [1] regarding the
> establishment of Fluid Design System as a sub-project of Apache NiFi, I'd
> like to
> call a formal vote to record this important community decision and
> establish consensus.
>
> The scope of this project is to define a theme-able set of high quality UI
> components and utilities for use across the various Apache NiFi web
> applications in order to provide a more consistent user experience.
>
> I am a +1 and looking forward to the future work in this area.
>
> The vote will be open for 72 hours and be a majority rule vote.
>
> [ ] +1 Establish Fluid Design System, a subproject of Apache NiFi
> [ ]   0 Do not care
> [ ]  -1 Do not establish Fluid Design System, a subproject of Apache NiFi
>
> Thanks,
>
> ScottyA
>
> [1] *http://mail-archives.apache.org/mod_mbox/nifi-dev/201802.mbox/%
> 3CCAKeSr4ibXX9xzGN1GhdVv5uTmWvfB3QULXF9orzw4FYD0n7taQ%40mail.gmail.com%3E
>  3CCAKeSr4ibXX9xzGN1GhdVv5uTmWvfB3QULXF9orzw4FYD0n7taQ%40mail.gmail.com%3E>*

Re: Is it possible to deploy NiFi to an WildFly app server ?

2018-03-07 Thread Bryan Bende

NiFi is not a single WAR that can be deployed somewhere. You should
think of it like other software that you install on your system, for
example a relational database. You wouldn't expect to deploy your
Postgres DB to your WildFly server.

On Wed, Mar 7, 2018 at 9:00 AM, Mike Thomsen  wrote:
> Most of NiFi isn't a JavaEE application, so I don't see how you could do
> more than maybe get part of the web UI working in WildFly.
>
> On Wed, Mar 7, 2018 at 6:36 AM, Александр  wrote:
>
>> We have a requirement to run NiFi in an only approved WildFly app
>> server. Is it possible to do that?
>>
>> Thanks.
>>

Re: Policies for root Process Group.

2018-02-27 Thread Bryan Bende

Making a call to "/process-groups/root" should retrieve the root
process group which should then have an id element.


On Mon, Feb 26, 2018 at 5:20 PM, Daniel Hernandez
 wrote:
> Thanks Matt,
>
> I get now what is the problem, in order to exhaust all my possibilities I
> may ask, is there a way using the API to get the root UUID from the
> flow.xml.gz file? Because I see the file there after running the tests.
>
> Thanks,
>
>
> On Mon, Feb 26, 2018 at 3:26 PM, Daniel Hernandez <
> daniel.hernan...@civitaslearning.com> wrote:
>
>> Hi Matt,
>>
>> Thanks for your answer.
>>
>> Do you know if there is a way to preconfigure this value when running
>> Nifi's Docker image? I am making the calls from an integration test that
>> runs a docker container with the Nifi server. I already check and the value
>> under  in the flow.xml.gz file changes everytime I deploy
>> the container, I guess it is created at startup.  Is it possible that I can
>> change my docker image to get a fix root group value?
>>
>> Thanks,
>>
>> Daniel
>>
>> On Mon, Feb 26, 2018 at 11:35 AM, Daniel Hernandez > civitaslearning.com> wrote:
>>
>>> Hi,
>>>
>>> I am currently working on calling the Nifi REST API to get the 'root'
>>> process group and setting it as parent for a new process-group.
>>>
>>> However I am getting the next messages:
>>>
>>> Attempting GET request to: JerseyWebTarget {
>>> https://127.0.0.1:8443/nifi-api/process-groups/root }
>>> 2018-02-26 11:06:55.341 DEBUG  --- [   main]
>>> c.c.p.n.c.i.b.BootApiClient  :
>>> 2018-02-26 11:06:55.341 DEBUG  --- [   main]
>>> c.c.p.n.c.i.b.BootApiClient  : Received 403 response from GET
>>> to JerseyWebTarget { https://127.0.0.1:8443/nifi-api/process-groups/root
>>> }
>>>
>>> com.civitaslearning.platform.nifi.client.invoker.boot.exception.NifiForbiddenException:
>>> No applicable policies could be found. Contact the system administrator.
>>>
>>> This is the content of my authorizations.xml file:
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/flow" action="R">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/restricted-components" action="W">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/tenants" action="R">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/tenants" action="W">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/policies" action="R">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/policies" action="W">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/controller" action="R">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/controller" action="W">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/process-groups/root" action="R">
>>>
>>> 
>>>
>>> 
>>>
>>> >> resource="/process-groups/root" action="W">
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> And this is the content of authorizations.xml
>>>
>>> 
>>>
>>> 
>>>
>>> file-access-policy-provider
>>>
>>> org.apache.nifi.authorization.FileAccessPolicyProvide
>>> r
>>>
>>> file-user-group-prov
>>> ider
>>>
>>> ./conf/authorizations.xm
>>> l
>>>
>>> CN=civitas,
>>> OU=ApacheNifi
>>>
>>> 
>>>
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> managed-authorizer
>>>
>>> org.apache.nifi.authorization.StandardManagedAuthoriz
>>> er
>>>
>>> file-access-policy-p
>>> rovider
>>>
>>> 
>>>
>>> 
>>>
>>>
>>> And users.xml
>>>
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> >> identity="CN=civitas, OU=ApacheNifi"/>
>>>
>>> 
>>>
>>> 
>>>
>>> I already create a policy using the same user cert so I guess the DN is
>>> valid.
>>> Am I defining the policy or making the call in a wrong way?
>>>
>>> Thanks in advance,
>>>
>>> Daniel Hernandez
>>>
>>>
>>>
>>

Re: get access token inside custom processor

2018-02-27 Thread Bryan Bende

Hello,

Your custom processor would be the same as if you were writing an
external client program.

You would need to provide the processor with a username and password
in the processor properties, and then it would need to make a call to
the token REST end-point.

Processors don't run as the user from the web UI, they run on behalf
of the NiFi framework and have no idea which user started/stopped
them.

Thanks,

Bryan

On Tue, Feb 27, 2018 at 1:27 AM, 尹文才  wrote:
> Hi guys, I'm trying to invoke some nifi rest apis inside my custom
> processor, the nifi I'm using is nifi 1.4.0 and it's a 3 node secured
> cluster., the username and password are kept inside a ldap server.
> I know that in a secured nifi cluster, in order to make any request I need
> the access token, my question is how could I get the access token in my
> custom processor? Thanks. (I think the token should be
> available somewhere after successful login right?)
>
> regards,
> ben

Re: Policies for root Process Group.

2018-02-26 Thread Bryan Bende

You should be able to include a canned flow.xml.gz in your in your
container, just have nothing under the root group.


On Mon, Feb 26, 2018 at 3:50 PM, Matt Gilman  wrote:
> Daniel,
>
> Unfortunately, there is no way to set this currently. This is ultimately a
> lifecycle issue. The UUID of the root group may be inherited from a cluster
> or randomly generated if a node is standalone. From the admin guide:
>
> "For a brand new secure flow, providing the "Initial Admin Identity" gives
> that user access to get into the UI and to manage users, groups and
> policies. But if that user wants to start modifying the flow, they need to
> grant themselves policies for the root process group. The system is unable
> to do this automatically because in a new flow the UUID of the root process
> group is not permanent until the flow.xml.gz is generated. If the NiFi
> instance is an upgrade from an existing flow.xml.gz or a 1.x instance going
> from unsecure to secure, then the "Initial Admin Identity" user is
> automatically given the privileges to modify the flow."
>
> Because of this, when there is no existing flow, granting permissions to
> the root group would need to happen after this initial startup.
>
> Matt
>
>
> On Mon, Feb 26, 2018 at 3:26 PM, Daniel Hernandez <
> daniel.hernan...@civitaslearning.com> wrote:
>
>> Hi Matt,
>>
>> Thanks for your answer.
>>
>> Do you know if there is a way to preconfigure this value when running
>> Nifi's Docker image? I am making the calls from an integration test that
>> runs a docker container with the Nifi server. I already check and the value
>> under  in the flow.xml.gz file changes everytime I deploy
>> the container, I guess it is created at startup.  Is it possible that I can
>> change my docker image to get a fix root group value?
>>
>> Thanks,
>>
>> Daniel
>>
>> On Mon, Feb 26, 2018 at 11:35 AM, Daniel Hernandez <
>> daniel.hernan...@civitaslearning.com> wrote:
>>
>> > Hi,
>> >
>> > I am currently working on calling the Nifi REST API to get the 'root'
>> > process group and setting it as parent for a new process-group.
>> >
>> > However I am getting the next messages:
>> >
>> > Attempting GET request to: JerseyWebTarget {
>> https://127.0.0.1:8443/nifi-
>> > api/process-groups/root }
>> > 2018-02-26 11:06:55.341 DEBUG  --- [   main]
>> > c.c.p.n.c.i.b.BootApiClient  :
>> > 2018-02-26 11:06:55.341 DEBUG  --- [   main]
>> > c.c.p.n.c.i.b.BootApiClient  : Received 403 response from GET
>> > to JerseyWebTarget { https://127.0.0.1:8443/nifi-api/process-groups/root
>> }
>> >
>> > com.civitaslearning.platform.nifi.client.invoker.boot.exception.
>> NifiForbiddenException:
>> > No applicable policies could be found. Contact the system administrator.
>> >
>> > This is the content of my authorizations.xml file:
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/flow" action="R">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/restricted-components" action="W">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/tenants" action="R">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/tenants" action="W">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/policies" action="R">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/policies" action="W">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/controller" action="R">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/controller" action="W">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/process-groups/root" action="R">
>> >
>> > 
>> >
>> > 
>> >
>> > > > resource="/process-groups/root" action="W">
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > And this is the content of authorizations.xml
>> >
>> > 
>> >
>> > 
>> >
>> > file-access-policy-provider
>> >
>> > org.apache.nifi.authorization.FileAccessPolicyProvider> > class>
>> >
>> > file-user-group-
>> > provider
>> >
>> > ./conf/authorizations.
>> > xml
>> >
>> > CN=civitas,
>> > OU=ApacheNifi
>> >
>> > 
>> >
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > managed-authorizer
>> >
>> > org.apache.nifi.authorization.StandardManagedAuthorizer> > class>
>> >
>> > file-access-policy-
>> > provider
>> >
>> > 
>> >
>> > 
>> >
>> >
>> > And users.xml
>> >
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > 
>> >
>> > > > identity="CN=civitas, OU=ApacheNifi"/>
>> >
>> > 
>> >
>> > 
>> >
>> > I already create a policy using the same user cert so I guess the DN is
>> > valid.
>> > Am I defining the policy or making the call in a wrong way?
>> >
>> > Thanks in advance,
>> >
>> > Daniel Hernandez
>> >
>> >
>> >
>>

Re: CSV record parsing with custom date formats

2018-02-13 Thread Bryan Bende

As a possible work around, there were date functions added to record path
in 1.5.0, so if you had a schema that treated the field as a string, you
could reformat the column in place using UpdateRecord to get it into
whatever format it needs to be in.

On Tue, Feb 13, 2018 at 9:17 PM Koji Kawamura 
wrote:

> Hi Derek,
>
> By looking at the code briefly, I guess you are using ValidateRecord
> processor with CSVReader and AvroWriter..
> As you pointed out, it seems DataTypeUtils.isCompatibleDataType does
> not use the date format user defined at CSVReader.
>
> Is it possible for you to share followings for us to reproduce and
> understand it better?
> - Sample input CSV file
> - NiFi flow template using CSVReader and AvroWriter
>
> Thanks,
> Koji
>
> On Wed, Feb 14, 2018 at 7:11 AM, Derek Straka  wrote:
> > I have a question about the expected behavior of convertSimpleIfPossible
> in
> > CSVRecordReader.java (NiFi 1.5.0).
> >
> > I have a custom CSV file that I am taking to an avro schema using
> > ValidateRecord.  The schema contains a logical date type and the CSV has
> > the date in the format MM/DD/.  I expected to provide the date string
> > in the controller element for the CSV reader and have everything parse
> > happily, but it ends up throwing an exception when it tries to parse
> things
> > in the avro writer (String->Date).  I don't think I should be blaming the
> > avro writer because I expected the CSV reader to parse the date for me.
> >
> > I did a little digging in the CSVRecordReader.java, and I see everything
> > flows through convertSimpleIfPossible when parsing the data, and each
> data
> > type is checked with DataTypeUtils.isCompatibleDataType prior to actually
> > trying to perform the conversion.
> >
> > The date string doesn't use the user provided format in the call to
> > DataTypeUtils.isCompatibleDataType, but instead uses the default for date
> > types.  The validation ends up failing when it uses the default date
> string
> > (-MM-DD), so it won't use LAZY_DATE_FORMAT as I expected.  Am I
> totally
> > off base, or it this unexpected behavior?
> >
> > Thanks.
> >
> > -Derek
>
-- 
Sent from Gmail Mobile

Re: Implementation of ListFile's Primary Node only in a cluster

2018-02-10 Thread Bryan Bende

Currently it means that the dataflow manager/developer is expected to
set the 'Execution Nodes' strategy to "Primary Node" at the time of
flow design.

We don't have anything that restricts the scheduling strategy of a
processor, but we probably should consider having an annotation like
@PrimaryNodeOnly that you can put on a processor and then the
framework will enforce that it can only be scheduled on primary node.

In the case of ListFile, I think the statement in the documentation is
only partially true...

When "Input Directory Location" is set to local, there should be no
issue with scheduling the processor on all nodes in the cluster, as it
would be listing a local directory and storing state locally.

When "Input Directory Location" is set to remote, it wouldn't make
sense to have all nodes listing the same remote directory and getting
the same results, and also the state is then stored in ZooKeeper under
a ZNode using the processor's UUID, and the processor has the same
UUID on each node so they would be overwriting each other's state in
ZK.

So ListFile probably can't be restricted to primary node only, where
as something like ListHDFS probably could because it is always listing
a remote destination.

On Fri, Feb 9, 2018 at 10:55 PM, Sivaprasanna  wrote:
> I was going through ListFile processor's code and found out that in the
> documentation
> ,
> it is mentioned that "this processor is designed to run on Primary Node
> only in a cluster". I want to understand what "designed" stands for here.
> Does that mean the processor was built in a way that it only runs on the
> Primary node regardless of the "Execution Nodes" strategy set to otherwise
> or does it mean that dataflow manager/developer is expected to set the
> 'Execution Nodes' strategy to "Primary Node" at the time of flow design? If
> it is of the former case, how is it handled in the code? If it is handled,
> it should be in the framework side but I don't see any annotation
> indicating anything related to such mechanism in the processor code and
> more over a related JIRA NIFI-543
>  is also open so I want
> clear my doubt.
>
> -
> Sivaprasanna

Re: Will you accept contributions in Scala?

2018-02-10 Thread Bryan Bende

I agree more with Andy about sticking with Java. The more varying languages
used, the more challenging it is to maintain. Once the code is part of the
Apache NiFi git repo, it is now the responsibility of the committers and
PMC members to maintain it.

I’d even say I am somewhat against the groovy/Spock test code that Andy
mentioned. I have frequently spent hours trying to fix a Spock test that
broke from something I was working on. Every committer is familiar with
JUnit, but only a couple know Spock. Just using this as an example that
every committer knows Java, but only a couple probably know Scala, Clojure,
etc.

On Sat, Feb 10, 2018 at 10:25 AM Jeff  wrote:

> +1 to Joe's response.  If you can develop a component in Groovy or Scala
> (or Clojure!) more quickly/comfortably, or if allowing components written
> in other languages would encourage people to contribute more, I'm all for
> it.
>
> On Sat, Feb 10, 2018 at 7:42 AM Joe Witt  wrote:
>
> > i personally would be ok with it for an extension/processor provided it
> > integrates well with the build.
> >
> > i would agree with andys view for core framework stuff but for
> extensions i
> > think we can do it like mikethomsen suggested.
> >
> > others?
> >
> > thanks
> > joe
> >
> > On Feb 10, 2018 7:30 AM, "Mike Thomsen"  wrote:
> >
> > > I'm just a community contributor, so take that FWIW, but a compromise
> > might
> > > be to publish the Scala code as separate maven modules to maven central
> > and
> > > then submit a thoroughly tested processor written in Java. As long as
> you
> > > have enough unit and integration tests to give strong coverage, I
> > wouldn't
> > > imagine anyone here would have issues reviewing it. If the tests fail
> > > because of code issues in the external dependencies, the obvious answer
> > is
> > > to just hold the PR until the tests pass.
> > >
> > > On Fri, Feb 9, 2018 at 9:00 AM, Weiss, Adam <
> adam.we...@perkinelmer.com>
> > > wrote:
> > >
> > > > Devs,
> > > >
> > > > I have some interns starting with my team and we use Scala internally
> > for
> > > > our work.
> > > > If I wanted to have them work to contribute some new processors,
> would
> > > > they have to be Java to be included with the base distribution or
> could
> > > > they use Scala as long as it fit within your current build and test
> > > process?
> > > >
> > > > Thanks,
> > > > -Adam
> > > >
> > >
> >
>
-- 
Sent from Gmail Mobile

Re: Question regarding InstanceClassLoader

2018-02-07 Thread Bryan Bende

Hello,

Is there a specific issue/problem you are trying to figure out?

If you are just interested in how it works, the main code to look at
would be in FlowController in the "reload" methods, here is the one
for a processor node:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java#L1238-L1275

The "additional urls" parameter to that method would be the calculated
set of URLs from the any property descriptors that have
dynamicallyModifiesClassPath(true).

Those urls are calculated here:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/AbstractConfiguredComponent.java#L127-L151

Thanks,

Bryan


On Wed, Feb 7, 2018 at 12:30 PM, Sivaprasanna  wrote:
> With NiFi 1.1 onward, support to load resources to classpath dynamically is
> supported. I read the articles related to this on:
>
>-
>
> https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.3/bk_developer-guide/content/per-instance-classloading.html
>- https://bryanbende.com/development/2016/11/24/apache-nifi-class-loading
>
> I understand we can make a property this way by setting
> PropertyDescriptor.dynamicallyModifiesClasspath(true). I want to know how
> this gets communicated with the framework and the framework handles this. I
> dug more and found that ExtensionManager. createInstanceClassLoader() does
> this job. Correct me, if I'm wrong. If that's the case, I want to know how
> the internal call happens i.e. setting dynamicallyModifiesClassPath
> triggers the instance class loader.

Re: setting up secure nifi

2018-01-31 Thread Bryan Bende

It’s the same problem, your initial admin should be:

CN=TC, OU=NIFI

Not

CN=TC,OU=NIFI,dc=example,dc=com

The first one is the DN of your client cert, the second one is not.

On Wed, Jan 31, 2018 at 7:23 PM Anil Rai <anilrain...@gmail.com> wrote:

> Hi Bryan,
>
> Thanks for the quick reply. I did followed your steps. But I am seeing the
> same error.
> Now the entry looks like
> CN=TC,OU=NIFI,dc=example,
> dc=com
>
> Also what does dc stand for after CN and OU. Is that a problem.
> Is there a blog that talks about installing and making it https using
> toolkit?. I did not find any good post that talks end to end from
> installing to making it secure using tls toolkit.
>
> Any help is appreciated.
>
> Thanks
> Anil
>
>
>
> On Wed, Jan 31, 2018 at 6:42 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
> > Hello,
> >
> > The identity in authorizers.xml for your initial admin does not match the
> > identity of your client cert.
> >
> > You should be putting “CN=TC, OU=NIFI” as the initial admin because that
> is
> > the DN of your client cert.
> >
> > You’ll need to stop NiFi, edit authorizers.xml, delete users.xml and
> > authorizations.xml, and start back up.
> >
> > Thanks,
> >
> > Bryan
> >
> > On Wed, Jan 31, 2018 at 6:11 PM Anil Rai <anilrain...@gmail.com> wrote:
> >
> > > All,
> > >
> > > I am trying to install nifi 1.5 and making it https. Below is the steps
> > > followed and the error i am getting. Below is the config and log files
> > > content. Please help
> > >
> > > 1. Installed nifi 1.5
> > > 2. Installed nifi toolkit 1.5
> > > 3. Ran toolkit - ./tls-toolkit.sh standalone -n 'localhost' -C
> > > 'CN=TC,OU=NIFI' -O -o ../security_output
> > > 4. Copied generated keystore, truststore and nifi properties to
> > nifi/config
> > > folder
> > > 5. Imported the generated certificate to chrome browser
> > > 6. Modified authorizers.xml as attached.
> > > 7. With required restarts. Now when i enter the below url in the
> > browser, I
> > > see the below error.
> > >
> > > https://localhost:9443/nifi/
> > >
> > > Insufficient Permissions
> > >
> > >- home
> > >
> > > Unknown user with identity 'CN=TC, OU=NIFI'. Contact the system
> > > administrator.
> > >
> > >
> > > authorizers.xml
> > > 
> > > 
> > > file-user-group-provider
> > > org.apache.nifi.authorization.
> > FileUserGroupProvider
> > > ./conf/users.xml
> > > 
> > >
> > > cn=TC,ou=NIFI,dc=example,dc=com
> > > 
> > >
> > > 
> > > file-access-policy-provider
> > >
> > > org.apache.nifi.authorization.FileAccessPolicyProvider
> > > file-user-group-provider
> > > ./conf/authorizations.xml
> > > cn=TC,ou=NIFI,dc=example,dc=com
> > > 
> > >
> > > 
> > > 
> > > 
> > >
> > > nifi-user.log
> > > ---
> > > 2018-01-31 17:51:20,220 INFO [main] o.a.n.a.FileUserGroupProvider
> > Creating
> > > new users file at
> > > /Users/anilrai/projects/tc/servicemax/nifi-1.5.0/./conf/users.xml
> > > 2018-01-31 17:51:20,234 INFO [main] o.a.n.a.FileUserGroupProvider
> > > Users/Groups file loaded at Wed Jan 31 17:51:20 EST 2018
> > > 2018-01-31 17:51:20,240 INFO [main] o.a.n.a.FileAccessPolicyProvider
> > > Creating new authorizations file at
> > > /Users/anilrai/projects/tc/servicemax/nifi-1.5.0/./conf/
> > authorizations.xml
> > > 2018-01-31 17:51:20,264 INFO [main] o.a.n.a.FileAccessPolicyProvider
> > > Populating authorizations for Initial Admin:
> > > cn=TC,ou=NIFI,dc=example,dc=com
> > > 2018-01-31 17:51:20,271 INFO [main] o.a.n.a.FileAccessPolicyProvider
> > > Authorizations file loaded at Wed Jan 31 17:51:20 EST 2018
> > > 2018-01-31 17:52:18,192 INFO [NiFi Web Server-28]
> > > o.a.n.w.a.c.IllegalStateExceptionMapper
> java.lang.IllegalStateException:
> > > Kerberos ticket login not supported by this NiFi.. Returning Conflict
> > > response.
> > > 2018-01-31 17:52:18,306 INFO [NiFi Web Server-67]
> > > o.a.n.w.a.c.IllegalStateExceptionMapper
> java.lang.IllegalStateException:
> > > OpenId Connect is not configured.. Returning Con

Re: setting up secure nifi

2018-01-31 Thread Bryan Bende

Hello,

The identity in authorizers.xml for your initial admin does not match the
identity of your client cert.

You should be putting “CN=TC, OU=NIFI” as the initial admin because that is
the DN of your client cert.

You’ll need to stop NiFi, edit authorizers.xml, delete users.xml and
authorizations.xml, and start back up.

Thanks,

Bryan

On Wed, Jan 31, 2018 at 6:11 PM Anil Rai  wrote:

> All,
>
> I am trying to install nifi 1.5 and making it https. Below is the steps
> followed and the error i am getting. Below is the config and log files
> content. Please help
>
> 1. Installed nifi 1.5
> 2. Installed nifi toolkit 1.5
> 3. Ran toolkit - ./tls-toolkit.sh standalone -n 'localhost' -C
> 'CN=TC,OU=NIFI' -O -o ../security_output
> 4. Copied generated keystore, truststore and nifi properties to nifi/config
> folder
> 5. Imported the generated certificate to chrome browser
> 6. Modified authorizers.xml as attached.
> 7. With required restarts. Now when i enter the below url in the browser, I
> see the below error.
>
> https://localhost:9443/nifi/
>
> Insufficient Permissions
>
>- home
>
> Unknown user with identity 'CN=TC, OU=NIFI'. Contact the system
> administrator.
>
>
> authorizers.xml
> 
> 
> file-user-group-provider
> org.apache.nifi.authorization.FileUserGroupProvider
> ./conf/users.xml
> 
>
> cn=TC,ou=NIFI,dc=example,dc=com
> 
>
> 
> file-access-policy-provider
>
> org.apache.nifi.authorization.FileAccessPolicyProvider
> file-user-group-provider
> ./conf/authorizations.xml
> cn=TC,ou=NIFI,dc=example,dc=com
> 
>
> 
> 
> 
>
> nifi-user.log
> ---
> 2018-01-31 17:51:20,220 INFO [main] o.a.n.a.FileUserGroupProvider Creating
> new users file at
> /Users/anilrai/projects/tc/servicemax/nifi-1.5.0/./conf/users.xml
> 2018-01-31 17:51:20,234 INFO [main] o.a.n.a.FileUserGroupProvider
> Users/Groups file loaded at Wed Jan 31 17:51:20 EST 2018
> 2018-01-31 17:51:20,240 INFO [main] o.a.n.a.FileAccessPolicyProvider
> Creating new authorizations file at
> /Users/anilrai/projects/tc/servicemax/nifi-1.5.0/./conf/authorizations.xml
> 2018-01-31 17:51:20,264 INFO [main] o.a.n.a.FileAccessPolicyProvider
> Populating authorizations for Initial Admin:
> cn=TC,ou=NIFI,dc=example,dc=com
> 2018-01-31 17:51:20,271 INFO [main] o.a.n.a.FileAccessPolicyProvider
> Authorizations file loaded at Wed Jan 31 17:51:20 EST 2018
> 2018-01-31 17:52:18,192 INFO [NiFi Web Server-28]
> o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos ticket login not supported by this NiFi.. Returning Conflict
> response.
> 2018-01-31 17:52:18,306 INFO [NiFi Web Server-67]
> o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException:
> OpenId Connect is not configured.. Returning Conflict response.
> 2018-01-31 17:52:18,350 INFO [NiFi Web Server-27]
> o.a.n.w.s.NiFiAuthenticationFilter Attempting request for (CN=TC, OU=NIFI)
> GET https://localhost:9443/nifi-api/flow/current-user (source ip:
> 127.0.0.1)
> 2018-01-31 17:52:18,354 INFO [NiFi Web Server-27]
> o.a.n.w.s.NiFiAuthenticationFilter Authentication success for CN=TC,
> OU=NIFI
> 2018-01-31 17:52:18,424 INFO [NiFi Web Server-27]
> o.a.n.w.a.c.AccessDeniedExceptionMapper identity[CN=TC, OU=NIFI], groups[]
> does not have permission to access the requested resource. Unknown user
> with identity 'CN=TC, OU=NIFI'. Returning Forbidden response.
> --
>
> Generated users.xml
> 
> 
> 
> 
> 
>  identity="cn=TC,ou=NIFI,dc=example,dc=com"/>
> 
> 
> 
>
> Generated authorizations.xml
> --
> 
> 
> 
>  resource="/flow" action="R">
> 
> 
>  resource="/data/process-groups/4dedb986-0161-1000-0db6-e28e0a2db61d"
> action="R">
> 
> 
>  resource="/data/process-groups/4dedb986-0161-1000-0db6-e28e0a2db61d"
> action="W">
> 
> 
>  resource="/process-groups/4dedb986-0161-1000-0db6-e28e0a2db61d" action="R">
> 
> 
>  resource="/process-groups/4dedb986-0161-1000-0db6-e28e0a2db61d" action="W">
> 
> 
>  resource="/restricted-components" action="W">
> 
> 
>  resource="/tenants" action="R">
> 
> 
>  resource="/tenants" action="W">
> 
> 
>  resource="/policies" action="R">
> 
> 
>  resource="/policies" action="W">
> 
> 
>  resource="/controller" action="R">
> 
> 
>  resource="/controller" action="W">
> 
> 
> 
> 
> 
>
> nifi.properties
> 
> # web properties #
>

Re: [DISCUSS] Addressing Lingering Pull Requests

2018-01-29 Thread Bryan Bende

I definitely agree with all of these points.

With our current setup, the only way a committer can close a PR is by
issuing a commit with the magic "This closes ..." clause.  The
submitter of the PR is the only one who can actually close it in
GitHub.

I don't want to hijack the discussion with a different topic, but it
might be worth considering switching to the ASF's GitBox integration
[1], which I believe lets us use Github as a real repository, rather
than just a mirror.

It seems like it would make it easier to manage the PRs in the event
that we did implement a policy like Mark and Joe described.

[1] https://gitbox.apache.org/repos/asf

On Mon, Jan 29, 2018 at 11:34 AM, Joe Witt  wrote:
> Mark
>
> Thanks for brining this up.  I do agree.
>
> We need to probably provide more description on the contributor guide
> or elsewhere of which aspects makes PRs easier to commit:
>  - They have unit tests which cover core capabilities but if they're
> cloud service dependent or highly network/disk oriented they have
> integration tests instead of unit tests for the high risk or
> environmentally sensitive bits.
>  - They have *thoroughly* reviewed and covered License and Notice
> updates and are done consistently with the L of the rest of the
> project.
>  - They pass all checks on Travis-CI
>  - If they required manual integration tests that detailed
> instructions/explanations of external system setup and configuration
> and test processes is provided.
>
> And maybe some explanation of which items are very difficult to get
> good reviewer help on:
> - Things which integrate with external systems that are not easily
> replicated for testing.  Consider whiz-bang database StoreIt.  If we
> dont have others aware of or famiilar with the StoreIt system it is
> really tough to find a good reviewer and timely response.
>
> We also need to revisit this as we progress an extension registry mechanism.
>
> Thanks
>
> On Mon, Jan 29, 2018 at 11:29 AM, Mark Payne  wrote:
>> All,
>>
>> We do from time to time go through the backlog of PR's that need to be 
>> reviewed and
>> start a "cleansing" process, closing out any old PR's that appear to have 
>> stalled out.
>> When we do this, though, we typically will start sending out e-mails asking 
>> if there are
>> any stalled PR's that we shouldn't close and start trying to decipher which 
>> ones are okay
>> to close out and which ones are not. This puts quite an onus on the 
>> committer who is
>> trying to clean this up. It also can result in having a large number of 
>> outstanding Pull Requests,
>> which I believe makes the community look bad because it gives the appearance 
>> that we are
>> not doing a good job of being responsive to Pull Requests that are submitted.
>>
>> I would like to propose that we set a new "standard" that is: if we have any 
>> Pull Request
>> that has been stalled (and by "stalled" I mean a committer has reviewed the 
>> PR and did
>> not merge but asked for clarifications or modifications and the contributor 
>> has not pushed
>> any new commit or responded to the comments) for at least 30 days, that we 
>> go ahead
>> and close the Pull Request (after commenting on the PR that it is being 
>> closed due to a lack
>> of activity and that the contributor is more than welcome to open a new PR 
>> if necessary).
>>
>> I feel like this gives contributors enough time to address concerns and it 
>> is simple enough
>> to create a new Pull Request if the need arises. Alternatively, if the 
>> contributor realizes that
>> they need more time, they can simply comment on the PR that they are still 
>> interested in
>> working on it but just need more time, and the simple act of commenting will 
>> mean that the
>> PR is no longer stalled, as defined above.
>>
>> Any thoughts on such a proposal? Any better alternatives that people have in 
>> mind?
>>
>> Thanks
>> -Mark

Re: putTCP odd behavior with EL

2018-01-26 Thread Bryan Bende

In the default case, "Connection Per Flow File" is false, which means
there is one connection created and used across many flow files, which
will perform best.

Setting "Connection Per Flow File" to true, means it will close the
connection at the end of every on trigger call.

We could potentially evaluate against flows when "Connection Per Flow
File" is true b/c then we can create a new one every time, but when it
is false it means you would have to potentially keep open many
connections.

It also may be confusing though to have differing evaluation based on
another property.

On Fri, Jan 26, 2018 at 12:16 PM, Ryan Ward  wrote:
> Thanks, that's a good idea, as I would automatically assume if a property
> supports EL it would work against flowfiles. I see the ticket 3231 was
> specific to variable registry and explicitly says not flow files.  Any
> particular reason?
>
> On Fri, Jan 26, 2018 at 12:02 PM, Pierre Villard <
> pierre.villard...@gmail.com> wrote:
>
>> And to add a bit of info on the last remark from Joe, I'm working on
>> NIFI-4149 to make the scope of EL clearer to users. Didn't have time to
>> work on it lately but will definitely try to get back to it very soon.
>>
>> Pierre
>>
>> 2018-01-26 17:14 GMT+01:00 Joe Witt :
>>
>> > Stated differently - it considers what it can glean from variables
>> > made available other than the flowfile itself.  With Apache NiFi 1.5.0
>> > that means process group variables, variables set in nifi.properites,
>> > and environment variables.
>> >
>> > We need to make sure that we call this out when we indicate a given
>> > property supports expression language so the user knows the scope at
>> > which it will help them.
>> >
>> > On Fri, Jan 26, 2018 at 10:44 AM, Marco Gaido 
>> > wrote:
>> > > Hi Ryan,
>> > >
>> > > probably the reason of the behavior is that EL on PutTCP is enabled but
>> > it
>> > > is not run on the incoming flowfile. So it doesn't care the attributes
>> of
>> > > your flowfile. It considers only environment variables.
>> > >
>> > > Thanks,
>> > > Marco
>> > >
>> > > 2018-01-26 15:56 GMT+01:00 Ryan Ward :
>> > >
>> > >> I'm seeing odd behavior trying to use attributes for the hostname and
>> > port
>> > >> fields.
>> > >>
>> > >> using ${endpoint_port} (9003) + hardcoded IP results in flowfile
>> > yielding
>> > >> failed to process session due to java.lang.NumberFormatException: For
>> > >> input
>> > >> string: "": {}
>> > >> java.lang.NumberFormatException: For input string: ""
>> > >>   at
>> > >> java.lang.NumberFormatException.forInputString(
>> > >> NumberFormatException.java:65)
>> > >>   at java.lang.Integer.parseInt(Integer.java:592)
>> > >>   at java.lang.Integer.parseInt(Integer.java:615)
>> > >>   at
>> > >> org.apache.nifi.attribute.expression.language.StandardPropertyValue.
>> > >> asInteger(StandardPropertyValue.java:78)
>> > >>
>> > >>   at org.apache.nifi.processors.standard.PutTCP.createSender
>> > >> (PutTCP.java:111)
>> > >>   ...
>> > >>   at org.apache.nifi.processors.standard.PutTCP.createSender
>> > >> (PutTCP.java:179)
>> > >>
>> > >> using hardcoded 9003 + ${endpoint} results in flowfile failing due to
>> > >> connection refused
>> > >> DEBUG ...No available connections, creating a new one...
>> > >> ERROR ...No available connections, and unable to create a new one
>> to
>> > >> failure: java.net.ConnectException: Connection refused
>> > >>
>> > >> Adding listenTCP to the cluster on 9003 leaving ${endpoint} and
>> > hardcoded
>> > >> port
>> > >> DEBUG...Connected to local port 23250
>> > >> DEBUGRelinquishing sender
>> > >> Flow files transferred to success, its unclear where the data went or
>> > why I
>> > >> needed to have the nodes listening on this port. Is the attribute
>> value
>> > >> being ignored and defaulting to localhost? Watching this behavior via
>> > >> netstat I could see 127.0.0.1 was indeed connected to itself on 9003.
>> > Odd
>> > >> thing is no data came in on the ListenTCP either.
>> > >>
>> > >> Thanks,
>> > >> Ryan
>> > >>
>> >
>>

Re: RedisConnectionPoolService Makes NiFi Unresponsive

2018-01-24 Thread Bryan Bende

Hello,

Can you take a couple of thread dumps while this is happening and provide
them so we can take a look?

You can put a file name as the argument to nifi.sh dump to have it written
to a file.

Thanks,

Bryan

On Wed, Jan 24, 2018 at 6:48 AM we are  wrote:

> Hi,
>
> Recently we switched the server we run nifi on from a 24 core server to a 4
> core one, and since then approximately 4 times a day nifi stops responding
> until it is restarted . Then we switched to an 8 cores server, and now it
> happens approximately every 2 days.
>
> When this happens, the UI becomes unresponsive, as well as the rest api.
> The number of nifi active threads metric returns 0 active threads, and the
> cpu is at 100% idle. There is not large spike in flowfiles, memory or cpu
> usage before nifi stops responding. But, when we checked the provenance
> repo we saw that events were getting created. The logs only show that
> events are being created, there are no errors or warnings. By looking into
> the content of the events we were able to determine that events were
> flowing up until a processor using the RedisConnectionPoolService.
>
> We tried to connect with the debugger to different processors and all of
> them, except 4, responded and the debugger connected successfully.
> The other 4 are using the RedisConnectionPoolService, and they didn't
> respond. 2 of these processors are custom ones we wrote, the other 2 are
> the built in wait-notify mechanism. When we tried to connect to the
> RedisConnectionPoolService the debugger wasn't able to connect to it as
> well. The redis service that the connection pool is connected to responds
> to us normally.
>
> We tried to look at the active threads using /opt/nifi/bin/nifi.sh dump,
> but we did not see anything strange.
>
> When we tried to dig into the problem we noticed that nifi uses an old
> version of spring-data-redis. We don't know if this is the problem but we
> opened an issue for this: https://issues.apache.org/jira/browse/NIFI-4811u
>
> The maximum timer driven thread count is the default (10). Our custom
> processors are configured to a maximum of 10 concurrent tasks, and the
> wait/notify processors are configured to 5. The RedisConnectionPoolService
> is configured with the default values:
> Max Total: 20
> Max Idle: 8
> Min Idle: 0
> Block When Exhausted: true
> Max Evictable Idle Time: 60 seconds
> Time Between Eviction Runs: 30 seconds
> Num Tests Per Eviction Run: -1
>
> We made sure to always call connection.close() in our custom made
> processors.
> Is it possible that somehow connections are not released or evicted, and
> that is why nifi freezes like this? How can we determine that this is the
> case?
>
> Thanks!
> Daniel
>
-- 
Sent from Gmail Mobile

Re: [VOTE] Release Apache NiFi MiNiFi 0.4.0 RC2

2018-01-19 Thread Bryan Bende

+1 binding

Ran through everything in the release helper and looked good, thanks!

On Fri, Jan 19, 2018 at 3:03 PM, Matt Gilman  wrote:
> +1 Release this package as minifi-0.4.0
>
> Verified hashes, signature, build, etc. Ran sample flows and everything
> looks good.
>
> Thanks for RMing Aldrin!
>
> Matt
>
> On Fri, Jan 19, 2018 at 3:24 AM, Koji Kawamura 
> wrote:
>
>> Verified the hash, built on Windows Server 2016, Java 1.8.0_144
>> successfully.
>> Confirmed that the basic template transform capability with NiFi 1.5.0
>> including RemoteProcessGroup.
>> The getting-started example flow, tailing a file at MiNiFi (0.4.0 RC2)
>> and sending it via S2S to NiFi (1.5.0) worked as expected.
>> https://nifi.apache.org/minifi/getting-started.html
>> Confirmed that the config.yml created by the toolkit has the correct
>> ID for the input port.
>>
>> +1 (binding) Release this package as minifi-0.4.0
>>
>> On Fri, Jan 19, 2018 at 5:55 AM, Aldrin Piri  wrote:
>> > Hello,
>> >
>> > I am pleased to call this vote for the source release of Apache NiFi
>> MiNiFi nifi-minifi-0.4.0.
>> >
>> > The source zip, including signatures, digests, etc. can be found at:
>> > https://repository.apache.org/content/repositories/orgapachenifi-1119
>> >
>> > The Git tag is nifi-minifi-0.4.0-RC2
>> > The Git commit ID is 5e19068e2d35dd9b5d5f45e75efb0414d84d226b
>> > https://git-wip-us.apache.org/repos/asf?p=nifi-minifi.git;a=commit;h=
>> 5e19068e2d35dd9b5d5f45e75efb0414d84d226b
>> > https://github.com/apache/nifi-minifi/commit/
>> 5e19068e2d35dd9b5d5f45e75efb0414d84d226b
>> >
>> > Checksums of nifi-minifi-0.4.0-source-release.zip:
>> > MD5: 237bb2f3fe26ff6451273c3c39579bf6
>> > SHA1: 2d53dd08b55a2799da97008bf971657546c0a752
>> > SHA256: b0ee425f7c214d423f22b75aa2006dcdabf407cd29b0bd2c4f8a8ea3a3ec4b18
>> >
>> > Release artifacts are signed with the following key:
>> > https://people.apache.org/keys/committer/aldrin.asc
>> >
>> > KEYS file available here:
>> > https://dist.apache.org/repos/dist/release/nifi/KEYS
>> >
>> > 5 issues were closed/resolved for this release:
>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12319921=12342439
>> >
>> > Release note highlights can be found here:
>> > https://cwiki.apache.org/confluence/display/MINIFI/
>> Release+Notes#ReleaseNotes-Version0.4.0
>> >
>> > The vote will be open until 4:00PM EST, 22 January 2018 [1].
>> >
>> > Please download the release candidate and evaluate the necessary items
>> including checking hashes, signatures, build
>> > from source, and test.  Then please vote:
>> >
>> > [ ] +1 Release this package as minifi-0.4.0
>> > [ ] +0 no opinion
>> > [ ] -1 Do not release this package because...
>> >
>> > Thanks!
>> >
>> > [1] You can determine this time for your local time zone at
>> https://s.apache.org/minifi-0.4.0-rc2-close
>>

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-16 Thread Bryan Bende

gt; > > >>>>
>>> > > >>>>> thanks tony!
>>> > > >>>>>
>>> > > >>>>>> On Jan 12, 2018 10:48 PM, "Tony Kurc" <trk...@gmail.com>
>>> wrote:
>>> > > >>>>>>
>>> > > >>>>>> I put some of the data I was working with on the wiki -
>>> > > >>>>>>
>>> > > >>>>>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+
>>> > > >> 1.5.0+nar+files
>>> > > >>>>>>
>>> > > >>>>>> On Fri, Jan 12, 2018 at 10:28 PM, Jeremy Dyer <
>>> jdy...@gmail.com
>>> > > >>>> wrote:
>>> > > >>>>>>
>>> > > >>>>>>> So my favorite option is Bryan’s option number “three” of
>>> using
>>> > > >> the
>>> > > >>>>>>> extension registry. Now my thought is do we really need to add
>>> > > >>>>> complexity
>>> > > >>>>>>> and do anything in the mean time or just focus on that?
>>> Meaning
>>> > > >> we
>>> > > >>>> have
>>> > > >>>>>>> roughly 500mb of available capacity today so why don’t we
>>> spend
>>> > > >> those
>>> > > >>>>> man
>>> > > >>>>>>> hours we would spend on getting the second repo up on the
>>> > > >> extension
>>> > > >>>>>>> registry instead?
>>> > > >>>>>>>
>>> > > >>>>>>> @Bryan do you have thoughts about the deployment of those bars
>>> > > >> in the
>>> > > >>>>>>> extension registry? Since we won’t be able to build the
>>> release
>>> > > >>>> binary
>>> > > >>>>>>> anymore would we still need to create separate repos for the
>>> > > >> nars or
>>> > > >>>>>> no?? I
>>> > > >>>>>>> have used the registry a little but I’m not 100% sure on your
>>> > > >> vision
>>> > > >>>>> for
>>> > > >>>>>>> the nars
>>> > > >>>>>>>
>>> > > >>>>>>> - Jeremy Dyer
>>> > > >>>>>>>
>>> > > >>>>>>> Sent from my iPhone
>>> > > >>>>>>>
>>> > > >>>>>>>> On Jan 12, 2018, at 10:18 PM, Tony Kurc <tk...@apache.org>
>>> > > >> wrote:
>>> > > >>>>>>>>
>>> > > >>>>>>>> I was looking at nar sizes, and thought some data may be
>>> > > >> helpful. I
>>> > > >>>>>> used
>>> > > >>>>>>> my recent RC1 verification as a basis for getting file sizes,
>>> and
>>> > > >>>> just
>>> > > >>>>>> got
>>> > > >>>>>>> the file size for each file in the assembly named "*.nar". I
>>> > > >> don't
>>> > > >>>> know
>>> > > >>>>>>> whether the images I pasted in will go through, but I made
>>> some
>>> > > >>>>> graphs.b
>>> > > >>>>>>> The first is a histogram of nar file size in buckets of 10MB.
>>> The
>>> > > >>>>> second
>>> > > >>>>>>> basically is similar to a cumulative distribution, the x axis
>>> is
>>> > > >> the
>>> > > >>>>>> "rank"
>>> > > >>>>>>> of the nar (smallest to largest), and the y-axis is how what
>>> > > >> fraction
>>> > > >>>>> of
>>> > > >>>>>>> the all the sizes of the nars together are that rank or
>>> lower. In
>>> > > >>>> other
>>> > > >>>>>>> words, on the graph, the dot at 60 and ~27 means that the
>>> > > >> smallest 60

Re: how to declare dependency?

2018-01-16 Thread Bryan Bende

Anything that is intended to be re-used across NARs should be packaged
into a utility module that each NAR could include.

For example, all of the modules under nifi-extensions-utils [1]
contain re-usable code that NARs can share.

This case is different because it is not a service API, it just
resuable code that will be included any given NAR using a standard
compile dependency.

[1] 
https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-extension-utils

On Tue, Jan 16, 2018 at 6:50 AM, Martin Mucha <alfon...@gmail.com> wrote:
> To extend this question a little bit further. This is imaginatory problem,
> but I feel the answer to it will help me in future.
>
> Lets say, that I need to have dependency on nifi-record or some module
> providing interface. I cannot add it as compile scope dependency, since
> otherwise existing implementors of won't work, as they are existing in
> another classloader only, and noone implements interface bundled in my nar.
> That's clear. I have to add dependency on nar in nar module, and provided
> scope dependency in module with processor. Easy.
>
> Now I want to reuse another thing from different nar. Uuups, does not work,
> I cannot add 2 nar dependencies. What can one do with that? I assume there
> isn't (severe) reusability limitation, that one cannot reuse classes from
> more than one nar, right?
>
> Mar.
>
> 2018-01-15 22:50 GMT+01:00 Martin Mucha <alfon...@gmail.com>:
>
>> Thanks guys!
>>
>> Just to sum our case up for potential future readers. We did try to define
>> such libraries as 'compile' scope, but (probably due to nar packaging) we
>> was now providing those 'dependencies' in our nar. Not what we wanted.
>> Example: there was missing one module, which contained only interface.
>> Marking it as compile fixed errors, but then we wasn't able to use
>> implementors of such interface, because now this interface was issued by
>> our nar, and obviously noone implemented that one.
>>
>> Instead, when we declared dependency on nifi-standard-services-api-nar,
>> provided scope worked ok, and everything else as well.
>>
>> Thanks for your help!
>> Martin.
>>
>> 2018-01-12 16:29 GMT+01:00 Bryan Bende <bbe...@gmail.com>:
>>
>>> In addition to what Matt said, the reason nifi-record is marked as
>>> provided is because it is part of nifi-standard-services-api-nar, and
>>> if your NAR was going to do anything with a record reader/writer you
>>> would have a NAR dependency on nifi-standard-services-api-nar so at
>>> runtime that is where your NAR would get nifi-record from. At build
>>> time it gets it from the provided dependency in order to compile.
>>>
>>> On Fri, Jan 12, 2018 at 10:16 AM, Matt Gilman <matt.c.gil...@gmail.com>
>>> wrote:
>>> > Mar,
>>> >
>>> > By using a dependency like that (without the version), it must be
>>> declared
>>> > in dependencyManagement someplace. If the jar isn't being pulled into
>>> the
>>> > resulting artifact it's likely because the dependency has a scope of
>>> > provided. You can override that scope to compile when you reference it.
>>> >
>>> > Matt
>>> >
>>> > On Fri, Jan 12, 2018 at 8:34 AM, Martin Mucha <alfon...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> in nifi-standard-processors there is this dependency
>>> >>
>>> >> 
>>> >> org.apache.nifi
>>> >> nifi-record
>>> >> 
>>> >>
>>> >> if I just copy paste into our project some class from this bundle (to
>>> fix
>>> >> bugs) and add dependencies as mentioned one, it builds. However nifi
>>> wont
>>> >> start, because nifi-record related classes are not on classpath.
>>> >>
>>> >> What shall I do to get them on classpath? For nifi-standard-processors
>>> it's
>>> >> fine to have nifi-record as provided, but apparently for our project,
>>> >> extending the same parent, it's not. nifi-record is not provided or
>>> part of
>>> >> built nar. How to fix this?
>>> >>
>>> >> thanks,
>>> >> Mar.
>>> >>
>>>
>>
>>

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-12 Thread Bryan Bende

Long term I'd like to see the extension registry take form and have
that be the solution (#3).

In the more near term, we could separate all of the NARs, except for
framework and maybe standard processors & services, into a separate
git repo.

In that new git repo we could organize things like Joe N just
described according to some kind of functional grouping. Each of these
functional bundles could produce its own tar/zip which we can make
available for download.

That would separate the release cycles between core NiFi and the other
NARs, and also avoid having any single binary artifact that gets too
large.

On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec  wrote:
> just a random thought.
>
> Drop In Lib packs... All the Hadoop ones in one package for example that
> can be added to a slim Nifi install. Another may be for Cloud, or Database
> Interactions, Integration (JMS, FTP, etc) of course defining these groups
> would be the tricky part... Or perhaps some type of installer which allows
> you to elect which packages to download to add to the slim install?
>
>
> On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt  wrote:
>
>> Team,
>>
>> The NiFi convenience binary (tar.gz/zip) size has grown to 1.1GB now
>> in the latest release.  Apache infra expanded it to 1.6GB allowance
>> for us but has stated this is the last time.
>> https://issues.apache.org/jira/browse/INFRA-15816
>>
>> We need consider:
>> 1) removing old nars/less commonly used nars/or particularly massive
>> nars from the assembly we distribute by default.  Folks can still use
>> these things if they want just not from our convenience binary
>> 2) collapsing nars with highly repeating deps
>> 3) Getting the extension registry baked into the Flow Registry then
>> moving to separate releases for extension bundles.  The main release
>> then would be just the NiFi framework.
>>
>> Any other ideas ?
>>
>> I'll plan to start identifying candiates for removal soon.
>>
>> Thanks
>> Joe
>>
>
>
>
> --
> Joseph

Re: how to declare dependency?

2018-01-12 Thread Bryan Bende

In addition to what Matt said, the reason nifi-record is marked as
provided is because it is part of nifi-standard-services-api-nar, and
if your NAR was going to do anything with a record reader/writer you
would have a NAR dependency on nifi-standard-services-api-nar so at
runtime that is where your NAR would get nifi-record from. At build
time it gets it from the provided dependency in order to compile.

On Fri, Jan 12, 2018 at 10:16 AM, Matt Gilman  wrote:
> Mar,
>
> By using a dependency like that (without the version), it must be declared
> in dependencyManagement someplace. If the jar isn't being pulled into the
> resulting artifact it's likely because the dependency has a scope of
> provided. You can override that scope to compile when you reference it.
>
> Matt
>
> On Fri, Jan 12, 2018 at 8:34 AM, Martin Mucha  wrote:
>
>> Hi,
>>
>> in nifi-standard-processors there is this dependency
>>
>> 
>> org.apache.nifi
>> nifi-record
>> 
>>
>> if I just copy paste into our project some class from this bundle (to fix
>> bugs) and add dependencies as mentioned one, it builds. However nifi wont
>> start, because nifi-record related classes are not on classpath.
>>
>> What shall I do to get them on classpath? For nifi-standard-processors it's
>> fine to have nifi-record as provided, but apparently for our project,
>> extending the same parent, it's not. nifi-record is not provided or part of
>> built nar. How to fix this?
>>
>> thanks,
>> Mar.
>>

Re: [VOTE] Release Apache NiFi 1.5.0 (RC1)

2018-01-10 Thread Bryan Bende

+1 (binding)

- Ran through release helper with no issues
- Ran into a minor issue related to component versioning when using
the registry and created this JIRA [1], would be more of an issue for
next release

[1] https://issues.apache.org/jira/browse/NIFI-4763


On Wed, Jan 10, 2018 at 10:05 AM, Matt Burgess  wrote:
> +1 (binding), ran through release guide with no issues (verified sigs
> & sums), ran various flows using record processors, the new Jackson
> CSV parser, provenance reporting task with new filtering capability
> and output fields, etc.
>
> On Tue, Jan 9, 2018 at 5:19 AM, Joe Witt  wrote:
>> Hello,
>>
>> I am pleased to be calling this vote for the source release of Apache
>> NiFi nifi-1.5.0.
>>
>> The source zip, including signatures, digests, etc. can be found at:
>> https://repository.apache.org/content/repositories/orgapachenifi-1116
>>
>> The Git tag is nifi-1.5.0-RC1
>> The Git commit ID is 46d30c7e92f0ad034d9b35bf1d05c350ab5547ed
>> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=46d30c7e92f0ad034d9b35bf1d05c350ab5547ed
>>
>> Checksums of nifi-1.5.0-source-release.zip:
>> MD5: 046f2dde4af592dd8c05e55c2bbb3c4f
>> SHA1: 63b9a68b9f89200fd31f5561956a15b45b1b9c8c
>> SHA256: 40b155c4911414907835f2eb0d5a4da798935f27f1e5134218d904fe6c942d13
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/joewitt.asc
>>
>> KEYS file available here:
>> https://dist.apache.org/repos/dist/release/nifi/KEYS
>>
>> 195 issues were closed/resolved for this release:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12341668
>>
>> Release note highlights can be found here:
>> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.5.0
>>
>> The vote will be open for 72 hours.
>> Please download the release candidate and evaluate the necessary items
>> including checking hashes, signatures, build
>> from source, and test.  The please vote:
>>
>> [ ] +1 Release this package as nifi-1.5.0
>> [ ] +0 no opinion
>> [ ] -1 Do not release this package because...

[ANNOUNCE] New Apache NiFi Committer Kevin Doran

2018-01-03 Thread Bryan Bende

On behalf of the Apache NiFi PMC, I am very pleased to announce that
Kevin Doran has accepted the PMC's invitation to become a committer on
the Apache NiFi project. We greatly appreciate all of Kevin's hard
work and generous contributions to the project. We look forward to his
continued involvement in the project.

Kevin has made significant contributions to the NiFi Registry project,
especially around setting up the security model & REST APIs, and was a
major contributor to getting the first release out. You also may have
interacted with him on the mailing lists already, as he is always
willing to dig into questions/issues and help out.

Welcome and congratulations!

[RESULT][VOTE] Release Apache NiFi Registry 0.1.0

2018-01-01 Thread Bryan Bende

Apache NiFi Community,

I am pleased to announce that the 0.1.0 release of Apache NiFi
Registry passes with:

12 +1 (binding) votes
  4 +1 (non-binding) votes
  0 0 votes
  0 -1 votes

Thanks to all who helped make this release possible.

Here is the PMC vote thread:
https://lists.apache.org/thread.html/c94d32297f1421929405a3c68b45fef24daaee17b266facd76153cc3@%3Cdev.nifi.apache.org%3E

Re: [VOTE] Release Apache NiFi Registry 0.1.0

2018-01-01 Thread Bryan Bende

+1 (binding)

On Mon, Jan 1, 2018 at 2:20 PM, Jeff <jtsw...@gmail.com> wrote:
> +1 (binding)
>
> - Ran through the helper guide
> - Ran through the initial testing steps in the helper guide
> - Ran through several use case scenarios, everything worked as expected,
> with a minor exception described below.
>
> NiFi Registry looks great!  I think this provides a huge quality of life
> improvement for NiFi and its users, and this is only the initial release!
> Amazing!
>
> One thing I did notice, when working with two process groups referencing
> the same versioned flow.  If one of these process groups is modified (for
> example, a processor was added) and the changes are committed, the second
> process group will report that it's up to do date, though it doesn't have
> any of the changes that were committed.  It appears there's a period of
> time between the commit of the changes and NiFi's retrieval/integration of
> the updates from the registry. Refreshing the NiFi UI during this time
> period will still show both process groups as up to date, though the
> process group with the second instance of that versioned flow will still
> not have the changes.  Some moments later, the second process group's icon
> will report that there's a new version.  Updating that second process group
> to the new version works as expected.
>
> This is minor and doesn't effect NiFi Registry, but I thought I'd mention
> it here.  I'll post this on the NiFi PR [1] as well.
>
> [1] https://github.com/apache/nifi/pull/2219.
>
> On Mon, Jan 1, 2018 at 11:30 AM Andrew Lim <andrewlim.apa...@gmail.com>
> wrote:
>
>> +1 (non-binding)
>>
>> -Ran full clean install on OS X (10.11.6)
>> -Tested secure and unsecure Registry instances: creating/editing users,
>> buckets, policies, special privileges; performed similar testing on group
>> level
>> -Reviewed documentation
>> -Performed basic testing in NiFi: adding registry clients, putting a
>> process group under version control, saving different versions, changing
>> versions, importing a version, stopping version control
>>
>> Awesome initial release for this project!
>>
>> Drew
>>
>>
>> > On Dec 28, 2017, at 1:09 PM, Bryan Bende <bbe...@apache.org> wrote:
>> >
>> > Hello,
>> >
>> > I am pleased to be calling this vote for the source release of Apache
>> > NiFi Registry 0.1.0.
>> >
>> > The source zip, including signatures, digests, etc. can be found at:
>> > https://repository.apache.org/content/repositories/orgapachenifi-1115/
>> >
>> > The Git tag is nifi-registry-0.1.0-RC1
>> > The Git commit ID is 81b99e7b04491eabb72ddf30754053ca12d0fcca
>> >
>> https://git-wip-us.apache.org/repos/asf?p=nifi-registry.git;a=commit;h=81b99e7b04491eabb72ddf30754053ca12d0fcca
>> >
>> > Checksums of nifi-registry-0.1.0-source-release.zip:
>> > MD5: 56244c3c296cdc9c3fcc6d22590b80d1
>> > SHA1: 6354e91f868f40d6656ec2467bde307260ad63ca
>> > SHA256: 2c680e441e6c4bfa2381bf004e9b19a6a79401a6a83e04597d0a714a95efd301
>> >
>> > Release artifacts are signed with the following key:
>> > https://people.apache.org/keys/committer/bbende.asc
>> >
>> > KEYS file available here:
>> > https://dist.apache.org/repos/dist/release/nifi/KEYS
>> >
>> > 65 issues were closed/resolved for this release:
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320920=12340217
>> >
>> > Release note highlights can be found here:
>> >
>> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-NiFiRegistry0.1.0
>> >
>> > The vote will be open for 96 hours.
>> >
>> > Please download the release candidate and evaluate the necessary items
>> > including checking hashes, signatures, build from source, and test.
>> >
>> > The please vote:
>> >
>> > [ ] +1 Release this package as nifi-registry-0.1.0
>> > [ ] +0 no opinion
>> > [ ] -1 Do not release this package because...
>>
>>

Apache NiFi Registry 0.1.0 RC1 Release Helper Guide

2017-12-28 Thread Bryan Bende

Hello Apache NiFi community,

Please find the associated guidance to help those interested in
validating/verifying the Apache NiFi Registry release so they can
vote.

# Download latest KEYS file:
https://dist.apache.org/repos/dist/dev/nifi/KEYS

# Import keys file:
gpg --import KEYS

# [optional] Clear out local maven artifact repository

# Pull down nifi-registry-0.1.0 source release artifacts for review:

wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-registry/nifi-registry-0.1.0/nifi-registry-0.1.0-source-release.zip
wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-registry/nifi-registry-0.1.0/nifi-registry-0.1.0-source-release.zip.asc
wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-registry/nifi-registry-0.1.0/nifi-registry-0.1.0-source-release.zip.md5
wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-registry/nifi-registry-0.1.0/nifi-registry-0.1.0-source-release.zip.sha1
wget 
https://dist.apache.org/repos/dist/dev/nifi/nifi-registry/nifi-registry-0.1.0/nifi-registry-0.1.0-source-release.zip.sha256

# Verify the signature
gpg --verify nifi-registry-0.1.0-source-release.zip.asc

# Verify the hashes (md5, sha1, sha256) match the source and what was
provided in the vote email thread
md5sum nifi-registry-0.1.0-source-release.zip
sha1sum nifi-registry-0.1.0-source-release.zip
sha256sum nifi-registry-0.1.0-source-release.zip

# Unzip nifi-registry-0.1.0-source-release.zip

# Verify the build works including release audit tool (RAT) checks
cd nifi-registry-0.1.0
mvn clean install -Pcontrib-check

# Verify the contents contain a good README, NOTICE, and LICENSE.

# Verify the git commit ID is correct

# Verify the RC was branched off the correct git commit ID

# Look at the resulting convenience binary as found in
nifi-registry-assembly/target

# Make sure the README, NOTICE, and LICENSE are present and correct

# Run the resulting convenience binary and make sure it works as expected

# Test integration between the registry and NiFi

Start the registry

./bin/nifi-registry.sh start

Create a bucket in the registry
- Go to the registry UI at http://localhost:18080/nifi-registry
- Click the tool icon in the top right corner
- Click New Bucket from the bucket table
- Enter a name and click create

Get the NiFi PR which adds the support for integrating with the registry
https://github.com/apache/nifi/pull/2219

>From that PR, edit the root pom and change:
0.0.1-SNAPSHOT
to
0.1.0

Build the PR and start NiFi

NOTE: You must have already built nifi-registry with "mvn clean
install" in order to build this PR because it depends on snapshot JARs
being in your local Maven repo.

Tell NiFi about your local registry instance
- Go the controller settings for NiFi from the top-right menu
- Select the Registry Clients tab
- Add a new Registry Client giving it a name and the url of
http://localhost:18080

Create a process group and place it under version control
- Right click on the PG and select the Version menu
- Select Start Version Control
- Choose the registry instance and bucket you want to use
- Enter a name, description, and comment

Go back to the registry and refresh the main page and you should see
the versioned flow you just saved

Import a new PG from a versioned flow
- Drag on a new PG like normal
- Instead of entering a name, click the Import link
- Now choose the flow you saved before

You should have a second identical PG now.

# Send a response to the vote thread indicating a +1, 0, -1 based on
your findings.

Thank you for your time and effort to validate the release!

[VOTE] Release Apache NiFi Registry 0.1.0

2017-12-28 Thread Bryan Bende

Hello,

I am pleased to be calling this vote for the source release of Apache
NiFi Registry 0.1.0.

The source zip, including signatures, digests, etc. can be found at:
https://repository.apache.org/content/repositories/orgapachenifi-1115/

The Git tag is nifi-registry-0.1.0-RC1
The Git commit ID is 81b99e7b04491eabb72ddf30754053ca12d0fcca
https://git-wip-us.apache.org/repos/asf?p=nifi-registry.git;a=commit;h=81b99e7b04491eabb72ddf30754053ca12d0fcca

Checksums of nifi-registry-0.1.0-source-release.zip:
MD5: 56244c3c296cdc9c3fcc6d22590b80d1
SHA1: 6354e91f868f40d6656ec2467bde307260ad63ca
SHA256: 2c680e441e6c4bfa2381bf004e9b19a6a79401a6a83e04597d0a714a95efd301

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/bbende.asc

KEYS file available here:
https://dist.apache.org/repos/dist/release/nifi/KEYS

65 issues were closed/resolved for this release:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320920=12340217

Release note highlights can be found here:
https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-NiFiRegistry0.1.0

The vote will be open for 96 hours.

Please download the release candidate and evaluate the necessary items
including checking hashes, signatures, build from source, and test.

The please vote:

[ ] +1 Release this package as nifi-registry-0.1.0
[ ] +0 no opinion
[ ] -1 Do not release this package because...

Re: [DISCUSS] First Release of NiFi Registry

2017-12-27 Thread Bryan Bende

Looks like the work for the initial docs has wrapped up and we are in
a good place to kick out an RC.

I'll start pulling things together and should be able to get something
out tomorrow.

Pierre,

I created a JIRA [1] to capture an improvement for the scenario you
described with the nested process groups.

As you said, I think most of the other stuff has been captured in
JIRA's created by Joe P.

Thanks,

Bryan

[1] https://issues.apache.org/jira/browse/NIFIREG-86

On Mon, Dec 25, 2017 at 5:40 AM, Pierre Villard
<pierre.villard...@gmail.com> wrote:
> Hey guys,
>
> Not sure that's the best place to give my feedbacks after running some
> tests, let me know if I should open a new thread.
>
> (I believe Joe P. already made some similar comments, but just in case...)
>
> - in an unsecure environment, it's probably better to disable the "Add new
> policy" button (NIFIREG-78)
> - I've seen some logs that could be set to debug? “Access tokens are only
> issued over HTTPS. Returning Conflict response.”, “Registry is not
> configured to internally manage users, groups, or policies. Please contact
> your system administrator.. Returning Conflict response.“
> - general comment for NiFi UI: add tooltips on the icons of the upper
> status bar? We've quite a few new icons coming with the Registry and I
> guess it could help people not very familiar with it yet.
> - is it possible to do a diff between two versions in the Registry UI?
> - when adding a variable to a versioned PG, it does not show changes to
> commit. Is it expected? (it does not to me)
> - how to set a previous version as the new current one? does not seem
> possible unless you stop version control and start again?
> - very very minor comment, in the Registry UI, in the actions list, I'd set
> "Delete" instead of "delete".
>
> Another observation:
>
> I have PG A containing PG B, both versioned. And I have two instances of PG
> A in my NiFi UI PG A1 and PG A2.
> - I deleted PG B tracking in NiFi Registry. I now have 404 errors on the
> PGAx because PG B is not found in registry. All good. Then I disconnect PG
> B in PG A1. PG A1 is shown as OK / up-to-date with nothing to commit.
> - If I try to import a new instance of PG A, it’s not working because “The
> Flow Registry with ID 893e20cc-0160-1000-8ab8-e0507c36aa94 reports that no
> Flow exists with Bucket f66d8eb1-b893-41ad-974b-565bc33c8104, Flow
> 8d2df468-e8e4-4138-aaef-c7eadb71c2c4, Version 4”
> - In PG A2, if I delete PG B, then it shows local change but I cannot
> revert local changes: "Failed to retrieve flow with Flow Registry in order
> to calculate local differences due to Error retrieving flow snapshot:
> Versioned flow does not exist with identifier
> 08e85785-cb41-4cae-a516-6b4d3506960e"
> - In the end I have to delete PG B, commit changes to get everything back
> to normal. I’m wondering if disconnecting PG B shouldn’t be considered as a
> local change to be committed? Because, I could be in a situation where I
> don’t want to delete PG B, I just want to stop version control on it, no?
>
> I'll run some more tests in secured environments.
>
> Pierre
>
>
>
>
> 2017-12-21 18:50 GMT+01:00 Bryan Bende <bbe...@gmail.com>:
>
>> Just wanted to give an update on this...
>>
>> Great progress has been made in the last two weeks in terms of getting
>> ready for an RC. Still a few outstanding items, but I think we could
>> have those wrapped up soon and kick out an RC some time next week.
>> Depending when everything is ready we can adjust the voting period if
>> needed to account for holidays and make sure there is adequate time
>> for review.
>>
>> In the mean time, I encourage anyone who is interested to give it a
>> try. Here is some info about how to get started...
>>
>> 1) Get the code for the registry
>>
>> The Apache repo is here:
>>
>> https://git-wip-us.apache.org/repos/asf/nifi-registry.git
>>
>> The github repo is here if you prefer to fork that:
>>
>> https://github.com/apache/nifi-registry
>>
>> 2) Build the registry code
>>
>> cd nifi-registry
>> mvn clean install
>>
>> 3) Start the registry
>>
>> cd nifi-registry-assembly/target/nifi-registry-0.0.1-SNAPSHOT-
>> bin/nifi-registry-0.0.1-SNAPSHOT/
>> ./bin/nifi-registry.sh start
>>
>> 4) Create a bucket in the registry
>>
>> - Go to the registry UI at http://localhost:18080/nifi-registry
>> - Click the tool icon in the top right corner
>> - Click New Bucket from the bucket table
>> - Enter a name and click create
>>
>> 5) Get the NiFi PR which adds the support for integrating with the re

Re: [Non-DoD Source] Re: Moving from version 1.1.2 to 1.4.0

2017-12-26 Thread Bryan Bende

Can you give some more details about what the dependency is for?

If it is some utility code that exists in standard processors then we
should be looking to move that to other reusable modules so that you can
depend on it without depending on the processors.

If it is because you extended a processor from standard processors, you
would probably want to just copy the processor code into your own NAR and
modify/extend it.

On Tue, Dec 26, 2017 at 1:26 PM Byers, Steven K (Steve) CTR USARMY MEDCOM
JMLFDC (US) <steven.k.byers@mail.mil> wrote:

> Thanks for the reply, Bryan,
>
> Yes, two of our custom processors have a dependency on the standard
> processors.  The dependency cannot be removed or those processors will not
> compile.
>
> Thank you,
>
> Steven K. Byers
> DXC Technology - Contractor
> Software Developer - Joint Medical Logistics Functional Development Center
> (JMLFDC)
> Defense Health Agency (DHA)/ Health Information Technology (HIT)
> Directorate/ Solution Delivery Division (SDD)/Clinical Support Branch/JMLFDC
> 1681 Nelson Street, Fort Detrick, MD  21702
> (443) 538-7575 | (410) 872-4923
> Email: steven.k.byers@mail.mil
>
>
>
> -Original Message-
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Tuesday, December 26, 2017 11:25 AM
> To: dev@nifi.apache.org
> Subject: [Non-DoD Source] Re: Moving from version 1.1.2 to 1.4.0
>
> Hello,
>
> This means your custom NAR is bundling the standard processors jar and as
> a result they are getting discovered twice, once from your NAR and once
> from the standard NAR.
>
> You’ll have to look at your maven dependencies for your custom NARs and
> figure out why the dependency on standard processors exists and remove it.
>
> Thanks,
>
> Bryan
>
>
> On Tue, Dec 26, 2017 at 11:09 AM Byers, Steven K (Steve) CTR USARMY MEDCOM
> JMLFDC (US) <steven.k.byers@mail.mil> wrote:
>
> > Hi devs,
> >
> > I'm looking into moving from NiFi 1.1.2 to 1.4.0.  We have several
> > custom processors and services.  Everything is compiling without any
> > problems but when I put the services into the 1.4.0 instance, NiFi
> > shows in the list of processors a 1.1.2 and 1.4.0 version of all
> > processors including the stock NiFi processors. If I just load our
> > custom processors that do not require a service, NiFi is fine and the
> > processor list looks like it should and the custom processors work
> > fine.  It seems to be something related to the custom services. Is
> > there some documentation or any guidance someone can provide to assist
> > with what I am doing?
> >
> > Thank you,
> >
> > Steven K. Byers
> > DXC Technology - Contractor
> > Software Developer - Joint Medical Logistics Functional Development
> > Center
> > (JMLFDC)
> > Defense Health Agency (DHA)/ Health Information Technology (HIT)
> > Directorate/ Solution Delivery Division (SDD)/Clinical Support
> > Branch/JMLFDC
> > 1681 Nelson Street, Fort Detrick, MD  21702
> > (443) 538-7575 | (410) 872-4923
> > Email: steven.k.byers@mail.mil
> >
> >
> >
> > --
> Sent from Gmail Mobile
>
-- 
Sent from Gmail Mobile

Re: Moving from version 1.1.2 to 1.4.0

2017-12-26 Thread Bryan Bende

Hello,

This means your custom NAR is bundling the standard processors jar and as a
result they are getting discovered twice, once from your NAR and once from
the standard NAR.

You’ll have to look at your maven dependencies for your custom NARs and
figure out why the dependency on standard processors exists and remove it.

Thanks,

Bryan


On Tue, Dec 26, 2017 at 11:09 AM Byers, Steven K (Steve) CTR USARMY MEDCOM
JMLFDC (US)  wrote:

> Hi devs,
>
> I'm looking into moving from NiFi 1.1.2 to 1.4.0.  We have several custom
> processors and services.  Everything is compiling without any problems but
> when I put the services into the 1.4.0 instance, NiFi shows in the list of
> processors a 1.1.2 and 1.4.0 version of all processors including the stock
> NiFi processors. If I just load our custom processors that do not require a
> service, NiFi is fine and the processor list looks like it should and the
> custom processors work fine.  It seems to be something related to the
> custom
> services. Is there some documentation or any guidance someone can provide
> to
> assist with what I am doing?
>
> Thank you,
>
> Steven K. Byers
> DXC Technology - Contractor
> Software Developer - Joint Medical Logistics Functional Development Center
> (JMLFDC)
> Defense Health Agency (DHA)/ Health Information Technology (HIT)
> Directorate/ Solution Delivery Division (SDD)/Clinical Support
> Branch/JMLFDC
> 1681 Nelson Street, Fort Detrick, MD  21702
> (443) 538-7575 | (410) 872-4923
> Email: steven.k.byers@mail.mil
>
>
>
> --
Sent from Gmail Mobile

Re: Penalized FlowFiles still cause Processor Invocations

2017-12-22 Thread Bryan Bende

Hello,

Does your processor happen to have a @TriggerWhenEmpty annotation on it?

That would cause it to always execute regardless of what is in the queue,
so just wanted to rule that out.

Thanks,

Bryan


On Fri, Dec 22, 2017 at 12:45 PM, Oleksi Derkatch  wrote:

> Hi,
>
>
> I've noticed that if the every FlowFile in a queue is penalized, I see
> constant invocations on the processor's onTrigger() method, despite the
> queue being "effectively" empty. This seems to cause millions of
> invocations of the processor that don't result in useful work.
>
>
> Am I understanding the situation correctly? Has anyone ever considered
> changing this behaviour so that we don't have all these calls to onTrigger
> when the queue only contains penalized flow files?
>
>
> Of course, we can update our Processors to just yield() when they get
> called with a null flow file, but this seems like a good thing for the
> engine to handle automatically.
>
>
> 
> _
> [image: 1496944291439_image001.png]
> Oleksi Derkatch  |  Software Engineer
>
> 7 Father David Bauer Drive, Suite 201
> 
> |  Waterloo, ON, Canada, N2L 0A2
> 
>
> +1 519 594 0940 ext.212 <(519)%20594-0940>  |  +1 844 527 6748
> <(844)%20527-6748>  l  www.vitalimages.com
> 
> _
>
>
>
>
> Notice - Confidential Information The information in this communication
> and any attachments is strictly confidential and intended only for the use
> of the individual(s) or entity(ies) named above. If you are not the
> intended recipient, any dissemination, distribution, copying or other use
> of the information contained in this communication and/or any attachment is
> strictly prohibited. If you have received this communication in error,
> please first notify the sender immediately and then delete this
> communication from all data storage devices and destroy all hard copies.
>

Re: [DISCUSS] First Release of NiFi Registry

2017-12-21 Thread Bryan Bende

Just wanted to give an update on this...

Great progress has been made in the last two weeks in terms of getting
ready for an RC. Still a few outstanding items, but I think we could
have those wrapped up soon and kick out an RC some time next week.
Depending when everything is ready we can adjust the voting period if
needed to account for holidays and make sure there is adequate time
for review.

In the mean time, I encourage anyone who is interested to give it a
try. Here is some info about how to get started...

1) Get the code for the registry

The Apache repo is here:

https://git-wip-us.apache.org/repos/asf/nifi-registry.git

The github repo is here if you prefer to fork that:

https://github.com/apache/nifi-registry

2) Build the registry code

cd nifi-registry
mvn clean install

3) Start the registry

cd 
nifi-registry-assembly/target/nifi-registry-0.0.1-SNAPSHOT-bin/nifi-registry-0.0.1-SNAPSHOT/
./bin/nifi-registry.sh start

4) Create a bucket in the registry

- Go to the registry UI at http://localhost:18080/nifi-registry
- Click the tool icon in the top right corner
- Click New Bucket from the bucket table
- Enter a name and click create

5) Get the NiFi PR which adds the support for integrating with the registry

https://github.com/apache/nifi/pull/2219

Build that PR like normal.

NOTE: That you must have already built nifi-registry with "mvn clean
install" in order to build this PR because it depends on snapshot JARs
being in your local Maven repo.

6) Tell NiFi about your local registry instance

- Go the controller settings for NiFi from the top-right menu
- Select the Registry Clients tab
- Add a new Registry Client giving it a name and the url of
http://localhost:18080

7) Create a process group and place it under version control

- Right click on the PG and select the Version menu
- Select Start Version Control
- Choose the registry instance and bucket you want to use
- Enter a name, description, and comment

8) Go back to the registry and refresh the main page and you should
see the versioned flow you just saved

9) Import a new PG from a versioned flow

- Drag on a new PG like normal
- Instead of entering a name, click the Import link
- Now choose the flow you saved before

You should have a second identical PG now.

>From there you can try making changes to one of them, view local
changes, revert changes, save a version 2, upgrade the other one to
version 2, etc.

Hope that helps.

-Bryan

On Fri, Dec 8, 2017 at 9:19 AM, Bryan Bende <bbe...@gmail.com> wrote:
> Mike,
>
> You brought up a good point... documentation is one of the things that
> still needs to be done.
>
> There is some information that might be helpful though...
>
> I would suggest reading this Wiki page for the feature proposal of
> "Configuration Management of Flows" [1].
>
> There is also a JIRA from a few months ago with initial mock ups for
> the registry UI [2].
>
> As part of the RC I can provide some instructions on how it can be
> tested with NiFi using PR 2219.
>
> Thanks,
>
> Bryan
>
> [1] 
> https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows
> [2] https://issues.apache.org/jira/browse/NIFIREG-3
>
>
> On Thu, Dec 7, 2017 at 8:21 PM, Mike Thomsen <mikerthom...@gmail.com> wrote:
>> Is there a good description/detail page somewhere going over the registry?
>>
>> On Thu, Dec 7, 2017 at 2:06 PM, Pierre Villard <pierre.villard...@gmail.com>
>> wrote:
>>
>>> Strong +1!!
>>>
>>> Really impressed by all the work you guys did on the registry stuff. Very
>>> impatient to use it in official releases!
>>>
>>> Le 7 déc. 2017 18:52, "Jeff" <jtsw...@gmail.com> a écrit :
>>>
>>> Bryan,
>>>
>>> +1 to getting an initial release of NiFi Registry out to the community.
>>> Definitely a huge step in the evolution of NiFi!
>>>
>>> On Thu, Dec 7, 2017 at 11:29 AM Russell Bateman <r...@windofkeltia.com>
>>> wrote:
>>>
>>> > Our down-stream users are excited at the prospect of using this registry
>>> > capability for their flows. So, we're eager to see it integrated into
>>> > the earliest NiFi version you can choose (1.5.0?).
>>> >
>>> > Russ
>>> >
>>> > On 12/07/2017 08:49 AM, Kevin Doran wrote:
>>> > > Thanks for kicking off this discussion thread, Bryan.
>>> > >
>>> > > I support prepping a release of NiFi Registry and making it version
>>> > 0.1.0 as you propose.
>>> > >
>>> > > Thanks!
>>> > > Kevin
>>> > >
>>> > > On 12/7/17, 10:45, "Joe Witt" <joe.w.

Re: [DISCUSS] First Release of NiFi Registry

2017-12-08 Thread Bryan Bende

Mike,

You brought up a good point... documentation is one of the things that
still needs to be done.

There is some information that might be helpful though...

I would suggest reading this Wiki page for the feature proposal of
"Configuration Management of Flows" [1].

There is also a JIRA from a few months ago with initial mock ups for
the registry UI [2].

As part of the RC I can provide some instructions on how it can be
tested with NiFi using PR 2219.

Thanks,

Bryan

[1] 
https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows
[2] https://issues.apache.org/jira/browse/NIFIREG-3


On Thu, Dec 7, 2017 at 8:21 PM, Mike Thomsen <mikerthom...@gmail.com> wrote:
> Is there a good description/detail page somewhere going over the registry?
>
> On Thu, Dec 7, 2017 at 2:06 PM, Pierre Villard <pierre.villard...@gmail.com>
> wrote:
>
>> Strong +1!!
>>
>> Really impressed by all the work you guys did on the registry stuff. Very
>> impatient to use it in official releases!
>>
>> Le 7 déc. 2017 18:52, "Jeff" <jtsw...@gmail.com> a écrit :
>>
>> Bryan,
>>
>> +1 to getting an initial release of NiFi Registry out to the community.
>> Definitely a huge step in the evolution of NiFi!
>>
>> On Thu, Dec 7, 2017 at 11:29 AM Russell Bateman <r...@windofkeltia.com>
>> wrote:
>>
>> > Our down-stream users are excited at the prospect of using this registry
>> > capability for their flows. So, we're eager to see it integrated into
>> > the earliest NiFi version you can choose (1.5.0?).
>> >
>> > Russ
>> >
>> > On 12/07/2017 08:49 AM, Kevin Doran wrote:
>> > > Thanks for kicking off this discussion thread, Bryan.
>> > >
>> > > I support prepping a release of NiFi Registry and making it version
>> > 0.1.0 as you propose.
>> > >
>> > > Thanks!
>> > > Kevin
>> > >
>> > > On 12/7/17, 10:45, "Joe Witt" <joe.w...@gmail.com> wrote:
>> > >
>> > >  Bryan - very exciting and awesome.  Having experimented with the
>> > >  registry on the JIRAs/PRs you mention I must say this is going to
>> > be a
>> > >  huge step forward for NiFi!
>> > >
>> > >  Since we'll also be doing a NiFi release soon (1.5.0?) I am happy
>> to
>> > >  volunteer to RM that as well if needed.
>> > >
>> > >  Thanks
>> > >
>> > >  On Thu, Dec 7, 2017 at 10:39 AM, Bryan Bende <bbe...@gmail.com>
>> > wrote:
>> > >  > Hey folks,
>> > >  >
>> > >  > There has been a lot of great work done on the NiFi Registry [1]
>> > and I
>> > >  > think we are probably very close to an initial release focused
>> on
>> > >  > storing "versioned flows".
>> > >  >
>> > >  > Since NiFi will have a dependency on client code provided by the
>> > >  > registry, the first release of the registry would need to occur
>> > before
>> > >  > the first release of NiFi that integrates with it. The work on
>> the
>> > >  > NiFi side is being done as part of NIFI-4436, which can be
>> > followed
>> > >  > along on PR 2219 [2].
>> > >  >
>> > >  > Currently nifi-registry master is set to 0.0.1-SNAPSHOT, but I
>> > would
>> > >  > propose the first release should be 0.1.0.
>> > >  >
>> > >  > Let me know if anyone has any thoughts or comments.  I'm happy
>> to
>> > act
>> > >  > as RM if no one else is interested in doing so, and we can start
>> > the
>> > >  > process of going through JIRA to see what is left.
>> > >  >
>> > >  > Thanks,
>> > >  >
>> > >  > Bryan
>> > >  >
>> > >  > [1] https://nifi.apache.org/registry.html
>> > >  > [2] https://github.com/apache/nifi/pull/2219
>> > >
>> > >
>> > >
>> >
>> >
>>

[DISCUSS] First Release of NiFi Registry

2017-12-07 Thread Bryan Bende

Hey folks,

There has been a lot of great work done on the NiFi Registry [1] and I
think we are probably very close to an initial release focused on
storing "versioned flows".

Since NiFi will have a dependency on client code provided by the
registry, the first release of the registry would need to occur before
the first release of NiFi that integrates with it. The work on the
NiFi side is being done as part of NIFI-4436, which can be followed
along on PR 2219 [2].

Currently nifi-registry master is set to 0.0.1-SNAPSHOT, but I would
propose the first release should be 0.1.0.

Let me know if anyone has any thoughts or comments.  I'm happy to act
as RM if no one else is interested in doing so, and we can start the
process of going through JIRA to see what is left.

Thanks,

Bryan

[1] https://nifi.apache.org/registry.html
[2] https://github.com/apache/nifi/pull/2219

Re: PutParquet fails - "Could not rename file"

2017-12-01 Thread Bryan Bende

I think there is an open PR for a "MoveHDFS" processor that might do
what you are describing, but currently I think you'd have to use
ExecuteScript to issue an hdfs mv command.

If you are interested in trying to fix the code for PutParquet, then I
would suggest trying to add an overwrite parameter to this method:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-hadoop-record-utils/src/main/java/org/apache/nifi/processors/hadoop/AbstractPutHDFSRecord.java#L408

Inside that method, if overwrite is true then we just need to delete
destFile before calling rename.

The value to pass in for the overwrite parameter is already available:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-hadoop-record-utils/src/main/java/org/apache/nifi/processors/hadoop/AbstractPutHDFSRecord.java#L288



On Thu, Nov 30, 2017 at 10:14 PM, VinShar  wrote:
> Thanks for Reply. Actually I saw some posts where people wanted their files
> not to be overwritten by PutParquet so i thought that may be i have
> something wrong in configuration of flow.
>
> I know putParquet internally renames file on HDFS but is there a processor
> that i can use in my flow to rename a file on HDFS? i see processors to get,
> fetch, delete and put on HDFS but couldn't figure out a way to rename. If
> this is a defect then I can save file with a different name, delete existing
> file through DeleteHDFS and then rename new file to file i deleted.
> If there no processor to rename file then i will try to modify the code or
> create a new processor by extending existing one.
>
>
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: Append to Parquet

2017-11-30 Thread Bryan Bende

Hello,

As far as I know there is not an option in Parquet to append due to
the way it's internal format works.

The ParquetFileWriter has a mode which only has CREATE and OVERWRITE:

https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java#L105-L107

-Bryan


On Thu, Nov 30, 2017 at 5:12 PM, VinShar  wrote:
> Hi,
>
> Is there any way to use PutParquet to append to an existing parquet file? i
> know that i can create a Kite DataSet and write parques to it but i am
> looking for an alternate to Spark's DataFrame.write.parquet (destination,
> mode="overwrite")
>
> Regards,
> Vinay
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: PutParquet fails - "Could not rename file"

2017-11-30 Thread Bryan Bende

Hello,

I haven't verified this against HDFS yet, but this may be a bug in the
processor...

The value of "Overwrite Files" is being passed to the Parquet Writer
to put it in "overwrite" mode, but since we first write a temp file,
but this would only help to overwrite the temp file if it was already
there.

Then we try to rename the temp file to the final name, and at this
point it fails because a file with the final name already exists. We
should be deleting the existing file before the rename when the file
already exists.

There is a unit test that tests this and passes, but it would be
running against a local filesystem so maybe it somehow behaves
differently than HDFS would.

-Bryan


On Thu, Nov 30, 2017 at 5:08 PM, VinShar  wrote:
> Hi,
>
> I am exploring NiFi and was trying to use it to save data in HDFS in Parquet
> format. I used PutParquet processor for the same. I am able to write a new
> file to HDFS but if i try to overwrite an existing one then i get below
> exception ("Overwrite Files" property of processor was set to true). I have
> also attached screenshot of flow. I used UpdateAttribute processor to rename
> flow file to a constant name so that flow always overwrites existing file. I
> can see dot file getting created in hdfs but it gets deleted after failure
> which i think is right. File permissions are not an issue, file was created
> by NiFi and if i use GetHDFS processor then it is able to get the file and
> delete it from HDFS.
>
> I will appreciate any pointers to resolve this issue.
>
> 2017-11-30 19:49:12,900 ERROR [Timer-Driven Process Thread-5]
> o.a.nifi.processors.parquet.PutParquet
> PutParquet[id=46f35988-1e6a-36ec-89ff-6400608bee87] Failed to write due to
> org.apache.nifi.processors.hadoop.exception.FailureException: Could not
> rename file /user/nifi/.gg_usr_test to its final filename: {}
> org.apache.nifi.processors.hadoop.exception.FailureException: Could not
> rename file /user/nifi/.gg_usr_test to its final filename
> at
> org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.rename(AbstractPutHDFSRecord.java:420)
> at
> org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.lambda$onTrigger$1(AbstractPutHDFSRecord.java:345)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
>
>
> 
>
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: Login Identity Provider

2017-11-14 Thread Bryan Bende

Jamie,

You can definitely implement your own LoginIdentityProvider...

It should work just like any other extension point, meaning you build
a NAR with your extension in it and drop it in the lib directory.

We don't have an archetype for this, but you could probably start with
the processor archetype and then rename the services file in META-INF
accordingly, and change the example processor to a
LoginIdentityProvider.

After that you drop your NAR into the lib directory, add your config
section to login-identity-providers.xml, and reference the id in
nifi.properties, just like any of the others.

The LDAP and Kerberos providers both are setup like this so you can
take a look at their code:

https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-ldap-iaa-providers-bundle
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-ldap-iaa-providers-bundle/nifi-ldap-iaa-providers/src/main/resources/META-INF/services/org.apache.nifi.authentication.LoginIdentityProvider
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-ldap-iaa-providers-bundle/nifi-ldap-iaa-providers/src/main/java/org/apache/nifi/ldap/LdapProvider.java

Thanks,

Bryan

On Tue, Nov 14, 2017 at 4:10 PM, Jamie Wang  wrote:
> Hi,
>
> In NiFi 1.4.0, I see login for LDAP and Kerberos are supported. I am 
> integrating nifi as part of our systems of products to interoperable with 
> each other. We want to use our product's built-in login facility as the 
> authentication mechanism.  Since LoginIdentityProvider is a pluggable 
> component, is there any API support for us to develop our own pluggable 
> LoginIdentifyProvider? If so, is there any example or how do I proceed with 
> this? Appreciate any input or pointers you may have.
>
> p.s. Sometimes ago, I asked the group if anyone had load the Nifi into their 
> own process and I didn't get any answer. I assumed no one tried Anyway, 
> it is possible to load nifi into your own process and we have done that. The 
> login is one another thing we want to integrate with our product.
>
> Thanks.
> Jamie

Re: Custom properties file for environmental promotion

2017-11-13 Thread Bryan Bende

In general that approach should work, there were a few community
efforts that did something like this in the past [1][2].

For the RPG, you may need to substitute another value as well, because
I believe the template also contains the UUID of the ports it is
connected to, which will be different depending on the target URI, but
I don't remember exactly how this part works, maybe others can chime
in.

Thanks,

Bryan

[1] https://github.com/aperepel/nifi-api-deploy
[2] https://github.com/hermannpencole/nifi-config

On Mon, Nov 13, 2017 at 1:47 PM, wildo  wrote:
> If you guys would please entertain one more question:
>
> Unfortunately the site is currently limited to Nifi 1.2, and this
> effectively renders the custom properties file useless as it relates to the
> PutHDFS kerberos principal/keytab. I was thinking about this as I exported
> the template in order to put into source control.
>
> Is there any reason that I couldn't modify the xml template to replace these
> values before uploading the template into the new environment?
>
> 
> Kerberos Principal
> d...@mydomain.com
> 
>
> 
> Kerberos Keytab
> /my/path/to/keytabs/dev.keytab
> 
>
>
> Similarly, when it comes to the remote process group, is there any reason
> that I can't just modify these uris before uploading the template to the new
> environment?
>
> https://dev001.mydomain.com:1234/nifi
> https://dev001.mydomain.com:1234/nifi,https://dev002.mydomain.com:1234/nifi
>
>
> It would be quite simple to write a script that does a simple substitution
> on these values. But is there any reason that that wouldn't work? I imagine
> I'm hardly the first to think about editing the template xml itself...
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: Custom properties file for environmental promotion

2017-11-08 Thread Bryan Bende

Currently, there is the variable properties file which would require a
service restart and also would need to be on all nodes in a cluster.

The last release (1.4.0) added a more user-friendly variable registry
in the UI which you can access from the context palette for a given
process group, near where the controller services for a PG are
located.

When editing variables in the UI, it will detect components that
reference them and automatically restart them. This variable registry
will be tightly integrated with the experience of using the flow
registry.

Given all of the above, there still isn't anything you can currently
do for RPGs though... you will unfortunately have to recreate it in
the target environment until they become editable.

As far as when the registry will be released, there are no set
timelines for apache projects so it will be based when the community
believes it is mature enough to be released, and when someone
volunteers to be the release manager.

That being said, a lot of good work has been done already and it is
maturing quickly.

Thanks,

Bryan

On Wed, Nov 8, 2017 at 10:21 AM, wildo  wrote:
> Great info Bryan- thanks!
>
> Regarding my first question, I talked to our admins and we only have one NIC
> anyway. So there is no need for me to limit it, and thus I don't have a need
> to use EL to discover the NIC. So that's good.
>
> Regarding the registry stuff, I found this [1] document which looks
> FANTASTIC. But I'm not able to find when/if this stuff will be released. My
> understanding is that it is not yet released, and therefore I'm assuming
> that specifying a custom.properties file via the
> nifi.variable.registry.properties is still the preferred method.
> Additionally, this will mean that:
>  1) Changes to this file require a service restart, correct?
>  2) Is it true that this needs to be specified for each node of a clustered
> environment?
>
> Thanks again!
>
> [1] https://nifi.apache.org/registry.html
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: Custom properties file for environmental promotion

2017-11-08 Thread Bryan Bende

Hello,

Regarding Remote Process Groups, this is definitely something that
needs to be improved. There is a JIRA to make the URL editable [1].

A significant amount of work has been done on the flow registry [2],
and this will become the primary way to deploy flows across
environments.

The typical scenario would be to save your dev flow to the registry,
and when importing it to QA or prod, you would then edit the RPG
(based on the JIRA to make it editable) to be the URL for that
environment.

After that the URL would be set for that environment and would not be
changed when upgrading to newer versions.

Hope this helps.

Thanks,

Bryan

[1] https://issues.apache.org/jira/browse/NIFI-4526
[2] https://github.com/apache/nifi-registry

On Tue, Nov 7, 2017 at 11:58 PM, wildo  wrote:
> We have nearly wrapped up our testing with out Nifi scripts in dev, and are
> now looking to push to QA. I found an article about creating a custom
> properties file in order to specify each of your environmental specific
> variables, and then specifying that file in nifi.properties at
> nifi.variable.registry.properties.
>
> This will work fine omitting two cases I can think of.
>
> 1) We have a number of ListenTCP processors which require the "local network
> interface" to be specified. I have read that Expression Language can access
> system properties, but I haven't seen any example about how to use this. Can
> anyone share how EL might be used to grab the local network interface for
> each environment automatically?
>
> 2) We use Remote Process Groups with Site-to-Site for load balancing. In the
> RPG, you have to specify an absolute url to the nodes in the remote site.
> The RPG doesn't indicate that EL is acceptable in this field. Can anyone
> chime in on the possibility of using EL to grab a property for the RPG url?
>
> Thanks!
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: cluster property not in nifi.properties

2017-11-07 Thread Bryan Bende

Mark,

I believe that property is no longer used...

Grep'ing the source tree for it shows a few lingering references in
the admin guide and in src/test/resources, but nothing in regular
code.

It may be residual from the 0.x clustering model that was removed
during the 1.0.0 release.

-Bryan

On Tue, Nov 7, 2017 at 12:10 PM, Mark Bean  wrote:
> It was observed that nifi.cluster.request.replication.claim.timeout
> property is not in the default nifi.properties file. I assume this property
> is still relevant (i.e. hasn't been deprecated.) Should it be included in
> nifi.properties?
>
> Thanks,
> Mark

Re: Classloader issues

2017-11-06 Thread Bryan Bende

Correct :)

Technically we do support deploying extensions in JARs, but if you did
that you would only be able to utilize JARs that are directly in lib,
and since nifi-utils is not one of those, it explains why your
processor coming from the JAR cannot find StandardValidators.

Long story short, I think removing the JAR and copying in your NAR should work.

-Bryan


On Mon, Nov 6, 2017 at 1:20 PM, Phil H <gippyp...@gmail.com> wrote:
> Hey, now I see the problem - I had (somehow?!?) managed to copy the JAR file 
> rather than the NAR file
>
>
>
>> On 7 Nov 2017, at 5:17 am, Phil H <gippyp...@gmail.com> wrote:
>>
>> Thanks Bryan,
>>
>> This new NAR does not appear in the extensions directory (my other working 
>> ones do).
>>
>> As for your second question
>>
>> [phil@localhost JSONCondenser]$ ls  ~/nifi-1.3.0/lib/ | grep jar
>> javax.servlet-api-3.1.0.jar
>> jcl-over-slf4j-1.7.25.jar
>> jetty-schemas-3.1.jar
>> jul-to-slf4j-1.7.25.jar
>> log4j-over-slf4j-1.7.25.jar
>> logback-classic-1.2.3.jar
>> logback-core-1.2.3.jar
>> nifi-api-1.3.0.jar
>> nifi-framework-api-1.3.0.jar
>> nifi-JSONCondenser-processors-0.1.jar
>> nifi-nar-utils-1.3.0.jar
>> nifi-properties-1.3.0.jar
>> nifi-runtime-1.3.0.jar
>> slf4j-api-1.7.25.jar
>>
>>
>>> On 7 Nov 2017, at 5:13 am, Bryan Bende <bbe...@gmail.com> wrote:
>>>
>>> Thanks for the poms.
>>>
>>> Can you provide the output of listing
>>> NIFI_HOME/work/nar/extensions/.nar-unpacked/META-INF/bundled-dependencies/
>>> ?
>>>
>>> and also the output of listing JARs that are in NiFi's lib directory?
>>> ls -l NIFI_HOME/lib/ | grep jar
>>>
>>> Want to verify that nifi-utils JAR is actually in your NAR and also
>>> that no other unexpected JARs are in your lib directory.
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>> On Mon, Nov 6, 2017 at 12:48 PM, Phil H <gippyp...@gmail.com> wrote:
>>>> ./pom.xml (Note that this issue occurred BEFORE I added the external JAR 
>>>> dependency, and I have that exact same org.json dependency in a bunch of 
>>>> other processors I have written without issue)
>>>>
>>>> http://maven.apache.org/POM/4.0.0; 
>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
>>>> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
>>>> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
>>>>   4.0.0
>>>>
>>>>   
>>>>   org.apache.nifi
>>>>   nifi-nar-bundles
>>>>   1.3.0
>>>>   
>>>>
>>>>   
>>>>   
>>>>   project.local
>>>>   projects
>>>>   file:${project.basedir}/repo
>>>>   
>>>>   
>>>>
>>>>   com.jidmu
>>>>   JSONCondenser
>>>>   0.1
>>>>   pom
>>>>
>>>>   
>>>>   nifi-JSONCondenser-processors
>>>>   nifi-JSONCondenser-nar
>>>>   
>>>>
>>>>   
>>>>   
>>>>   org.json
>>>>   JSON
>>>>   1.0
>>>>   
>>>>   
>>>>
>>>> 
>>>>
>>>>
>>>>
>>>> ./nifi-JSONCondenser-processors/pom.xml
>>>> http://maven.apache.org/POM/4.0.0; 
>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
>>>> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
>>>> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
>>>>   4.0.0
>>>>
>>>>   
>>>>   com.jidmu
>>>>   JSONCondenser
>>>>   0.1
>>>>   
>>>>
>>>>   nifi-JSONCondenser-processors
>>>>   jar
>>>>
>>>>   
>>>>   
>>>>   org.apache.nifi
>>>>   nifi-api
>>>>   
>>>>   
>>>>   org.apache.nifi
>>>>   nifi-utils
>>>>   
>>>>   
>>>>   org.apache.nifi
>>>>   nifi-mock
>>>>   test
>>>>   
>>>>   
>>>>   org.slf4j
>>>>   slf4j-simple
>>>>   test
>>>>   
>>>>   
>>>>

Re: Classloader issues

2017-11-06 Thread Bryan Bende

Thanks for the poms.

Can you provide the output of listing
NIFI_HOME/work/nar/extensions/.nar-unpacked/META-INF/bundled-dependencies/
?

and also the output of listing JARs that are in NiFi's lib directory?
ls -l NIFI_HOME/lib/ | grep jar

Want to verify that nifi-utils JAR is actually in your NAR and also
that no other unexpected JARs are in your lib directory.

Thanks,

Bryan

On Mon, Nov 6, 2017 at 12:48 PM, Phil H <gippyp...@gmail.com> wrote:
> ./pom.xml (Note that this issue occurred BEFORE I added the external JAR 
> dependency, and I have that exact same org.json dependency in a bunch of 
> other processors I have written without issue)
>
> http://maven.apache.org/POM/4.0.0; 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
> 4.0.0
>
> 
> org.apache.nifi
> nifi-nar-bundles
> 1.3.0
> 
>
> 
> 
> project.local
> projects
> file:${project.basedir}/repo
> 
> 
>
> com.jidmu
> JSONCondenser
> 0.1
> pom
>
> 
> nifi-JSONCondenser-processors
> nifi-JSONCondenser-nar
> 
>
> 
> 
> org.json
> JSON
> 1.0
> 
> 
>
> 
>
>
>
> ./nifi-JSONCondenser-processors/pom.xml
> http://maven.apache.org/POM/4.0.0; 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
> 4.0.0
>
> 
> com.jidmu
> JSONCondenser
> 0.1
> 
>
> nifi-JSONCondenser-processors
> jar
>
> 
> 
> org.apache.nifi
> nifi-api
> 
> 
> org.apache.nifi
> nifi-utils
> 
> 
> org.apache.nifi
> nifi-mock
> test
> 
> 
> org.slf4j
> slf4j-simple
> test
> 
> 
> junit
> junit
> test
> 
> 
> 
>
> ./nifi-JSONCondenser-nar/pom.xml
> http://maven.apache.org/POM/4.0.0; 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
> 4.0.0
>
> 
> com.jidmu
> JSONCondenser
> 0.1
> 
>
> nifi-JSONCondenser-nar
> 0.1
> nar
> 
> true
> true
> 
>
> 
> 
> com.jidmu
> nifi-JSONCondenser-processors
> 0.1
> 
> 
>
> 
>
>
>
>> On 7 Nov 2017, at 4:39 am, Bryan Bende <bbe...@gmail.com> wrote:
>>
>> It is most likely an issue with the Maven configuration in one of your 
>> modules.
>>
>> Can you share your project, or the pom files for the processors, NAR,
>> and bundle?
>>
>> Thanks,
>>
>> Bryan
>>
>>
>> On Mon, Nov 6, 2017 at 12:20 PM, Phil H <gippyp...@gmail.com> wrote:
>>> Nifi version is 1.3.0, running on Java 1.8.0_131, running on a 
>>> out-of-the-box CentOS (if that’s relevant)
>>>
>>>> On 7 Nov 2017, at 4:17 am, Phil H <gippyp...@gmail.com> wrote:
>>>>
>>>> I added the StandardValidators reference back in and the original error 
>>>> reoccurs.  The offending code (which compiles fine using the same JDK) is 
>>>> totally standard stuff:
>>>>
>>>>   public static final PropertyDescriptor ID_PATH = new PropertyDescriptor
>>>>   .Builder().name("ID_PATH")
>>>>   .displayName("JSON ID Path")
>>>>   .description("The path to the JSON attribute that represents the 
>>>> unique ID for the object")
>>>>   .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
>>>>   .required(true)
>>>>   .build();
>>>>
>>>>
>>>> 2017-11-07 04:13:46,855 ERROR [main] org.apache.nifi.NiFi Failure to 
>>>> launch NiFi due to java.util.ServiceConfigurationError: 
>>>> org.apache.nifi.processor.Processor: Provider 
>>>> com.jidmu.processors.JSONCondenser.JSONCondenser could not be instantiated
>>>> java.util.ServiceConfigurationError: o

Re: Classloader issues

2017-11-06 Thread Bryan Bende

It is most likely an issue with the Maven configuration in one of your modules.

Can you share your project, or the pom files for the processors, NAR,
and bundle?

Thanks,

Bryan


On Mon, Nov 6, 2017 at 12:20 PM, Phil H  wrote:
> Nifi version is 1.3.0, running on Java 1.8.0_131, running on a out-of-the-box 
> CentOS (if that’s relevant)
>
>> On 7 Nov 2017, at 4:17 am, Phil H  wrote:
>>
>> I added the StandardValidators reference back in and the original error 
>> reoccurs.  The offending code (which compiles fine using the same JDK) is 
>> totally standard stuff:
>>
>>public static final PropertyDescriptor ID_PATH = new PropertyDescriptor
>>.Builder().name("ID_PATH")
>>.displayName("JSON ID Path")
>>.description("The path to the JSON attribute that represents the 
>> unique ID for the object")
>>.addValidator(StandardValidators.NON_BLANK_VALIDATOR)
>>.required(true)
>>.build();
>>
>>
>> 2017-11-07 04:13:46,855 ERROR [main] org.apache.nifi.NiFi Failure to launch 
>> NiFi due to java.util.ServiceConfigurationError: 
>> org.apache.nifi.processor.Processor: Provider 
>> com.jidmu.processors.JSONCondenser.JSONCondenser could not be instantiated
>> java.util.ServiceConfigurationError: org.apache.nifi.processor.Processor: 
>> Provider com.jidmu.processors.JSONCondenser.JSONCondenser could not be 
>> instantiated
>>   at java.util.ServiceLoader.fail(ServiceLoader.java:232)
>>   at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
>>   at 
>> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
>>   at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
>>   at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
>>   at 
>> org.apache.nifi.nar.ExtensionManager.loadExtensions(ExtensionManager.java:138)
>>   at 
>> org.apache.nifi.nar.ExtensionManager.discoverExtensions(ExtensionManager.java:104)
>>   at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:699)
>>   at org.apache.nifi.NiFi.(NiFi.java:160)
>>   at org.apache.nifi.NiFi.main(NiFi.java:267)
>> Caused by: java.lang.NoClassDefFoundError: 
>> org/apache/nifi/processor/util/StandardValidators
>>   at 
>> com.jidmu.processors.JSONCondenser.JSONCondenser.(JSONCondenser.java:58)
>>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
>> Method)
>>   at 
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>   at 
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>>   at java.lang.Class.newInstance(Class.java:442)
>>   at 
>> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
>>   ... 7 common frames omitted
>> Caused by: java.lang.ClassNotFoundException: 
>> org.apache.nifi.processor.util.StandardValidators
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>   ... 14 common frames omitted
>>
>>
>>> On 7 Nov 2017, at 4:02 am, Joe Witt  wrote:
>>>
>>> Can you share the code by chance for a review?  Otherwise, you'll want
>>> to add a debugger at runtime and examine the context classloader as it
>>> goes through to see where it goes wonky.
>>>
>>> What version are you on?
>>>
>>> Thanks
>>>
>>> On Mon, Nov 6, 2017 at 11:58 AM, Phil H  wrote:
 Hi guys,

 I have just (today) started having issues with a new processor I've 
 written where a seemingly random class (e.g.: StandardValidators) fails to 
 load when NiFi is initializing. I've created this processor like all my 
 others (using the maven archetype) and the VM I'm running it on has not 
 changed versions of any software.

 I removed any reference to StandardValidators, and the NAR then loaded 
 successfully. As I added some more functionality, I then had the same 
 ClassNotFoundException occur with a different random class.

 Because I have used the maven archetype, all my other processors obviously 
 use StandardValidators, so it's not like the class itself is somehow 
 corrupted. This feels like either a NiFi or maybe even JVM bug (or other 
 issue that is surfacing as this ClassNotFoundException).

 Pulling my hair out here - would love some help!

 Cheers,
 Phil
>>
>

Re: FetchSFTP vs GetSFTP

2017-11-01 Thread Bryan Bende

The list-fetch approach sounds correct, and the micro acquisition
cluster (if necessary) also sounds like a good idea.

Regarding multiple hosts, the connection pooling in FetchSFTP does
account for that. Its basically a map from the hostname string to a
holder of connections for that hostname.

-Bryan

On Tue, Oct 31, 2017 at 7:55 PM, Ryan Ward <ryan.wa...@gmail.com> wrote:
> Yep that's exactly how I have it set up with a push to RPG. Is that
> preferred? I just started playing with it to be honest. I can see how it
> could be tricky if you have to pull from multiple servers each flow file
> could potentially have a different sftp host address in the queues.
>
> All together we have to pull from about 60 servers. If this doesn't work
> out with the list/fetch  I plan to have a micro acquisition cluster just
> for gets.
>
> Ryan
>
> On Oct 31, 2017 4:26 PM, "Bryan Bende" <bbe...@gmail.com> wrote:
>
>> Ryan,
>>
>> The 10 seconds appears to be a hard-code rule in the processor,
>> although it seems like it could be turned into a configurable
>> property.
>>
>> It would require a code change to make it grab a batch of flow files
>> during a single execution. In theory it shouldn't provide that much of
>> a difference, but might be an interesting experiment. It makes the
>> code more challenging to write though, not that that's a reason not to
>> do it.
>>
>> If you have a 5 node cluster, you are doing List on primary node and
>> then redistributing the results to all the nodes via an RPG so all
>> nodes can fetch?
>>
>> -Bryan
>>
>>
>> On Tue, Oct 31, 2017 at 3:43 PM, Ryan Ward <ryan.wa...@gmail.com> wrote:
>> > Joe/Bryan Thanks!
>> >
>> > I believe the one specific file per concurrent task/connection (and too
>> > many threads) is the issue I have we have a lot of small files and often
>> > times backed up . I'm going to drop the task count to take advantage of
>> the
>> > pooling. Is it possible to have Fetch do batches vs a single file? Would
>> > that improve throughput? Also is that 10 seconds configurable?
>> >
>> > Some background: I'm converting 2 single nodes into a 5 node cluster and
>> > trying to figure out the best approach.
>> >
>> > Thanks again!
>> >
>> >
>> >
>> > On Tue, Oct 31, 2017 at 2:56 PM, Bryan Bende <bbe...@gmail.com> wrote:
>> >
>> >> Ryan,
>> >>
>> >> Personally I don't have experience running these processors at scale,
>> >> but from a code perspective they are fundamentally different...
>> >>
>> >> GetSFTP is a source processor, meaning is not being fed by an upstream
>> >> connection, so when it executes it can create a connection and
>> >> retrieve up to max-selects during that one execution.
>> >>
>> >> FetchSFTP is being told to fetch one specific file, typically through
>> >> attributes on incoming flow files, so the concept of max-selects
>> >> doesn't really apply because there is only thing to select during an
>> >> execution of the processor.
>> >>
>> >> FetchSFTP does employ connection pooling behind the scenes such that
>> >> it will keep open a connection for each concurrent task, as long as
>> >> each connection continues to be used with in 10 seconds.
>> >>
>> >> -Bryan
>> >>
>> >>
>> >> On Tue, Oct 31, 2017 at 11:43 AM, Joe Witt <joe.w...@gmail.com> wrote:
>> >> > Ryan - dont know the code specifics behind FetchSFTP off-hand but i
>> >> > can confirm there are users at that range for it.
>> >> >
>> >> > Thanks
>> >> >
>> >> > On Tue, Oct 31, 2017 at 11:38 AM, Ryan Ward <ryan.wa...@gmail.com>
>> >> wrote:
>> >> >> I've found that on a single node getSFTP is able to pull more files
>> off
>> >> a
>> >> >> remote server than Fetch in a cluster. I noticed Fetch doesn't have a
>> >> max
>> >> >> selects so it is requiring way more connections (one per file?) and
>> >> >> concurrent threads to keep up.
>> >> >>
>> >> >> Was wondering if anyone is using List/Fetch at scale? In the multi
>> TB's
>> >> a
>> >> >> day range?
>> >> >>
>> >> >> Thanks,
>> >> >> Ryan
>> >>
>>

Re: FetchSFTP vs GetSFTP

2017-10-31 Thread Bryan Bende

Ryan,

Personally I don't have experience running these processors at scale,
but from a code perspective they are fundamentally different...

GetSFTP is a source processor, meaning is not being fed by an upstream
connection, so when it executes it can create a connection and
retrieve up to max-selects during that one execution.

FetchSFTP is being told to fetch one specific file, typically through
attributes on incoming flow files, so the concept of max-selects
doesn't really apply because there is only thing to select during an
execution of the processor.

FetchSFTP does employ connection pooling behind the scenes such that
it will keep open a connection for each concurrent task, as long as
each connection continues to be used with in 10 seconds.

-Bryan

On Tue, Oct 31, 2017 at 11:43 AM, Joe Witt  wrote:
> Ryan - dont know the code specifics behind FetchSFTP off-hand but i
> can confirm there are users at that range for it.
>
> Thanks
>
> On Tue, Oct 31, 2017 at 11:38 AM, Ryan Ward  wrote:
>> I've found that on a single node getSFTP is able to pull more files off a
>> remote server than Fetch in a cluster. I noticed Fetch doesn't have a max
>> selects so it is requiring way more connections (one per file?) and
>> concurrent threads to keep up.
>>
>> Was wondering if anyone is using List/Fetch at scale? In the multi TB's a
>> day range?
>>
>> Thanks,
>> Ryan

Re: Two issues relating to a processor I'm developing

2017-10-30 Thread Bryan Bende

Mike,

Regarding the licensing, I believe LGPL is a no-go for Apache projects.

Take a look here:
https://www.apache.org/legal/resolved.html#category-x

-Bryan


On Sat, Oct 28, 2017 at 4:47 PM, Mike Thomsen  wrote:
> The processor breaks down a much larger file into a huge number of small
> data points. We're talking like turning a 1.1M line file into about 2.5B
> data points.
>
> My current approach is "read a file with GetFile, save to /tmp, break down
> into a bunch of large CSV record batches (like a few hundred thousand
> records per group)" and then commit.
>
> It's slow, and with some good debugging statements, I can see the processor
> tearing into the data just fine. However, I am thinking about adding a
> variant to this which would be an "iterative" version that would follow
> this pattern:
>
> "read the file, save to /tmp, load the file, keep the current read position
> intact, every onTrigger call sends out a batch w/ session.commit() until
> it's done reading. Then grab the next flowfile."
>
> Does anyone have any suggestions on good practices to follow here,
> potential concerns, etc.? (Note: I have to write the file to /tmp because a
> library I am using which I don't want to fork doesn't have an API that can
> read from a stream rather than a java.io.File)
>
> Also, are there any issues with accepting a contribution that makes use of
> a LGPL-licensed library, in the event that my client wants to open source
> it (we think they will)?
>
> Thanks,
>
> Mike

Re: Multiple users or group as initial admin identity

2017-10-20 Thread Bryan Bende

Hi Fredrik,

These are some good ideas.

If we did support multiple initial admins, I would suggest it be done
through multiple elements, rather than a comma separate list since
commas are part of a DN which could be a single user.

We already support this patter in the new user group provider:





Down in the policy provider we currently only support a single
property called "Initial Admin", but that could possibly be:





I would think groups could be done similarly by providing a group to
the user group provider and then declaring that group to be an admin,
possibly:



and



-Bryan


On Thu, Oct 19, 2017 at 10:56 AM, Fredrik Skolmli  wrote:
> Hi all.
>
> With the ability to populate NiFi with users and groups from LDAP (as of
> 1.4.0(?)), I'm running into a few tasks that could be avoided or improved.
>
> I would like to specify a group as the initial admin identity instead of a
> single user, enabling the group members to log in and do the initial setup
> of new NiFi instances.
>
> Another option, as a quickfix, would be to allow the initial admin identity
> property to be a comma separated value (i.e. "admin1,admin2").
>
> The latter would be a rather small patch to implement, but I would some
> appreciate feedback from the community on what the best and most reliable
> approach would be. Or if both would be considered.
>
> ..or are there any other ideas on the roadmap to solve this that I haven't
> found in JIRA or thought of myself?
>
> Thanks.
>
> BR,
> Fredrik

Re: Syslog processing from cisco switches to Splunk

2017-10-19 Thread Bryan Bende

If you can provide an example message we can try to see why
ListenSyslog says it is invalid.

I'm not sure that will solve the issue, but would give you something
else to try.

On Thu, Oct 19, 2017 at 8:38 AM, Andrew Psaltis
 wrote:
> Dave,
> To clarify you are using the PutUDP processor, not the PutSplunk processor?
>
> On Thu, Oct 19, 2017 at 7:31 AM, DAVID SMITH 
> wrote:
>
>> Hi
>> We are trying to do something which on the face of it seems fairly simple
>> but will not work.We have a cisco switch which is producing syslogs,
>> normally we use zoneranger to send them to Splunk and the records are
>> shown.However we want to do a bit of content routing, so we are using NiFi
>> 0.7.3 with a ListenUDP on port 514 and we can see the records coming in to
>> NiFi. Without doing anything to the records we use a putUDP to send records
>> to the Splunk server, NiFi says they have sent successfully but they never
>> show in Splunk.We have used a listenUDP on another NiFi and the records
>> transfer and look exactly the same as they were sent.We have also used
>> listenSyslog and putSyslog, but the listenSyslog says the records are
>> invalid.
>> Has anyone ever to do this, and can you give us any guidance on what we
>> may be missing?
>> Many thanksDave
>
>
>
>
> --
> Thanks,
> Andrew

Re: Metric access for reporting tasks

2017-10-11 Thread Bryan Bende

There should be a DeprecationNotice annotation in the nifi-api module.

I believe the intent was to use this, and then later add some
visualization in the UI/docs to indicate what is deprecated.

Anyone else feel free to correct me here.

On Wed, Oct 11, 2017 at 1:54 PM, Omer Hadari <hadari.o...@gmail.com> wrote:
> Yes it is separate, just read my e-mail again and I see I wasn't clear,
> sorry about that.
> Thank you for adding me to the list. I'll create a ticket for the refactor
> and start working on it soon.
> Also, what's the appropriate way to deprecate a
> reporting-task/service/processor?
>
>
> On Wed, Oct 11, 2017 at 8:48 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> I just added you to the contributors list in JIRA so you should be
>> able to assign things to yourself.
>>
>> I think initially putting all the metric services into the same NAR
>> will probably be fine. If we add others in the future that bring in
>> any conflicting dependencies then we can re-evaluate.
>>
>> The API is already separate in nifi-metrics-reporter-service-api right?
>>
>> Someone could implement their own metrics service in another NAR by
>> having a NAR dependency on nifi-metrics-reporter-service-api-nar.
>>
>>
>> On Wed, Oct 11, 2017 at 1:23 PM, Omer Hadari <hadari.o...@gmail.com>
>> wrote:
>> > I looked at it and I think they could live under the same nar. That might
>> > be preferred since we want each implementation to depend on the same
>> > version of dropwizard-metrics, and including it in each nar is redundant
>> > and might even cause problems (correct me if I am wrong here).
>> >
>> > If you think these (or future) implementations might have conflicting
>> > dependencies I guess it's also possible to separate each into it's own
>> > submodule, or even separate specific problematic nars but keep most in
>> the
>> > same one. Anyway, I think the api should be extracted to it's own module
>> so
>> > that anyone who wants to implement their own service can do so without
>> > including modules they don't need.
>> >
>> > By the way, if it's OK of course, could you please add me to the jira so
>> > that the issue can be assigned to me once opened? Thank you!
>> >
>> > On Wed, 11 Oct 2017 at 17:56 Bryan Bende <bbe...@gmail.com> wrote:
>> >
>> >> Omer,
>> >>
>> >> I think adding the new versions that implement the new
>> >> MetricReporterService, and marking the old ones as deprecated makes
>> >> sense. They could potentially be removed on a major future release
>> >> like 2.0.0.
>> >>
>> >> Were you envisioning the DataDogMetricReportService and
>> >> AmbariMetricReportingService living along side the
>> >> GraphiteMetricReportingService in nifi-metrics-reporting-task? or
>> >> would the DataDog and Ambari implementations live inside their
>> >> respective NARs and just depend on the MetricsReportingService API?
>> >>
>> >> I haven't really looked at the dependencies to see if putting them all
>> >> in one NAR would cause any issues.
>> >>
>> >> I have slight concerns over whether the Ambari one can be easily
>> >> converted to the new approach, obviously it would be good if we can,
>> >> but we need to ensure we port over the exact logic it is using.
>> >>
>> >> Thanks,
>> >>
>> >> Bryan
>> >>
>> >>
>> >> On Wed, Oct 11, 2017 at 8:40 AM, Omer Hadari <hadari.o...@gmail.com>
>> >> wrote:
>> >> > So I have created a generic metric reporting task, and implemented a
>> >> > Graphite service for it (Thank you Bryan for the quick reviews and
>> >> > responses!), and I am up to implementing the DataDog and Ambari
>> reporting
>> >> > tasks in the same manner as well. I think it's important for avoiding
>> >> > confusion when implementing another reporting task, and for creating a
>> >> > uniform UI for metric reporting (the same task, different
>> implementations
>> >> > of the controller service).
>> >> > I don't think I can remove the old ones though (It will obviously
>> break
>> >> > flows that use them). What do you think is best practice here?
>> >> > Personally I think implementing a "double" and deprecate the old ones
>> in
>> >> > some way is OK.
>> >> >
>> >

Re: Metric access for reporting tasks

2017-10-11 Thread Bryan Bende

I just added you to the contributors list in JIRA so you should be
able to assign things to yourself.

I think initially putting all the metric services into the same NAR
will probably be fine. If we add others in the future that bring in
any conflicting dependencies then we can re-evaluate.

The API is already separate in nifi-metrics-reporter-service-api right?

Someone could implement their own metrics service in another NAR by
having a NAR dependency on nifi-metrics-reporter-service-api-nar.


On Wed, Oct 11, 2017 at 1:23 PM, Omer Hadari <hadari.o...@gmail.com> wrote:
> I looked at it and I think they could live under the same nar. That might
> be preferred since we want each implementation to depend on the same
> version of dropwizard-metrics, and including it in each nar is redundant
> and might even cause problems (correct me if I am wrong here).
>
> If you think these (or future) implementations might have conflicting
> dependencies I guess it's also possible to separate each into it's own
> submodule, or even separate specific problematic nars but keep most in the
> same one. Anyway, I think the api should be extracted to it's own module so
> that anyone who wants to implement their own service can do so without
> including modules they don't need.
>
> By the way, if it's OK of course, could you please add me to the jira so
> that the issue can be assigned to me once opened? Thank you!
>
> On Wed, 11 Oct 2017 at 17:56 Bryan Bende <bbe...@gmail.com> wrote:
>
>> Omer,
>>
>> I think adding the new versions that implement the new
>> MetricReporterService, and marking the old ones as deprecated makes
>> sense. They could potentially be removed on a major future release
>> like 2.0.0.
>>
>> Were you envisioning the DataDogMetricReportService and
>> AmbariMetricReportingService living along side the
>> GraphiteMetricReportingService in nifi-metrics-reporting-task? or
>> would the DataDog and Ambari implementations live inside their
>> respective NARs and just depend on the MetricsReportingService API?
>>
>> I haven't really looked at the dependencies to see if putting them all
>> in one NAR would cause any issues.
>>
>> I have slight concerns over whether the Ambari one can be easily
>> converted to the new approach, obviously it would be good if we can,
>> but we need to ensure we port over the exact logic it is using.
>>
>> Thanks,
>>
>> Bryan
>>
>>
>> On Wed, Oct 11, 2017 at 8:40 AM, Omer Hadari <hadari.o...@gmail.com>
>> wrote:
>> > So I have created a generic metric reporting task, and implemented a
>> > Graphite service for it (Thank you Bryan for the quick reviews and
>> > responses!), and I am up to implementing the DataDog and Ambari reporting
>> > tasks in the same manner as well. I think it's important for avoiding
>> > confusion when implementing another reporting task, and for creating a
>> > uniform UI for metric reporting (the same task, different implementations
>> > of the controller service).
>> > I don't think I can remove the old ones though (It will obviously break
>> > flows that use them). What do you think is best practice here?
>> > Personally I think implementing a "double" and deprecate the old ones in
>> > some way is OK.
>> >
>> > For reference here is the original ticket:
>> > https://issues.apache.org/jira/browse/NIFI-4392
>> > and here is the PR: https://github.com/apache/nifi/pull/2171
>> >
>> > Thank you!
>> >
>> > On Mon, Sep 18, 2017 at 6:07 PM, Andrew Hulbert <andrew.hulb...@ccri.com
>> >
>> > wrote:
>> >
>> >> Hi Omer,
>> >>
>> >> If you're interested in some help to implement, test, or review a
>> >> graphite/grafana metrics reporter please let me know! We have written a
>> >> very simple version and are interested in getting support into the main
>> >> codebase as well.
>> >>
>> >> -Andrew
>> >>
>> >>
>> >> On 09/17/2017 05:57 PM, Joe Witt wrote:
>> >>
>> >>> Omer
>> >>>
>> >>> Is the right list and it's awesome you want to contribute.
>> >>>
>> >>> Yes for sure such contribs are welcome.  Just need to be sure all
>> >>> libraries
>> >>> used including transitive deps are fair game as far as licensing goes
>> and
>> >>> are properly accounted for.
>> >>>
>> >>> As far as refactoring to avoid code duplication it could be helpful.
&g

Re: Metric access for reporting tasks

2017-10-11 Thread Bryan Bende

Omer,

I think adding the new versions that implement the new
MetricReporterService, and marking the old ones as deprecated makes
sense. They could potentially be removed on a major future release
like 2.0.0.

Were you envisioning the DataDogMetricReportService and
AmbariMetricReportingService living along side the
GraphiteMetricReportingService in nifi-metrics-reporting-task? or
would the DataDog and Ambari implementations live inside their
respective NARs and just depend on the MetricsReportingService API?

I haven't really looked at the dependencies to see if putting them all
in one NAR would cause any issues.

I have slight concerns over whether the Ambari one can be easily
converted to the new approach, obviously it would be good if we can,
but we need to ensure we port over the exact logic it is using.

Thanks,

Bryan

On Wed, Oct 11, 2017 at 8:40 AM, Omer Hadari  wrote:
> So I have created a generic metric reporting task, and implemented a
> Graphite service for it (Thank you Bryan for the quick reviews and
> responses!), and I am up to implementing the DataDog and Ambari reporting
> tasks in the same manner as well. I think it's important for avoiding
> confusion when implementing another reporting task, and for creating a
> uniform UI for metric reporting (the same task, different implementations
> of the controller service).
> I don't think I can remove the old ones though (It will obviously break
> flows that use them). What do you think is best practice here?
> Personally I think implementing a "double" and deprecate the old ones in
> some way is OK.
>
> For reference here is the original ticket:
> https://issues.apache.org/jira/browse/NIFI-4392
> and here is the PR: https://github.com/apache/nifi/pull/2171
>
> Thank you!
>
> On Mon, Sep 18, 2017 at 6:07 PM, Andrew Hulbert 
> wrote:
>
>> Hi Omer,
>>
>> If you're interested in some help to implement, test, or review a
>> graphite/grafana metrics reporter please let me know! We have written a
>> very simple version and are interested in getting support into the main
>> codebase as well.
>>
>> -Andrew
>>
>>
>> On 09/17/2017 05:57 PM, Joe Witt wrote:
>>
>>> Omer
>>>
>>> Is the right list and it's awesome you want to contribute.
>>>
>>> Yes for sure such contribs are welcome.  Just need to be sure all
>>> libraries
>>> used including transitive deps are fair game as far as licensing goes and
>>> are properly accounted for.
>>>
>>> As far as refactoring to avoid code duplication it could be helpful.  You
>>> might want to just do a jira and PR to do yours in a nice and clean and
>>> reusable way and once that is done and in then do another jira and PR to
>>> clean up the others.
>>>
>>> Thanks
>>> Joe
>>>
>>> On Sep 16, 2017 2:44 PM, "Omer Hadari"  wrote:
>>>
>>> Hello,

 I hope I am writing to the correct mailing list.
 We use graphite in my organization, and recently started to use nifi.
 We went on to write a simple reporting task for graphite, and I figured
 it could be used by other people as well, so why not contribute it.
 I was looking at other reporting tasks though (DataDog and Ambari), and
 there seems to me that there is some code duplication in how they access
 metrics. They both use very similar classes in order to to that:
 org.apache.nifi.reporting.ambari.metrics.MetricsService
 org.apache.nifi.reporting.ambari.metrics.MetricNames
 org.apache.nifi.reporting.datadog.metrics.MetricsService
 org.apache.nifi.reporting.datadog.metrics.MetricNames

 They are not identical, but again - very similar. I think this
 functionality can be easily exported to some other module, in order for
 more reporting tasks that need to generally report the same metrics to be
 written more easily.
 My questions are:
 a. Are more metric reporting tasks (like graphite) welcome
 b. If the refactor I am suggesting is in order, will it belong in
 nifi-commons or is a new module for reporting tasks in order?

 I would be more than happy to implement any and all changes I have just
 suggested by myself, and am simply asking these questions in order to
 best
 fit into your conventions and workflow.

 Thank you in advance!

>>

Re: Funnel Queue Slowness

2017-10-09 Thread Bryan Bende

Peter,

The images didn’t come across for me, but since you mentioned that a failure 
queue is involved, is it possible all the flow files going to failure are being 
penalized which would cause them to not be processed immediately?

-Bryan

> On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks)  wrote:
> 
> I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. 
> I haven’t been able to create a nice reproducible test case to pass on.
> What I’m seeing is that my failure queue on the right will start to fill up, 
> even though there is plenty of room for them in the next queue. You can see 
> that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first 
> image), so it’s not that the FlowFile’s are moving so fast that they just 
> appear to be in queue.
> 
> If I stop the downstream processor the files slowly trickle through the 
> funnel into the next queue slowly. I had an Oldest FlowFile First prioritizer 
> on the downstream queue. I tried removing it but there was no change in 
> behavior.
> One time where I saw this behavior in the past was when my NiFi instance was 
> thread starved, but there are plenty of threads available on the instance and 
> all other processors are running fine. I also don’t understand why it 
> trickles the FlowFile’s in, from what I’ve seen in the code Funnel grabs 
> large batches at one time…
> 
> Thoughts?
> 
> (Sometimes my images don’t make it, let me know if that happens.)
> 



signature.asc
Description: Message signed with OpenPGP

Re: ListenTcpRecord

2017-10-09 Thread Bryan Bende

Clay,

Multiple packets should not be an issue since it is reading a stream of data 
from the socket, but I don’t think the prefixed length will work.

The data coming across has to be in a format that one of the record readers can 
understand. If you have JSON data and then have something additional added to 
it like the length, then its not going to be readable by a standard JSON 
library.

You could potentially create a custom record reader to handle your format.

-Bryan

> On Oct 7, 2017, at 10:37 AM, Clay Teahouse <clayteaho...@gmail.com> wrote:
> 
> Thank you, Bryan. I was able to set up the flow using your template and
> then fix my own.
> 
> Can ListenTCPRecord deal with records spread across multiple packets? More
> specifically, can it handle payloads prefixed with the length? I receive
> JSON and protobuf messages from a client and each record is prefixed with
> the length of the message.
> 
> -Clay
> 
> On Thu, Oct 5, 2017 at 8:22 AM, Bryan Bende <bbe...@gmail.com> wrote:
> 
>> Have you tried using the template?
>> 
>> https://gist.githubusercontent.com/bbende/fa2bff34e721fef21453986336664c
>> b2/raw/db658c64f75fec47785ab63920ee23582bf1492f/multi_line_log_
>> processing.xml
>> 
>> If you import that it will give you the exact flow from my post and
>> all you have to do is start everything.
>> 
>> PutTCP is sending to ListenTCP, the connection coming out of PutTCP to
>> LogAttribute is for the failure relationship just to easily see if
>> anything fails.
>> 
>> -Bryan
>> 
>> 
>> On Thu, Oct 5, 2017 at 7:05 AM, Clay Teahouse <clayteaho...@gmail.com>
>> wrote:
>>> Thanks Bryan, for the feedback.
>>> 
>>> I don't seem to be able  to replicate the example in your blog. So, is
>> the
>>> flow  GenerateFlowFile -> TCPPUT --> LogAttribute and ListenTCPRecord
>>> -->LogAttribute?  Both TCPPut and ListenTCPRecord  are sending data to
>>> LogAttribute?  Shouldn't ListenTCPRecord be receiving the data data?
>> Also,
>>> I am not able to link ListenTCPRecord to any other processor.
>>> 
>>> In any case, I tried to send logs to ListenTCPRecord using netcat and
>> have
>>> ListenTCPRecord send the processed data to LogAttribute, but
>>> ListenTCPRecord doesn't seem to be listening (I don't see any activity on
>>> the processor). I'd appreciate if you let me what I am doing wrong.
>>> 
>>> thanks
>>> Clay
>>> 
>>> On Wed, Oct 4, 2017 at 9:14 AM, Bryan Bende <bbe...@gmail.com> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I wrote a post that shows an example of using ListenTCPRecord with a
>>>> GrokReader to receive multi-line log messages. There is a link to a
>>>> template of the flow at the very end.
>>>> 
>>>> You could easily change the example so that PutTCP is sending a single
>>>> JSON document, or an array of JSON documents, and ListenTCPRecord is
>>>> using a JsonTreeReader.
>>>> 
>>>> We don't have a protobuf record reader so that currently isn't an
>> option.
>>>> 
>>>> Let us know if you have any other questions.
>>>> 
>>>> -Bryan
>>>> 
>>>> [1] https://bryanbende.com/development/2017/10/04/apache-
>>>> nifi-processing-multiline-logs
>>>> 
>>>> On Wed, Oct 4, 2017 at 8:39 AM, Clay Teahouse <clayteaho...@gmail.com>
>>>> wrote:
>>>>> Hi All,
>>>>> 
>>>>> Does anyone have an example of ListenTcpRecord processor in action,
>> say
>>>> for
>>>>> example, with a json or protobuf reader? I am specifically wondering
>>>> about
>>>>> the record length/prefix.
>>>>> 
>>>>> thanks
>>>>> Clay
>>>> 
>> 



signature.asc
Description: Message signed with OpenPGP

Re: route flow based on variable

2017-10-09 Thread Bryan Bende

Ben,

1) Yes, the variables are hierarchical, so a variable at the root group would 
be visible to all components, unless there is a variable with the same name at 
a lower level which would override it.

2) I haven’t tried this, but I would expect that you should still be able to 
use RouteOnAttribute to route on a variable… Lets say that your root group has 
a variable “env” and in one environment you have this set to “dev” and in 
another environment you have it set to “prod”. You might have a part of your 
flow that you only run in “prod” so you put a RouteOnAttribute with something 
like ${env:equals(“prod”)} which would only enable this path in prod.

Variables on their own are not associated with flow files, so if you wanted to 
do something more specific per-flow file, then the only way would be what you 
described with using UpdateAttribute.

Thanks,

Bryan

> On Oct 8, 2017, at 10:44 PM, 尹文才  wrote:
> 
> Hi guys, I've played around with the latest NIFI 1.4.0 release for a while
> and I think the new variable registry feature is great, however I have 2
> questions about this feature:
> 
> 1. It seems that I could only add variables to a processor group, could I
> add a global variable in the NIFI root processor group so it could be used
> anywhere inside NIFI?
> 
> 2. I want to route to different flows based on the variables I added, but
> currently the only way I know that could make this work is like this:
> myprocessor->updateAttribute(add the variable into FlowFile attribute)->
> routeOnAttribute->different flows based on the variable value
> 
> I didn't find any routeOnVariable processor, is there any easier way that I
> could use to implement conditional flow in NIFI? Thanks
> 
> /Ben

signature.asc
Description: Message signed with OpenPGP

Re: RecordReader and RecordWriter: development work on ruleengine

2017-10-05 Thread Bryan Bende

Uwe,

I don't think there is specific documentation on how to write code
using the record readers and writers, but the best example to look at
would be ConvertRecord

ConvertRecord actually extends from AbstractRecordProcessor:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractRecordProcessor.java

AbstractRecordProcessor does the following:

- Create a stream callback for a flow file
- Create a reader from the input stream of the flow file
- Create a writer for the output stream of the flow file
- Read each record from the reader and pass it to the writer

For each record it delegates to a "process" method to let sub-classes
take action on the record, but ConvertRecord doesn't need to do
anything because the act of taking a record from a reader for format 1
and passing it to writer for format 2, is conversion in itself.

Hope that helps.

-Bryan


On Thu, Oct 5, 2017 at 2:14 PM, Uwe Geercken  wrote:
> Hello,
>
> I would like to pick up development on my processor that uses a ruleengine. 
> It is working since several month already but I want to convert the processor 
> to use the new RecordReader/RecordWriter framework. This way I can use any 
> data format, which is really, really cool and will benefit it's purpose.
>
> What the processor does is to get a record and run it through the ruleengine 
> to evaluate results based on business rules. The business rules are written 
> in an external web application. So the benefit is, that the business logic is 
> not embedded in the code (process). You can change the logic without touvhing 
> the code/process. The ruleengine uses Java reflection to construct objects 
> and run methods based on the rules and against the data. Using reflection I 
> can instantiate any Java object, which makes the ruleengine universally 
> usable. Btw: I have used the ruleengine with Pentaho ETL, Nifi, Kafka and 
> Hadoop - and from Java code of course.
>
> The ruleengine also knows "actions". Actions are fired based on the the 
> result of rules (group of rules). So this is a way to fire an event based on 
> logic. Again - all outside of the code. You can manipulate the records of the 
> flowfile this way, e.g.
>
> So I am also interested in the methods that RecordReader provides to get to 
> the data. I am confident that I can make a really easy implementation that 
> works for any data because the framework abstracts everything away that is 
> about data format.
>
> Now looking through code of the existing processors helps and is educational 
> but to get started a documentation would be very helpful. Maybe you can point 
> me to a sample or documentation which shows the usage of the api for 
> RecordReader/RecordWriter and how I get a hold on the data/content of the 
> flowfile.
>
> Thanks for your time. Nifi is getting better and better. I am totally 
> convinced that the ruleengine will benefit many people and help externalize 
> logic, so that a proper division of responsibilities is possible: between 
> those that write code/create flows and those that manage a complete set of 
> logic, which is in fact the knowledge of a company. Central logic and clean 
> code will benefit both quality and agility (time to implement changes).
>
> Greetings,
>
> Uwe

Re: ListenTcpRecord

2017-10-05 Thread Bryan Bende

Have you tried using the template?

https://gist.githubusercontent.com/bbende/fa2bff34e721fef21453986336664cb2/raw/db658c64f75fec47785ab63920ee23582bf1492f/multi_line_log_processing.xml

If you import that it will give you the exact flow from my post and
all you have to do is start everything.

PutTCP is sending to ListenTCP, the connection coming out of PutTCP to
LogAttribute is for the failure relationship just to easily see if
anything fails.

-Bryan


On Thu, Oct 5, 2017 at 7:05 AM, Clay Teahouse <clayteaho...@gmail.com> wrote:
> Thanks Bryan, for the feedback.
>
> I don't seem to be able  to replicate the example in your blog. So, is the
> flow  GenerateFlowFile -> TCPPUT --> LogAttribute and ListenTCPRecord
> -->LogAttribute?  Both TCPPut and ListenTCPRecord  are sending data to
> LogAttribute?  Shouldn't ListenTCPRecord be receiving the data data? Also,
> I am not able to link ListenTCPRecord to any other processor.
>
> In any case, I tried to send logs to ListenTCPRecord using netcat and have
> ListenTCPRecord send the processed data to LogAttribute, but
> ListenTCPRecord doesn't seem to be listening (I don't see any activity on
> the processor). I'd appreciate if you let me what I am doing wrong.
>
> thanks
> Clay
>
> On Wed, Oct 4, 2017 at 9:14 AM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> Hello,
>>
>> I wrote a post that shows an example of using ListenTCPRecord with a
>> GrokReader to receive multi-line log messages. There is a link to a
>> template of the flow at the very end.
>>
>> You could easily change the example so that PutTCP is sending a single
>> JSON document, or an array of JSON documents, and ListenTCPRecord is
>> using a JsonTreeReader.
>>
>> We don't have a protobuf record reader so that currently isn't an option.
>>
>> Let us know if you have any other questions.
>>
>> -Bryan
>>
>> [1] https://bryanbende.com/development/2017/10/04/apache-
>> nifi-processing-multiline-logs
>>
>> On Wed, Oct 4, 2017 at 8:39 AM, Clay Teahouse <clayteaho...@gmail.com>
>> wrote:
>> > Hi All,
>> >
>> > Does anyone have an example of ListenTcpRecord processor in action, say
>> for
>> > example, with a json or protobuf reader? I am specifically wondering
>> about
>> > the record length/prefix.
>> >
>> > thanks
>> > Clay
>>

Re: ListenTcpRecord

2017-10-04 Thread Bryan Bende

Hello,

I wrote a post that shows an example of using ListenTCPRecord with a
GrokReader to receive multi-line log messages. There is a link to a
template of the flow at the very end.

You could easily change the example so that PutTCP is sending a single
JSON document, or an array of JSON documents, and ListenTCPRecord is
using a JsonTreeReader.

We don't have a protobuf record reader so that currently isn't an option.

Let us know if you have any other questions.

-Bryan

[1] 
https://bryanbende.com/development/2017/10/04/apache-nifi-processing-multiline-logs

On Wed, Oct 4, 2017 at 8:39 AM, Clay Teahouse  wrote:
> Hi All,
>
> Does anyone have an example of ListenTcpRecord processor in action, say for
> example, with a json or protobuf reader? I am specifically wondering about
> the record length/prefix.
>
> thanks
> Clay

Re: [VOTE] Release Apache NiFi 1.4.0 (RC2)

2017-09-29 Thread Bryan Bende

+1 (binding)

- Ran through the release helper and everything checked out.
- Ran a couple of sample flows with no issues


On Fri, Sep 29, 2017 at 9:46 AM, James Wing  wrote:
> Jeff, I agree the updated KEYS file has been published.  Thanks.
>
> On Fri, Sep 29, 2017 at 6:00 AM, Jeff  wrote:
>
>> James,
>>
>> I had to do a hard reload of the page in Chrome, since the browser kept
>> showing me a cached version without my key.  After the hard reload, I can
>> see my key at https://dist.apache.org/repos/dist/dev/nifi/KEYS.  Could you
>> try opening the KEYS link in incognito mode and verify that my key is
>> there?
>>
>> Thanks,
>> Jeff
>>
>> On Fri, Sep 29, 2017 at 1:06 AM James Wing  wrote:
>>
>> > +1 (binding). I ran through the release helper including signature,
>> hashes,
>> > build, and testing the binary.  I checked the LICENSE and NOTICE files.
>> > Everything looks good to me.
>> >
>> > One thing I noted is that Jeff's GPG key is not yet in the public KEYS
>> file
>> > at https://dist.apache.org/repos/dist/dev/nifi/KEYS, but it is added in
>> > the
>> > master branch KEYS file to be published with the release.  I believe that
>> > is OK for the signature, we've done this before, and perhaps we should
>> > consider changing the helper text in the future.
>> >
>> > Thanks, Jeff, for putting this release together.
>> >
>> >
>> > On Thu, Sep 28, 2017 at 12:54 PM, Jeff  wrote:
>> >
>> > > Hello,
>> > >
>> > > I am pleased to be calling this vote for the source release of Apache
>> > NiFi
>> > > nifi-1.4.0.
>> > >
>> > > The source zip, including signatures, digests, etc. can be found at:
>> > > https://repository.apache.org/content/repositories/orgapachenifi-
>> > >
>> > > The Git tag is nifi-1.4.0-RC2
>> > > The Git commit ID is e6508ba7d3da5bba54abd6233a7a8f9dd4c32151
>> > > https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=
>> > > e6508ba7d3da5bba54abd6233a7a8f9dd4c32151
>> > >
>> > > Checksums of nifi-1.4.0-source-release.zip:
>> > > MD5: 41e4083e602883a3e180032f32913414
>> > > SHA1: 26770625138126f45bed4989adb0a6b65a767aa2
>> > >
>> > > Release artifacts are signed with the following key:
>> > > https://people.apache.org/keys/committer/jstorck.asc
>> > >
>> > > KEYS file available here:
>> > > https://dist.apache.org/repos/dist/release/nifi/KEYS
>> > >
>> > > 199 issues were closed/resolved for this release:
>> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> > > projectId=12316020=12340589
>> > >
>> > > Release note highlights can be found here:
>> > > https://cwiki.apache.org/confluence/display/NIFI/
>> > > Release+Notes#ReleaseNotes-Version-1.4.0
>> > >
>> > > The vote will be open for 72 hours.
>> > > Please download the release candidate and evaluate the necessary items
>> > > including checking hashes, signatures, build
>> > > from source, and test.  The please vote:
>> > >
>> > > [ ] +1 Release this package as nifi-1.4.0
>> > > [ ] +0 no opinion
>> > > [ ] -1 Do not release this package because...
>> > >
>> >
>>

Re: [VOTE] Release Apache NiFi 1.4.0

2017-09-25 Thread Bryan Bende

I think the reason for the upgrade issue was the following...

Normally there is an automatic upgrade of component versions, with the
following logic:

- If the flow says you are using version X of a component, and during
startup version X is not found, but version Y is found, and version Y
is the only version of that component, then version Y is selected.

- If the flow says you are using version X of a component, and during
startup more than one version of the component is found, then we can't
automatically select one, so a ghost component would be created as
place-holder.

This is how all the components would normally go from 1.3.0 to 1.4.0
on an upgrade.

In Richard's flow, he was using the 1.3.0 XMLFileLookupService, and
when he upgraded it found a 1.4.0 version from the lookup services
NAR, and also a 1.4.0 version from the Mongo services NAR, and
therefore fell into the second case described above.

Deleting the service and re-creating it is one way to resolve the
issue, I also believe you could go into the controller services table
and select to "Change Version" on the service and select the version
from the lookup services NAR.


On Mon, Sep 25, 2017 at 12:28 PM, Matt Burgess  wrote:
> All,
>
> I verified that Joey is correct and that dependency causes the
> duplicates. I reopened NIFI-4345 and submitted a PR.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-4345
> [2] https://github.com/apache/nifi/pull/2174
>
> On Mon, Sep 25, 2017 at 11:45 AM, Richard St. John  
> wrote:
>> Joey,
>>
>> That sounds like that is the issue.
>>
>> Rick.
>>
>> --
>> Richard St. John, PhD
>> Asymmetrik
>> 141 National Business Pkwy, Suite 110
>> Annapolis Junction, MD 20701
>>
>> On Sep 25, 2017, 11:44 AM -0400, Joey Frazee , wrote:
>>> I think there could be an issue with the deps in the 
>>> nifi-mongodb-services-nar. It includes nifi-lookup-services which should 
>>> either be unnecessary or should just be provided scope (just need the 
>>> services API dependency). So it’s possible that all the impls in 
>>> nifi-lookup-services are indeed included twice.
>>>
>>> Does that jive with what you’re seeing? I.e., for LookupService properties 
>>> do you see double of everything?
>>>
>>> -joey
>>>
>>> On Sep 25, 2017, 10:30 AM -0500, Richard St. John , 
>>> wrote:
>>> > Joe,
>>> >
>>> > The issue I encountered was related to, I believe, the packaging of the 
>>> > mongodb lookup service.  I am using the XMLlookup service and have a 
>>> > processor with a reference to the XML lookup service.  When I upgraded 
>>> > from 1.3 to 1.4, the processor became invalid due to “incompatible type” 
>>> > of service.  The lookup attribute processor appeared to be attempting to 
>>> > use the mongodb lookup service.  I re-added the xml lookup service, being 
>>> > careful to use the one in the nifi-lookup-services-nar and not the one 
>>> > packaged in the nifi-mongodb-services-nar.  After do that, the lookup 
>>> > attribute processor was valid and able to link to the xml lookup service.
>>> >
>>> > Rick.
>>> >
>>> > --
>>> > Richard St. John, PhD
>>> > Asymmetrik
>>> > 141 National Business Pkwy, Suite 110
>>> > Annapolis Junction, MD 20701
>>> >
>>> > On Sep 25, 2017, 10:55 AM -0400, Joe Witt , wrote:
>>> > > -1 (binding) based on what Rick ran into.
>>> > >
>>> > > Otherwise though the release is looking good. I'm running through a
>>> > > series of tests now and things going well.
>>> > >
>>> > > Rick,
>>> > > I agree there are duplicate controller services and sourced to the
>>> > > mongo system. And we must fix/remove those.
>>> > >
>>> > > However, the issue for upgrading is one I'd like to better understand.
>>> > > What is the problem you're seeing? It is not required that controller
>>> > > services have unique class names. The requirement is that the
>>> > > artifact/coordinate is unique across the class name/extension
>>> > > bundle/version. So lets figure out why this is actually breaking you.
>>> > >
>>> > > Thanks
>>> > > Joe
>>> > >
>>> > > On Mon, Sep 25, 2017 at 10:47 AM, Richard St. John 
>>> > >  wrote:
>>> > > > -1 non-binding.
>>> > > >
>>> > > > There are duplicate lookup services registered and it’s causing issues
>>> > > > upgrading from 1.3.0 to 1.4.0. It seems to be related to the mongo 
>>> > > > lookup
>>> > > > service.
>>> > > >
>>> > > > Rick.
>>> > > >
>>> > > > --
>>> > > > Richard St. John, PhD
>>> > > > Asymmetrik
>>> > > > 141 National Business Pkwy, Suite 110
>>> > > > Annapolis Junction, MD 20701
>>> > > >
>>> > > > On Sep 24, 2017, 9:15 PM -0400, Jeff , wrote:
>>> > > >
>>> > > > There is an error in my previous email. 192 issues were closed and
>>> > > > resolved for this release.
>>> > > >
>>> > > > On Sun, Sep 24, 2017 at 9:11 PM Jeff  wrote:
>>> > > >
>>> > > > Hello,
>>> > > >
>>> > > > I am pleased

Re: Reg. Nifi Clustering Load Balancing

2017-08-29 Thread Bryan Bende

Hello,

You can run a standard HTTP load-balancer in front of ListenHTTP and have
your producers use the URL of the load-balancer.

Nginx or apache httpd can be used.

Thanks,

Bryan


On Tue, Aug 29, 2017 at 11:40 AM, mayank rathi 
wrote:

> Does this help?
>
> [image: Inline image 1]
>
> On Tue, Aug 29, 2017 at 11:07 AM, Nishant Gupta <
> nishantgupta1...@gmail.com> wrote:
>
>> Hello Sir/Madam,
>>
>> I need to know how we can implement load balancing and single point access
>> to all data producer
>>
>> I have suppose 100 machines that produce data.(flowfiles)
>> That data is sent over HTTP from 100 machines to nifi and that we can
>> access using listenHTTP processor in Nifi.(Currently I am able to do that
>> for single node Nifi)
>> Is there any way to provide 100 machines with a single URL (for all 4-5
>> nodes in Nifi Cluster)
>>
>> Can you please let me know how to achieve it.??
>>
>> Thanks and Regards,
>> Nishant Gupta
>>
>
>
>
> --
> NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>

Re: determine if instance is a Cluster Coordinator or Primary Node

2017-08-29 Thread Bryan Bende

Mark,

I don't believe there is currently anything like this in Authorizer API.

You would likely have to build something similar to what processors have...

In ProcessorInitializationContext they get access to a NodeType which
tells them if they are currently primary or not.

Then they can annotate a method with @PrimaryNodeStateChange to get
notified when primary node changes.

-Bryan

On Tue, Aug 29, 2017 at 8:08 AM, Mark Bean  wrote:
> Is there a way to get access to Cluster configuration state? Specifically,
> can a Node determine which Node - or simply "itself" - is the Cluster
> Coordinator or the Primary Node?
>
> Use case: I have a custom authorizer which includes a background thread to
> re-authorize users and policies in case a user's credentials have changed.
> This thread can potentially change authorizations.xml and users.xml files
> which are kept in sync with ZooKeeper. I do not want each Node to execute
> the process making the same changes. It would be desirable to execute this
> process on only one Node (Coordinator or Primary) and let ZooKeeper
> coordinate the changes across the Cluster.
>
> Thanks,
> Mark

Re: how to execute code when processor is stopping

2017-08-11 Thread Bryan Bende

Ben,

I apologize if I am not understanding the situation, but...

In the case where your OnScheduled code is in a retry loop, if someone
stops the processor it will call your OnUnscheduled code which will
set the flag to bounce out of the loop. This sounds like what you
want, right?

In the case where OnScheduled times out, the framework is calling
OnUnscheduled which would call your code to set the flag, but wouldn't
that not matter at this point because you aren't looping anymore
anyway?

If the framework calls OnScheduled again, your code should set the
flag back to whatever it needs to be to start looping again right?

An alternative that might avoid some of this would be to lazily
initialize the connection in the onTrigger method of the processor.

-Bryan


On Fri, Aug 11, 2017 at 9:16 AM, 尹文才  wrote:
> thanks Pierre, my case is that I need to implement a database connection
> retry logic inside my OnScheduled method, when the database is not
> available I will retry until the connection is back online.
> The problem is when the database is offline it will throw timed out
> execution exception inside OnScheduled and then call OnUnscheduled. But
> when I manually stop the processor the OnUnsheduled
> will also get called. I know my logic sounds a little weird but I need to
> set some flag in the OnUnscheduled method to stop the retry logic inside
> OnScheduled in order to be able to stop the processor,
> otherwise the processor is not able to be stopped unless I restart the
> whole NIFI.
>
> Regards,
> Ben
>
> 2017-08-11 17:18 GMT+08:00 Pierre Villard :
>
>> Oh OK, get it now!
>>
>> Not sure what's your use case, but I don't think you can do that unless you
>> set some information when the process actually executes onTrigger for the
>> first time and you then check this value in your OnUnscheduled annotated
>> method.
>>
>> Pierre
>>
>> 2017-08-11 10:11 GMT+02:00 尹文才 :
>>
>> > Hi Pierre, I've checked the developer guide before I sent the email and
>> > according to the developer guide, the method annotated with OnUnScheduled
>> > will be called in 2 cases according to my understanding, please correct
>> me
>> > if I'm wrong:
>> > 1. when user tries to stop the processor in the NIFI UI, thus the
>> processor
>> > is no longer scheduled to run in this case, and the method will be
>> called.
>> > 2. when method annotated with OnScheduled throws exceptions, for example
>> > time out execution exception, the OnUnScheduled method will also be
>> called.
>> >
>> > My question is how to tell the first scenario from the second one?
>> Thanks.
>> >
>> > Regards,
>> > Ben
>> >
>> > 2017-08-11 15:51 GMT+08:00 Pierre Villard :
>> >
>> > > Hi Ben,
>> > >
>> > > You might want to have a look here:
>> > > https://nifi.apache.org/docs/nifi-docs/html/developer-
>> > > guide.html#component-lifecycle
>> > >
>> > > Pierre
>> > >
>> > > 2017-08-11 9:06 GMT+02:00 尹文才 :
>> > >
>> > > > Hi guys, I'm trying to execute some code in my processor when the
>> > > processor
>> > > > is asked to stop in the NIFI UI by the user, I checked the developer
>> > > guide
>> > > > and only find OnUnscheduled will be called when the processor is no
>> > long
>> > > > scheduled to run. I've tested this OnUnscheduled, it will also be
>> > called
>> > > > after timed out executing OnScheduled task. So is there a way to
>> > execute
>> > > > some code only when the processor is stopping?
>> > > >
>> > > > Regards,
>> > > > Ben
>> > > >
>> > >
>> >
>>

Re: get controller service's configuration

2017-08-10 Thread Bryan Bende

The way controller services are setup you have the following...

- DBCPService interface (provides getConnection()) extends
ControllerService interface (empty interface to indicate it is a CS)
- DBCPConnectionPool extends AbstractControllerService implements DBCPService
- Processor XYZ depends on DBCPService interface

The DBCPService interface is the common point between the processor
and the implementations. The processor XYZ classpath only knows about
the DBCPService interface, it doesn't know anything about the classes
that implement it... there could actually be several implementations
in different NARs, but it is up to the framework to provide access to
these.

Since the processor only depends on the interface, which in this case
only exposes getConnection(), you can't really assume the service has
certain properties because DBCPConnectionPool.DB_DRIVERNAME is
specific to the DBCPConnectionPool implementation... another
implementation may not have that property, or may call it something
different. The interface would have to provide getDriverName() so that
each implementation could provide that.

-Bryan


On Thu, Aug 10, 2017 at 4:33 AM, 尹文才  wrote:
> Thanks Andy, I've tried your approach, in my case the controller service is
> a DBCPConnectionPool and when I tried to get driver class name property
> through context.getProperty(DBCPConnectionPool.DB_DRIVERNAME).getValue(),
> but I the value is null. The AbstractControllerService class does have a
> method getConfigurationContext() to get configuration context, but the
> method is protected. So I still didn't find a feasible way to get the
> controller service's properties.
>
> Regards,
> Ben
>
> 2017-08-10 12:18 GMT+08:00 Andy LoPresto :
>
>> You can get the current property values of a controller service from the
>> processor by using the ProcessContext object. For example, in GetHTTP [1],
>> in the @OnScheduled method, you could do:
>>
>> context.getControllerServiceLookup().getControllerService("my-
>> controller-service-id”);
>>
>> context.getProperty("controller-service-property-name");
>> context.getProperty(SomeControllerService.CONSTANT_PROPERTY_DESCRIPTOR);
>>
>> I forget if context.getProperty() will give the controller service
>> properties as well as the processor properties. If it doesn’t, you can cast
>> the retrieved ControllerService into AbstractControllerService or the
>> concrete class and access available properties directly from the
>> encapsulated ConfigurationContext.
>>
>> [1] https://github.com/apache/nifi/blob/master/nifi-nar-
>> bundles/nifi-standard-bundle/nifi-standard-processors/src/
>> main/java/org/apache/nifi/processors/standard/GetHTTP.java#L295
>>
>> Andy LoPresto
>> alopre...@apache.org
>> *alopresto.apa...@gmail.com *
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> On Aug 9, 2017, at 6:57 PM, 尹文才  wrote:
>>
>> Thanks Koji, I checked the link you provided and I think getting a
>> DataSource is no different than getting the DBCP service(they could just
>> get the connection). Actually I was trying to get the configured driver
>> class to check the database type.
>>
>> Regards,
>> Ben
>>
>> 2017-08-10 9:29 GMT+08:00 Koji Kawamura :
>>
>> Hi Ben,
>>
>> I'm not aware of ways to obtain configurations of a controller from a
>> processor. Those should be encapsulated inside a controller service.
>> If you'd like to create DataSource instance instead of just obtaining
>> a connection, this discussion might be helpful:
>> https://github.com/apache/nifi/pull/1417
>>
>> Although I would not recommend, if you really need to obtain all
>> configurations, you can do so by calling NiFi REST API from your
>> processor.
>>
>> Thanks,
>> Koji
>>
>> On Thu, Aug 10, 2017 at 10:09 AM, 尹文才  wrote:
>>
>> Hi guys, I have a customized processor with a DBCP controller service as
>>
>> a
>>
>> property. I could get the DBCP controller service in my code, but does
>> anyone know how to obtain all the configurations of the DBCP controller
>> service in java code(e.g. Database Connection URL, Database Driver
>> Location, etc) Thanks.
>>
>> Regards,
>> Ben
>>
>>
>>
>>

Re: Removing unneeded nar files from NiFi standard distribution.

2017-08-08 Thread Bryan Bende

Toivo,

Besides the Jetty NAR and the framework NAR, you should be able to
remove most of the other NARs without any negative impact.

The hope is that eventually most of these NARs can live in an
extension repository and then people can pick and choose which
processors to add to their distribution.

-Bryan

On Tue, Aug 8, 2017 at 7:57 AM, Toivo Adams  wrote:
> Hi,
>
> In lib directory are a lot of nar files which we don’t use.
> I assume unneeded nar files can be deleted?
> Any negative consequences?
>
> I assume start time may improve and maybe RAM consumption?
> Or is it not worth of trouble?
>
> Thanks
> Toivo
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Removing-unneeded-nar-files-from-NiFi-standard-distribution-tp16599.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Avro Reader reading from FlowFile throws SchemaNotFoundException

2017-08-07 Thread Bryan Bende

Hello,

I think I see what the problem is now...

The exception in your second email is coming from the CSV writer which
is set to get the schema from the "Schema Text" property, which is in
turn set to ${avro.schema}.

I believe what you showed in the section ### FLOWFILE ### is the
content of the flow file which is different than a flow file
attribute, so I don't think you actually have a flow file attribute
named avro.schema.

Ideally what you really want here is for the CSV writer to just use
the same schema that the reader used. In the unreleased code which is
currently set for 1.4.0-SNAPSHOT, there is a new Schema Access
Strategy called "Inherit Record Schema" which is what you want to
choose on the CSV writer.

In order to make it work in 1.3.0, you would need to take the schema
generated from QueryDatabaseTable and paste the schema text into the
"Schema Text" property on the CSV writer.

Alternatively you could create an AvroSchemaRegistry and declare the
schema in there, and then reference it by name.

-Bryan

On Mon, Aug 7, 2017 at 1:17 AM, Frederik  wrote:
> Hey,
>
> no problem. Please find the template file below. Let me know if I can
> provide anything else. thanks
>
> convert_record_template.xml
> 
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Avro-Reader-reading-from-FlowFile-throws-SchemaNotFoundException-tp16574p16590.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Avro Reader reading from FlowFile throws SchemaNotFoundException

2017-08-04 Thread Bryan Bende

Hello,

I'm assuming this error came from #2 when you tried to use Schema Text
set to ${avro.schema} ?

The error means your flow file doesn't have an attribute called
avro.schema, which it would need to have if you reference
${avro.schema}.

What were the results using Embedded Avro Schema? That should work.

-Bryan


On Thu, Aug 3, 2017 at 11:10 PM, Frederik  wrote:
> Hi, I created a FlowFile with QueryDataBaseTable and want to convert this
> straight to CSV via the ConvertRecord processor. I tried the AvroReader with
> the following Schema Access Strategies
>
> 1. Use Embedded Avro Schema
> 2. Use 'Schema Text' Property and Schema Text set to ${avro.schema}
>
> My FlowFile has the following start structure.
> Obj^A^B^Vavro.schema<8a>^B{"type":"record","name":"customer_crm_summary","namespace":"any.data","fields":[{"name":"device_tac_code","type":["null","string"]}]}
> I would think this seems all ok but the the AvroReader fails with the
> Exception below SchemaNotFoundException
> I tested this with Nifi 1.2 and 1.3
> Any ideas on what this could be? thanks
>
> 2017-08-04 14:41:45,236 ERROR [Timer-Driven Process Thread-1]
> o.a.n.processors.standard.ConvertRecord
> ConvertRecord[id=ab1cb1cb-015d-1000-4818-59412b1f3b2b] Failed to process
> records for
> StandardFlowFileRecord[uuid=6019524b-c23e-4b5f-977c-8be056aa217f,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1501814390874-1, container=default,
> section=1], offset=0,
> length=4090093],offset=0,name=7952490691251418,size=4089924]; will route to
> failure: org.apache.nifi.schema.access.SchemaNotFoundException: FlowFile did
> not contain appropriate attributes to determine Schema Text
> org.apache.nifi.schema.access.SchemaNotFoundException: FlowFile did not
> contain appropriate attributes to determine Schema Text
> at
> org.apache.nifi.schema.access.AvroSchemaTextStrategy.getSchema(AvroSchemaTextStrategy.java:46)
> at
> org.apache.nifi.serialization.SchemaRegistryService.getSchema(SchemaRegistryService.java:112)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:89)
> at com.sun.proxy.$Proxy121.getSchema(Unknown Source)
> at
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:106)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
> at
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
> at
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2017-08-04 14:41:45,243 ERROR [Timer-Driven Process Thread-1]
> o.a.n.processors.standard.ConvertRecord
> ConvertRecord[id=ab1cb1cb-015d-1000-4818-59412b1f3b2b] Failed to process
> records for
> StandardFlowFileRecord[uuid=1414d8a0-c033-4d8d-b2a3-944c6124894f,cl
>
>
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Avro-Reader-reading-from-FlowFile-throws-SchemaNotFoundException-tp16574.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: ERR_BAD_SSL_CLIENT_AUTH_CERT error after configuring secure cluster

2017-07-31 Thread Bryan Bende

Hello,

I think you should only make one call to the toolkit which should
generate a CA, the server certs, and the client cert all at the same
time. The -C flag is for the client cert which you already had on the
first call so I think it generated it already.

By running it twice like above, the first time is generating a CA and
server certs for servers 101-103, the second time its generating a new
CA, a server cert for server101, and a client cert, so now you are
using a client cert that was generated from a different CA than the
server certs.

-Bryan



On Mon, Jul 31, 2017 at 1:02 PM, nifi-san  wrote:
> Hello Experts,
>
> I have secured my three node nifi cluster and followed the links below:-
>
> https://pierrevillard.com/2016/11/29/apache-nifi-1-1-0-secured-cluster-setup/
>
> https://pierrevillard.com/tag/tls-toolkit/
>
> The only difference is that I used the toolkit standalone mode to generate
> the required certs.
>
> Inspite of generating the client certificate with the below command, I see
> the following error on my browser:-
>
> "ERR_BAD_SSL_CLIENT_AUTH_CERT"
>
> Below are the commands used to generate the certificates and keystores:-
>
> tls-toolkit.sh standalone -n 'server10[1-3]xj.domain.com' -C 'CN=admin,
> OU=NIFIORG' -o.
>
> Client Cert:-
>
> tls-toolkit.sh standalone -n 'server101.domain.com' -C 'CN=admin,
> OU=NIFIORG' -o.
>
> Tried generating the client certificate using "localhost" as well instead of
> "server101.domain.com" but that did not help either.
>
> The cluster has come up successfully and listening on the SSL port.Also,the
> users.xml and authorizations.xml have been populated properly with the
> initial Admin whihc is "CN=admin, OU=NIFIORG"
>
> I imported the cert created in p12 format into the browser but everytime I
> try to access the UI,i get the same error.
>
> Tried regenerating the certs for all the nodes and created a fresh new
> client cert as well but that did not help.
>
> I could not see any error in the logs but at the same time there was no
> authentication request in the user logs for the user "CN=admin, OU=NIFIORG".
>
> Appreciate any pointers how to resolve this issue.
>
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/ERR-BAD-SSL-CLIENT-AUTH-CERT-error-after-configuring-secure-cluster-tp16538.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Sending Parquet files to S3

2017-07-31 Thread Bryan Bende

Hello,

The PutParquet processor uses the Hadoop client to write to a filesystem.

For example, to write to HDFS you would have a core-site.xml with a
filesystem like:

  fs.defaultFS
  hdfs://yourhost

And to write to a local filesystem you could have a core-site.xml with:

fs.defaultFS
file:///

If there is a way to declare s3 as a filesystem, then I would expect
it to work, but I am not familiar with doing that.

The alternative would be what you suggested where you would write to
HDFS first, and then use ListHDFS -> FetchHDFS -> PutS3Object.

Thanks,

Bryan

On Fri, Jul 28, 2017 at 2:17 PM, shitij  wrote:
> Hi,
> For sending parquet files to s3, can I use the PutParquet processor
> directly, giving it an s3 path or do I first write to HDFS and then use
> PutS3Object?
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Sending-Parquet-files-to-S3-tp16525.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: RTC clarification

2017-07-07 Thread Bryan Bende

I agree with encouraging reviews from everyone, but I lean towards
"binding" reviews coming from committers.

If we allow any review to be binding, there could be completely
different levels of review that occur...

There could be someone who isn't a committer yet, but has been
contributing already and will probably do a very thorough review of
someone's contribution, and there could be someone else who we never
interacted with us before and writes "+1 LGTM" on a PR and we have no
idea if they know what they are talking about or if they even tried
the contribution to see if it works. Obviously a committer can also
write "+1 LGTM", but I think when that comes from a committer it holds
more weight.

I think we may also want to clarify if we are only talking about
"submitted by committer, reviewed by non-committer" or also talking
about "submitted by non-committer, reviewed by non-committer".

For the first case I can see the argument that since the contribution
is from a committer who is already trusted, they can make the right
judgement call based on the review. Where as in the second case, just
because a community member submitted something and another community
member says it looks good, doesn't necessarily mean a committer should
come along and automatically merge it in.


On Fri, Jul 7, 2017 at 4:13 AM, Michael Hogue
 wrote:
> Thanks for fielding the question, Tony.
>
> Joe and James' statements both make sense. I suppose a case by case
> analysis could be carried out, too. For example, since I'm mostly
> unfamiliar with the code base but am looking to gain familiarity, I'm
> reviewing pretty straightforward or trivial PRs. My plan was to continue
> doing that until I felt comfortable reviewing something with a larger
> impact, such as the new TCPListenRecord processor implementation [1].
> However, as Tony explained, my original question was whether that sort of
> review would be binding or whether I should be doing it at all. I think
> both of those questions were answered here in that ultimately committer
> sign off is needed, but reviews may be binding regardless of source.
>
> Thanks for the feedback!
>
> Mike
>
>
> [1] https://github.com/apache/nifi/pull/1987
> On Fri, Jul 7, 2017 at 01:14 James Wing  wrote:
>
>> We should definitely encourage review feedback from non-committers.
>> Getting additional perspectives, interest, and enthusiasm from users is
>> critical for any project, doubly so for an integrating application where
>> committers cannot be experts in all the systems we are integrating with.  I
>> believe NiFi could use more review bandwidth.  Are missing out on potential
>> reviewers because of the current policy?
>>
>> I do not have any experience with non-committer "binding reviews" as
>> described in the Apache Gossip thread.  How would that work?  Wouldn't a
>> committer have to review the review and decide to commit?  If we knew the
>> reviewer well enough to accept their judgement, why not make them a
>> committer?
>>
>> My expectation is that many non-committer reviews are helpful and
>> constructive, but not necessarily 100% comprehensive.  Reviewers might
>> comment on the JIRA ticket without working with the PR, or try the proposed
>> change without reviewing the code, tests, etc.  All great stuff, but
>> backstopped by committers.
>>
>> Thanks,
>>
>> James
>>
>> On Thu, Jul 6, 2017 at 7:30 PM, Joe Witt  wrote:
>>
>> > It is undefined at this point and I agree we should reach consensus
>> > and document it.
>> >
>> > I am in favor making non-committer reviews binding.
>> >
>> > Why do we do RTC:
>> > - To help bring along new committers/grow the community
>> > - To help promote quality by having peer reviews
>> >
>> > Enabling non-committer reviews to be binding still allows both of
>> > those to be true.
>> >
>> > On Thu, Jul 6, 2017 at 10:10 PM, Tony Kurc  wrote:
>> > > All, I was having a discussion with Mike Hogue - a recent contributor -
>> > off
>> > > list, and he had some questions about the review process. And largely
>> the
>> > > root of the question was, if a non-committer reviews a patch or PR
>> (which
>> > > Mike has spent some time doing), is that considered a "review"? I
>> didn't
>> > > have the answers, so I went on a hunt for documentation. I started with
>> > the
>> > > Contributor Guide [1]. The guide describes reviewing, and calls out a
>> > > Reviewer role, but doesn't specifically point out that Reviewer is a
>> > > committer, just that a committer "can actively promote contributions
>> into
>> > > the repository", and goes on to imply the non-committers can review.
>> > >
>> > > Given this, I was unable to answer this question:
>> > > If a committer "X" submits a patch or PR, it is reviewed by a
>> > non-committer
>> > > "Y", does that review satisfy the RTC requirement, and "X" may merge in
>> > the
>> > > patch?
>> > >
>> > > I found a related discussion on the

Re: Custom NAR interfering with BundleUtils.findBundleForType

2017-07-05 Thread Bryan Bende

Scott,

Thanks for providing the stacktrace... do any of your custom
processors use the @DefaultSchedule annotation? and if so, do any of
them set the tasks to a number less than 1?

The exception you are getting is from some code that is preventing
using 0 or negative number of tasks for a processor that is not
scheduled for event driven, basically meaning event driven is the only
one where it would make sense to have less than 1 task.

What Joe mentioned about including other standard processors in your
NAR could very well still be a problem, but that stacktrace might be
something else.

-Bryan


On Wed, Jul 5, 2017 at 12:44 PM, Scott Wagner  wrote:
> Hi Joe,
>
> We are extending AbstractProcessor for our processors, and
> AbstractControllerService for our controller service.  However, we did
> include the InvokeHTTP processor with some modifications that are
> referencing some other classes that are in the nifi-processors-standard JAR.
> I will look into breaking those out to remove that dependency.
>
> The actual error that we are getting is below:
>
> 2017-07-04 12:28:29,076 WARN [main] org.apache.nifi.web.server.JettyServer
> Failed to start web server... shutting down.
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> java.lang.IllegalArgumentException
> at
> org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:426)
> at
> org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1576)
> at
> org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:84)
> at
> org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:722)
> at
> org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:533)
> at
> org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
> at
> org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:876)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:532)
> at
> org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:839)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:344)
> at
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1480)
> at
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1442)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:799)
> at
> org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261)
> at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:540)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at
> org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:290)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
> at org.eclipse.jetty.server.Server.start(Server.java:452)
> at
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
> at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
> at org.eclipse.jetty.server.Server.doStart(Server.java:419)
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at
> org.apache.nifi.web.server.JettyServer.start(JettyServer.java:705)
> at org.apache.nifi.NiFi.(NiFi.java:160)
> at org.apache.nifi.NiFi.main(NiFi.java:267)
> Caused by: java.lang.IllegalArgumentException: null
> at
> org.apache.nifi.controller.StandardProcessorNode.setMaxConcurrentTasks(StandardProcessorNode.java:620)
> at
> org.apache.nifi.controller.StandardFlowSynchronizer.updateProcessor(StandardFlowSynchronizer.java:985)
> at
>

Re: Communication between flows

2017-07-05 Thread Bryan Bende

Steve,

In 1.2.0 there were some new processors added called Wait/Notify...

With those you could send your original JSON (before splitting) to a
Wait processor and tell it to wait until the signal count is equal to
the number of splits, then you could put a Notify processor right
after PutMongo connected to the success relationship. For example, if
100 JSON documents get split out, the Wait processor is waiting for
100 signals or until it times out, and signals are only sent after
successful insertion to Mongo. You can checkout this blog for an
example [1].

In 1.1.2, you might be able to put a MergeContent processor configured
to run in defragment mode connected to the success relationship of
PutMongo. Defragment mode is used to undo the splitting that was done
by an unpstream processors, so it will only defragment and merged back
together if all the fragments made it through.

-Bryan

[1] https://ijokarumawak.github.io/nifi/2017/02/02/nifi-notify-batch/

On Wed, Jul 5, 2017 at 12:46 PM, Byers, Steven K (Steve) CTR USARMY
MEDCOM JMLFDC (US)  wrote:
> Is there a mechanism or technique for communicating the results of a flow 
> file to its "sister" flow files?
>
> Here is a high-level description of what I am doing:
>
> Input to my flow is a JSON array of documents that get split (SplitJson) into 
> individual documents and each document becomes a distinct flow file.  Each 
> document (flow file) gets validated against a JSON schema (ValidateJson) then 
> gets updated into a Mongo collection (PutMongoUpdate).  At the end of all 
> this, I want to do some post processing but only if all documents processed 
> successfully.
>
> When a failure occurs (either in the validation or the Mongo update) is there 
> a way to communicate that to the success branch of the flow process so a 
> decision can be made about whether to proceed to post processing or not.
>
> I am using NiFi 1.1.2
>
> Thank you for any guidance you can offer,
>
> Steve
>
>
>

Re: HBase security label support

2017-06-29 Thread Bryan Bende

Mike,

I don't know of any work being done or any JIRAs that exist for this,
but seems like it would be good to support them. Most likely its just
that no one has asked for it yet.

I'd go ahead and create a JIRA, or if you were planning to incorporate
it into the HBase record processors then that sounds good too.

Thanks,

Bryan

On Thu, Jun 29, 2017 at 1:03 PM, Mike Thomsen  wrote:
> Are there any plans for implementing HBase security labels?
>
> Thanks,
>
> Mike

Re: conversion from AVRO file format to Parquet file format

2017-06-28 Thread Bryan Bende

Rohith,

Can you share more details about how you have configured PutParquet?
What Record Reader are you using and what Schema Access Strategy?

If your data is already in Avro then you would need to set the Record
Reader to an AvroRecordReader. The AvroRecordReader can be configured
to use the schema from the Avro datafile, or from a schema registry.

Then you have to configure the write schema directly in PutParquet
through the 'Schema Access Strategy'. In the future there will be an
option to just write with the same schema as the reader, but currently
the read and write schemas are separate.

The easiest thing to do is probably to create an AvroSchemaRegistry
and add your schema to it, then have the AvroRecordReader reference
this by name, and also have the PutParquet reference it by name, this
way they are ensured to use the same schema.

Let us know if this does not make sense.

Thanks,

Bryan

On Wed, Jun 28, 2017 at 4:39 AM, rohithkumars  wrote:
> Hello Team,
>
> We the team are in need to convert the data flow file from AVRO to PARQUET.
>
> We found two processors to do that.
>
> 1. Get_Parquet
> 2. Put_Parquet
>
> is it possible to convert binary file format of AVRO to parquet. The source
> schema is not getting automatically generated for parquet.
>
> instead we also tried manually creating the schema and to write as a parquet
> but no luck.
>
> Please guide us on this.
>
> Thanks,
> Rohith
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/conversion-from-AVRO-file-format-to-Parquet-file-format-tp16278.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: How to ingest files into HDFS via Apache NiFi from non-hadoop environment

2017-06-23 Thread Bryan Bende

Yes, I think running NiFi on edge nodes would make sense, this way
they can access the public network to receive data, but also access
HDFS on the private network.


On Fri, Jun 23, 2017 at 4:24 PM, Mothi86  wrote:
> Hi Bryan,
>
> Greetings and appreciate your instant reply. Data nodes are in private
> network inside the hadoop cluster and NiFi is away from hadoop cluster on a
> seperate non-hadoop server. If we need NiFi to have access to data node,
> does that mean we need to have NiFi within the cluster ? something like edge
> node or management node which has access to public network for twitter
> access or so and also private network of data nodes.
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-ingest-files-into-HDFS-via-Apache-NiFi-from-non-hadoop-environment-tp16247p16249.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: How to ingest files into HDFS via Apache NiFi from non-hadoop environment

2017-06-23 Thread Bryan Bende

Hello,

Every node where NiFi is running must be able to connect to the data
node process on every node where HDFS is running. I believe the
default port for the HDFS data node process is usually 50010.

I'm assuming your 4 worker nodes are running HDFS, so NiFi would have
to access those.

-Bryan


On Fri, Jun 23, 2017 at 3:37 PM, Mothi86  wrote:
> Apache NiFi is installed on non-hadoop environment and targets to ingest
> processed files into HDFS (Kerberized cluster - 4 management node and 1 edge
> node on public network and 4 worker nodes on private network).
>
> Is it workable solution to achieve above use case as I face multiple error
> even after performing below activities. Time being alternative, I have
> installed NiFi in edge node and everything works fine but please advise if
> there is anything additional I have to perform to make above use case work.
>
> * Firewall restriction between NiFi and management server is open and ports
> (22,88,749,389) are open.
> * Firewall restriction between NiFi and edge node server is open and ports
> (22, 2181,9083) are open
> * krb5.conf file from hadoop cluster along with keytab for application user
> is copied to NiFi server. Running kinit using application user and keytab -
> successful token is listed under klist.
> * SSH operation is successful and also SFTP into hadoop server works fine.
> * configured hdfs-site.xml and core-site.xml files into NiFi.
>
> 
> 
>
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-ingest-files-into-HDFS-via-Apache-NiFi-from-non-hadoop-environment-tp16247.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

< 1 2 3 4 5 6 7 8 >

401 - 500 of 781 matches

Mail list logo