Re: PutSolrContentStream Doc and DocID confusion

2016-04-20 Thread Bryan Bende
Ok so sounds like the incoming JSON is valid, a couple of more questions
then...

What version of Solr are you using?
Can you insert the same JSON document outside of NiFi with success?

In one of the more recent versions (5.2 or 5.3 maybe), the Solr admin
console added a page to the UI where you can insert documents:
https://cwiki.apache.org/confluence/display/solr/Documents+Screen

If you don't have that screen in the UI then maybe curl:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers

Example:
curl -X POST -H 'Content-Type: application/json' '
http://localhost:8983/solr/my_collection/update/json/docs' --data-binary '
{
  "id": "1",
  "title": "Doc 1"
}'

The reason I am interested in inserting the document outside of NiFi is
because NiFi is really not doing anything other than streaming the JSON to
the provided path.


On Wed, Apr 20, 2016 at 2:04 PM, dale.chang13 
wrote:

> Hi Brian,
>
> Yes, the JSON object I am storing is a valid JSON document. The Content
> Payload is set to true and the value is:
>
> {"docid":"a1602677-fc7c-43ea-adba-c1ed945ede3d_1831"}
>
>
> I believe I would have gotten a JSON syntax error saying that the JSON
> object was invalid.
>
> ---
>
> I have a PutSolrContentStream that routes FlowFiles to a LogAttribute on
> /connection_failure/ or /failure/
>
> Here is what I see in the Bulletin
>
> 14:55:15 EDT   ERROR 1ed45988-8ad6-3252-bd6d-7410b6dba8fd
> localhost:8181
> PutSolrContentStream[id=1ed45988-8ad6-3252-bd6d-7410b6dba8fd] Failed to
> send
>
> StandardFlowFileRecord[uuid=6b929b50-57d9-4178-b276-43eea977569a,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461178507835-17,
> container=default,
> section=17], offset=173166, length=122],offset=0,name=block.msg,size=122]
> to
> Solr due to org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://localhost:8983/solr/cobra_shard1_replica3:
> [doc=doj_civ_fraud_ws1_a1602677-fc7c-43ea-adba-c1ed945ede3d_1649] missing
> required field: docid; routing to failure:
> org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://localhost:8983/solr/cobra_shard1_replica3:
> [doc=document_a1602677-fc7c-43ea-adba-c1ed945ede3d_1649] missing required
> field: docid
>
> 14:55:15 EDT WARNING 0897791f-cc5d-4276-b5ce-76e610ce1478
> localhost:8181
> LogAttribute[id=0897791f-cc5d-4276-b5ce-76e610ce1478] logging for flow file
>
> StandardFlowFileRecord[uuid=8f1efd1e-7f71-44bf-8b88-bb890dce5905,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461178507835-17,
> container=default,
> section=17], offset=173898, length=122],offset=0,name=CANCELLED  Capital
> Allocations w/ Naveen, D Port, M Walkeryour ofc.msg,size=122]
> --
> Standard FlowFile Attributes
> Key: 'entryDate'
> Value: 'Wed Apr 20 14:54:59 EDT 2016'
> Key: 'lineageStartDate'
> Value: 'Wed Apr 20 14:54:59 EDT 2016'
> Key: 'fileSize'
> Value: '122'
> FlowFile Attribute Map Content
> Key: 'docid'
> Value: 'a1602677-fc7c-43ea-adba-c1ed945ede3d_1831'
> --
> {"docid":"a1602677-fc7c-43ea-adba-c1ed945ede3d_1831"}
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/PutSolrContentStream-Doc-and-DocID-confusion-tp9400p9411.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: PutSolrContentStream Doc and DocID confusion

2016-04-20 Thread Bryan Bende
Hello,

Just to clarify, do you have LogAttribute with the "Log Payload" property
to true?

If the actual output of LogAttribute is
/[*docid*="95663ced-6a3b-4356-877a-7c5707c046e7_779"]/
then that is not a valid JSON document.

You would need the content of the flow file to be something like the
following:

{
  "docid" : "95663ced-6a3b-4356-877a-7c5707c046e7_779"
}

Can you verify that the payload is a valid JSON document like the one above
and then we can go from there.

Thanks,

Bryan


On Wed, Apr 20, 2016 at 1:26 PM, dale.chang13 
wrote:

> While using PutSolrContentStream to store a JSON object in SolrCloud, I've
> been running into this issue of being unable to store a document. I've
> uploaded a solr schema that says that the field *docid* is required and a
> string. Attempting to store a document in solr, this is the error I get:
>
> Failed to send StandardFlowFileRecord[...] to Solr due to due to
> org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at localhost:8983/solr/cobra_shard1_replica3:
> /[*doc*="95663ced-6a3b-4356-877a-7c5707c046e7_779"]/ missing required
> field:
> *docid*; routing to failure:
>
> HOWEVER,
>
> Using LogAttribute to print out the JSON object stored as the FlowFile's
> content and specifically docid, it has a key-value pair
> /[*docid*="95663ced-6a3b-4356-877a-7c5707c046e7_779"]/, which is the same
> string that is printed out in the error from PutSolrContentStream.
>
> My question: /Is there some confusion between the way Nifi uses *doc* and
> the attribute *docid*?/ It referred to the document via
> /[*doc*="95663ced-6a3b-4356-877a-7c5707c046e7_779"]/ after shard3.
>
> Additionally, it looks like replica3 is the only shard that has problem in
> my SolrCloud instance.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/PutSolrContentStream-Doc-and-DocID-confusion-tp9400.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: [DISCUSS] From Contributor to Committer

2016-04-20 Thread Bryan Bende
So are we saying as a community that a contributor has to first become a
committer, and then only after continued consistent engagement could then
be considered for PMC?

I don't have any issue with that approach, although it is not exactly what
I thought when we first created the two tiers.

On Wed, Apr 20, 2016 at 11:10 AM, Joe Witt  wrote:

> Tony,
>
> There appears to be consensus around these thoughts.  Perhaps we
> should document this on a Wiki page?
>
> I think generally for committer status it would be good to see a
> number of these things for a period of time and then for PMC status to
> see those contributions continue and ideally expand for a longer
> duration.  Another few months?
>
> Thanks
> Joe
>
> On Wed, Apr 13, 2016 at 2:58 PM, Joe Witt  wrote:
> > Tony,
> >
> > I agree with the points you raise and the completeness of the
> > perspective you share. I do think we should add to that a focus on
> > licensing and legal aspects.
> >
> > - The individual has shown an effort to aid the community in producing
> > release which are consistent with ASF licensing requirements and the
> > guidance followed in the Apache NiFi community to adhere to those
> > policies.  This understanding could be shown when introducing new
> > dependencies (including transitive) by ensuring that all licensing and
> > notice updates have occurred.  Another good example is flagging
> > potentially copyrighted or insufficiently cited items like Skora found
> > recently in the Kafka tests.  One of our most important jobs as a
> > community is to put out legal releases and that is certainly a team
> > effort!
> >
> > Thanks
> > Joe
> >
> > On Sun, Apr 10, 2016 at 10:56 PM, Sean Busbey  wrote:
> >> Thanks for starting this Tony!
> >>
> >> As a PMC member, I really try to focus on things that help the
> >> community where we tend to have limited bandwidth: reviews weigh
> >> heavily, as does helping out new folks on users@, and doing public
> >> talking/workshops.
> >>
> >> I also am inclined to vote in favor of folks who show the kind of
> >> project dedication that we expect from PMC members. While we still
> >> need to do a better job of describing those things, at the moment I'm
> >> thinking of things like voting on release candidates, watching out for
> >> our trademarks, and investing the time needed to handle our licensing
> >> responsibilities.
> >>
> >> On Sun, Apr 10, 2016 at 9:38 AM, Tony Kurc  wrote:
> >>> First off, I recommend this reading this page to understand what the
> Apache
> >>> NiFi PMC draws from when making a decision
> >>>
> >>> http://community.apache.org/contributors/index.html
> >>>
> >>> I thought it would be helpful for me to walk through how I interpret
> that
> >>> guidance, and what that means for NiFi. For those that didn't read,
> there
> >>> are four aspects of contribution that are worth considering someone for
> >>> committership: community, project, documentation and code. Really, the
> >>> committer decision comes down to: has this person built up enough
> merit in
> >>> the community that I have a high degree of confidence that I trust
> him/her
> >>> with write access to the code and website.
> >>>
> >>> Given that merit and trust are subjective measures, how does the PMC
> make
> >>> those decisions? We, the PMC, have attempted to make this as
> evidence-based
> >>> as possible. When discussing a contributor for being considered for
> >>> committer access, we attempt to put together a corpus of interaction
> in the
> >>> community, both negative and positive, and use this as a basis for
> >>> discussion. The interaction with the community can include:
> >>>
> >>> - Interaction on the mailing lists - is this person helping others? Is
> this
> >>> person using the community to enhance his/her understanding of the
> project
> >>> or the apache foundation?
> >>> - Code contributions - is this person contributing code that advances
> the
> >>> project? How important is the code? Is this a niche capability, a core
> >>> capability? How challenging was the code? Was the code improving the
> >>> quality of the project (bug fix, adding  tests, or code that comes
> along
> >>> with comprehensive unit and/or integration tests). How does this person
> >>> react to criticism of his/her contribution? Is this person reacting
> >>> positively to patch or pull request feedback? Is the code high quality?
> >>> - Assisting others with their contributions - is this person providing
> >>> useful comments on pull requests or patches? Is this person testing new
> >>> features/functionality and providing feedback on the mailing list?
> >>> - Participating in project votes and discussions: is this person
> helping to
> >>> verify releases? Providing input to the roadmap? Is this person using
> the
> >>> lists to get feedback on features he/she plan to implement?
> >>> - Documentation contributions - is this person helping the 

Re: Compression of Data in HDFS

2016-04-07 Thread Bryan Bende
Hi James,

It looks like there may be a typo in what I wrote... I had $[path} but it
should be ${path}

Sorry about that, can you let us know if that worked.

Thanks,

Bryan

On Thu, Apr 7, 2016 at 3:05 AM, jamesgreen 
wrote:

> Hi Bryan
>
> I tried what you suggested but it just creates a path called
> "/landing/teradata/compressed/prodeiw_arc/$[path}" ?
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Compression-of-Data-in-HDFS-tp8821p8861.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: [VOTE] Release Apache NiFi 0.6.1 (RC1)

2016-04-07 Thread Bryan Bende
+1 (binding)

Ran through release helper and everything checked out.
Ran example flows and functioned as expected.

On Wed, Apr 6, 2016 at 11:16 PM, Aldrin Piri  wrote:

> +1, binding
>
> Build and tests worked on OS X and Windows 7.  Branch looked good on our
> Travis build and ran some flow templates from reviews from some of the
> associated issues with anticipated results.
>
> Verified hashes and signatures.
>
> On Wed, Apr 6, 2016 at 5:33 PM, Joe Witt  wrote:
>
> > Hello Apache NiFi Community,
> >
> > I am pleased to be calling this vote for the source release of Apache
> > NiFi 0.6.1.
> >
> > The source zip, including signatures, digests, etc. can be found at:
> >  https://dist.apache.org/repos/dist/dev/nifi/nifi-0.6.1/
> >
> > The Git tag is nifi-0.6.1-RC1
> > The Git commit hash is d51b24e18356b76ce649c8285af0806acd9071d0
> > *
> >
> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=d51b24e18356b76ce649c8285af0806acd9071d0
> > *
> >
> https://github.com/apache/nifi/commit/d51b24e18356b76ce649c8285af0806acd9071d0
> >
> > Checksums of nifi-0.6.1-source-release.zip:
> > MD5: b82a7fe60e03b679d7c06f57319fa396
> > SHA1: 32c45d51d1e1858eaba1df889711c69c17b44781
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/joewitt.asc
> >
> > KEYS file available here:
> > https://dist.apache.org/repos/dist/release/nifi/KEYS
> >
> > 11 issues were closed/resolved for this release:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12335496
> > Release note highlights can be found here:
> >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version0.6.1
> >
> > The vote will be open for 72 hours.
> > Please download the release candidate and evaluate the necessary items
> > including checking hashes, signatures, build from source, and test. Then
> > please vote:
> >
> > [ ] +1 Release this package as nifi-0.6.1
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because...
> >
> > Thanks!
> >
>


Re: Compression of Data in HDFS

2016-04-06 Thread Bryan Bende
Ok one more question...

On GetHDFS are you setting the Directory to
"\landing\databasename\prodeiw_arc\"
and then setting Recurse Sub-Directories to true to have it go into each
table's directory?

The reason I ask is because the FlowFiles coming out of GetHDFS have an
attribute on them called Path, the documentation says:

The path is set to the relative path of the file's directory on HDFS. For
example, if the Directory property is set to /tmp, then files picked up
from /tmp will have the path attribute set to "./". If the Recurse
Subdirectories property is set to true and a file is picked up from
/tmp/abc/1/2/3, then the path attribute will be set to "abc/1/2/3"

So theoretically if you were pointing to "\landing\databasename\prodeiw_arc\"
and then it recursed into "\landing\databasename\prodeiw_arc\tablename",
the path attribute would end up being "tablename".

You could then reference this in your PutHDFS processor by setting the
Directory to "/landing/teradata/compressed/prodeiw_arc/$[path}"



On Wed, Apr 6, 2016 at 8:46 AM, jamesgreen 
wrote:

> Hi Brian, Thanks for the help!
>
> I have tried two ways
> a.
> 1.  I use GetHDFS to retrieve data from the HDFS , I then use putHDFS
> and set
> the compression to GZIP.
> 2.  In the Directory I am putting the complete path i.e
> /landing/teradata/compressed/prodeiw_arc
> b.
> 1.   I use GetHDFS to retrieve data from the HDFS, I then use Compress
> Content to apply the compression and then use PutHDFS
> 2.  In the Directory I am putting the complete path i.e
> /landing/teradata/compressed/prodeiw_arc
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Compression-of-Data-in-HDFS-tp8821p8825.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Compression of Data in HDFS

2016-04-06 Thread Bryan Bende
Hello,

Can you describe your flow a bit more?

Are you using ListHDFS + FetchHDFS to retrieve the data from HDFS?

What value do you have for the Directory property in PutHDFS?

Thanks,

Bryan

On Wed, Apr 6, 2016 at 7:12 AM, jamesgreen 
wrote:

> I am trying to compress a whole lot of files from my HDFS and write to
> another folder on the HDFS
> My Folder Structure is as follows:
> \landing\databasename\prodeiw_arc\tablename\_SUCCESS
> \landing\databasename\prodeiw_arc\tablename\part-m-0
>
> \landing\databasename\prodeiw_arc\tablename2\_SUCCESS
> \landing\databasename\prodeiw_arc\tablename2\part-m-0
>
> I am trying to compress to the following
> \landing\compressed\prodeiw_arc\tablename\_SUCCESS
> \landing\compressed\prodeiw_arc\tablename\part-m-0
>
> \landing\compressed\prodeiw_arc\tablename2\_SUCCESS
> \landing\compressed\prodeiw_arc\tablename2\part-m-0
>
> I have found that it compresses to
> \landing\compressed\prodeiw_arc\_SUCCESS
> \landing\compressed\prodeiw_arc\tablename\part-m-0
>
> it will then continue to overwrite. Is there anyway I can keep the
> directory
> structure when doing a PutHDFS?
>
> Thanks and Regards
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Compression-of-Data-in-HDFS-tp8821.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: How to pass variables to nifi Processor

2016-04-06 Thread Bryan Bende
Hello,

For properties that support expression language, they can reference
properties passed in through the bootstrap.conf file.
So you could define a property in bootstrap.conf like -Dfoo=abc and in a
processor property reference that as $[foo}.

There are plans for this be improved in the future through the idea of a
variable registry:
https://cwiki.apache.org/confluence/display/NIFI/Variable+Registry

You may also want to take a look at this project that provides a way to
deploy templates from one instance to another:
https://github.com/aperepel/nifi-api-deploy

Thanks,

Bryan


On Tue, Apr 5, 2016 at 1:32 PM, Rajeswari Raghunathan - Contractor <
rajeswari.raghunathan.contrac...@8451.com> wrote:

> Hi Team,
>
> We are using NIFI in our company to migrate all data from SFTP server to
> HDFS.
> We have different environment like dev, test, prod and  want to accomplish
> Continuous integration with NIFI. In order to do so, we need to separate
> all sensitive data like server hostname,username,password from hardcoding
> in Processor (eg:-GETSFTP).
> Can anyone help me by providing best solution for this problem?
>
> Regards,
> Rajeswari
>


Re: Error setting up environment for Custom Processor

2016-04-01 Thread Bryan Bende
Hello,

I think there are two options... You could try going through your Pom files
and anywhere you see 0.1.0-incubating change it to the latest NiFi release
of 0.6.0.

A second option is to not rely on the nifi-nar-bundles parent, the steps to
remove that are described here:

https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-Inheritance

Let us know if it still isn't working after that.

Thanks,

Bryan

On Friday, April 1, 2016, idioma  wrote:

> Hi,
> I have followed this set of instructions in order to build custom NiFi
> Processor:
>
>
> https://community.hortonworks.com/articles/4318/build-custom-nifi-processor.html
> <
> https://community.hortonworks.com/articles/4318/build-custom-nifi-processor.html
> >
>
> However, when running mvn install or mvn clean install, I am returned with
> the following error:
>
> [INFO] Scanning for projects...
> Downloading:
> https://repo.maven.apache.org/maven2/org/apache/nifi/nifi-nar-bundl
> es/0.1.0-incubating/nifi-nar-bundles-0.1.0-incubating.pom
> [ERROR] The build could not read 1 project -> [Help 1]
> [ERROR]
> [ERROR]   The project hwx:HWX:1.0
> (D:\nifi-0.5.1\custom-processors\HWX\pom.xml)
> has 1 error
> [ERROR] Non-resolvable parent POM: Could not transfer artifact
> org.apache.ni
> fi:nifi-nar-bundles:pom:0.1.0-incubating from/to central
> (https://repo.maven.apa
> che.org/maven2): Connect to repo.maven.apache.org:443
> [repo.maven.apache.org/23.
> 235.43.215] failed: Connection refused: connect and 'parent.relativePath'
> pointsat wrong local POM @
> line 19, column 13 -> [Help 2]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> swit   ch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please
> rea   d the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildin
> gException
> [ERROR] [Help 2]
> http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableMo
> delException
>
> Can you help?
>
> Thank you!
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Error-setting-up-environment-for-Custom-Processor-tp8703.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


-- 
Sent from Gmail Mobile


Re: Can't connect to Secure HBase cluster

2016-03-31 Thread Bryan Bende
>From doing some Googling it seems like the problem is similar to the one
described here where Hive could no longer talk to HBase after installing
Phoenix:
https://community.hortonworks.com/questions/1652/how-can-i-query-hbase-from-hive.html

The solution in that scenario was to add a phoenix jar to Hive's classpath,
which makes me think we would somehow have to make a phoenix jar available
on NiFi's classpath for the HBase Client Service.

I don't know enough about Phoenix to say for sure, but I created this JIRA
to capture the issue:
https://issues.apache.org/jira/browse/NIFI-1712


On Thu, Mar 31, 2016 at 2:28 PM, Guillaume Pool  wrote:

> Hi,
>
>
>
> Yes, here it is
>
>
>
>   
>
>
>
> 
>
>   fs.defaultFS
>
>   hdfs://supergrpcluster
>
> 
>
>
>
> 
>
>   fs.trash.interval
>
>   360
>
> 
>
>
>
> 
>
>
> ha.failover-controller.active-standby-elector.zk.op.retries
>
>   120
>
> 
>
>
>
> 
>
>   ha.zookeeper.quorum
>
>   sv-htndp2.hdp.supergrp.net:2181,
> sv-htndp1.hdp.supergrp.net:2181,sv-htndp3.hdp.supergrp.net:2181
>
> 
>
>
>
> 
>
>   hadoop.http.authentication.simple.anonymous.allowed
>
>   true
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.admin.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.admin.hosts
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hcat.groups
>
>   users
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hcat.hosts
>
>   sv-htnmn2.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hdfs.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hdfs.hosts
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hive.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.hive.hosts
>
>   sv-htnmn2.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.HTTP.groups
>
>   users
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.HTTP.hosts
>
>   sv-htnmn2.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.knox.groups
>
>   users
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.knox.hosts
>
>   sv-htncmn.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.oozie.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.oozie.hosts
>
>   sv-htncmn.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.root.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.root.hosts
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.yarn.groups
>
>   *
>
> 
>
>
>
> 
>
>   hadoop.proxyuser.yarn.hosts
>
>   sv-htnmn1.hdp.supergrp.net
>
> 
>
>
>
> 
>
>   hadoop.security.auth_to_local
>
>   RULE:[1:$1@$0](ambari...@hdp.supergrp.net)s/.*/ambari-qa/
>
> RULE:[1:$1@$0](hb...@hdp.supergrp.net)s/.*/hbase/
>
> RULE:[1:$1@$0](h...@hdp.supergrp.net)s/.*/hdfs/
>
> RULE:[1:$1@$0](sp...@hdp.supergrp.net)s/.*/spark/
>
> RULE:[1:$1@$0](.*@HDP.SUPERGRP.NET)s/@.*//
>
> RULE:[2:$1@$0](amshb...@hdp.supergrp.net)s/.*/ams/
>
> RULE:[2:$1@$0](amshbasemas...@hdp.supergrp.net)s/.*/ams/
>
> RULE:[2:$1@$0](amshbas...@hdp.supergrp.net)s/.*/ams/
>
> RULE:[2:$1@$0](am...@hdp.supergrp.net)s/.*/ams/
>
> RULE:[2:$1@$0](d...@hdp.supergrp.net)s/.*/hdfs/
>
> RULE:[2:$1@$0](hb...@hdp.supergrp.net)s/.*/hbase/
>
> RULE:[2:$1@$0](h...@hdp.supergrp.net)s/.*/hive/
>
> RULE:[2:$1@$0](j...@hdp.supergrp.net)s/.*/mapred/
>
> RULE:[2:$1@$0](j...@hdp.supergrp.net)s/.*/hdfs/
>
> RULE:[2:$1@$0](k...@hdp.supergrp.net)s/.*/knox/
>
> RULE:[2:$1@$0](n...@hdp.supergrp.net)s/.*/yarn/
>
> RULE:[2:$1@$0](n...@hdp.supergrp.net)s/.*/hdfs/
>
> RULE:[2:$1@$0](oo...@hdp.supergrp.net)s/.*/oozie/
>
> RULE:[2:$1@$0](r...@hdp.supergrp.net)s/.*/yarn/
>
> RULE:[2:$1@$0](y...@hdp.supergrp.net)s/.*/yarn/
>
> DEFAULT
>
> 
>
>
>
> 
>
>   hadoop.security.authentication
>
>   kerberos
>
> 
>
>
>
> 
>
>   hadoop.security.authorization
>
>   true
>
> 
>
>
>
> 
>
>   hadoop.security.key.provider.path
>
>   
>
> 
>
>
>
> 
>
>   io.compression.codecs
>
>
> org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec
>
> 
>
>
>
> 
>
>   io.file.buffer.size
>
>   131072
>
> 
>
>
>
> 
>
>   io.serializations
>
>   org.apache.hadoop.io.serializer.WritableSerialization
>
> 
>
>
>
> 
>
>   ipc.client.connect.max.retries
>
>   50
>
> 
>
>
>
> 
>
>   ipc.client.connection.maxidletime
>
>   3
>
> 
>
>
>
> 
>
>   ipc.client.idlethreshold
>
>   8000
>
> 
>
>
>
> 
>
>   ipc.server.tcpnodelay
>
>   true
>
> 
>
>
>
> 
>
>   mapreduce.jobtracker.webinterface.trusted
>
>   false
>
> 
>
>
>
> 
>
>   net.topology.script.file.name
>
>   /etc/hadoop/conf/topology_script.py
>
> 
>
>
>
>   
>
>
>
> Thanks
>
>
>
> *From: *Jeff Lord 

Re: Import Kafka messages into Titan

2016-03-31 Thread Bryan Bende
For #2,  you can use templates to move the flow (or parts of it) to another
instance.
A possible approach is to organize the flow into process groups and create
a template per process group, making it potentially easier to update parts
of the flow independently.

This project might be helpful to look at in terms of automating deploying a
template from one instance to another:
https://github.com/aperepel/nifi-api-deploy

For properties that are environment specific, if the property supports
expression language, you can specify them in bootstrap.conf as -D
properties for each of your NiFi instances, and in your processors you can
reference them with Expression Language.
For example, in each bootstrap.conf there could be -Dkafka.topic=mytopic
and then in a PutKafka processor set the topic to ${kafka.topic}. This will
let your template be the same for each environment.
Unfortunately at a quick glance it looks like GetKafka topic name does not
support EL, which should probably be fixed to allow this.

In the future there is a plan to have a variable registry exposed through
the UI so that you wouldn't have to edit the bootstrap file to define these
properties.


On Thu, Mar 31, 2016 at 11:58 AM, Matt Burgess  wrote:

> I'll let someone else have a go at question 2 :)
>
> If you're using ExecuteScript with Groovy, you don't need
> EvaluateJsonPath, Groovy has a JSONSlurper that works nicely (examples on
> my blog).
>
> To put directly into Titan you don't need to convert the format, instead
> you'll want Gremlin (part of Apache Tinkerpop), point your processor's
> Module Path property at a folder containing the Gremlin JARs, then you can
> create the vertices and edges using the approach in the Titan documentation.
>
> This would make an excellent blog post, perhaps I'll give this a try
> myself but please feel welcome to share anything you learn along the way!
> If I get some spare time I'd like to write a PutGraph processor that does
> pretty much what we've outlined here.
>
> Regards,
> Matt
>
> Sent from my iPhone
>
> > On Mar 31, 2016, at 10:15 AM, idioma  wrote:
> >
> > Matt, thank you for this this is brilliant. So, as it is I am thinking
> that I
> > would like to use the following:
> >
> > GetKafka -> EvaluateJsonPath -> ExecuteScript+Groovy Script
> >
> > My questions are two:
> >
> > 1) How do I import the Titan-compliant file into Titan? I guess I can
> modify
> > the script and load it into it.
> > 2) my second quest is more naive and proves the fact my background is
> more
> > on Apache Camel with very little knowledge of NiFi. In a versionControl
> > environment, how do you push a process flow created with NiFi that mostly
> > involves standard components? Do you write customized version where you
> set
> > of kafka properties, for example?
> >
> > Thanks
> >
> >
> >
> > --
> > View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Import-Kafka-messages-into-Titan-tp8647p8667.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Can't connect to Secure HBase cluster

2016-03-31 Thread Bryan Bende
Ok so it is not the Kerberos authentication that is causing the problem.

Would you be able to share a template of your flow?
If you are not familiar with templates, they are described here [1]. You
can paste the XML of the template on a gist [2] as an easy way to share it.

If you can't share a template, then can you tell us if anything else is
going on in your flow, any other processors being used?

Thanks,

Bryan

[1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates
[2] https://gist.github.com/

On Thu, Mar 31, 2016 at 10:59 AM, Guillaume Pool <gp...@live.co.za> wrote:

> Hi,
>
>
>
> Yes, I can connect using that user.
>
>
>
> Had to test it on HBase master as HBase not installed on NiFi server.
>
>
>
> Regards
>
> Guillaume
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>
> *From: *Bryan Bende <bbe...@gmail.com>
> *Sent: *Thursday, 31 March 2016 03:12 PM
> *To: *us...@nifi.apache.org
> *Subject: *Re: Can't connect to Secure HBase cluster
>
>
> Hello,
>
> In order to narrow down the problem, can you connect to the Hbase shell
> from the command line using the same keytab and principal?
>
> kinit -kt /app/env/nifi.keytab  n...@hdp.supergrp.net
> hbase shell
>
> Then scan a table or some operation. If that all works, then we need to
> find out why you are getting UnsupportedOperationException:
> org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory.
>
> Would you be able to share a template of your flow with us?
>
> Thanks,
>
> Bryan
>
> On Thu, Mar 31, 2016 at 7:27 AM, Guillaume Pool <gp...@live.co.za> wrote:
>
> Hi,
>
>
>
> I am trying to make a connection to a secured cluster that has phoenix
> installed.
>
>
>
> I am running HDP 2.3.2 and NiFi 0.6.0
>
>
>
> Getting the following error on trying to enable HBase_1_1_2_ClientService
>
>
>
> 2016-03-31 13:24:23,916 INFO [StandardProcessScheduler Thread-5]
> o.a.nifi.hbase.HBase_1_1_2_ClientService
> HBase_1_1_2_ClientService[id=e7e9b2ed-d336-34be-acb4-6c8b60c735c2] HBase
> Security Enabled, logging in as principal n...@hdp.supergrp.net with
> keytab /app/env/nifi.keytab
>
> 2016-03-31 13:24:23,984 WARN [StandardProcessScheduler Thread-5]
> org.apache.hadoop.util.NativeCodeLoader Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 2016-03-31 13:24:24,101 INFO [StandardProcessScheduler Thread-5]
> o.a.nifi.hbase.HBase_1_1_2_ClientService
> HBase_1_1_2_ClientService[id=e7e9b2ed-d336-34be-acb4-6c8b60c735c2]
> Successfully logged in as principal n...@hdp.supergrp.net with keytab
> /app/env/nifi.keytab
>
> 2016-03-31 13:24:24,177 ERROR [StandardProcessScheduler Thread-5]
> o.a.n.c.s.StandardControllerServiceNode
> HBase_1_1_2_ClientService[id=e7e9b2ed-d336-34be-acb4-6c8b60c735c2] Failed
> to invoke @OnEnabled method due to java.io.IOException:
> java.lang.reflect.InvocationTargetException
>
> 2016-03-31 13:24:24,182 ERROR [StandardProcessScheduler Thread-5]
> o.a.n.c.s.StandardControllerServiceNode
>
> java.io.IOException: java.lang.reflect.InvocationTargetException
>
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
> ~[hbase-client-1.1.2.jar:1.1.2]
>
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
> ~[hbase-client-1.1.2.jar:1.1.2]
>
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
> ~[hbase-client-1.1.2.jar:1.1.2]
>
> at
> org.apache.nifi.hbase.HBase_1_1_2_ClientService$1.run(HBase_1_1_2_ClientService.java:215)
> ~[nifi-hbase_1_1_2-client-service-0.6.0.jar:0.6.0]
>
> at
> org.apache.nifi.hbase.HBase_1_1_2_ClientService$1.run(HBase_1_1_2_ClientService.java:212)
> ~[nifi-hbase_1_1_2-client-service-0.6.0.jar:0.6.0]
>
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_71]
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
> ~[na:1.8.0_71]
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> ~[hadoop-common-2.6.2.jar:na]
>
> at
> org.apache.nifi.hbase.HBase_1_1_2_ClientService.createConnection(HBase_1_1_2_ClientService.java:212)
> ~[nifi-hbase_1_1_2-client-service-0.6.0.jar:0.6.0]
>
> at
> org.apache.nifi.hbase.HBase_1_1_2_ClientService.onEnabled(HBase_1_1_2_ClientService.java:161)
> ~[nifi-hbase_1_1_2-client-service-0.6.0.jar:0.6.0]
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ

Re: Splitting Incoming FlowFile, Output Multiple FlowFiles

2016-03-31 Thread Bryan Bende
Hello,

SplitText and SplitContent should be producing individual FlowFiles. Are
you seeing something different?

For SplitText you would set "Line Split Count" to 1 in order to get a
FlowFile for each line of the incoming CSV.

If you are doing extremely large files, it is generally recommended to do a
two-phase split where the first SplitText might have something like "Line
Split Count" set to 10,000-20,000 and then a second SplitText with "Line
Split Count" set to 1.

-Bryan


On Thu, Mar 31, 2016 at 8:35 AM, dale.chang13 
wrote:

> My specific use-case calls for ingesting a CSV table with many rows and
> then
> storing individual rows into HBase and Solar. Additionally, I would like to
> avoid developing custom processors, but it seems like the SplitText and
> SplitContent Processors do not return individual flowfiles, each with their
> own attributes.
>
> However, I was wondering what the best plan of attack would be when taking
> an incoming FlowFile and sending FlowFiles through Process Session?
> Creating
> multiple instances of Process Session? session.transfer within a loop?
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Splitting-Incoming-FlowFile-Output-Multiple-FlowFiles-tp8653.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: [VOTE] Release Apache NiFi 0.6.0 (RC2)

2016-03-26 Thread Bryan Bende
+1 (binding)

I did run into the problem previously mentioned with TestListFile. For me
this test never passes on OSX, but worked fine when building on a Linux VM.
I tested ListFile from the running application and it appears to work as
expected, so I don't see this as a reason to vote against the release.

Everything else in the release helper checked out, and I successfully
tested a flow using the HDFS and HBase processors.



On Sat, Mar 26, 2016 at 12:49 PM, Mark Payne  wrote:

> +1.
>
> Downloaded and verified signature and hashes.
> Built on OSX with contrib-check and had no problems.
> Verified README, NOTICE, and LICENSE files.
> Started software and verified functionality.
>
> Thanks
> -Mark
>
> Sent from my iPhone
>
> > On Mar 26, 2016, at 11:18 AM, Aldrin Piri  wrote:
> >
> > +1, binding
> >
> > Build, tests, hashes and signatures all check out.  Verified
> functionality
> > with some existing flows and verified anticipated and correct
> functionality
> > with the List processors (ListFile and ListSFTP)
> >
> > I realize there have been some intermittent problems with ListFile, but
> the
> > changes associated with 1664 were changes to the tests and we have had
> > passing builds across Travis, OS X, Windows 7, 8, and 10 as part of that
> > review process.  An issue has been raised to improve these, NIFI-1689,
> and
> > can be worked and incorporated into the follow on support and master
> > branches.
> >
> >> On Sat, Mar 26, 2016 at 12:34 AM, Tony Kurc  wrote:
> >>
> >> I've been holding off as I hoped my build problems were my environment,
> or
> >> bad timing luck, but I just can't seem to get through a windows 10
> build,
> >> and it looks like Andy had the same problems.
> >>
> >> C:\development\nifi-0.6.0>java -version
> >> java version "1.7.0_10"
> >> Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
> >> Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)
> >>
> >> C:\development\nifi-0.6.0>mvn --version
> >> Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06;
> >> 2015-04-22T07:57:37-04:00)
> >> Maven home: c:\development\apache-maven-3.3.3\bin\..
> >> Java version: 1.7.0_10, vendor: Oracle Corporation
> >> Java home: C:\Program Files\Java\jdk1.7.0_10\jre
> >> Default locale: en_US, platform encoding: Cp1252
> >> OS name: "windows 8", version: "6.2", arch: "amd64", family: "windows"
> >>
> >> Errors:
> >>
> >> Tests run: 12, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 3.056
> sec
> >> <<< FAILURE! - in org.apache.nifi.processors.standard.TestListFile
> >> testFilterFilePattern(org.apache.nifi.processors.standard.TestListFile)
> >> Time elapsed: 0.224 sec  <<< FAILURE!
> >> java.lang.AssertionError: expected:<4> but was:<0>
> >>at org.junit.Assert.fail(Assert.java:88)
> >>at org.junit.Assert.failNotEquals(Assert.java:834)
> >>at org.junit.Assert.assertEquals(Assert.java:645)
> >>at org.junit.Assert.assertEquals(Assert.java:631)
> >>at
> >>
> >>
> org.apache.nifi.processors.standard.TestListFile.testFilterFilePattern(TestListFile.java:438)
> >>
> >> testFilterPathPattern(org.apache.nifi.processors.standard.TestListFile)
> >> Time elapsed: 0.223 sec  <<< FAILURE!
> >> java.lang.AssertionError: expected:<4> but was:<0>
> >>at org.junit.Assert.fail(Assert.java:88)
> >>at org.junit.Assert.failNotEquals(Assert.java:834)
> >>at org.junit.Assert.assertEquals(Assert.java:645)
> >>at org.junit.Assert.assertEquals(Assert.java:631)
> >>at
> >>
> >>
> org.apache.nifi.processors.standard.TestListFile.testFilterPathPattern(TestListFile.java:492)
> >>
> >> testFilterHidden(org.apache.nifi.processors.standard.TestListFile)  Time
> >> elapsed: 0.227 sec  <<< FAILURE!
> >> java.lang.AssertionError: expected:<2> but was:<0>
> >>at org.junit.Assert.fail(Assert.java:88)
> >>at org.junit.Assert.failNotEquals(Assert.java:834)
> >>at org.junit.Assert.assertEquals(Assert.java:645)
> >>at org.junit.Assert.assertEquals(Assert.java:631)
> >>at
> >>
> >>
> org.apache.nifi.processors.standard.TestListFile.testFilterHidden(TestListFile.java:383)
> >>
> >> testReadable(org.apache.nifi.processors.standard.TestListFile)  Time
> >> elapsed: 0.219 sec  <<< FAILURE!
> >> java.lang.AssertionError: expected:<3> but was:<0>
> >>at org.junit.Assert.fail(Assert.java:88)
> >>at org.junit.Assert.failNotEquals(Assert.java:834)
> >>at org.junit.Assert.assertEquals(Assert.java:645)
> >>at org.junit.Assert.assertEquals(Assert.java:631)
> >>at
> >>
> >>
> org.apache.nifi.util.StandardProcessorTestRunner.assertTransferCount(StandardProcessorTestRunner.java:330)
> >>at
> >>
> >>
> org.apache.nifi.processors.standard.TestListFile.testReadable(TestListFile.java:624)
> >>
> >> testRecurse(org.apache.nifi.processors.standard.TestListFile)  Time
> >> elapsed: 0.226 sec  <<< FAILURE!
> >> 

Re: Custom Graphs/Metrics reporting

2016-03-24 Thread Bryan Bende
Hi Michael,

Currently the graphing functionality is not extensible, as far as I know.

One option that might be helpful is a ReportingTask. ReportingTasks are
components that get access to all of the metrics and provenance events and
can push them out to an external system. Currently there are ReportingTasks
for Ganglia and Ambari, and I believe an open pull-request to add one for
Rieman, but ReportingTasks are an extension point just like processors, so
a custom ReportingTask could be developed to push metrics to any system [1].

Let us know if you had something else in mind.

-Bryan

[1]
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#reporting-tasks

On Thu, Mar 24, 2016 at 6:51 AM, michael.griffit...@baesystems.com <
michael.griffit...@baesystems.com> wrote:

> Morning (in the UK) all,
>
> Does NiFi's API have the ability to extend the graphing functionality? It
> currently shows flow files or bytes read/written every 5 minutes. Is it
> possible to extend this to show custom metrics?
>
> Many thanks,
>
> Michael
>
> Michael Griffiths
> System Developer
> BAE Systems Applied Intelligence
> ___
> |  E: michael.griffit...@baesystems.com
>
> BAE Systems Applied Intelligence, Surrey Research Park, Guildford, Surrey,
> GU2 7RQ.
> www.baesystems.com/ai
>
> Please consider the environment before printing this email. This message
> should be regarded as confidential. If you have received this email in
> error please notify the sender and destroy it immediately. Statements of
> intent shall only become binding when confirmed in hard copy by an
> authorised signatory. The contents of this email may relate to dealings
> with other companies under the control of BAE Systems Applied Intelligence
> Limited, details of which can be found at
> http://www.baesystems.com/Businesses/index.htm.
>


Re: Closing in on the Apache NiFi 0.6.0 release

2016-03-22 Thread Bryan Bende
All,

While testing the original RC I came across a couple of issues with the new
GetSplunk processor.

The first issue relates to being able to specify the timezone through a
processor property, so that the timezone used for searching will match with
splunk's timezone.
The second is is that GetSplunk incorrectly clears it's state when NiFi
starts up causing to pull data it has already pulled.

I'd like to re-open NIFI-1420 and submit fixes for these problems before we
make the next RC. I should be able to do this shortly.

Thanks,

Bryan

On Tue, Mar 22, 2016 at 11:50 AM, Aldrin Piri  wrote:

> Joe,
>
> Looking through the associated tickets, both sound like worthwhile
> additions and can hold off until those items get through reviews.
>
> --Aldrin
>
> On Tue, Mar 22, 2016 at 11:47 AM, Joe Witt  wrote:
>
> > Aldrin,
> >
> > NIFI-1665 appears to correct a problematic behavior when pulling from
> > Kafka and when timeouts can occur.  Definitely think we should get
> > this in the build.  I also see that NIFI-1645 is up and given the
> > trouble that is causing for use of delimiter function we should engage
> > on this.
> >
> > Since you're working the windows build issue and these are in play do
> > you mind waiting a bit before sending the new RC ?
> >
> > Thanks
> > Joe
> >
> > On Mon, Mar 21, 2016 at 1:42 PM, Aldrin Piri 
> wrote:
> > > All,
> > >
> > > It looks like the last ticket for 0.6.0 has been merged and resolved.
> > >
> > > I will begin the RC process shortly working off of commit
> > > 736896246cf021dbed31d4eb1e22e0755e4705f0 [1] [2].
> > >
> > > [1]
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=736896246cf021dbed31d4eb1e22e0755e4705f0
> > > [2]
> > >
> >
> https://github.com/apache/nifi/commit/736896246cf021dbed31d4eb1e22e0755e4705f0
> > >
> > > On Mon, Mar 21, 2016 at 1:48 AM, Tony Kurc  wrote:
> > >
> > >> The Locale issue was reviewed, confirmed as fixed by reporter and
> merged
> > >> in.
> > >>
> > >> On Sun, Mar 20, 2016 at 10:35 PM, Joe Witt 
> wrote:
> > >>
> > >> > Team,
> > >> >
> > >> > There are a couple finishing touches PRs to fix a big defect in
> > >> > SplitText for certain input types, improve locale handling and test
> > >> > behavior for Kit bundle, and to clean up content viewing from
> > >> > connections.
> > >> >
> > >> > Getting good input on findings folks have so please keep it coming
> as
> > >> > that helps ensure a solid/healthy RC.
> > >> >
> > >> > Thanks
> > >> > Joe
> > >> >
> > >> > On Sat, Mar 19, 2016 at 6:21 PM, Tony Kurc 
> wrote:
> > >> > > Recommend https://issues.apache.org/jira/browse/NIFI-1651 be
> > included
> > >> in
> > >> > > 0.6.0
> > >> > >
> > >> > > On Wed, Mar 16, 2016 at 4:08 PM, Joe Witt 
> > wrote:
> > >> > >
> > >> > >> Team,
> > >> > >>
> > >> > >> Ok sooo close.  We have 5 tickets remaining.
> > >> > >>
> > >> > >> - Additional functionality/cleanup for SplitText [1]
> > >> > >> [status] Still in discussions. Recommend we move this change to
> > 0.7.0.
> > >> > >> Solid effort on both code contributor and reviewer side but this
> > is a
> > >> > >> tricky one.
> > >> > >>
> > >> > >> - Support Kerberos based authentication to REST API [2]
> > >> > >> [status] PR is in. Reviewing and PR tweaking appears active.
> Looks
> > >> > >> quite close and comments indicate great results.
> > >> > >>
> > >> > >> - Add Kerberos support to HBase processors [3]
> > >> > >> [status] Patch in. Under review.  Running on live test system
> with
> > >> > >> great results.
> > >> > >>
> > >> > >> - Add support for Spring Context loaded processors (Spring
> > >> > >> Integrations, Camel, ...) [4]
> > >> > >> [status] Appears ready. Getting review feedback.
> > >> > >>
> > >> > >> - Zookeeper interaction for NiFI state management should limit
> > state
> > >> to
> > >> > >> 1MB [6]
> > >> > >> [status] Patch is in and review under way.  Looks close.
> > >> > >>
> > >> > >> [1] https://issues.apache.org/jira/browse/NIFI-1118
> > >> > >> [2] https://issues.apache.org/jira/browse/NIFI-1274
> > >> > >> [3] https://issues.apache.org/jira/browse/NIFI-1488
> > >> > >> [4] https://issues.apache.org/jira/browse/NIFI-1571
> > >> > >> [5] https://issues.apache.org/jira/browse/NIFI-1626
> > >> > >>
> > >> > >> Thanks
> > >> > >>
> > >> > >> On Wed, Mar 16, 2016 at 4:04 PM, Joe Witt 
> > wrote:
> > >> > >> > Team,
> > >> > >> >
> > >> > >> > Ok sooo close.  We have 6 tickets remaining.
> > >> > >> >
> > >> > >> > - Additional functionality/cleanup for SplitText [1]
> > >> > >> > [status] Still in discussions. Recommend we move this change to
> > >> 0.7.0.
> > >> > >> > Solid effort on both code contributor and reviewer side but
> this
> > is
> > >> a
> > >> > >> > tricky one.
> > >> > >> >
> > >> > >> > - Support Kerberos based authentication to REST API [2]
> > >> > >> > [status] PR 

Processor additional documentation

2016-03-21 Thread Bryan Bende
 Geercken <
> > uwe.geerc...@web.de
> > > >
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>>> Dan,
> > > > >>>>>>
> > > > >>>>>> but maybe I have a wrong understanding: do I have to create an
> > > > >>> index.html
> > > > >>>>>> file? Currently I have only created an additionalDetails.html
> > > file.
> > > > >>>>>>
> > > > >>>>>> I will also try to reduce the html code to a minimum and see
> if
> > > it is
> > > > >>> a
> > > > >>>>>> problem with my code.
> > > > >>>>>>
> > > > >>>>>> Bye,
> > > > >>>>>>
> > > > >>>>>> Uwe
> > > > >>>>>>
> > > > >>>>>>> Gesendet: Freitag, 18. März 2016 um 19:03 Uhr
> > > > >>>>>>> Von: "dan bress" <danbr...@gmail.com>
> > > > >>>>>>> An: dev@nifi.apache.org
> > > > >>>>>>> Betreff: Re: Re: Processor additional documentation
> > > > >>>>>>>
> > > > >>>>>>> Uwe,
> > > > >>>>>>>   No its not a problem to have both index.html and
> > > > >>>>>> additionalDetails.html
> > > > >>>>>>> The NiFi framework generates nearly all of the documentation
> > for
> > > > >>> your
> > > > >>>>>>> processor for you.  It will generate information about the
> > > > >>> properties and
> > > > >>>>>>> relationships your processor exposes to its users.  If you
> need
> > > to
> > > > >>>>>> express
> > > > >>>>>>> more about your processor, then that is where
> > > additionalDetails.html
> > > > >>>>>> comes
> > > > >>>>>>> into play.  For example, if your processor uses a custom
> query
> > > > >>> language.
> > > > >>>>>>>
> > > > >>>>>>> Generated index.html example:
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>
> > >
> >
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
> > > > >>>>>>>
> > > > >>>>>>> additionalDetails.html example:
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>
> > >
> >
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/additionalDetails.html
> > > > >>>>>>>
> > > > >>>>>>> On Fri, Mar 18, 2016 at 10:54 AM Uwe Geercken <
> > > uwe.geerc...@web.de>
> > > > >>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Bryan,
> > > > >>>>>>>>
> > > > >>>>>>>> all looks ok. I looked into the nifi-home/work/docs folder.
> > > There
> > > > >>> is
> > > > >>>>>>>> nothing but a components folder. Inside there is a folder
> for
> > my
> > > > >>>>>> processor:
> > > > >>>>>>>> com.datamelt.nifi.test.TemplateProcessor and inside the
> folder
> > > > >>> there
> > > > >>>>>> is a
> > > > >>>>>>>> file index.html and it contains the code of my
> > > > >>> additionalDetails.html
> > > > >>>>>> file.
> > > > >>>>>>>>
> > > > >>>>>>>> when I open the file in the web browser it looks good. I
> > looked
> > > at
> > > > >>>>>> other
> > > > >>>>>>>> index.html files and they look similar.
> > > > >>>>>>>>
> > > > >>>>>>>> but I noted that some folders have an inde.html file AND an
> > > > >>>>>>>> additionalDetails.html file. maybe that is the problem?
> > > > >>>>>>>>
> > > > >>>>>>>> greetings,
> > > > >>>>>>>>
> > > > >>>>>>>> Uwe
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Gesendet: Freitag, 18. März 2016 um 16:18 Uhr
> > > > >>>>>>>> Von: "Bryan Bende" <bbe...@gmail.com>
> > > > >>>>>>>> An: dev@nifi.apache.org
> > > > >>>>>>>> Betreff: Re: Processor additional documentation
> > > > >>>>>>>> Hi Uwe,
> > > > >>>>>>>>
> > > > >>>>>>>> Do you have the additionalDetails.html file in your
> processors
> > > jar
> > > > >>>>>> project,
> > > > >>>>>>>> under src/main/resources?
> > > > >>>>>>>>
> > > > >>>>>>>> Similar to this:
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>
> > >
> >
> https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/resources
> > > > >>>>>>>>
> > > > >>>>>>>> The expected project structure is described here:
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-ExampleProcessorBundleStructure[https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-ExampleProcessorBundleStructure]
> <https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-ExampleProcessorBundleStructure[https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions%23MavenProjectsforExtensions-ExampleProcessorBundleStructure]>
> > > > >>>>>> <
> > > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-ExampleProcessorBundleStructure[https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions%23MavenProjectsforExtensions-ExampleProcessorBundleStructure]
> > > > >>>>
> > > > >>>>>>>> <
> > > > >>>>>>
> > > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-ExampleProcessorBundleStructure[https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions%23MavenProjectsforExtensions-ExampleProcessorBundleStructure]
> > > > >>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> If you think that part is setup correctly, can you check
> under
> > > > >>>>>>>> nifi_home/work/docs and see if
> > > > >>>>>> com.datamelt.nifi.test.TemplateProcessor is
> > > > >>>>>>>> there?
> > > > >>>>>>>>
> > > > >>>>>>>> -Bryan
> > > > >>>>>>>>
> > > > >>>>>>>> On Fri, Mar 18, 2016 at 11:04 AM, Uwe Geercken <
> > > > >>> uwe.geerc...@web.de>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> Hello,
> > > > >>>>>>>>>
> > > > >>>>>>>>> I am writing my first processor. As described in the
> > > > >>> documentation, I
> > > > >>>>>>>> have
> > > > >>>>>>>>> added an HTML file to be used when the user selects
> "Usage":
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>
> > docs/com.datamelt.nifi.test.TemplateProcessor/additionalDetails.html
> > > > >>>>>>>>>
> > > > >>>>>>>>> This is located in the root or the Processors nar file.
> > > > >>>>>>>>>
> > > > >>>>>>>>> The processor class is this:
> > > > >>>>>>>>>
> > > > >>>>>>>>> com/datamelt/nifi/test/TemplateProcessor.class
> > > > >>>>>>>>>
> > > > >>>>>>>>> The processor works, but selecting "Usage" won't show my
> HTML
> > > > >>> file.
> > > > >>>>>>>>>
> > > > >>>>>>>>> I understood that I write the HTML file and Nifi will picks
> > it
> > > > >>> up
> > > > >>>>>> when it
> > > > >>>>>>>>> starts. Or is this not true?
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for feedback,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Uwe
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
>


-- 
Sent from Gmail Mobile


Re: quick reminder on new nars

2016-03-19 Thread Bryan Bende
Ah perfect, didn't know that was already done.

On Wed, Mar 16, 2016 at 9:24 AM, Joe Witt <joe.w...@gmail.com> wrote:

> Bryan,
>
> I believe the archetype is already updated if this what you're referring
> to.
>
>
> https://github.com/apache/nifi/blob/master/nifi-maven-archetypes/nifi-processor-bundle-archetype/src/main/resources/archetype-resources/nifi-__artifactBaseName__-nar/pom.xml
>
> Thanks
> Joe
>
> On Wed, Mar 16, 2016 at 9:19 AM, Bryan Bende <bbe...@gmail.com> wrote:
> > Also sounds like we need to update the archetype based on whatever
> approach
> > we come up with, either adding those properties  to the NAR Pom in the
> > archetype, or having it use a specific parent.
> >
> > On Wednesday, March 16, 2016, Joe Witt <joe.w...@gmail.com> wrote:
> >
> >> Would certainly like to better understand what you have in mind.
> >>
> >> thanks
> >>
> >> On Wed, Mar 16, 2016 at 12:02 AM, Sean Busbey <bus...@apache.org
> >> <javascript:;>> wrote:
> >> > we could make a parent pom for all the nar modules.
> >> >
> >> > wanna see what that looks like?
> >> >
> >> > On Tue, Mar 15, 2016 at 8:46 PM, Joe Witt <joe.w...@gmail.com
> >> <javascript:;>> wrote:
> >> >> Team,
> >> >>
> >> >> During the previous build/release cycle it was found that
> >> >> javadocs/sources were being made for the Nar bundles themselves and
> >> >> was causing invalid licensing/notice information to be present.  All
> >> >> the existing bundles and the archetypes were fixed for this.  Just be
> >> >> sure on new nars to include these as well if you aren't copying from
> >> >> something existing or using the archetype.  I just fixed a couple of
> >> >> them for new things in the 0.6.0 release.
> >> >>
> >> >> The nar pom itself should have a properties section such as
> >> >>
> >> >> 
> >> >> true
> >> >> true
> >> >> 
> >> >>
> >> >> Perhaps there is a nicer maven way of ensuring this doesn't happen
> for
> >> Nars.
> >> >>
> >> >> Thanks
> >> >> Joe
> >>
> >
> >
> > --
> > Sent from Gmail Mobile
>


Re: Cross NAR Controller Services

2016-03-15 Thread Bryan Bende
Devin,

This WIki page shows how to create the appropriate dependencies between
your NAR and the ControllerService:

https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-LinkingProcessorsandControllerServices

I also created an example project on GitHub to show a working example:
  https://github.com/bbende/nifi-dependency-example

Hope that helps.

-Bryan

On Tue, Mar 15, 2016 at 7:33 PM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Devin
>
> Your problem is most likely in your NAR poms where you may satisfy compile
> dependency but not NAR to participate in class loader runtime  inheritance.
> Is there a way to look at your poms and also the general structure of the
> project?
>
> Oleg
>
> Sent from my iPhone
>
> > On Mar 15, 2016, at 18:51, Devin Fisher <
> devin.fis...@perfectsearchcorp.com> wrote:
> >
> > I'm having issues using a standard controller service
> (DBCPConnectionPool)
> > that is provided by nifi-dbcp-service-nar. But I'm having issues with my
> > nar. I have included a dependency on nifi-dbcp-service-api in my maven
> pom
> > and have used the property description that is the same as ExecuteSQL for
> > the DBCP_SERVICE property. When I load my processor in nifi I don't get a
> > list of DBCPConnectionPool controller service like I expect. I have an
> > ExecuteSQL processor in the same flow (for testing) and it list the
> > controller service I created just fine and uses it just fine.
> >
> > The problem seems to me (I don't have a development environment to
> confirm)
> > that the DBCPService.class that I use in my processor is not seen as the
> > same class object (because of the isolation features of NAR) as the one
> > that DBCPCOnnectionPool implements. I think I have mostly confirmed this
> by
> > implementing a dummy controller service that implements DBCPService in
> the
> > same NAR as my processor and my processor is able to list it just fine.
> But
> > the ExecuteSQL don't list my dummy controller service. So they seem to be
> > considered different classes.
> >
> > I think I'm doing something wrong because ExecuteSQL is not in the same
> nar
> > as DBCPConnectionPool. So they play nice together somehow but I don't see
> > what I need to do so that my nar works the same way.
> >
> > I'm enjoying developing against nifi and sorry if this is a rookie
> mistake.
> >
> > Devin
>


Re: Unable to connect Nifi to SQL Server

2016-03-08 Thread Bryan Bende
Hello,

If you are specifying the URI for the driver jar, then you would generally
put it outside of NiFi lib directory, somewhere else.

The URI also needs the file:/// prefix, so for windows I think it would be:

file:///D:/some/path/to/driver/sqljdbc4.jar

Let us know if that works.

-Bryan

On Tue, Mar 8, 2016 at 3:16 PM, Ashraf Mohammed 
wrote:

> Hi dev@Nifi,
>
> I am trying to connect Nifi to source SQL Server database.  I keep getting
> the error " Cant load database driver"
>
> My settings are as follows:
>
>
> Database Driver Class NameInfo
> com.microsoft.sqlserver.jdbc.SQLServerDriver
>
> Database Driver Jar UrlInfo
>  D:\nifi\nifi-0.5.1-bin\nifi-0.5.1\lib\sqljdbc4.jar
>
> Also note: I downloaded, the SQLJDBC4.jar file from the Microsoft site, I
> unzipped the files, retrieved the jar file and moved it under the nifi lib
> folder/
>
> Any advice in fixing this issue please ?
> --
> Regards,
> Ashraf Y Mohammed
>


Re: ExecuteSQL and NiFi 0.5.1 - Error org.apache.avro.SchemaParseException: Empty name

2016-03-05 Thread Bryan Bende
I think this a legitimate bug that was introduced in 0.5.0.

I created this ticket: https://issues.apache.org/jira/browse/NIFI-1596

For those interested, I think the line of code causing the problem is this:

https://github.com/apache/nifi/blob/0e926074661302c65c74ddee3af183ff49642da7/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/JdbcCommon.java#L133

I think that should be:
  if (!StringUtils.isBlank(tableNameFromMeta))

Have't tried this, but based on the error that was reported it seems like
this could be the problem.

-Bryan

On Sat, Mar 5, 2016 at 4:52 PM, Marcelo Valle Ávila 
wrote:

> Hello Juan,
>
> Thanks for the response,
>
> I deploy a NiFi 0.5.1 clean installation, and the behavior is still there.
> Reading other user mail of the mailing list, it seems that there is some
> incompatibility between NiFi 0.5.x and Oracle databases (maybe more).
>
> With DB2 databases works fine.
>
> Regards
>
> 2016-03-04 19:27 GMT+01:00 Juan Sequeiros :
>
>> I wonder if on the controller service DBCPConnectionPool associated to
>> your ExecuteSQL processor you have something that can't be found since it's
>> stored on your older release.
>>
>>
>> On Fri, Mar 4, 2016 at 11:12 AM, Marcelo Valle Ávila 
>> wrote:
>>
>>> Hello community,
>>>
>>> I'm starting my first steps with NiFi, and enjoining how it works!
>>>
>>> I started with version 0.4.1 and a simple flow:
>>>
>>> ExecuteSQL -> ConvertAvroToJSON -> PutEventHub
>>>
>>> Reading from an Oracle database, and everything works like a charm!
>>>
>>> Few days ago NiFi 0.5.1 has been released, and I tried a rolling
>>> upgrade, using my old NiFi flow. The update goes right and my flow is
>>> loaded correctly.
>>>
>>> The problem is when I starts the ExecuteSQL processor, it doesn't
>>> works... In log file I can see this trace:
>>>
>>> ERROR [Timer-Driven Process Thread-8]
>>> o.a.nifi.processors.standard.ExecuteSQL
>>> org.apache.avro.SchemaParseException: Empty name
>>> at org.apache.avro.Schema.validateName(Schema.java:1076) ~[na:na]
>>> at org.apache.avro.Schema.access$200(Schema.java:79) ~[na:na]
>>> at org.apache.avro.Schema$Name.(Schema.java:436) ~[na:na]
>>> at org.apache.avro.Schema.createRecord(Schema.java:145) ~[na:na]
>>> at
>>> org.apache.avro.SchemaBuilder$RecordBuilder.fields(SchemaBuilder.java:1732)
>>> ~[na:na]
>>> at
>>> org.apache.nifi.processors.standard.util.JdbcCommon.createSchema(JdbcCommon.java:138)
>>> ~[na:na]
>>> at
>>> org.apache.nifi.processors.standard.util.JdbcCommon.convertToAvroStream(JdbcCommon.java:72)
>>> ~[na:na]
>>> at
>>> org.apache.nifi.processors.standard.ExecuteSQL$1.process(ExecuteSQL.java:158)
>>> ~[na:na]
>>> at
>>> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1953)
>>> ~[nifi-framework-core-0.5.1.jar:0.5.1]
>>> at
>>> org.apache.nifi.processors.standard.ExecuteSQL.onTrigger(ExecuteSQL.java:152)
>>> ~[na:na]
>>> at
>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>> ~[nifi-api-0.5.1.jar:0.5.1]
>>> at
>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1139)
>>> [nifi-framework-core-0.5.1.jar:0.5.1]
>>> at
>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:139)
>>> [nifi-framework-core-0.5.1.jar:0.5.1]
>>> at
>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49)
>>> [nifi-framework-core-0.5.1.jar:0.5.1]
>>> at
>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:124)
>>> [nifi-framework-core-0.5.1.jar:0.5.1]
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> [na:1.7.0_79]
>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>> [na:1.7.0_79]
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>> [na:1.7.0_79]
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> [na:1.7.0_79]
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> [na:1.7.0_79]
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> [na:1.7.0_79]
>>> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
>>>
>>> I tried with a clean installation of NiFi 0.5.1 and new clean flow, but
>>> the error stills appears, and the processor doesn't starts.
>>>
>>> With a downgrade to NiFi 0.4.1 the processor works perfectly.
>>>
>>> Do you have any idea of what can be failing?
>>> Do you think I'm doing something wrong?
>>>
>>> Thanks in advance!
>>> Marcelo
>>>
>>
>>
>>
>> --
>> Juan Carlos Sequeiros
>>
>
>


Re: GetMongo GC Overflow

2016-03-01 Thread Bryan Bende
Hello,

I'm not that familiar with MongoDB, but from looking at the existing
GetMongo processor, it seems to create a FlowFile per Document and only
calls session.commit() once at the very end, which could possibly be a
problem when producing a very significant amount of flow files.

When you mentioned writing your own processor, did you do this as your own
project? or were you modifying the one in apache nifi and rebuilding the
whole project?

There should be some information in nifi_home/logs/nifi-app.log that
indicates why it didn't start up. If you could provide the error messages
and stack traces it would help us figure out what went wrong.

Thanks,

Bryan


On Tue, Mar 1, 2016 at 9:08 AM, ajansing  wrote:

> Running Mac OS X 10.10.5
>  Apache Maven 3.3.9
>  java version "1.8.0_72"
>  Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
>
> I've been trying to figure out how to use the GetMongo processor to output
> to a PutHDFS processor.
>
> Some things I think I've figured out:
>
> *Limit* acts exactly as .limit() for Mongo, where all it does it give you
> the first *n* elements in a collections.
> *Batch* isn't a command in Mongo (that I know of) and I can't see how this
> entry does anything for the processor.
>
> I'm working with a collection in the millions and I can't just simply leave
> the limit blank because the JVM runs out of memory. I tried to write my own
> processor and got it to compile under the *mvn clean install*, but when I
> copy the .nar file from the '...nar/target' directory to the
> 'nifi-0.6.0/lib' folder and then try to 'sh nifi.sh run' or 'start', to
> nifi
> refuses to finish booting up and terminates itself.
>
> Taking  GetMongo.java
> <
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-mongodb-bundle/nifi-mongodb-processors/src/main/java/org/apache/nifi/processors/mongodb/GetMongo.java
> >
> and it's respective other files. I modified them and changed the following
> method:
>
>
> @Override
> public void onTrigger(final ProcessContext context, final
> ProcessSession
> session) throws ProcessException {
> final ProcessorLog logger = getLogger();
> final MongoCollection collection =
> getCollection(context);
> int count = (int)collection.count();
> int next = context.getProperty(BATCH_SIZE).asInteger();
> int current = next;
> while(count >= current){
> try {
> final FindIterable it =
>
> collection.find().skip(current).limit(context.getProperty(LIMIT).asInteger());
>
> final MongoCursor cursor = it.iterator();
> try {
> FlowFile flowFile = null;
> while (cursor.hasNext()) {
> flowFile = session.create();
> flowFile = session.write(flowFile, new
> OutputStreamCallback() {
> @Override
> public void process(OutputStream out) throws
> IOException {
> IOUtils.write(cursor.next().toJson(), out);
> }
> });
>
> session.getProvenanceReporter().receive(flowFile,
> context.getProperty(URI).getValue());
> session.transfer(flowFile, REL_SUCCESS);
> }
>
> session.commit();
>
> } finally {
> cursor.close();
> }
> } catch (final RuntimeException e) {
> context.yield();
> session.rollback();
> }
> current = current + next;
> }
> }
>
>
> I also modified the test and abstracts so Maven would compile.
>
> Any thoughts?
>
> I'm trying to make a processor that can traverse over an entire collection
> in the millions; and later /any/ size.
>
> If anyone has already made one and can share, that'd be great too! Thanks!
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/GetMongo-GC-Overflow-tp7729.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Installing NIFI through ambari.

2016-02-26 Thread Bryan Bende
It may have something to do with the APP_ID in the metrics being
'nificluster'. It needs to line up with the APP_ID in the service. I would
recommend leaving it as the default value and see if that works.

On Friday, February 26, 2016, davi  wrote:

> I'm have similar situation . I have installed a NIFI cluster (1 NCM /3
> Nodes)with ambari .NIFI is running fine and data are sent to ambari metrics
> collector . I verified with the phoenix query , but still not showing on
> the
> ambari web page: Result of Phoenix query :   SELECT
> METRIC_NAME,SERVER_TIME,APP_ID FROM METRIC_RECORD WHERE
> APP_ID='nificluster'
> ORDER BY SERVER_TIME LIMIT
>
> 20;+--+---+|
> METRIC_NAME|   SERVER_TIME
>
> |+--+---+|
> jvm.daemon_thread_count  | 1456484338751
> || jvm.gc.runs.PSMarkSweep  | 1456484338751
> || jvm.gc.runs.PSScavenge   | 1456484338751
> || jvm.heap_usage   | 1456484338751
> || jvm.gc.time.PSMarkSweep  | 1456484338751
> || jvm.gc.time.PSScavenge   | 1456484338751
> || jvm.heap_used| 1456484338751
> || jvm.uptime   | 1456484338751
> || jvm.thread_states.blocked| 1456484338751
> || jvm.thread_states.runnable   | 1456484338751
> || jvm.file_descriptor_usage| 1456484338751
> || jvm.thread_states.timed_waiting  | 1456484338751
> || jvm.thread_states.terminated | 1456484338751
> || jvm.non_heap_usage   | 1456484338751
> || jvm.thread_count | 1456484338751
> || ActiveThreads| 1456484338751
> || BytesReadLast5Minutes| 1456484338751
> || BytesReceivedLast5Minutes| 1456484338751
> || FlowFilesQueued  | 1456484338751
> || BytesSentLast5Minutes| 1456484338751
>
> |+--+---+and
> ambari view :
> <
> http://apache-nifi-developer-list.39713.n7.nabble.com/file/n7644/nifiservice.png
> >
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Installing-NIFI-through-ambari-tp7194p7644.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.



-- 
Sent from Gmail Mobile


Re: [VOTE] Release Apache NiFi 0.5.1 (RC2)

2016-02-24 Thread Bryan Bende
+1 Release this package as nifi-0.5.1 (binding)

Ran through release helper and everything checked out.
Tested flows against HDFS with and without Kerberos enabled.

On Wed, Feb 24, 2016 at 11:22 PM, Joe Witt  wrote:

> +1 Release this package as nifi-0.5.1 (binding)
>
> Have done the full build verification and have it running on a
> standalone node and in a cluster w/security enabled including
> LDAP/uname/pword.  All working nicely.
>
> On Tue, Feb 23, 2016 at 9:32 PM, Tony Kurc  wrote:
> > Hello,
> > I am pleased to be calling this vote for the source release of Apache
> NiFi
> > nifi-0.5.1.
> >
> > The source zip, including signatures, digests, etc. can be found at:
> > https://repository.apache.org/content/repositories/orgapachenifi-1076
> >
> > The Git tag is nifi-0.5.1-RC2
> > The Git commit ID is 672211b87b4f1e52f8ee5153c26a467b555a331e
> >
> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=672211b87b4f1e52f8ee5153c26a467b555a331e
> >
> > This release candidate is a branch off of support/nifi-0.5.x at
> > e2005fa059fbe128e2e278cda5ed7a27ab6e1ec3
> >
> > Checksums of nifi-0.5.1-source-release.zip:
> > MD5: 9139aaae5d0a42a0fbbb624c2e739cdd
> > SHA1: 374a24354f54c7b6c04ba0897f8e2e0b9a6a5127
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/tkurc.asc
> >
> > KEYS file available here:
> > https://dist.apache.org/repos/dist/release/nifi/KEYS
> >
> > 12 issues were closed/resolved for this release:
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12334887
> >
> > Release note highlights can be found here:
> >
> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version0.5.1
> >
> > The vote will be open for 72 hours.
> > Please download the release candidate and evaluate the necessary items
> > including checking hashes, signatures, build from source, and test. Then
> > please vote:
> >
> > [ ] +1 Release this package as nifi-0.5.1
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because because...
> >
> > Thanks!
> > Tony
>


Re: Issue connecting to Oracle-HPSM using NIFI

2016-02-22 Thread Bryan Bende
Naveen,

Can you check nifi_home/logs/nifi-app.log to see if there is a full stack
trace beyond the message that says "unsupported feature"?

Would be curious to see what the full stack trace looks like.

-Bryan


On Mon, Feb 22, 2016 at 2:26 PM, Kilaru, Naveen Kumar <
naveenkumar.kil...@capitalone.com> wrote:

>
> Hi Apache-Nifi Dev Team,
>
>
> We are validating NIFI for our business solution and while we do a POC e
> are facing a strange issue using "nifi-1.1.1.0"
>
>
> We are trying to load the data from Oracle HPSM source system and we see
> below error message we are getting while we try to execute the flow.
>
>
> Query used : select query SELECT * FROM SERVICEMANAGER.DEVICE2M1
>
>
> Unable to execute SQL select query SELECT * FROM SERVICEMANAGER.DEVICE2M1
> due to org.apache.nifi.processor.exception.ProcessException:
> java.sql.SQLException: Unsupported feature. No incoming flow file to route
> to failure:
>
> org.apache.nifi.processor.exception.ProcessException:
> java.sql.SQLException: Unsupported feature
>
> Please help us with this issue as we would like to use NIFI in our tech
> stack for current implementation.
>
> Note: Same flow was executed successfully with the query  "select * from
> sample"
>
> Thanks,
> Naveen
> 7326667677
> 
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>


Re: Set default yield duration

2016-02-17 Thread Bryan Bende
Adam,

I think the idea was for the user to always control these values through
the user interface, and therefore there isn't a way that I know of for a
specific processor to control the default values for yield duration and run
schedule.

I have always wanted to be able to do this though, specifically for the run
schedule. The main scenario is a "Get" processor that is going to poll an
external system and extract data. In this case, a run schedule of 0 seconds
is almost never the desired value, and if a user runs the processor
forgetting to change the duration, they start killing the external system.

Curious to hear what others think.

-Bryan


On Wed, Feb 17, 2016 at 1:19 AM, Adam Lamar  wrote:

> Hey everyone,
>
> I've been working on a processor that yields often, and the default yield
> duration of 1 second is too short. Is there any way to change this default
> for a specific processor only?
>
> Alternatively, can the run schedule default duration be changed in a
> similar way?
>
> Cheers,
> Adam
>


Re: Nifi - Adding attributes to flow file results in FlowFileHandlingException when flow file is transferred

2016-02-10 Thread Bryan Bende
Hello,

The error message is indicating that you are trying to transfer an unknown
FlowFile because it is transferring a reference to the original FlowFile
before you updated the attributes. You would need to assign the result of
putAllAttributes (or putAttribute) and then transfer that:

flowFile = session.putAllAttributes(flowFile, attributes);

Thanks,

Bryan

On Wed, Feb 10, 2016 at 11:41 AM, M Singh 
wrote:

> Hi:
> I am processing some flow files and want to add success and failure
> attributes to the processed flow file and then transfer it.  But this is
> producing an exception:
> org.apache.nifi.processor.exception.FlowFileHandlingException:
> StandardFlowFileRecord[uuid=432bc163-28a0-4d08-b9e8-1674a649ae8c,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1455121175949-1, container=default,
> section=1], offset=0, length=1],offset=0,name=129790440390423,size=1] is
> not known in this session (StandardProcessSession[id=70]) at
> org.apache.nifi.controller.repository.StandardProcessSession.validateRecordState(StandardProcessSession.java:2361)
> ~[nifi-framework-core-0.5.0-SNAPSHOT.jar:0.5.0-SNAPSHOT]
> Here is the code segment in the onTrigger method:
> Note - If I comment out the lines (as shown below) where I tried to add
> attributes to the flow file and it works.  If I uncomment the lines (either
> adding single attributes or multiple, the exception is produced)
> try {List records = new ArrayList<>();
> // Prepare batch of recordsfor (int i = 0; i <
> flowFiles.size(); i++) {final ByteArrayOutputStream baos =
> new ByteArrayOutputStream();
> session.exportTo(flowFiles.get(i), baos);records.add(new
> Record().withData(ByteBuffer.wrap(baos.toByteArray(;}
> // Send the batchPutRecordBatchRequest
> putRecordBatchRequest = new PutRecordBatchRequest();
> putRecordBatchRequest.setDeliveryStreamName(streamName);
> putRecordBatchRequest.setRecords(records);PutRecordBatchResult
> results = client.putRecordBatch(putRecordBatchRequest);
> // Separate out the successful and failed flow files
>   List responseEntries =
> results.getRequestResponses();List failedFlowFiles =
> new ArrayList<>();List successfulFlowFiles = new
> ArrayList<>();for (int i = 0; i < responseEntries.size(); i++ )
> {PutRecordBatchResponseEntry entry =
> responseEntries.get(i);FlowFile flowFile = flowFiles.get(i);
> Map attributes = new HashMap<>();
>   attributes.put(RECORD_ID, entry.getRecordId());// NOTE - If I
> uncomment this line - or any other which adds attributes to the flowfile -
> i get the exception//
> session.putAttribute(flowFile,RECORD_ID, entry.getRecordId());
>   if ( ! StringUtils.isBlank(entry.getErrorCode()) ) {
> attributes.put(ERROR_CODE, entry.getErrorCode());
> attributes.put(ERROR_MESSAGE, entry.getErrorMessage());//
>   session.putAllAttributes(flowFile, attributes);
> failedFlowFiles.add(flowFile);} else {//
> session.putAllAttributes(flowFile, attributes);
> successfulFlowFiles.add(flowFile);}}
> if ( failedFlowFiles.size() > 0 ) {
> session.transfer(failedFlowFiles, REL_FAILURE);
> getLogger().error("Failed to send {} records {}", new Object[]{stream,
> failedFlowFiles});}
> if ( successfulFlowFiles.size() > 0 ) {// Throws exception
> when attributes are added to flow files
> session.transfer(successfulFlowFiles, REL_SUCCESS);
> getLogger().info("Success sent {} records {}", new Object[]{stream,
> successfulFlowFiles});}
> records.clear();


Re: Using NiFI with a Secured HBASE setup

2016-02-08 Thread Bryan Bende
Laxman,

The current HBase integration does not support Kerberized HBase installs at
the moment.

I created a JIRA to track this:
https://issues.apache.org/jira/browse/NIFI-1488

-Bryan

On Mon, Feb 8, 2016 at 10:36 AM,  wrote:

> Hi,
>
> I have configured a Hbase client and kinit on the machine the that NIFI is
> running on, but I get the following error:
>
> 016-02-08 15:30:12,475 WARN [pool-26-thread-1]
> o.a.hadoop.hbase.ipc.AbstractRpcClient Exception encountered while
> connecting to the server : javax.security.sasl.SaslException: GSS initiate
> failed [Caused by GSSException: No valid credentials provided (Mechanism
> level: Failed to find any Kerberos tgt)]
> 2016-02-08 15:30:12,476 ERROR [pool-26-thread-1]
> o.a.hadoop.hbase.ipc.AbstractRpcClient SASL authentication failed. The most
> likely cause is missing or invalid credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(Unknown
> Source) ~[na:1.8.0_71]
> at
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
> ~[hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:642)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:166)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:769)
> ~[hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:766)
> ~[hbase-client-1.1.2.jar:1.1.2]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_71]
> at javax.security.auth.Subject.doAs(Unknown Source) ~[na:1.8.0_71]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> ~[hadoop-common-2.6.2.jar:na]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:766)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:920)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:889)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1222)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:50918)
> [hbase-protocol-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1564)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1502)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1524)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1553)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1704)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3917)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:413)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:397)
> [hbase-client-1.1.2.jar:1.1.2]
> at
> org.apache.nifi.hbase.HBase_1_1_2_ClientService.onEnabled(HBase_1_1_2_ClientService.java:137)
> [nifi-hbase_1_1_2-client-service-0.4.1.jar:0.4.1]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[na:1.8.0_71]
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> ~[na:1.8.0_71]
> at 

Re: Using NiFI with a Secured HBASE setup

2016-02-08 Thread Bryan Bende
I should add that the HDFS processors do support communicating with a
kerberized HDFS.

There are properties on the processors to configure the keytab and
principal, as well as the following property in nifi.properties:

nifi.kerberos.krb5.file=


Thanks,

Bryan

On Mon, Feb 8, 2016 at 11:29 AM, Bryan Bende <bbe...@gmail.com> wrote:

> Laxman,
>
> The current HBase integration does not support Kerberized HBase installs
> at the moment.
>
> I created a JIRA to track this:
> https://issues.apache.org/jira/browse/NIFI-1488
>
> -Bryan
>
> On Mon, Feb 8, 2016 at 10:36 AM, <laxman.siy...@thomsonreuters.com> wrote:
>
>> Hi,
>>
>> I have configured a Hbase client and kinit on the machine the that NIFI
>> is running on, but I get the following error:
>>
>> 016-02-08 15:30:12,475 WARN [pool-26-thread-1]
>> o.a.hadoop.hbase.ipc.AbstractRpcClient Exception encountered while
>> connecting to the server : javax.security.sasl.SaslException: GSS initiate
>> failed [Caused by GSSException: No valid credentials provided (Mechanism
>> level: Failed to find any Kerberos tgt)]
>> 2016-02-08 15:30:12,476 ERROR [pool-26-thread-1]
>> o.a.hadoop.hbase.ipc.AbstractRpcClient SASL authentication failed. The most
>> likely cause is missing or invalid credentials. Consider 'kinit'.
>> javax.security.sasl.SaslException: GSS initiate failed
>> at
>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(Unknown
>> Source) ~[na:1.8.0_71]
>> at
>> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>> ~[hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:642)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:166)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:769)
>> ~[hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:766)
>> ~[hbase-client-1.1.2.jar:1.1.2]
>> at java.security.AccessController.doPrivileged(Native Method)
>> ~[na:1.8.0_71]
>> at javax.security.auth.Subject.doAs(Unknown Source) ~[na:1.8.0_71]
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>> ~[hadoop-common-2.6.2.jar:na]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:766)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:920)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:889)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1222)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:50918)
>> [hbase-protocol-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1564)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1502)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1524)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1553)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1704)
>> [hbase-client-1.1.2.jar:1.1.2]
>> at
>> org.apache

Re: Installing NIFI through ambari.

2016-02-07 Thread Bryan Bende
Hello,

The Apache NiFi community only provides an example service for viewing
metrics in Ambari [1]. This example does not actually install and manage
NiFi through Ambari, it only acts as placeholder to view  metrics.

-Bryan

[1] https://cwiki.apache.org/confluence/display/NIFI/Ambari+Metrics




On Sun, Feb 7, 2016 at 9:03 AM, pd  wrote:

> I'm trying to install NIFI on active hadoop cluster (2.3.2) through Ambari
> (2.1.2). When I go thorough the installation screen, I have an option to
> selct master node, but once I select that it moves to the "Customize
> Services" screen skipping the "Assign Slaves and Clients" selection screen
> Does anyone know the fix/workaround?
>
> Ambari node:
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/NIFI
> root@g1:NIFI> ls -ltr
> total 72
> -rwxr-xr-x 1 root 14940 Feb  6 14:24 widgets.json*
> -rwxr-xr-x 1 root 17908 Feb  6 14:24 README.md*
> -rwxr-xr-x 1 root  2141 Feb  6 14:24 nifi-bootstrap.xml*
> -rwxr-xr-x 1 root  4249 Feb  6 14:24 metrics.json*
> -rwxr-xr-x 1 root  1834 Feb  6 14:24 metainfo.xml*
> -rwxr-xr-x 1 root   221 Feb  6 14:24 kerberos.json*
> drwxr-xr-x 3 root  4096 Feb  6 14:24 package/
> drwxr-xr-x 2 root  4096 Feb  6 14:24 screenshots/
> drwxr-xr-x 2 root  4096 Feb  6 14:24 demofiles/
> drwxr-xr-x 2 root  4096 Feb  7 08:22 configuration/
>
>
> <
> http://apache-nifi-developer-list.39713.n7.nabble.com/file/n7194/nifi-issue-screen.png
> >
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Installing-NIFI-through-ambari-tp7194.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Does Nifi-ambari reporting supports Ambari version 2.1.1

2016-02-06 Thread Bryan Bende
Joe/Shweta,

I view the Ambari stuff as two different pieces... the AmbariReportingTask
in NiFi which was developed against Ambari 2.1, but I believe should work
against any 2.X version because the Ambari Metrics Service was introduced
in 2.0.0. The reporting task can send over metrics independent of the
service definition, there does not even need to be a NiFi service installed
in Ambari for the metrics to be sent over.

The service definition is what installs the service in Ambari and gives you
the dashboard so you can see the metrics. The service definition that is
provided on the Wiki page is an example service that works only with HDP
2.3. The directory structure would have to be modified for a different
version. It may be as simple as changing "stacks/HDP/2.3" to
"stacks/HDP/".

-Bryan


On Sat, Feb 6, 2016 at 11:11 AM, shweta  wrote:

> Yes Joe, we followed that same document.
>
>
> Regards,
> Shweta
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Does-Nifi-ambari-reporting-supports-Ambari-version-2-1-1-tp7178p7180.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: How to capture more than 40 groups in Extract Text

2016-02-03 Thread Bryan Bende
Hi Shweta,

You may want to consider a custom processor at this point.
The csv-to-json example works ok for smaller csv files, but admittedly is
not a great solution when there are a lot of columns.
There has been interest from the community in the past on having a
ConvertCsvToJson processor, but no one has taken on the task yet [1].

-Bryan

[1] https://issues.apache.org/jira/browse/NIFI-1398


On Tue, Feb 2, 2016 at 11:40 PM, shweta  wrote:

> Hi All,
>
> I have requirement wherein I need to convert a csv file to JSON. The input
> csv file has 135 attributes.
> I referred to nifi example template csv-to-json.xml which uses a
> combination
> of replaceText and ExtractText processor.
> But I think ExtractText has limitation of capturing not more that 40
> groups.
> Is there a way around to handle this scenario.
>
> Regards,
> Shweta
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-capture-more-than-40-groups-in-Extract-Text-tp7115.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Are we thinking about Penalization all wrong?

2016-01-28 Thread Bryan Bende
I really like the idea of being able to have different penalty durations
for each connection, and I think this would make things a bit clearer then
relying on the developer to determine when to penalize things.

On Thu, Jan 28, 2016 at 11:34 AM, Mark Payne  wrote:

> All,
>
> I've been thinking about how we handle the concept of penalizing
> FlowFiles. We've had a lot of questions
> lately about how penalization works & the concept in general. Seems the
> following problems exist:
>
> - Confusion about difference between penalization & yielding
> - DFM sees option to configure penalization period on all processors, even
> if they don't penalize FlowFiles.
> - DFM cannot set penalty duration in 1 case and set a different value for
> a different case (different relationship, for example).
> - Developers often forget to call penalize()
> - Developer has to determine whether or not to penalize when building a
> processor. It is based on what the developer will
> think may make sense, but in reality DFM's sometimes want to penalize
> things when the processor doesn't behave that way.
>
> I'm wondering if it doesn't make sense to remove the concept of
> penalization all together from Processors and instead
> move the Penalty Duration so that it's a setting on the Connection. I
> think this would clear up the confusion and give the DFM
> more control over when/how long to penalize. Could set to the default to
> 30 seconds for self-looping connections and no penalization
> for other connections.
>
> Any thoughts?
>
> Thanks
> -Mark


Re: ListenLumberjack processor is working

2016-01-15 Thread Bryan Bende
Andre,

Very cool that you have made progress here. Being able to integrate with
logstash will be very useful.

I think the refactoring I'm doing for the RELP stuff should help reduce the
amount of code that had to be carried over from ListenSyslog. I'm happy to
help you update your code once my changes are in. Sorry it hasn't gotten in
sooner.

-Bryan


On Fri, Jan 15, 2016 at 8:39 AM, Andre  wrote:

> Hey folks,
>
> I've managed to progress on ListenLumberjack. The code is a bit
> 'spaghettic' at the moment, with some serious amount of logger. enabled
> to allow some additional troubleshooting, but overall it "works".
>
> I am strongly considering refactor the code as whole once Bryan completes
> the ListenRELP processor.
>
> Functional code (I guess? :D ) should be available in here:
>
> https://github.com/trixpan/nifi-lumberjack-bundle/
>
> Known issues:
> * If logstash-forwarder goes silent for too long the processor will raise a
> Timeout. Couldn't find evidence of a keep alive within Lumberjack so I am
> considering catching this error as debug.
> * I suspect the code may have some memory leaks.
> * Tests haven't been created yet. To be honest I never wrote unit tests in
> my whole life so it will be another ride. :-)
>
> My results were the following:
>
> Single thread, 2 sec runs
> 2016/01/15 23:52:27.589014 Registrar: processing 4000 events
> 2016/01/15 23:52:29.169361 Registrar: processing 4000 events
> 2016/01/15 23:52:30.552031 Registrar: processing 4000 events
> 2016/01/15 23:52:32.998425 Registrar: processing 4000 events
> 2016/01/15 23:52:35.411438 Registrar: processing 4000 events
> 2016/01/15 23:52:37.062141 Registrar: processing 4000 events
> 2016/01/15 23:52:39.468577 Registrar: processing 4000 events
> 2016/01/15 23:52:40.940890 Registrar: processing 4000 events
> 2016/01/15 23:52:43.480875 Registrar: processing 4000 events
> 2016/01/15 23:52:45.026758 Registrar: processing 4000 events
>
> 4 threads, 2 sec runs
> 2016/01/15 23:56:03.376303 Registrar: processing 4000 events
> 2016/01/15 23:56:03.443074 Registrar: processing 4000 events
> 2016/01/15 23:56:03.471795 Registrar: processing 4000 events
> 2016/01/15 23:56:03.508283 Registrar: processing 4000 events
> 2016/01/15 23:56:03.534002 Registrar: processing 4000 events
> 2016/01/15 23:56:03.562387 Registrar: processing 4000 events
> 2016/01/15 23:56:03.587744 Registrar: processing 4000 events
> 2016/01/15 23:56:03.622716 Registrar: processing 4000 events
> 2016/01/15 23:56:03.649074 Registrar: processing 4000 events
> 2016/01/15 23:56:03.675780 Registrar: processing 4000 events
>
> Would anyone have a decent logstash testbed to put some extra pressure
> against the processor?
>


Re: Direction for Integration Tests

2016-01-13 Thread Bryan Bende
I also like Mans categories, as I know I have personally written some unit
tests that don't require any external dependencies, but might have veered
closer to an integration test than a true unit test.
Some examples I am thinking of are the tests for Put/GetSolr which use and
EmbeddedSolrServer, or TestListenSyslog which actually starts listening on
a port and receives messages over a socket.
I would still want those tests to run as part of the normal build, which I
think is what Mans is suggesting.

I'm not sure if the "no external dependencies" would always hold true as a
rule though. I could see there being some more complex tests between
internal components that might not have external
dependencies, but still makes sense to have in the IT part of the build. As
an example, if we ever create a mechanism to connect two processors
together and test the flow between them, this feels
more like a true integration test, yet it might not have any external
dependencies. So maybe it is a case by case basis.



On Wed, Jan 13, 2016 at 3:02 PM, Joe Skora  wrote:

> I like that Mans' categories.
>
> Logically, I think the 2nd and 3rd categories fold together with "no
> dependency" integration tests being another case alongside "hadoop
> dependency", "jms dependency", "aws dependency", etc.
>
> Then I would expect that
>
>- every component should have unit tests,
>- most components (if not every) should have "no dependency" integration
>tests, and
>- externally connected components should have "external dependency"
>integration tests.
>
>
>
> On Wed, Jan 13, 2016 at 2:39 PM, M Singh 
> wrote:
>
> > Hi:
> > My thought is that we can classify tests into 3 major categories -
> >- Pure unit test - class isolated using mocks etc
> >- Integration with no external dependencies - some interaction with
> > other classes/in memory db, or mocks, but no external resources.
> >- Integration with external resource upto and including end-to-end -
> > Which require real external resources (http endpoint, aws etc)
> >
> > In this way, we can always keep unit and integration w/o ext dependencies
> > tests while conditionally enable integration tests.
> > Mans
> >
> > On Wednesday, January 13, 2016 11:24 AM, Aldrin Piri <
> > aldrinp...@gmail.com> wrote:
> >
> >
> >  Definitely agree and can see value in those as well.  The core issue I
> am
> > trying to address with this is that as highlighted by Joe.
> >
> > As an intermediate step, I have created a PR [1] that shows how I
> envision
> > a baseline of this.  Currently this treats all ITs (in this external
> > resource context) the same without separate profiles.  What I am trying
> to
> > do is not impede a place for unit tests (or, perhaps more generically,
> > those that can run as part of the build).
> >
> > [1] https://github.com/apache/nifi/pull/173
> >
> > On Wed, Jan 13, 2016 at 8:19 AM, Oleg Zhurakousky <
> > ozhurakou...@hortonworks.com> wrote:
> >
> > > Aldrin
> > >
> > > IMHO there are two types of integration testing; 1. Integration
> testing a
> > > Processor/ControllerService with the actual target system, 2.
> Integration
> > > testing of the flow or part of the flow.
> > > The second one essentially is the same as the first one but the target
> > > system is being NiFi itself and we are seriously lacking on that type
> of
> > > testing since NiFi is highly modularized and there is not a single
> module
> > > where such testing could be performed. As I’ve mentioned earlier in
> this
> > > list, I’ve already started such module on my fork
> > > https://github.com/olegz/nifi/tree/int-test which allowed me already
> to
> > > discover a few bugs and I think we need to start talking about pushing
> it
> > > to the trunk.
> > >
> > > So what I am essentially saying is that while profile and other maven
> > > tricks may set a convention to allow one to provide "conditional target
> > > system testing” (which is essentially what you mean by integration test
> > > with AWS etc), we still need to think about automating internal NiFi
> > > integration testing.
> > >
> > > Cheers
> > > Oleg
> > >
> > >
> > > On Jan 12, 2016, at 10:26 PM, Aldrin Piri   > > aldrinp...@gmail.com>> wrote:
> > >
> > > All,
> > >
> > > We have had a few conversations regarding integration tests in the past
> > In
> > > part, this issue has resurfaced thanks to the work going on with
> > NIFI-1325
> > > in extending AWS related processors and extensions.  The issue at hand
> is
> > > that, In this particular case, the contributor made test cases to
> > > supplement their contribution but, because the integration level test
> > > classes are ignored, had to create separate unit test classes to get
> > these
> > > tests to run.  AbstractS3Test lays the ground work quite appropriately
> > for
> > > integration testing, but doesn't so much apply to unit tests where
> > > possible.
> > >
> > > What I 

Syslog Classes

2016-01-05 Thread Bryan Bende
All,

I'm working on NIFI-1273 to add support for the RELP protocol (Reliable
Event Logging Protocol) to the syslog processors. In order to do this I'll
likely have to add at least one more channel reader implementation to the
inner classes that already exist in ListenSyslog. I'm starting to think
there might be a bit too much going on in there and it might be easier to
manage and understand if the inner classes were moved to regular classes.

If we agree that is a good idea, then the question is where to put them
In hindsight it probably would have been better to have a syslog bundle,
instead of putting the syslog processors in the nifi-standard-processors,
then all of these classes could live there. The processors don't have any
special dependencies which is why the standard bundle initially seemed like
a good idea.

Since we have to be careful of breaking changes, the options I see are:

1) Keep the syslog processors in nifi-standard-processors, and put these
classes under the util package where SyslogParser and SyslogEvent are.
Maybe create org.apache.nifi.processors.standard.util.syslog to group them
together under util.

2) Keep the syslog processors in nifi-standard-processors, but create a
nifi-syslog-utils project in nifi-commons and put all supporting code
there. I doubt that any other parts of NiFi would need to make use of this
artifact, but it would create a nice isolated syslog library. I think we
could safely move most of the inner classes there since they are private,
but not sure if we can move SyslogParser and SyslogEvent yet since they are
public classes in standard processors.

3) Create a syslog bundle with copies of the processors, do all new work
there, including NIFI-1273. Mark the existing processors as deprecated and
remove on 1.0. Seems unfortunate to deprecate processors one release after
releasing them, and would force anyone wanting RELP to switch to the new
bundle, but seems to be the only way to create a separate bundle if that is
what we wanted.

What do others think about this?

#1 is obviously the least intrusive and easiest, but I'm not sure it is the
best choice, especially given that we want to move to an extension registry
eventually, and would probably want to break apart some of standard
processors.

#2 might be a good middle ground. Leaving the processor part for another
time.

-Bryan


Re: Syslog Classes

2016-01-05 Thread Bryan Bende
Sounds good to me. #1 does address the immediate problem as you mentioned.

Since the util package has a lot of stuff in it, my preference would be to
have a syslog package under util, or even at the same level as util. If we
did that, would moving SyslogParser and SyslogEvent to that package be
considered a breaking change?
Seems "less breaking" than moving a processor to a new package which could
break someone's flow, but still possible someone is using one of those
classes since they are public.
I can move the other stuff and leave those two alone, but just wanted to
double-check.

-Bryan

On Tue, Jan 5, 2016 at 4:11 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Bryan
>
> Great writeup on the tradeoffs.  From my perspective #1 seems quite
> fine for now.  I see no need to create new processors and it seems
> like the only problem to be solved right now is to make the code
> cleaner/more readable.  #1 it sounds like solves that.  There is
> perhaps another topic to address one day which is the grouping of
> processors within a bundle.  Syslog as its own nar was probably the
> right call but this is just fine for now.  When we go with a registry
> model then we will revisit these and many others anyway.
>
> Thanks
> Joe
>
> On Tue, Jan 5, 2016 at 10:22 AM, Bryan Bende <bbe...@gmail.com> wrote:
> > All,
> >
> > I'm working on NIFI-1273 to add support for the RELP protocol (Reliable
> > Event Logging Protocol) to the syslog processors. In order to do this
> I'll
> > likely have to add at least one more channel reader implementation to the
> > inner classes that already exist in ListenSyslog. I'm starting to think
> > there might be a bit too much going on in there and it might be easier to
> > manage and understand if the inner classes were moved to regular classes.
> >
> > If we agree that is a good idea, then the question is where to put
> them
> > In hindsight it probably would have been better to have a syslog bundle,
> > instead of putting the syslog processors in the nifi-standard-processors,
> > then all of these classes could live there. The processors don't have any
> > special dependencies which is why the standard bundle initially seemed
> like
> > a good idea.
> >
> > Since we have to be careful of breaking changes, the options I see are:
> >
> > 1) Keep the syslog processors in nifi-standard-processors, and put these
> > classes under the util package where SyslogParser and SyslogEvent are.
> > Maybe create org.apache.nifi.processors.standard.util.syslog to group
> them
> > together under util.
> >
> > 2) Keep the syslog processors in nifi-standard-processors, but create a
> > nifi-syslog-utils project in nifi-commons and put all supporting code
> > there. I doubt that any other parts of NiFi would need to make use of
> this
> > artifact, but it would create a nice isolated syslog library. I think we
> > could safely move most of the inner classes there since they are private,
> > but not sure if we can move SyslogParser and SyslogEvent yet since they
> are
> > public classes in standard processors.
> >
> > 3) Create a syslog bundle with copies of the processors, do all new work
> > there, including NIFI-1273. Mark the existing processors as deprecated
> and
> > remove on 1.0. Seems unfortunate to deprecate processors one release
> after
> > releasing them, and would force anyone wanting RELP to switch to the new
> > bundle, but seems to be the only way to create a separate bundle if that
> is
> > what we wanted.
> >
> > What do others think about this?
> >
> > #1 is obviously the least intrusive and easiest, but I'm not sure it is
> the
> > best choice, especially given that we want to move to an extension
> registry
> > eventually, and would probably want to break apart some of standard
> > processors.
> >
> > #2 might be a good middle ground. Leaving the processor part for another
> > time.
> >
> > -Bryan
>


Re: about the debug of nifi

2016-01-05 Thread Bryan Bende
Hello,

The nifi-ide-integration project will not have any src, it just generates
the project files for Eclipse or IntelliJ (in your case the .iml, .ipr, and
.iws files).
You will need to have the nifi source code somewhere else, and the
remainder of the instructions here [1] explain how to link them together.

-Bryan

[1] https://github.com/olegz/nifi-ide-integration/blob/master/README.md

On Tue, Jan 5, 2016 at 1:51 AM, 522250...@qq.com <522250...@qq.com> wrote:

>
> i want to debug nifi ,and i find the guidence in wiki.Then i use gradle
> following https://github.com/olegz/nifi-ide-integration/.  But after i
> run the command "./gradlew clean idea"  ,i just can not find any source in
> the folder "
> nifi-ide-integration
> "  Here is the total contents of my folder   "
>
> .:
> build.gradle  gradle  gradlew  gradlew.bat  nifi-ide-integration.iml
> nifi-ide-integration.ipr  nifi-ide-integration.iws  README.md
> settings.gradle  src
>
> ./gradle:
> wrapper
>
> ./gradle/wrapper:
> gradle-wrapper.jar  gradle-wrapper.properties
>
> ./src:
> main
>
> ./src/main:
> resources
>
> ./src/main/resources:
> log4j.properties
>
>
> "
> Then how will i do if i want to debug nifi.
> Thanks for your any reply.
>
>
> 522250...@qq.com
>


Re: Syslog Classes

2016-01-05 Thread Bryan Bende
If it helps at all, here is a possible refactoring based on making a syslog
package under org.apache.nifi.processors.standard, and moving the parser
and event classes there as well:

https://github.com/bbende/nifi/tree/NIFI-1273/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/syslog


On Tue, Jan 5, 2016 at 6:14 PM, Tony Kurc <trk...@gmail.com> wrote:

> Excellent questions, Bryan. I'll give it some thought tonight.
>
> On Tue, Jan 5, 2016 at 11:22 AM, Bryan Bende <bbe...@gmail.com> wrote:
>
> > All,
> >
> > I'm working on NIFI-1273 to add support for the RELP protocol (Reliable
> > Event Logging Protocol) to the syslog processors. In order to do this
> I'll
> > likely have to add at least one more channel reader implementation to the
> > inner classes that already exist in ListenSyslog. I'm starting to think
> > there might be a bit too much going on in there and it might be easier to
> > manage and understand if the inner classes were moved to regular classes.
> >
> > If we agree that is a good idea, then the question is where to put
> them
> > In hindsight it probably would have been better to have a syslog bundle,
> > instead of putting the syslog processors in the nifi-standard-processors,
> > then all of these classes could live there. The processors don't have any
> > special dependencies which is why the standard bundle initially seemed
> like
> > a good idea.
> >
> > Since we have to be careful of breaking changes, the options I see are:
> >
> > 1) Keep the syslog processors in nifi-standard-processors, and put these
> > classes under the util package where SyslogParser and SyslogEvent are.
> > Maybe create org.apache.nifi.processors.standard.util.syslog to group
> them
> > together under util.
> >
> > 2) Keep the syslog processors in nifi-standard-processors, but create a
> > nifi-syslog-utils project in nifi-commons and put all supporting code
> > there. I doubt that any other parts of NiFi would need to make use of
> this
> > artifact, but it would create a nice isolated syslog library. I think we
> > could safely move most of the inner classes there since they are private,
> > but not sure if we can move SyslogParser and SyslogEvent yet since they
> are
> > public classes in standard processors.
> >
> > 3) Create a syslog bundle with copies of the processors, do all new work
> > there, including NIFI-1273. Mark the existing processors as deprecated
> and
> > remove on 1.0. Seems unfortunate to deprecate processors one release
> after
> > releasing them, and would force anyone wanting RELP to switch to the new
> > bundle, but seems to be the only way to create a separate bundle if that
> is
> > what we wanted.
> >
> > What do others think about this?
> >
> > #1 is obviously the least intrusive and easiest, but I'm not sure it is
> the
> > best choice, especially given that we want to move to an extension
> registry
> > eventually, and would probably want to break apart some of standard
> > processors.
> >
> > #2 might be a good middle ground. Leaving the processor part for another
> > time.
> >
> > -Bryan
> >
>


Re: Standing a REST interface on GetMongo processor

2015-12-23 Thread Bryan Bende
Hello,

You are correct that GetMongo is currently a source processor that does not
accept input.

I don't know of any existing plans for it to support incoming FlowFIles,
but it sounds like a useful feature and would be a good request to make in
JIRA [1].

-Bryan

[1] https://issues.apache.org/jira/browse/NIFI


On Wed, Dec 23, 2015 at 1:48 PM, AUDET Frederic <
frederic.au...@ca.thalesgroup.com> wrote:

> Hi guys,
>
> Is there a plan to allow incoming flow (properties) parameterize a
> GetMongo processor?
>
> I'd like to stand a REST interface (HandleHttpRequest) to grap query
> parameters for a MongoDB; it look likes the GetMongo processor is top level
> in a flow diagram...
> -Fred
>
>
> 
> AVIS IMPORTANT : Ce courriel est strictement r?serv? ? l'usage de la (des)
> personne(s) ? qui il est adress?. Ce courriel peut contenir de
> l'information confidentielle. Aussi, toute divulgation, distribution,
> copie, ou autre utilisation de ce courriel par une autre personne que son
> destinataire est strictement prohib?e. Si vous n'?tes pas le destinataire
> ou un employ? ou un agent responsable de remettre ce message ? son
> destinataire, veuillez s'il vous pla?t communiquer imm?diatement avec
> l'exp?diteur et d?truire ce courriel ainsi que toute copie que vous
> pourriez d?tenir. ATTENTION : Les courriels ne sont pas s?curitaires s'ils
> ne sont pas crypt?s de fa?on appropri?e. Il est possible que des courriels
> soient intercept?s ou affect?s par un virus. Bien qu'une surveillance
> antivirus soit effectu?e ? l'?gard de tous nos courriels, nous n'acceptons
> aucune responsabilit? pour des virus ou tout autre ?l?ment de dommage
> pouvant ?tre introduits avec ce message.
>
> WARNING : This message is intended only for the named recipients. This
> message may contain information that is confidential. Any dissemination,
> copying, or use of this message or its contents by anyone other than the
> named recipient is strictly prohibited. If you are not a named recipient or
> an employee or agent responsible for delivering this message to a named
> recipient, please notify the sender immediately, and immediately destroy
> this message and any copies you may have. WARNING: Email may not be secure
> unless properly encrypted. It is possible for e-mails to be intercepted or
> affected by viruses. While we maintain virus checks on all e-mails, we
> accept no liability for viruses or other material introduced with this
> message.
>


Re: [VOTE] Release Apache NiFi 0.4.1 (rc1)

2015-12-21 Thread Bryan Bende
+1 (binding) Release this package as Apache NiFi 0.4.1

On Sat, Dec 19, 2015 at 4:55 PM, Tony Kurc  wrote:

> I've verified the signature, hashes, built on Ubuntu 14.04 (x86_64) with
> Oracle java 7. Built binaries worked as expected.
>
> Licence and notice look good
>
> +1
> +1 (binding)
>
> Went through the RC validation procedure.  All looks good.  Ran on a
> single node + cluster + SSL + LDAP and went through pretty much all of
> the bugs and verified behavior.
>
> Thanks
> Joe
>
> On Sat, Dec 19, 2015 at 12:59 PM, Matt Gilman 
> wrote:
> > [X] +1 Release this package as Apache NiFi 0.4.1 (binding)
> >
> > On Sat, Dec 19, 2015 at 10:45 AM, Joe Witt  wrote:
> >
> >> Hello NiFi Community,
> >>
> >> I am pleased to be calling this vote for the source release of Apache
> >> NiFi 0.4.1.
> >>
> >> The source zip, including signatures, digests, and associated
> >> convenience binaries can be found at
> >>   https://dist.apache.org/repos/dist/dev/nifi/nifi-0.4.1/
> >>
> >> The staged maven artifacts of the build can be found at
> >>   https://repository.apache.org/content/repositories/orgapachenifi-1067
> >>
> >> The Git tag is nifi-0.4.1-RC1
> >> The Git commit ID is d624ea48665041ffd9aa4f246b374bde10cfc878
> >>
> >>
>
> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=d624ea48665041ffd9aa4f246b374bde10cfc878
> >>
> >> Checksums of NiFi 0.4.0 Source Release
> >> MD5: 0e6e4dcf079a771d381b2031b4b89de1
> >> SHA1: 045466c7efed5331119eff0b84bbe29ce7d9baa3
> >>
> >> Release artifacts are signed with the following key
> >>   https://people.apache.org/keys/committer/joewitt.asc
> >>
> >> KEYS file available here
> >>   https://dist.apache.org/repos/dist/release/nifi/KEYS
> >>
> >> 17 issues were resolved for this release
> >>
> >>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12334375
> >>
> >> Release note highlights
> >>
> >>
>
> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version0.4.1
> >>
> >> Migration/Upgrade guidance
> >>   https://cwiki.apache.org/confluence/display/NIFI/Migration+Guidance
> >>   https://cwiki.apache.org/confluence/display/NIFI/Upgrading+NiFi
> >>
> >> The vote will be open for 72 hours.
> >> Please download the release candidate and evaluate the necessary items
> >> including checking hashes, signatures, build from source, and test.
> >>
> >> Then please vote:
> >>
> >> [ ] +1 Release this package as Apache NiFi 0.4.1
> >> [ ] +0 no opinion
> >> [ ] -1 Do not release this package because...
> >>
>


Re: Cluster Setup

2015-12-18 Thread Bryan Bende
Hello,

You can run the NCM on the same machine as a node, but they still have to
be two separate processes. You would create two copies of the directory
where you extracted nifi and set one to be the manager and one to be the
node, you'll also have to give them different nifi.web.http.port values.

For the second error, you could try stopping the instance that is the node
and remove the flow.xml.gz that is under conf, and then start it again to
have it pull the flow from the NCM.

-Bryan

On Fri, Dec 18, 2015 at 10:50 AM, plj  wrote:

> Howdy,
>
>   I'm trying to set up a cluster for the 1st time.  I 1st tried to setup a
> NCM and a node on the same machine.  In nifi.properties I set:
> nifi.web.http.port=8081
> nifi.cluster.is.node=true
> nifi.cluster.node.address=
> nifi.cluster.node.protocol.port=8083
> nifi.cluster.is.manager=true
> nifi.cluster.manager.address=
> nifi.cluster.manager.protocol.port=8082
>
> I got the error:
> 2015-12-18 11:14:56,827 ERROR [NiFi logging handler] org.apache.nifi.StdErr
> Failed to start web server: ...
> nested exception is java.lang.IllegalStateException: Application may be
> configured as a cluster manager or a node, but not both.
>
> What did I do wrong?  The admin guild says :
> "it is also perfectly fine to install the NCM and one of the nodes on the
> same server, as the NCM is very lightweight".
>
> So I decided I could figure that out later and set
> nifi.cluster.is.node=false
> I started my NCM and then started a node on another machine.  I got the
> following error.
>
> 2015-12-18 11:06:45,112 INFO [Handle Controller Startup Failure Message
> from
> [id=24b57ab2-1853-4814-8c31-4e467d3364e3, apiAddress=localhost,
> apiPort=8081, socketAddress=localhost, socketPort=8083]]
> o.a.n.c.manager.impl.WebClusterManager Node Event:
> [id=24b57ab2-1853-4814-8c31-4e467d3364e3, apiAddress=localhost,
> apiPort=8081, socketAddress=localhost, socketPort=8083] -- 'Node could not
> join cluster because it failed to start up properly. Setting node to
> Disconnected. Node reported the following error: Failed to connect node to
> cluster because local flow is different than cluster flow.'
>
> How do I get the cluster flow and the local flow to be the same?
>
> thank you
>
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Cluster-Setup-tp5853.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Testing handling of static class methods

2015-12-18 Thread Bryan Bende
If you get it into a protected instance method, you can also make an inner
class in your test, something like TestablePutJMS extends PutJMS, and
overrides that method to return a mock or whatever you want. That is a
common pattern in a lot of the processor tests.

On Fri, Dec 18, 2015 at 3:44 PM, Matt Burgess  wrote:

> You could move the one static call into an instance method of PutJMS, and
> use Mockito.spy() to get a partial mock of the processor, then use when()
> to override the instance method in the test. Not sure if that's how it's
> done in other places but it's worked for me in the past.
>
> Regards,
> Matt
>
> Sent from my iPhone
>
> > On Dec 18, 2015, at 3:20 PM, Joe Skora  wrote:
> >
> > For unit testing, one problem I've run into is overriding the returns
> from
> > static class methods.
> >
> > For instance, PutJMS contains this code:
> >
> > try {
> >>wrappedProducer = JmsFactory.createMessageProducer(context, true);
> >>logger.info("Connected to JMS server {}",
> >>new Object[]{context.getProperty(URL).getValue()});
> >> } catch (final JMSException e) {
> >>logger.error("Failed to connect to JMS Server due to {}", new
> >> Object[]{e});
> >>session.transfer(flowFiles, REL_FAILURE);
> >>context.yield();
> >>return;
> >> }
> >
> > where JmsFactory.createmessageProducer call being defined as
> >
> > public static WrappedMessageProducer createMessageProducer(...
> >
> > which presents a problem since it can't be easily overridden for a unit
> > test.  Exercising the
> >
> > How you handle this problem?
> >
> > Regards,
> > Joe
>


Re: Facing Issue while connecting with HDFS

2015-12-15 Thread Bryan Bende
Hello,

By default NiFi will interact with HDFS as the user that started the NiFi
process. If you can only access HDFS as the superuser then you will
probably need to start NiFi as that user.

There is a "run.as" property in conf/bootstrap.conf where you can specify a
username to run NiFi as, but I'm not sure that is necessary here since it
sounds like you need to run it as superuser.

-Bryan

On Sun, Dec 13, 2015 at 11:49 PM, digvijayp 
wrote:

> Hi Bryan,
>
> I am saying that in our HDFS cluster specific user have the
> permission(superuser) to access it.So when ever we have to do any operation
> on cluster we need to first login by our id and then superuser and then we
> can access the cluster.
> When I trying to connect to NIFI with HDFS cluster it is trying to login by
> my id and then not with the superuser .Due to which I am getting the
> authentication issue,so I need to provide superuser credential in the nifi
> so that it will first login and then do connection.not sure where do I add
> these detail in the nifi?
>
> Thanks for support.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5769.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: How to iterate through complex JSON objects.

2015-12-15 Thread Bryan Bende
As an alternative approach, could you use SplitJSON first to split on the
items array?

You would get a FlowFile for each item, then when you use EvaluateJSONPath
you would be dealing with only a single FlowFile so you could extract the
id and title and use ReplaceText like you were already doing.

Then use MergeContent at the end to merge them back together, or depending
what you are doing maybe they don't need to be merged.

On Tue, Dec 15, 2015 at 3:27 AM, shweta  wrote:

> Just figured out that by specifying the Return Type as Json in
> "EvaluateJsonPath" processor I got the entire array of values. So for JSON
> path expression  "$.item.*.id","$.item.*.title" , I got
> ["2233","2232","2231"],["testing with Java","testing with Java","testing
> with Java"]
> I'm just trying to figure out how I can transpose it and instead get
> something like this
>
> 2233, "testing with Java"
> 2232, "testing with Java"
> 2231, "testing with Java"
>
> to generate my desired csv.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5791.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Moving BinFiles

2015-12-13 Thread Bryan Bende
Joe,

Thanks for bringing this up. I remember running into this issue a long time
ago when trying to extend BinFiles from a processor in another NAR.

I'm wondering if there could be another artifact in nifi-standard-bundle
like nifi-standard-processor-utils where stuff like this could go? There is
already a nifi-processor-utils artifact in nifi-commons, but not sure if it
makes sense to push an abstract processor all the way back there.

Curious what others think.

-Bryan

On Fri, Dec 11, 2015 at 9:33 PM, Joe Gresock  wrote:

> In NIFI-305, we refactored a parent class for MergeContent so the binning
> functionality could be reused.  In practice, this isn't quite useful yet,
> because BinFiles is still in the nifi-standard-processors, which contains
> an org.apache.nifi.processor.Processor file.  Therefore, any processor that
> tries to extend BinFiles must automatically pull in all the processors in
> nifi-standard-processors, which is probably undesirable if it's an
> extension.
>
> Is there anywhere in the code we could put parent classes like this for
> extensibility purposes?
>
> Thanks,
> Joe
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can do
> all this through him who gives me strength.*-Philippians 4:12-13*
>


Re: Facing Issue while connecting with HDFS

2015-12-11 Thread Bryan Bende
Digvijay,

Are you talking about Kerberos authentication to the HDFS cluster?

If so, in nifi.properties you specify your krb5.conf file:

nifi.kerberos.krb5.file=/etc/krb5.conf  (or wherever your conf file is)

 This is the file that would have your realms defined. Then on the HDFS
processors your
properties would be something like:

Kerberos Principal = myprinicpal@MYREALM
Kerberos Keytab = /etc/security/keytabs/myprinciapl.keytab

-Bryan


On Fri, Dec 11, 2015 at 6:00 AM, digvijayp 
wrote:

> Hi Bryan,
>
> Thanks for nice info...It would really help me.
> Facing one issue while designing putHDFS process.After putting the putHDFS
> configuration details I am getting the error mentioning security
> authentication .After the analysis I found that these is due to permission
> of userid.
> So where can we put the userid and password so that when it connect to hdfs
> first login by that id and then pulls back data.As I know it should be done
> in the nifi.properties file.But can you please share where I can add these
> details?
>
> Thanks
> Digvijay P.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5720.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Facing Issue while connecting with HDFS

2015-12-10 Thread Bryan Bende
Site-to-Site is a direct connection between NiFi instances/clusters over a
socket, so TCP based.

There will always have to be at least one local machine involved. When NiFi
pulls/receives data from somewhere, it takes that data under control and
stores it in the NiFi content repository on disk (configured in
nifi.properties). As a FlowFile moves through the flow, a pointer to this
content is being passed around until it needs to be accessed. So when
PutHDFS needs to send to the other cluster it would read the content and
send to the other HDFS. The data would then eventually age-off from the
NiFi content repository depending how it is configured. So it would not
have to hold all of the data on the local machine, but it would always have
some portion of the most recent data that has been moved across.

Let us know if this doesn't make sense.

-Bryan




On Thu, Dec 10, 2015 at 1:52 AM, digvijayp 
wrote:

> Hi Bryan,
> So in edge node approach how data sent in site-to-site ?I mean to say is it
> using any protocol to transfer it like FTP,SFTP.
> As you are saying If both clusters can fully talk to each other than you
> don't need this edge node approach, you could just have a NiFi instance, or
> cluster, that pulls from one HDFS and pushes to the other.
> so my query is we have to use FetchHDFS/getHDFS process which get data from
> HDFS to local machine and putHDFS process which load data from local
> machine
> to HDFS.I dont have yo use the local machin in between .So how can we
> manage
> the transfer data without using local machine? Where can we do such
> configuration in nifi?
>
> Thanks in advance.
>
> Digvijay P.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5712.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Facing Issue while connecting with HDFS

2015-12-09 Thread Bryan Bende
It doesn't necessarily have to be on the same machine, but the machine NiFi
is on would have to be able to communicate with the name-node and
data-nodes in order to push/pull data to/from HDFS. In your example this
would mean your local machine would need to be able to access the name-node
and data-node on your VM.

In the cluster to cluster scenario... If each cluster is mostly closed off
from a networking perspective, you could potentially have edge nodes on
each cluster that were able to reach each other. Each of those edge nodes
could run a NiFi instance, and the two NiFi instances could talk directly
to each other. The first one would use List/FetchHDFS and would have to be
able to communicate with cluster #1, it would send data via site-to-site to
the second instance which would use PutHDFS and have to be able to
communicate with cluster #2. If both clusters can fully talk to each other
than you don't need this edge node approach, you could just have a NiFi
instance, or cluster, that pulls from one HDFS and pushes to the other.

As far as comparing to distcp, keep in mind that distcp launches a
map-reduce job to perform a heavily parallelized copy, this would work a
little different in NiFi. If you had a NiFi cluster you could scale it so
each node in the cluster was pulling data, otherwise with a single instance
it would be limited to how much processing that instance can perform.

Hope this helps.

-Bryan

On Wed, Dec 9, 2015 at 12:01 AM, digvijayp 
wrote:

> Thanks for the responce Bryan 
>
> I am getting the same error when applying the /root as directory.
> So is it necessary to install nifi on the same machine when we install the
> hadoop?
> Basically i am trying to explore to use nifi to data flow from 1 HDFS
> cluster to other HDFS cluster instead of doing manually by distcp. Is nifi
> recommended to be used in such scenario?
>
> Thanks in advance
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5703.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Facing Issue while connecting with HDFS

2015-12-08 Thread Bryan Bende
Hello,

The directory property should only need the path in HDFS. The other stuff
like the file system, host, and port would be determined from the provided
configuration files.

Do you receive any different response if you set the directory to just
"/root" (or some other path)?

If you still receive the same error you may want to verify you can connect
to that port from outside your vm, possibly without even using NiFi.

-Bryan

On Tuesday, December 8, 2015, digvijayp 
wrote:

> Hi Team,
>
> I have been exploring NiFi for couple of days now.
>
> NiFi is running on a machine which is not a part of Hadoop cluster. I want
> to put files into HDFS (On my machine I have configured the hortonworks
> sandbox by using virtual machine ). As to writ into HDFS I have created the
> process PUTHDFS in Ni-fi.As per my understanding I have to do following
> setting to connect with HDFS:
>
> 1.setting of Hadoop configuration Resources:I have also copied
> hdfs-site.xml
> and core-site.xml into nifi installed windows directory.Path of these xml
> is
> given to the Hadoop configuration Resources.
> 2. setting of Directory:For the directory structure i have given the url
> "hdfs://127.0.0.1:8080//root" which is hortonworks default url and port.
> Still I am getting the error "unresolvedaddressexception".Is something I am
> missing?
>
> Thanks in advance,
> Digvijay P.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


-- 
Sent from Gmail Mobile


Re: Need access to edit Apache NiFi confluence pages.

2015-12-01 Thread Bryan Bende
David,

I just granted you access, let us know if it doesn't work.

Thanks,

Bryan


On Tue, Dec 1, 2015 at 2:30 PM, David Wynne  wrote:

> Please grant me access to the Confluence Apache-NiFi wiki page.
>


Re: Coding a Processor that writes to multiple output flowfiles at once

2015-11-23 Thread Bryan Bende
Hi Salvatore,

Have you looked at the append() method on ProcessSession which lets you
append to the content of a FlowFile?

You should be able to create several new FlowFiles, and then while reading
lines from the incoming FlowFile, append the appropriate parts to each of
the new FlowFIles.

An example processor that does something like this is the new RouteText
processor:
https://github.com/apache/nifi/blob/773576e041088d9e326f1d2e84b0ad8acbd6cfdc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/RouteText.java#L485

Let us know if this helps.

Thanks,

Bryan


On Mon, Nov 23, 2015 at 2:40 AM, Salvatore Papa 
wrote:

> Heya NiFi devs,
>
> I'm having a bit of trouble trying to wrap my head around a valid way of
> tackling this problem with the available Processor templates. I'd like to
> split an input flowfile into N different flowfiles, 1 going into 1 of N
> relationships.
>
> A simplistic way of viewing it would be: A very large CSV file, with N
> columns, and I want to split each column into its own flowfile, and each of
> these flowfiles to its own relationship (or with an attribute saying which
> column it belongs to).
>
> Basic premise is for an example with two columns, and only two lines:
> * Read a line, write first column value to flowfile A, write second column
> value to flowfile B
> * Read next line, appending first column value to flowfile A, appending
> second column value to flowfile B
> Followed by one of:
> * Send flowfile A to relationship A, and send flowfile B to relationship B
> or
> * Set attribute "A" to flowfile A, attribute "B" to flowfile B, then send
> both A and B to a 'success' relationship.
>
> Unfortunately, I can't seem to find a way to write to multiple flowfiles at
> once, or at least, write to an outputstream for one flowfile, then write to
> another outputstream for another flowfile, then continue writing to the
> first flowfile.
>
> If they weren't such large files, i'd be okay with reading the input file N
> times, pulling out the different part each time, but i'd like to only have
> to read each line (by extension, the file) only once.
>
> I've written AbstractProcessors before for simple One-to-One
> transformations, and even Merge processors which use are an extension of
> AbstractSessionFactoryProcessors to do Many-to-One, and even Split
> AbstractProcessors for One-to-Many in serial (splitting at different
> places, even clone(flowfile, start, size); But I can't work out a way to do
> this One-to-Many in parallel.
>
> Any ideas? Am I missing something useful? Do I just have to do it reading
> it multiple times? Just a really simple proof of concept explaining the
> design would be enough to get me started.
>
> Kind regards,
> Salvatore
>


Re: Nifi startup error after I install my first custom processor

2015-11-20 Thread Bryan Bende
The dependencies can definitely be confusing to wrap your head around. A
good rule of thumb is that you would generally not have a direct jar
dependency on anything under nifi-nar-bundles in the source tree. This
would mean no jar dependencies on processors, controller services,
reporting tasks etc. Now there are cases where a NAR can depend on another
NAR, this is common when a processor needs to use a controller service, a
good description of that is here [1], another example is how we have the
hadoop-libraries-nar so multiple NARs can depend on the same set of hadoop
libraries and not have to duplicate all those jars. Any other dependencies
like things from nifi-commons, such as processor utils and others, are fair
game to include in your NAR. If you look at the nifi lib directory you can
see the following jars:

jcl-over-slf4j-1.7.12.jar
jul-to-slf4j-1.7.12.jar
log4j-over-slf4j-1.7.12.jar
logback-classic-1.1.3.jar
logback-core-1.1.3.jar
nifi-api-0.4.0-SNAPSHOT.jar
nifi-documentation-0.4.0-SNAPSHOT.jar
nifi-nar-utils-0.4.0-SNAPSHOT.jar
nifi-properties-0.4.0-SNAPSHOT.jar
nifi-runtime-0.4.0-SNAPSHOT.jar
slf4j-api-1.7.12.jar

These are automatically available to every nar, anything else needs to be
brought in by bundling the jars in your NAR, or by depending on another NAR.

As for logging, nifi itself uses slf4j and logback, so the logger you get
from getLogger() would be controlled from the logback.xml in the conf
directory. I think for non-processor classes if you use the slf4j api that
will be your best bet as I believe it will automatically use the same
logback configuration, but others could correct me if I am wrong here. I
believe it is also possible to use log4j directly with in your NAR, for
example I know some third party client libraries use log4j and by having a
dependency on log4j-over-slf4j it can somehow route all of the log4j calls
back through the main slf4j configuration. Hope I didn't confuse things
more here, let us know.

[1]
https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-LinkingProcessorsandControllerServices



On Fri, Nov 20, 2015 at 1:26 AM, Mark Petronic <markpetro...@gmail.com>
wrote:

> Thanks so much Bryan. It is running now. It is starting to come together
> but I'm still a little unclear on when to include nifi files to pick up
> dependencies and when to directly include the upstream dependencies. Like
> say some nifi nar like nifi-processor-utils already has a dependency on
> commons-lang3 and my custom processor needs commons-lang3, should I satisfy
> that by depending on nifi-processor-utils (which in know will already be in
> the install) or should I add a dependency on commons-lang3 in my nar and
> duplicate the libraries? I realize that there are version issues to be
> concerned with which I believe is the genesis for nars to start with but
> assume I need the same version, too? What't the general best practice
> there?
>
> Also, now that I was able to run my processor, I have a question about
> logging. I see that the processors use getLogger() to get a logger. But is
> see in other places a more traditional use of LoggerFactory.getLogger().
> What is the best approach to logging from helper classes that I use in my
> processor that are not inner classes of the processor class by defined
> outside in separate class files because I intend to reuse them across
> multiple processor types? I need to debug an error that I am seeing in one
> of those helper classes and need to add some logging.
>
> Mark
>
> On Thu, Nov 19, 2015 at 8:23 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
> > Hi Mark,
> >
> > Glad to hear you were able to get started building processors, and glad
> > that blog post helped!
> >
> > I pulled down your code and built and deployed it. It looks like the
> issue
> > is that your processors pom has a dependency on:
> > 
> > org.apache.nifi
> > nifi-standard-processors
> > 0.4.0-SNAPSHOT
> > 
> >
> > Which means your NAR ends up having the standard processors jar in it,
> but
> > they are also deployed in nifi-standard-nar which causes some problems.
> You
> > can see the jar is there by looking
> > in
> >
> work/nar/extensions/nifi-bigdata-nar-1.0.nar-unpacked/META-INF/bundled-dependencies/.
> >
> > I removed that dependency from the pom and it looked like the only
> > compilation errors were on some missing json related libraries (jackson
> and
> > json path). I added these to your processor pom and it seems to deploy
> now:
> >
> > 
> > com.jayway.jsonpath
> > json-path
> > 2.0.0
> > 
> > 
> > org.codehaus.jackson
> > jackson-mapper-asl
> > 1.9.13
> > 
> >
> >

Re: nifi fails build on CentOS 7

2015-11-19 Thread Bryan Bende
George,

Thanks for passing on this info about the syslog tests, those are new for
this current release so it is good to know what people are running into.

The ListenSyslog processor has a property that sets the SO_RCVBUF to allow
increasing the OS socket receive buffer, it defaults to 1MB and it looks
like the tests do not explicitly set this value, so I wonder if that is
causing the problem and maybe the test should set this property a bit lower
since it is not actually needed for the tests. In my experience running the
processor on a CentOS VM, it would just throw a warning if 1MB was too
high, but the processor still worked, so I'm not sure.

-Bryan


On Thu, Nov 19, 2015 at 6:36 AM, George Seremetidis 
wrote:

> I had the same problem with Centos 7 and Oracle JDK 8. Those exact unit
> tests would fail. Follow the instructions on
> https://nifi.apache.org/quickstart.html. I think you'll find that Centos
> sets max file handles to 1024 for a user.
>
> I also had to increase net.core.rmem_max to 1,200,000 to pass one of the
> Syslog tests.
>
> George
>
> George
>


Re: JSON / Avro issues

2015-11-05 Thread Bryan Bende
Jeff,

Are you using the 0.3.0 release?

I think this is the issue you ran into which is resolved for the next
release:
https://issues.apache.org/jira/browse/NIFI-944

With regards to ConvertJSONtoAvro, I believe it one json document per line
with a new line at the end of each line (your second example).

-Bryan

On Thu, Nov 5, 2015 at 4:59 PM, Jeff  wrote:

> I built a simple flow that reads a tab separated file and attempts to
> convert to Avro.
>
> ConvertCSVtoAvro just says that the conversion failed.
>
> Where can I find more information on what the failure was?
>
> Using the same sample tab separated file, I create a JSON file out of it.
>
> The JSON to Avro processor also fails with very little explication.
>
>
> With regard to the ConvertCSVtoAvro processor
> Since my file is tab  delimited, do I simple open the "CSV
> delimiter” property, delete , and hit the tab key or is there a special
> syntax like ^t?
> My data has no CSV quote character so do I leave this as “or
> delete it or check the empty box?
>
> With regard to the ConvertJSONtoAvro
> What is the expected JSON source file to look like?
> [
>  {fields values … },
>  {fields values …}
> ]
> Or
>  {fields values … }
>  {fields values …}
> or something else.
>
> Thanks,
>
> Sorry for send this to 2 lists


Re: Gitting the hub right

2015-11-04 Thread Bryan Bende
Joe,

One way to avoid the merge commits is to use rebase. I believe we have it
outlined here:

https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-Keepingyourfeaturebranchcurrent

In short, you basically...
- checkout your master
- fetch upstream to get the latest apache nifi master
- merge the upstream master to your master
- checkout your feature branch
- rebase your feature branch to your master, which essentially takes away
your commits on that branch, brings it up to date with master, and puts
back your commits

-Bryan


On Wed, Nov 4, 2015 at 8:15 AM, Joe Skora  wrote:

> Ok, I've read numerous Github howto's, but still don't feel like I've been
> doing it quite right.
>
> Assuming that I've cloned the 'apache/nifi' to 'myname/nifi', what is the
> best way to integrate changes in 'apache/nifi'?  Whatever process I've
> followed so far has created another commit in my repo related to merging
> the upstream changes, which confuses things when comparing my repo to
> upstream.
>
> Regards,
> Joe
>


Re: Next release?

2015-11-03 Thread Bryan Bende
Also, to answer Ricky's question about how to merge in the pull request
once there is consensus...

There are multiple ways to do it, but I believe what a lot of PMC members
do is the following:
- Get a patch of the pull request by appending .patch to the end of the url
- git am --signoff < foo.patch
- git commit --allow-empty -m"This closes #___"
- git push

It may be as simple as clicking the merge button in github, but I haven't
tried :)


On Tue, Nov 3, 2015 at 10:37 AM, Aldrin Piri  wrote:

> We certainly follow the RTC process with NiFi. As Joe mentioned, as long as
> there is a consensus plus one, then you can push.
>
> I will put this on my plate to scope out at some point today and get you
> the review so you can give your new credentials some usage.
>
> Thanks!
>
> --aldrin
>
>
>
> On Tue, Nov 3, 2015 at 10:27 AM, Alan Jackoway  wrote:
>
> > I am not a committer, but I think that at a minimum another committer
> > should sign off on it. I don't mind if a different committer says "looks
> > good to me, you can merge that," but I don't think committers should put
> > their own code in without sign off.
> >
> > On Tue, Nov 3, 2015 at 10:23 AM, Oleg Zhurakousky <
> > ozhurakou...@hortonworks.com> wrote:
> >
> > > May I suggest something that works so well in multitude of projects -
> one
> > > must never merge its own PR, essentially ensuring that there is a
> > consensus
> > >
> > > Sent from my iPhone
> > >
> > > > On Nov 3, 2015, at 09:00, Joe Witt  wrote:
> > > >
> > > > Ricky,
> > > >
> > > > Might I remind you, Sir, that you have the power to push!  :-)
> > > >
> > > > Let's make sure all the deps are understood (how large?) and that
> > > > licensing is fully accounted for.  As long as you have a good plus
> one
> > > > and we're sure its good let's push.  Happy to work with you on it.
> > > >
> > > > Also be sure to move the ticket to the 040 release.  Do you have
> > > > privileges for that already?
> > > >
> > > > Thanks
> > > > Joe
> > > >
> > > >> On Tue, Nov 3, 2015 at 1:49 PM, Ricky Saltzer 
> > > wrote:
> > > >> Big +1 for these features! I have a pull request out right now for
> > > adding a
> > > >> Riemann processor . I've
> been
> > > using
> > > >> it on our internal cluster for the past few weeks without any
> issues,
> > > so it
> > > >> might be worth taking one last look and then possibly merge in for
> the
> > > >> release on the 19th.
> > > >>
> > > >>
> > > >>> On Tue, Nov 3, 2015 at 7:34 AM, Joe Witt 
> wrote:
> > > >>>
> > > >>> Team,
> > > >>>
> > > >>> As we work toward an 0.4.0 release here are the current highlights
> > > >>> I've captured from the current and resolved tickets.  I might have
> > > >>> missed key points but these seem (to me) like the major points:
> > > >>>
> > > >>> Version 0.4.0
> > > >>>
> > > >>> Highlights of the 0.4.0 release include:
> > > >>> - Added proper support for tailing log files.
> > > >>> - Updated the framework/UX to support new authentication mechanisms
> > > >>> based on username/password
> > > >>> - New processor to support Python/Jython scripts as processors.
> > > >>> - New processors to capture syslog data received via UDP/TCP
> > > >>> - Improved behavior of Execute and Put SQL processors
> > > >>> - Provided documentation to help the 'Getting Started' process
> > > >>> - Improved efficiency and file handling for merges/sessions dealing
> > > >>> with 1000s of objects
> > > >>> - New processors to List and Fetch data via SFTP
> > > >>> - Improved Kerberos ticket re-registration for HDFS processors
> > > >>> - Added processors to interact with Couchbase
> > > >>> - Increased convenience when searching for provenance events of a
> > > >>> given component
> > > >>> - Added SSL support to JMS processors
> > > >>>
> > > >>> Now, we have many outstanding tickets still assigned to 0.4.0 which
> > > >>> are unresolved.  I reassigned many but still many remain.  Please
> do
> > a
> > > >>> scan through if you reported them and see which ones can be moved
> off
> > > >>> of 040.
> > > >>>
> > > >>> We released 0.3.0 on Sep 19th.  I suggest we try to target Nov 19th
> > > >>> then for 0.4.0.  There is already quite a lot in this and so I
> think
> > > >>> we should get very specific about the items remaining which really
> > > >>> must be in 040 vs which we can push forward.
> > > >>>
> > > >>> I'll keep pairing down the tickets on 040 and pinging folks to
> > > >>> understand likely target dates for completion.
> > > >>>
> > > >>> Thanks
> > > >>> Joe
> > > >>>
> > >  On Mon, Nov 2, 2015 at 3:06 PM, Joe Witt 
> > wrote:
> > >  The current process is outlined in our release guide.  But the
> main
> > > idea
> > > >>> is
> > >  that all who wish to participate in release validation do so from
> > the
> > > RC.
> > >  Unit tests are of course run by the builds but 

Pull Request Comments and JIRA

2015-10-30 Thread Bryan Bende
Does anyone know why in-line comments on pull-requests don't post back to
the JIRA?

Comments on the overall pull request do post back. I feel like this might
have been something that worked during incubation and no longer works, but
could be wrong.


Re: Recommendation on getting started as a contibutor

2015-10-27 Thread Bryan Bende
Mark,

I don't think there is a strong preference for patches vs. pull requests. A
lot of contributors use pull requests, and I personally find it easier to
review pull requests because you can give feedback in-line on the code. The
more important thing is that whatever is being submitted should always tie
back to a JIRA, and commit messages should start with the JIRA name.

-Bryan

On Tue, Oct 27, 2015 at 6:02 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Mark
>
> The following output comes from RunNifi which starts org.apache.nifi.NiFi
> as a separate JVM process which means you are not really in full DEBUG mode
> anyway:
>
> opt/java/jdk1.7.0_75/bin/java
>
> -Dnifi.properties.file.path=/home/mpetronic/nifi-0.3.1-SNAPSHOT/./conf/nifi.properties
> -Dfile.encoding=ANSI_X3.4-1968 -classpath
>
> /home/mpetronic/repos/nifi-ide-integration/bin:/home/mpetronic/.m2/repository/org/apache/nifi/nifi-api/0.3.1-SNAPSHOT/nifi-api-0.3.1-SNAPSHOT.jar:/home/mpetronic/.m2/repository/org/apache/nifi/nifi-runtime/0.3.1-SNAPSHOT/nifi-runtime-0.3.1-SNAPSHOT.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-core/2.4/d99532ba3603f27bebf4cdd3653feb0e0b84cf6/log4j-core-2.4.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-api/1.7.12/8e20852d05222dc286bf1c71d78d0531e177c317/slf4j-api-1.7.12.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.12/485f77901840cf4e8bf852f2abb9b723eb8ec29/slf4j-log4j12-1.7.12.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/org.slf4j/jul-to-slf4j/1.7.12/8811e2e9ab9055e557598dc9aedc64fd43e0ab20/jul-to-slf4j-1.7.12.jar:/home/mpetronic/.m2/repository/org/apache/nifi/nifi-nar-utils/0.3.1-SNAPSHOT/nifi-nar-utils-0.3.1-SNAPSHOT.jar:/home/mpetronic/.m2/repository/org/apache/nifi/nifi-properties/0.3.1-SNAPSHOT/nifi-properties-0.3.1-SNAPSHOT.jar:/home/mpetronic/.m2/repository/org/apache/nifi/nifi-documentation/0.3.1-SNAPSHOT/nifi-documentation-0.3.1-SNAPSHOT.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-api/2.4/cc68e72d6d14098ba044123e10e048d203d3fd47/log4j-api-2.4.jar:/home/mpetronic/.gradle/caches/modules-2/files-2.1/log4j/log4j/1.2.17/5af35056b4d257e4b64b9e8069c0746e8b08629f/log4j-1.2.17.jar:/home/mpetronic/nifi-0.3.1-SNAPSHOT/conf
> org.apache.nifi.NiFi
>
> If you are in Eclipse, did you go through Run Configuration step where you
> configure your man class that should be org.apache.nifi.NiFi?
>
> Oleg
>
> > On Oct 27, 2015, at 8:55 AM, Mark Petronic 
> wrote:
> >
> > Main Class is definitely configured to org.apache.nifi.NiFi per
> > instructions. That's what I see in the sample command lines I sent
> > you, too. Curious, what makes you think that was misconfigured?
> >
> > On Tue, Oct 27, 2015 at 8:52 AM, Oleg Zhurakousky
> >  wrote:
> >> It appears you’ve misconfigured your Run Configuration.
> >> It seems like your MainClass is org.apache.nifi.bootstrap.RunNifi. It
> should be org.apache.nifi.NiFi
> >>
> >> Can you verify?
> >>
> >> Oleg
> >>
> >>> On Oct 27, 2015, at 8:45 AM, Mark Petronic 
> wrote:
> >>>
> >>> On Tue, Oct 27, 2015 at 6:55 AM, Oleg Zhurakousky
> >>>  wrote:
>  I was just able to reproduce your exact error by disassociating it
> from the class path
> >>>
> >>> Oleg, thanks for the response.
> >>>
> >>> 1. I verified that my working directory is correct and points to my
> >>> running version of Nifi:
> >>>
> >>> /home/mpetronic/nifi-0.3.1-SNAPSHOT
> >>>
> >>> 2. I verified that src/main/resources is indeed the one and only entry
> >>> listing in my build path settings under the "Source" tab. However, I
> >>> don't see that reflected in the below command line.
> >>>
> >>> 3. Not sure how to dump the active classpath from Eclipse project
> >>> configuration but, if I run Nifi under Eclipse and go to the
> >>> properties of the running instance I see this as the command line used
> >>> to run it. Question is why are all my classpaths pointing to files in
> >>> the maven repository? Those are the values reflected in the project
> >>> build path under the "Libraries" tab that I got by default after
> >>> importing the nifi-ide-integration project. I did not edit anything
> >>> there.
> >>>
> >>> /opt/java/jdk1.7.0_75/bin/java
> >>>
> -Dnifi.properties.file.path=/home/mpetronic/nifi-0.3.1-SNAPSHOT/./conf/nifi.properties
> >>> -Dfile.encoding=ANSI_X3.4-1968 -classpath
> >>>
> 

Re: NiFi 0.3 : Query regarding HDFS processor

2015-10-22 Thread Bryan Bende
Hello,

There is a property on PutHDFS where you can specify the Hadoop
configuration files which tell the processor about your HDFS installation:

Hadoop Configuration Resources - A file or comma separated list of files
which contains the Hadoop file system configuration. Without this, Hadoop
will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or
will revert to a default configuration.

I don't think it is possible to bypass the name node since the name node
tracks where all the files are in HDFS.

-Bryan


On Thu, Oct 22, 2015 at 2:35 AM,  wrote:

> Hi Team,
>
> I have been exploring NiFi for couple of days now.
>
> NiFi is running on a machine which is not a prt of Hadoop cluster. I want
> to put files into HDFS from an external source. How do I configure the
> Hadoop cluster host details in the NiFi?
>
> I can get the file from remote source using GetSFTP processor and write it
> into the Hadoop edge node using PutSFTP, followed by PutHDFS.
> But I would like to know if there is any way to directly write the
> FlowFile to HDFS using PutHDFS without writing to Hadoop name node.
>
> Could you please help me to identify the same?
>
> Regards,
> Ramkishan Betta,
> Consultant, BT e-serv India,
> Bengaluru - INDIA
>
>


Re: Ingest Original data from External system by data's dependent condition

2015-10-13 Thread Bryan Bende
FYI, in case someone wants to work it, the ticket for extracting from Avro
is: https://issues.apache.org/jira/browse/NIFI-962


On Tue, Oct 13, 2015 at 9:29 AM, Andrew Grande 
wrote:

> A typical pattern/workaround for this situation was to copy e.g. the json
> _in full_ into an attribute, leaving the payload in a binary format. But,
> as you can imagine, it's not ideal as FlowFile memory and disk pressure
> will be raised significantly and duplicate that of an existing content repo.
>
> Andrew
>
>
>
>
> On 10/13/15, 9:21 AM, "Joe Witt"  wrote:
>
> >Hello
> >
> >Is the only reason for converting from AVRO or whatever to JSON so
> >that you can extract attributes?
> >
> >I recommend not converting the data simply so that you can do that.  I
> >recommend building processes to extract attributes from the raw.  I
> >believe we have JIRA's targeted for the next release to do this for
> >AVRO just like JSON.  If you have other custom formats in mind i
> >recommend building 'ExtractXYZAttributes'.
> >
> >There is no mechanism in play today where we convert from format A to
> >B and then in the resulting B we keep the original A hanging around
> >that object.  You can do this of course by making archive/container
> >formats to hold both but this is also not recommended.
> >
> >Does this make sense?
> >
> >Thanks
> >Joe
> >
> >On Tue, Oct 13, 2015 at 9:06 AM, Oleg Zhurakousky
> > wrote:
> >> Sorry, I meant to say that you have to enrich the original file with a
> correlation attribute, otherwise there is nothing to correlate on.
> >> I am not sure if NiFi has any implementation of ContentEnricher (EIP),
> perhaps UpdateAttribute will do the trick.
> >>
> >> Oleg
> >>
> >>> On Oct 13, 2015, at 8:21 AM, yejug  wrote:
> >>>
> >>> Hi Oleg
> >>>
> >>> THanks for response, may be I missing something (I cannot find you
> image
> >>> =)), but you suggestion doesn;t appropriate.
> >>>
> >>> There into MergeContent processor brings two types of flowFiles :
> >>> 1) one is flow file with original content (AVRO) but without populated
> >>> "correlation" attribute, directly from GetKafka
> >>> 2) and second type of flow file with parsed content (JSON) and
> populated
> >>> "correlation" attribute
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Ingest-Original-data-from-External-system-by-data-s-dependent-condition-tp3093p3096.html
> >>> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
> >>>
> >>
> >
>


Re: Ingest Original data from External system by data's dependent condition

2015-10-13 Thread Bryan Bende
We do have an idea that we called HoldFile that hasn't been fully
implemented yet, but has come up a few times:
https://issues.apache.org/jira/browse/NIFI-190

The idea was basically for a processor to "hold" a FlowFile until it was
signaled by another processor to release it.
Seems like this is similar to the ClaimCheck idea and could play into the
scenarios being discussed... hold format A, convert to format B, add some
attributes to B, then release A, transferring those attributes to A.


On Tue, Oct 13, 2015 at 11:08 AM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Great points Joe!
>
> One point I want to add to the discussion. . .
>
> As I am still learning the internals of the NiFi, the use case at the core
> of this thread is actually a very common EIP problem and while Aggregator
> (Merger) receiving from multiple inbound sources is one approach, it is not
> the only one.
> Another pattern that would probably fit better here is the ClaimCheck in
> combination with MessageStore.
> The way it would work is like this:
> - Original FlowFile (Message) is stored in MessageStore with the given key
> (ClaimCheck) which becomes an attribute to be passed downstream
> - Somewhere downstream whenever you ready for aggregation, use the
> ClaimCheck to access MessageStore to get the original Message to perform
> aggregation or whatever else.
>
> The general benefit is that accessing the original message may be required
> not only for aggregation but for any variety of use cases. Having
> ClaimCheck will give access to the original message to anyone who has it.
>
> So, I wan to use this as an opportunity to ask a wider NiFi group (since I
> am still learning it myself) if such pattern is supported? I know there is
> a ContentRepository so I am assuming it would’t be that difficult
>
> Cheers
> Oleg
>
> > On Oct 13, 2015, at 10:56 AM, Joe Witt  wrote:
> >
> > Lot of details passing by here but...
> >
> > Given formats A,B...Z coming in the following capabilities are
> > generally desired:
> > 1) Extract attributes of each event
> > 2) Make routing decisions on each event based on those extracted
> attributes
> > 3) Deliver raw/unmodified data to some endpoint (like HDFS)
> > 4) Convert/Transform data to some normalized format (and possibly schema
> too).
> > 5) Deliver converted data to some endpoint.
> >
> > Steps #1 and #4 involve (naturally) custom work for formats that are
> > not something we can readily support out of the box such as XML, JSON,
> > AVRO, etc...  Even the workaround suggested really only works for the
> > case where you know the original format well enough and we can support
> > it which means we'd like not have needed the workaround anyway.  So,
> > the issue remains that custom work is required for #1 and #4 cases...
> > Now, if you have packed formats that you think we could support please
> > let us know and we can see about some mechanism of dealing with those
> > formats generically - would be a power user tool of course but
> > avoiding custom work is great when achievable with the right user
> > experience/capability mix.
> >
> > Thanks
> > Joe
> >
> > On Tue, Oct 13, 2015 at 10:06 AM, yejug  wrote:
> >> Ok,
> >>
> >> Thank you guys for assistance.
> >>
> >> Looks like Joe's suggestion more appropriate for me, but there is one
> BUT,
> >> in case 'ExtractXYZAttributes' we must implement implicit parsing of
> encoded
> >> message and cannot reuse this logic, e.g. if we will want do actual
> XXX ->
> >> Json (for example json =)) convertion in future.
> >>
> >> With 99,9% in my case, except AVRO there will be more inputs (as minimum
> >> msgpack and some custom binary formats), which must be parsed as well as
> >> stored in the original input format
> >>
> >> So I think, except ConvertXXXToJson + Andrew's workaround there no more
> >> alternatives for me now
> >>
> >> Thanks again
> >>
> >>
> >>
> >> --
> >> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Ingest-Original-data-from-External-system-by-data-s-dependent-condition-tp3093p3101.html
> >> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
> >
>
>


Re: A flow to PutSQL the lines of a CSV file?

2015-10-05 Thread Bryan Bende
Russell,

How big are these CSVs in terms of rows and columns?

If they aren't too big, another option could be to use SplitText +
ReplaceText to split the csv into a FlowFile per line, and then convert
each line into SQL in ReplaceText. The downside is that this would create a
lot of FlowFiles for very large CSVs.

-Bryan

On Mon, Oct 5, 2015 at 4:14 PM, Russell Whitaker  wrote:

> Use case I'm attempting:
>
> 1.) ingest a CSV file with header lines;
> 2.) remove header lines (i.e. remove N lines at head);
> 2.) SQL INSERT each remaining line as a row in an existing mysql table.
>
> My thinking so far:
>
> #1 is given (CSV fetched already);
> #2 simple, should be handled in the context of ExecuteStreamProcessor;
>
> #3 is where I'm scratching my head: I keep re-reading the Description
> field for
> the PutSQL processor in http://nifi.apache.org/docs.html but can't seem to
> parse this into what I need to do to prepare a flowfile comprising lines of
> comma-separated lines of text into a series of INSERT statements:
>
> "Executes a SQL UPDATE or INSERT command. The content of an incoming
> FlowFile is expected to be the SQL command to execute. The SQL command
> may use the ? to escape parameters. In this case, the parameters to
> use must exist as FlowFile attributes with the naming convention
> sql.args.N.type and sql.args.N.value, where N is a positive integer.
> The sql.args.N.type is expected to be a number indicating the JDBC
> Type."
>
> Of related interest: there seems to be only one CSV-relevant processor
> type in
> v0.3.0, ConvertCSVToAvro; I fear the need to have to do something like
> this:
>
> ConvertCSVToAvro->ConvertAvroToJSON->ConvertJSONToSQL->PutSQL
>
> Guidance, suggestions? Thanks!
>
> Russell
>
> --
> Russell Whitaker
> http://twitter.com/OrthoNormalRuss
> http://www.linkedin.com/pub/russell-whitaker/0/b86/329
>


Re: PutHDFS Configuration Issue

2015-09-30 Thread Bryan Bende
Glad that first issue was resolved! I am by no means a kerberos expert, but
having set this up once before, the setup should be something like the
following:

nifi.kerberos.krb5.file=/etc/krb5.conf  (or wherever your conf file is)

 This is the file that would have your realms defined. Then on PutHDFS your
properties would be something like:

Kerberos Principal = myprinicpal@MYREALM
Kerberos Keytab = /etc/security/keytabs/myprinciapl.keytab

MYREALM would have to be defined in krb5.conf.

-Bryan


On Wed, Sep 30, 2015 at 1:04 PM, DomenicPuzio 
wrote:

> Thank you so much for the quick reply! Restarting NiFi resolved the issue!
> I
> had not thought of that.
>
> However, a new issue was raised that I am currently working through: "Can't
> find Kerberos realm". Do you have any idea where this should be set and
> what
> value it is looking for?
>
> I really appreciate the help!
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/PutHDFS-Configuration-Issue-tp2998p3003.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: PutHDFS Configuration Issue

2015-09-30 Thread Bryan Bende
Hi Domenic,

It sounds like you are on the right path... just to confirm, did you
restart NiFi after setting nifi.kerberos.krb5.file in nifi.properties?

It will only pick up changes to nifi.properties on a restart.

Also,

On Wed, Sep 30, 2015 at 12:38 PM, DomenicPuzio  wrote:

> Hello,
>
> I am trying to set up the PutHDFS processor on NiFi, and I am running into
> an issue. I have the Hadoop Configuration Resources set and pointing to my
> core-site.xml and hdfs-site.xml, and I have a Kerberos Principal and Keytab
> file. However, I am getting the error below.
>
> 'Kerberos Principle' is invalid because you are missing the
> nifi.kerberos.krb5.file property in nifi.properties.
>
> However, in my
>
> nifi/nifi-assembly/target/nifi-0.3.0-SNAPSHOT-bin/nifi-0.3.0-SNAPSHOT/conf/nifi.properties
> file, I have a line saying "nifi.kerberos.krb5.file=/path/to/my/key.keytab"
> (where I provide the actual path).
>
> Can you provide any assistance? I'd truly appreciate it!
>
> Thank you very much!
>
> Domenic
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/PutHDFS-Configuration-Issue-tp2998.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Syslog processors?

2015-09-28 Thread Bryan Bende
Corey,

No one was worked on this as far as I know, but there is definitely
interest. There is a ticket that I believe you created :)  [1]

You can however receive syslog messages over UDP using the ListenUDP
processor, and setting syslog to forward messages to the given port.

-Bryan

[1] https://issues.apache.org/jira/browse/NIFI-274


On Mon, Sep 28, 2015 at 12:14 PM, Corey Flowers 
wrote:

> Has anyone completed or submitted code for syslog processors yet?
>
> Sent from my iPhone
>


Re: NIFI DBCP connection pool not working for hive

2015-09-18 Thread Bryan Bende
Hello,

Can you provide the error message you received?

Is it about finding/loading the driver? or about actually creating the
connection?

Thanks,

Bryan

On Fri, Sep 18, 2015 at 2:11 PM, vthiru2006  wrote:

> Hi,
>
> I'm trying to make a hive connection pool via controller service - DBCP
> Connection Pool module.
>
> Database Connection URL --
>
> jdbc:hive2://.com:11000/default;principal=hive/.
> c...@na.company.int
> Database Driver Class Name  -- org.apache.hadoop.hive.jdbc.HiveDriver
> Database Drive Jar URL-- url where my cloudera
> hive-jdbc-standalone.jar is.
>
> For Connection, I even tried just giving
> jdbc:hive2://.com:11000/default and it didn't work.
>
> We use the same jar, URL and access hive metastore through our Java
> programs
> and it works.
>
> Can someone help with this!
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/NIFI-DBCP-connection-pool-not-working-for-hive-tp2863.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>


Re: Ways to migrate to 0.3.0

2015-09-15 Thread Bryan Bende
Rick,

You should be able to move the conf/flow.xml.gz from your snapshot
install to your new 0.3.0 conf directory.

-Bryan

On Tuesday, September 15, 2015, Rick Braddy  wrote:

> How can I migrate prior flow graph from 0.3.0-SNAPSHOT to the new 0.3.0
> tree?
>
> Rick
>


-- 
Sent from Gmail Mobile


Re: [VOTE] Release Apache NiFi 0.3.0

2015-09-15 Thread Bryan Bende
Signature and checksums look good
Build passes with contrib-check
README/LICENSE/NOTICE all look good
Test dataflows perform as expexted

+1 (binding) - Release this package as nifi-0.3.0


On Tue, Sep 15, 2015 at 9:50 AM, Mark Payne  wrote:

> Downloaded source, verified checksums and that the key/signature was valid.
>
> Was able to build with contrib-check without any problems.
>
> README/LICENSE/NOTICE all look good. Application runs without any problems.
>
> +1 (binding) - Release this package as nifi-0.3.0
>
> Thanks
> -Mark
>
>
>
> > On Sep 14, 2015, at 11:13 PM, Matt Gilman 
> wrote:
> >
> > Hello
> > I am pleased to be calling this vote for the source release of Apache
> NiFi
> > 0.3.0.
> >
> > The source zip, including signatures, digests, etc. can be found at:
> > https://repository.apache.org/content/repositories/orgapachenifi-1060
> >
> > The Git tag is nifi-0.3.0-RC1
> > The Git commit ID is 2ec735e35025fed3c63d51128ec0609ffe1fa7e3
> >
> https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=2ec735e35025fed3c63d51128ec0609ffe1fa7e3
> >
> > Checksums of nifi-0.3.0-source-release.zip:
> > MD5: 0bca350d5d6d9c9a459304253b8121c4
> > SHA1: 4b14bf1c0ddc3d970ef44dac93e716e9e6964842
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/mcgilman.asc
> >
> > KEYS file available here:
> > https://dist.apache.org/repos/dist/release/nifi/KEYS
> >
> > 89 issue was closed/resolved for this release:
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020=12329653
> >
> > Release note highlights can be found here:
> >
> https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version0.3.0
> >
> > The vote will be open for 72 hours.
> > Please download the release candidate and evaluate the necessary items
> > including checking hashes, signatures, build from source, and test.  The
> > please vote:
> >
> > [ ] +1 Release this package as nifi-0.3.0
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because...
>
>


Re: Session Commits

2015-09-08 Thread Bryan Bende
Rick,

I believe it depends on how you implement your processor, meaning what
class you extend from.
If you extend AbstractProcessor, then you can see that it creates a session
and calls commit for you.
If you extend AbstractSessionFactoryProcessor, or implemented the Processor
interface, then you would need to handle creating the session and calling
commit/rollback.

-Bryan

On Tue, Sep 8, 2015 at 2:54 PM, Rick Braddy  wrote:

> Hi,
>
> During development of some new processors, I have been looking closely at
> standard processors to understand best practices. The Developer Guide
> suggests that one should call session "commit()" upon completion of
> onTrigger() session processing, which makes sense.  However, I notice in a
> number of standard processors that commit() is not called at all; e.g., see
> SplitText processor as an example of this.  Session transfer() gets called
> but no commit() calls.
>
> So my question is the commit() call necessary, or are sessions being
> auto-committed if not rolled back?  Is there some penalty to calling
> session commit() vs. just calling transfer.
>
> Sorry for so many questions, but without a Nifi API reference guide, this
> seems like the fastest way to understand what's intended by the framework.
>
> Thanks
> Rick
>


Re: POM dependency failures

2015-09-04 Thread Bryan Bende
Rick,

Can you check in your local Maven repository to see if the 0.3.0-SNAPSHOTs
are in there?  It is typically in HOME/.m2

As one example ~/.m2/repository/org/apache/nifi/nifi-api/  should have a
sub-directory for 0.3.0-SNAPSHOT.

-Bryan

On Fri, Sep 4, 2015 at 2:50 PM, Rick Braddy  wrote:

> Yes. I pulled source from Nifi on Github[1] and built it locally, which
> seemed to work okay at first.
>
> Is there a better process to use?
>
> [1] https://github.com/apache/nifi
>
>
> -Original Message-
> From: Aldrin Piri [mailto:aldrinp...@gmail.com]
> Sent: Friday, September 04, 2015 12:46 PM
> To: dev@nifi.apache.org
> Subject: Re: POM dependency failures
>
> Rick,
>
> Did you do a build (mvn install) of the 0.3.0 repository on the same
> machine you are doing this development on?  We do not, as a project,
> publish any snapshots and this would explain the issues in not being able
> to locate the associate poms.
>
> On Fri, Sep 4, 2015 at 1:39 PM, Rick Braddy  wrote:
>
> > Hi,
> >
> > Does anyone know why these dependencies would suddenly start failing
> > or how to go about resolving?
> >
> > [WARNING] The POM for org.apache.nifi:nifi-api:jar:0.3.0-SNAPSHOT is
> > missing, no dependency information available [WARNING] The POM for
> > org.apache.nifi:nifi-processor-utils:jar:0.3.0-SNAPSHOT is missing, no
> > dependency information available [WARNING] The POM for
> > org.apache.nifi:nifi-mock:jar:0.3.0-SNAPSHOT is missing, no dependency
> > information available
> >
> > These errors occur during Maven build in a separate NAR bundle project
> > created using Bryan's Customer Processor post here<
> > http://bryanbende.com/development/2015/02/04/custom-processors-for-apa
> > che-nifi/>
> > for developing custom processors.
> >
> > This Maven project had been building okay yesterday and just suddenly
> > stopped working today (and I don't believe I changed any POM files)...
> >
> > Thanks
> > Rick
> >
>


Re: POM dependency failures

2015-09-04 Thread Bryan Bende
Glad you got it working!

-Bryan

On Fri, Sep 4, 2015 at 3:20 PM, Rick Braddy <rbra...@softnas.com> wrote:

> That's what it was. Resolution was simple.
>
> ln -s ~/.m2 /root
>
> Now I can build, update and restart Nifi service as root and the
> dependencies are found in both my local account and root.
>
> Not ideal, but what's a developer to do...
>
> Thanks for the assist.
> Rick
>
> -Original Message-
> From: Rick Braddy [mailto:rbra...@softnas.com]
> Sent: Friday, September 04, 2015 1:03 PM
> To: dev@nifi.apache.org
> Subject: RE: POM dependency failures
>
> I may know what the problem is...
>
> I decided to run the build as 'root' user so my rebuild script would be
> able to copy over my NAR bundle and restart Nifi service... so it appears
> that since the Maven repository is associated with my local home directory,
> that may be why it's failing.
>
> I will create a link and see if that resolves the issue.
>
> Thanks for the pointer.
>
> Rick
>
> -Original Message-
> From: Rick Braddy [mailto:rbra...@softnas.com]
> Sent: Friday, September 04, 2015 1:00 PM
> To: dev@nifi.apache.org
> Subject: RE: POM dependency failures
>
> Bryan,
>
> Yes.  The 0.3.0-SNAPSHOT folders are there for each Nifi subsystem folder.
>
> root@rick-dev:~/.m2/repository/org/apache/nifi# ls
> nifi  nifi-external  nifi-mock
> nifi-processor-utils
> nifi-api  nifi-framework-bundle  nifi-nar-bundles
>  nifi-provenance-repository-bundle
> nifi-commons  nifi-maven-archetypes  nifi-nar-maven-plugin
>
> ./nifi-provenance-repository-bundle/0.3.0-SNAPSHOT
> ./nifi-maven-archetypes/0.3.0-SNAPSHOT
> ./nifi-mock/0.3.0-SNAPSHOT
> ./nifi/0.3.0-SNAPSHOT
> ./nifi-external/0.3.0-SNAPSHOT
> ./nifi-processor-utils/0.3.0-SNAPSHOT
> ./nifi-nar-bundles/0.3.0-SNAPSHOT
> ./nifi-commons/0.3.0-SNAPSHOT
> ./nifi-api/0.3.0-SNAPSHOT
> ./nifi-framework-bundle/0.3.0-SNAPSHOT
>
> Rick
>
> -Original Message-
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Friday, September 04, 2015 12:54 PM
> To: dev@nifi.apache.org
> Subject: Re: POM dependency failures
>
> Rick,
>
> Can you check in your local Maven repository to see if the 0.3.0-SNAPSHOTs
> are in there?  It is typically in HOME/.m2
>
> As one example ~/.m2/repository/org/apache/nifi/nifi-api/  should have a
> sub-directory for 0.3.0-SNAPSHOT.
>
> -Bryan
>
> On Fri, Sep 4, 2015 at 2:50 PM, Rick Braddy <rbra...@softnas.com> wrote:
>
> > Yes. I pulled source from Nifi on Github[1] and built it locally,
> > which seemed to work okay at first.
> >
> > Is there a better process to use?
> >
> > [1] https://github.com/apache/nifi
> >
> >
> > -Original Message-
> > From: Aldrin Piri [mailto:aldrinp...@gmail.com]
> > Sent: Friday, September 04, 2015 12:46 PM
> > To: dev@nifi.apache.org
> > Subject: Re: POM dependency failures
> >
> > Rick,
> >
> > Did you do a build (mvn install) of the 0.3.0 repository on the same
> > machine you are doing this development on?  We do not, as a project,
> > publish any snapshots and this would explain the issues in not being
> > able to locate the associate poms.
> >
> > On Fri, Sep 4, 2015 at 1:39 PM, Rick Braddy <rbra...@softnas.com> wrote:
> >
> > > Hi,
> > >
> > > Does anyone know why these dependencies would suddenly start failing
> > > or how to go about resolving?
> > >
> > > [WARNING] The POM for org.apache.nifi:nifi-api:jar:0.3.0-SNAPSHOT is
> > > missing, no dependency information available [WARNING] The POM for
> > > org.apache.nifi:nifi-processor-utils:jar:0.3.0-SNAPSHOT is missing,
> > > no dependency information available [WARNING] The POM for
> > > org.apache.nifi:nifi-mock:jar:0.3.0-SNAPSHOT is missing, no
> > > dependency information available
> > >
> > > These errors occur during Maven build in a separate NAR bundle
> > > project created using Bryan's Customer Processor post here<
> > > http://bryanbende.com/development/2015/02/04/custom-processors-for-a
> > > pa
> > > che-nifi/>
> > > for developing custom processors.
> > >
> > > This Maven project had been building okay yesterday and just
> > > suddenly stopped working today (and I don't believe I changed any POM
> files)...
> > >
> > > Thanks
> > > Rick
> > >
> >
>


Re: Adding new processor to standard bundle, not showing in UI

2015-08-31 Thread Bryan Bende
Rick,

Everything you described sounds like the correct approach. One thing to
try, in the directory where you have nifi installed, there should be a work
directory which has a nar sub-directory... you could try stopping nifi,
deleting that nar directory, and starting again.

That directory contains expanded versions of all the nars, so since you
were installing on top of an existing installation, it is possible that it
is still running the previous version without your new processor. If nifi
starts and that directory isn't there, it will re-expand all the nars.

Let us know if it still doesn't work after that.

-Bryan

On Mon, Aug 31, 2015 at 7:07 PM, Rick Braddy  wrote:

> Hi,
>
> I'm developing a new processor that I want to test alongside the standard
> processors (like GetFile).  I have added the new .java file and it's
> compiling just fine; however, it's not showing up in the processor list of
> the GUI.
>
> The new processor has been added to:
>
> ~/nifi/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/META-INF/services/
> org.apache.nifi.processor.Processor
>
> And the entire project was rebuilt using "mvn -T C2.0 clean install", then
> installed in the runtime tree using "tar xvfz..." (with Nifi service
> stopped), then restarted Nifi... but still my new processor is not showing
> up in the list.
>
> What am I missing?
>
> Rick
>
> P.S.  I realize I can also build a separate NAR bundle, but was trying
> this as a first step, before building my own processor bundle.
>
>
>
>
>
>


Re: GetSolr NoClassDefFoundError

2015-08-30 Thread Bryan Bende
Srikanth/Joe,

The NoClassDefFoundError was a side effect that resulted when we bumped
several dependencies to newer versions in the 0.2.x release, specifically
the httpclient.  We have the fix ready to go for 0.3.0, the Jira is -
https://issues.apache.org/jira/browse/NIFI-780

Thanks,

Bryan


On Sun, Aug 30, 2015 at 7:57 PM, Joe Witt joe.w...@gmail.com wrote:

 Hello Srikanth,

 Nope you should not have to mess with the classpath for that to work.
 Created a JIRA for it [1].

 Can you provide a bit about your config on GetSolr?

 Regarding the extension and class loader model the developer guide
 does a pretty good job of describing it [2].  You can get to those
 docs from from the site [3] or a running instance of NiFi.

 Thanks
 Joe

 [1] https://issues.apache.org/jira/browse/NIFI-910
 [2] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#nars
 [3] https://nifi.apache.org/docs.html

 On Sun, Aug 30, 2015 at 7:50 PM, Srikanth srikanth...@gmail.com wrote:
  Hello,
 
  I'm trying out my first dataflow using NiFi. The flow involves GetSolr -
  LogAttributes. I ran into NoClassDefFoundError while trying this.
 
  Here is an extract from nifi-app.log
 
  18:36:26 EDTERROR7a7692b6-a9dd-4a95-a68a-932d54774144
  GetSolr[id=7a7692b6-a9dd-4a95-a68a-932d54774144]
  GetSolr[id=7a7692b6-a9dd-4a95-a68a-932d54774144] failed to process due
 to
  java.lang.NoClassDefFoundError: org/apache/http/message/TokenParser;
  rolling back session: java.lang.NoClassDefFoundError:
  org/apache/http/message/TokenParser
  2015-08-30 18:34:55,908 WARN [Timer-Driven Process Thread-7]
  o.a.n.c.t.ContinuallyRunProcessorTask
  java.lang.NoClassDefFoundError: org/apache/http/message/TokenParser
  at
 
 org.apache.http.client.utils.URLEncodedUtils.parse(URLEncodedUtils.java:280)
  ~[httpclient-4.4.1.jar:4.4.1]
  at
 
 org.apache.http.client.utils.URLEncodedUtils.parse(URLEncodedUtils.java:237)
  ~[httpclient-4.4.1.jar:4.4.1]
 
 
  Should I be manually adding certain jars to class path?
 
  A more general question, how are jars built, distributed and loaded in
  NiFi.
  Asking this mainly because of the extendable design of this project.
 
  I have to say it was super easy to get started with NiFi. Doc was well
  organized.
  I had my first visit to NiFi page y'day morning and I'm writing my first
  dataflow today evening...hit my first exception too!!
 
  Regards,
  Srikanth



Re: 0.3.0 In-depth Testing

2015-08-26 Thread Bryan Bende
All,

I was testing a scenario this morning using the Flume processors to
leverage
Flume's Syslog sources, and ran into an issue that I created a ticket for
[1].

Assuming the patch is reviewed and agreed upon, I would suggest this could
be included in 0.3.0 as it only affects a single line of code in
ExecuteFlumeSource.
If there is any objection to this let me know.

Thanks,

Bryan

[1] https://issues.apache.org/jira/browse/NIFI-895

On Wed, Aug 26, 2015 at 10:08 AM, Aldrin Piri aldrinp...@gmail.com wrote:

 Folks,

 I will be acting as the Release Manager for the upcoming 0.3.0 release.  I
 am pleased to announce the tickets that were originally assigned with a fix
 version of 0.3.0 have been completed and many great features and
 improvements have been incorporated.

 Over the past couple of days, some have been doing some in-depth and
 rigorous testing to find and iron out any quirks in using NiFi.  This has,
 so far, generated two issues to be addressed.

 Anyone that has a few spare moments and would like to join in, please feel
 free to do so and identify any unexpected behavior that may arise under
 different scenarios in which you use NiFi or even exceptional cases.

 Points of emphasis have been handling system conditions that may arise due
 to issues concerning disk space and the underlying repositories.

 Thanks!

 --Aldrin



Re: help with adding a process

2015-08-26 Thread Bryan Bende
Pat,

If you are developing your bundle with in the nifi source code, you also
need to update the nifi-assembly/pom.xml to include your new nar in the
final package that gets built.

You can check if it is included after doing a full build, and checking the
lib directory (located wherever you are running nifi from) and seeing if
your image nar is in there.

-Bryan

On Wed, Aug 26, 2015 at 4:02 PM, plj p...@mitre.org wrote:

 I had not set the org.apache.nifi.processor.Processor file but I have
 now.  No change.

 Pat

 From: Aldrin Piri [via Apache NiFi (incubating) Developer List] [mailto:
 ml-node+s39713n2580...@n7.nabble.com]
 Sent: Wednesday, August 26, 2015 3:36 PM
 To: Jones, Patrick L. p...@mitre.org
 Subject: Re: help with adding a process

 Pat,

 As a quick check, did you add the fully qualified package name of your
 processor (org.mitre.nitfimages.Processor Classname) to the
 org.apache.nifi.processor.Processor file within the nifi-image-processors
 src/main/resources/META-INF/services directory?  The overall structure
 provided seems to be appropriate.

 As an aside, if you are creating a bundle for your organization, you may be
 better suited creating a separate bundle to include in NiFi releases.  We
 have an archetype available to aid in this process, the documentation of
 which is available on the NiFi Confluence wiki [1].

 [1]

 https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions#MavenProjectsforExtensions-MavenProcessorArchetype

 On Wed, Aug 26, 2015 at 3:25 PM, plj [hidden
 email]/user/SendEmail.jtp?type=nodenode=2580i=0 wrote:

  Howdy,
 
I'm trying to add my own process to NiFi.  So far I have written the
  process and it compiles.  What I want now is actually see my process show
  up
  in NiFis list web GUI of processors.
 
  What I did, I tried to follow the Developers Guild modified by what the
  code
  looks like (I sort of used nifi-geo-bundle as an example.  So in:
 
  root/nifi-nar-bundles
 
  I added
   nifi-image-bundles
   |---pom.xml
   |--nifi-image-processor
  |--src
|-- main
  |--java
|-- org.mitre.nitfimages
  |--resources
 |...
|--nifi-images-nar
   |--pom.xml
 
  So it all compiles in maven but I don't know what I have to do to make my
  process show up.  Clearly I'm missing a step or 3 to get my stuff added.
  Can you please enlighten me?
 
  thanks in advance,
 
  Pat
 
 
 
 
  --
  View this message in context:
 
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/help-with-adding-a-process-tp2579.html
  Sent from the Apache NiFi (incubating) Developer List mailing list
 archive
  at Nabble.com.
 

 
 If you reply to this email, your message will be added to the discussion
 below:

 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/help-with-adding-a-process-tp2579p2580.html
 To unsubscribe from help with adding a process, click here
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=2579code=cGxqQG1pdHJlLm9yZ3wyNTc5fDU1MTMzODA1
 .
 NAML
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 




 --
 View this message in context:
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/help-with-adding-a-process-tp2579p2584.html
 Sent from the Apache NiFi (incubating) Developer List mailing list archive
 at Nabble.com.



Re: [VOTE] Release Apache NiFi nifi-nar-maven-plugin 1.1.0

2015-08-20 Thread Bryan Bende
+1 Release this package as nifi-nar-maven-plugin-1.1.0

Verified all steps in Matt's helper email, functions as expected.



On Wed, Aug 19, 2015 at 11:25 PM, Matt Gilman matt.c.gil...@gmail.com
wrote:

 +1 (binding) Release this package as nifi-nar-maven-plugin-1.1.0

 On Wed, Aug 19, 2015 at 11:21 PM, Joe Witt joe.w...@gmail.com wrote:

  +1 (binding) Release this package as nifi-nar-maven-plugin-1.1.0
 
  Verified sigs, hashes, builds clean w/contrib-check.  Functions as
  expected.
 
  Minor:
  - The README.md contains two references to our old incubator
  addresses.  This should be resolved in a future release.
 
  On Wed, Aug 19, 2015 at 10:57 PM, Matt Gilman matt.c.gil...@gmail.com
  wrote:
   Hello
  
   I am pleased to be calling this vote for the source release of Apache
  NiFi
   nifi-nar-maven-plugin-1.1.0.
  
   The source zip, including signatures, digests, etc. can be found at:
   https://repository.apache.org/content/repositories/orgapachenifi-1059
  
   The Git tag is nifi-nar-maven-plugin-1.1.0-RC1
   The Git commit ID is 80841130461e8346c0bd643b4097b36bf005b3a2
  
 
 https://git-wip-us.apache.org/repos/asf?p=nifi-maven.git;a=commit;h=80841130461e8346c0bd643b4097b36bf005b3a2
  
   Checksums of nifi-nar-maven-plugin-1.1.0-source-release.zip:
   MD5: 83c70a2a1372d77b3c9e6bb5828db70f
   SHA1: 90e5667a465a092ffeea36f0621706fc55b4
  
   Release artifacts are signed with the following key:
   https://people.apache.org/keys/committer/mcgilman.asc
  
   KEYS file available here:
   https://dist.apache.org/repos/dist/release/nifi/KEYS
  
   1 issue was closed/resolved for this release:
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020version=1201
  
   Release note highlights can be found here:
  
 
 https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-NiFiNARMavenPluginVersion1.1.0
  
   The vote will be open for 72 hours.
   Please download the release candidate and evaluate the necessary items
   including checking hashes, signatures, build from source, and test.
 The
   please vote:
  
   [ ] +1 Release this package as nifi-nar-maven-plugin-1.1.0
   [ ] +0 no opinion
   [ ] -1 Do not release this package because because...
 



Re: Process to create multiple files

2015-08-17 Thread Bryan Bende
Pat,

You don't need to extend anything special to produce multiple files. You
can create as many FlowFiles as you want and transfer them to a
relationship, such as...

FlowFile childFlowFile = session.create(original);
// create more

ListFlowFile flowFiles = new ArrayListFlowFile();
flowFiles.add(childFLowFile);

session.transfer(flowFIles, YOUR_RELATIONSHIP);

Good examples might be SplitJson or SplitXML.

-Bryan


On Mon, Aug 17, 2015 at 2:05 PM, plj p...@mitre.org wrote:

 Ok that makes sense. What processor would I extend/implement to allow me
 to send out multiple files from the process that unpacks the NITF bundle?
 Any examples?

 Thank,

 Pat


 From: Joe Witt [via Apache NiFi (incubating) Developer List] [mailto:
 ml-node+s39713n2513...@n7.nabble.com]
 Sent: Monday, August 17, 2015 1:56 PM
 To: Jones, Patrick L. p...@mitre.org
 Subject: Re: Process to create multiple files

 Hello Pat,

 Yeah makes sense.  You would need a custom processor to support the
 NITF format [1]

 There does appear to be a Java library to deal with it but its
 licensing isn't ASL v2 compatible so we're not likely to be able to
 play along anytime soon.  But that should be a good way to get started
 building what you need.

 The processor would unpack the items out of the NITF bundle and then
 could send out the full size images if that is how it works.  Those
 full size images can then be resized to thumbnails.  Tons of ways to
 play this one and it is a very common style of use case.

 Thanks
 Joe

 [1] https://en.wikipedia.org/wiki/National_Imagery_Transmission_Format
 [2] https://github.com/codice/imaging-nitf

 On Mon, Aug 17, 2015 at 1:45 PM, plj [hidden
 email]/user/SendEmail.jtp?type=nodenode=2513i=0 wrote:

  Howdy,
 
 Thanks for the reply.  I think my situation is different
 than you suggest.  I have an image file in NITF format.  That file may have
 multiple images and multiple bands inside that one file.  The software that
 I have that reads that file does create a thumbnail for each image and each
 band inside that file.   I don't know how many thumbnails I need until I
 read the files metadata.
 
 
  Thank
 
  Pat
 
  From: Dan Bress [via Apache NiFi (incubating) Developer List] 
  [mailto:[hidden
 email]/user/SendEmail.jtp?type=nodenode=2513i=1]
  Sent: Monday, August 17, 2015 1:40 PM
  To: Jones, Patrick L. [hidden
 email]/user/SendEmail.jtp?type=nodenode=2513i=2
  Subject: Re: Process to create multiple files
 
  plj,
 I would not recommend having this processor create multiple
 thumbnails.  What I would recommend is the following:
 
 Create a new processor called CreateThumbnail or RescaleImage
 
  Then have a configuration on the processor that says what size the
 output image should be(e.g. 128x128, or 1/X of original size).
 
  Your new processor will read in the incoming image, and rescale it
 down to the user specified size and pass it forward.
 
   Now if you want to create a 128x128 64x64 and 32x32 sized images
 you would do the following.
 
  (GetFile)-(RescaleImage configured to 128x128)-(PutFile)
|-(RescaleImage configured to 64x64)-(PutFile)
\-(RescaleImage configured to 32x32)-(PutFile)
 
  Where GetFile has 3 success relationships, each going to a different
 RescaleImage processor.
 
  I think it makes more sense to have one processor create one file, then
 you can use the flow to visually configure how many copies of the file you
 want to make.  This should make this processor simpler and more reusable.
 
 
  Dan Bress
  Software Engineer
  ONYX Consulting Services
 
  
  From: plj [hidden email]/user/SendEmail.jtp?type=nodenode=2511i=0
  Sent: Monday, August 17, 2015 1:27 PM
  To: [hidden email]/user/SendEmail.jtp?type=nodenode=2511i=1
  Subject: Process to create  multiple files
 
  Howdy,
 
I'm new to NiFi so please bear with me.  What I want to accomplish is:
read an image file
   process the file to create one or more thumbnails from the image.
   Send the resulting thumbnails along the flow
 
  So I can use GetFile to read the file and then send it along.  I
 think I
  need to write a custom java processor that will process the image file
 and
  then send each of the thumbnail files (say .jpg for now) on to the next
  thing in the flow (say PutFile for example).
 
  Are there suggestions on what I should implement or extend to create
 my
  custom processor?  It will take in one file and output multiple files.
  Would extending PutFile so that it processed and then puts each
 thumbnail
  on the flow be a good strategy?  Other ideas?
 
  Thank you,
 
 
 
 
 
  --
  View this message in context:
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Process-to-create-multiple-files-tp2510.html
  Sent from the Apache NiFi (incubating) Developer List mailing list
 archive at Nabble.com.
 
  
 

Re: [DISCUSS] Feature proposal: First-class Avro Support

2015-08-15 Thread Bryan Bende
Ryan,

Thanks for the feedback and suggestions! We will definitely factor all of
this into the design, and when I get a chance I will update the Wiki page
accordingly.

Thanks,

Bryan

On Sat, Aug 15, 2015 at 5:45 PM, Ryan Blue b...@cloudera.com wrote:

 On 08/12/2015 06:09 PM, Bryan Bende wrote:

 All,

 Given how popular Avro has become, I'm very interested in making progress
 on providing first-class support with in NiFi. I took a stab at filling in
 some of the requirements on the Feature Proposal Wiki page [1] and wanted
 to get feedback from everyone to see if these ideas are headed in the
 right
 direction.

 Are there any major features missing from that list? any other
 recommendations?

 I'm also proposing that we create a new Avro bundle to capture the
 functionality that is decided upon, and we can consider whether any of the
 existing Avro-specific functionality in the Kite bundle could eventually
 move to the Avro bundle. If anyone feels strongly about this, or has an
 alternative recommendation, let us know.

 [1]
 https://cwiki.apache.org/confluence/display/NIFI/First-class+Avro+Support

 Thanks,

 Bryan


 Thanks for putting this together, Bryan!

 I have a few thoughts and observations about the proposal:

 * Conversion to Avro is an easier problem than conversion from Avro. Item
 #2 is to convert from Avro to other formats like CSV, but that isn't
 possible for some Avro schemas. For example, Avro supports nested lists and
 maps that have no good representation in CSV so we'll have to be careful
 about that conversion. It is possible for a lot of data and is definitely
 valuable, though.

 * For #3, converting Avro records, I'd also like to see the addition of
 transformation expressions. For example, I might have a timestamp in
 seconds that I need to convert to the Avro timestamp-millis type by
 multiplying the value by 1000.

 * There are a few systems like Flume that use Avro serialization for
 individual records, without the Avro file container. This complicates
 behavior a bit. Your suggestion to have merge/split is great, but we should
 plan on having a couple of scenarios for it:
   - Merge/split between files and bare records with schema header
   - Merge/split Avro files to produce different sized files

 * The extract fingerprint processor could be more general and populate a
 few fields from the Avro header:
   - Schema definition (full, not fp)
   - Schema fingerprint
   - Schema root record name (if schema is a record)
   - Key/value metadata, like compression codec

 * It looks like #7, evaluate paths, and #8, update records, are intended
 for the case where the content is a bare Avro record. I'm not sure that
 evaluating paths would work for Avro files.

 * For the update records processor, this is really similar to the
 processor to convert between Avro schemas, #3. I suggest merging the two
 and making it easy to work with either a file or a record via record-level
 callback. This would be useful elsewhere as well. Maybe tell the difference
 between file and record by checking for the filename attribute?

 On the subject of where these processors go, I'm not attached to them
 being in the Kite bundle. It would probably be better to separate that out.
 However, there are some specific features in the Kite bundle that I think
 are really valuable:
   - Use a schema file from a HDFS path (requires Hadoop config)
   - Use the current schema of a dataset/table

 Those make it possible to update a table schema, then have that change
 propagate to the conversion in NiFi. So if I start receiving a new field in
 my JSON data, I just update a table definition and then the processor picks
 up the change either automatically or with a restart.

 The other complication is that the libraries for reading JSON and CSV (and
 from an InputFormat if you are interested) are in Kite, so you'll have a
 Kite dependency either way. We can look at separating the support into
 stand-alone Kite modules or moving it into the upstream Avro project.

 Overall, this looks like a great addition!

 rb


 --
 Ryan Blue
 Software Engineer
 Cloudera, Inc.



Re: [DISCUSS] Removal of the 'master' vs 'develop' distinction

2015-08-13 Thread Bryan Bende
If we worked on master and had a prod branch that was the last release,
then we have the same thing we do now, just with different names. This
would be GitLab Flow as Brandon pointed out.

That being said, I don't have experience with the release process, and
maybe the prod branch does not provide any value for us. The prod branch
would normally be used to create quick fix branches based off production,
or when doing automated/continuous deployments to a production system, but
if we aren't doing either of those things then maybe it is not worth it.

-Bryan

On Thu, Aug 13, 2015 at 2:23 PM, Brandon DeVries b...@jhu.edu wrote:

 Personally, I still think GitLab Flow[1] is all we need for us to be Really
 Useful Engines.

 [1] https://about.gitlab.com/2014/09/29/gitlab-flow/

 Brandon

 On Thu, Aug 13, 2015 at 2:15 PM Joe Witt joew...@apache.org wrote:

  Resending
  On Aug 13, 2015 12:22 PM, Joe Witt joe.w...@gmail.com wrote:
 
   Team,
  
   It was proposed by Ryan Blue on another thread that we consider
   dropping the master vs develop distinction.  In the interest of his,
   in my view, very good point I didn't want it to get buried in that
   thread.
  
   [1] is the thread when we last discussed gitflow/develop/master on
   entry to the incubator.
  
   And from that thread here is the part I wish I had better understood
   when the wise Mr Benson said it:
  
   Another issue with gitflow is the master branch. The master branch is
   supposed to get merged to for releases. The maven-release-plugin won't
   do that, and the jgitflow plugin is unsafe. So one option is to 'use
   gitflow' but not bother with the master versus develop distinction,
   the other is to do manual merges to master at release points.
  
   I think we should follow this guidance: 'use gitflow' but not bother
   with the master versus develop distinction.  I say this from having
   done the release management job now a couple of times including having
   done a 'hotfix'.
  
   My comments here are not a rejection of that master/develop concept in
   general.  It is simply pointing out that for the Apache NiFi community
   it is not adding value but is creating confusion and delay [2].
  
   Thanks
   Joe
  
   [1] http://s.apache.org/GIW
   [2] Sir Topham Hatt - Thomas and Friends (tm)
  
 



Re: Trouble sending files

2015-07-30 Thread Bryan Bende
Hi Patrick.

There are generally two approaches to sending data between nifi instances...

- The sending instance brings data to a remote process group which is
pointing at the receiving nifi, the receiving nifi has an input port to
receive the data
- The sending instance brings data to an output port, and the receiving
instance has a remote process group pointing at the output port of the
sending instance

Basically push vs. pull. Either approach will work, but the advantage of
the first approach (pushing) is that the sending nifi is deciding where to
send its data, as opposed to data waiting at an output port for anyone to
consume.

You can add a remote process group from the top menu, and you will be
prompted to enter the url location of the remote nifi. Let us know if you
have any questions getting this to work.

-Bryan



On Thu, Jul 30, 2015 at 2:07 PM, plj p...@mitre.org wrote:

 Howdy

   I'm trying to send files between 2 nifi instances on different computers.
 I read a bunch of files, they are sitting in the Q.  I wire them to an
 output port.  On the other machine I have an input port of the same name
 which is happily waiting for the files.  Nothing happens.
   I have on both machines set:
 nifi.remote.input.socket.port=1234
 nifi.remote.input.secure=false

 What else do I have to do to let Nifi know about another instance/instances
 and cause them to communicate.

 I don't see any errors in the logs.

 thank you,

 Patrick Jones
 
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/file/n2308/get_put_nifi.png
 



 --
 View this message in context:
 http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Trouble-sending-files-tp2308.html
 Sent from the Apache NiFi (incubating) Developer List mailing list archive
 at Nabble.com.



Re: [VOTE] Release Apache NiFi 0.2.1

2015-07-23 Thread Bryan Bende
+1 (binding)

- Signature and hashes look good
- LICENSE, NOTICE, README look good in source and assembly
- Source builds with contrib-check
- Started resulting binary with https and client auth, successfully
prompted to request an account
- Basic flow works


On Thu, Jul 23, 2015 at 6:49 PM, Joe Witt joe.w...@gmail.com wrote:

 Hello

 I am pleased to be calling this vote for the source release of Apache
 NiFi nifi-0.2.1.

 The source zip, including signatures, digests, etc. can be found at:
 https://repository.apache.org/content/repositories/orgapachenifi-1058

 The Git tag is nifi-0.2.1-RC1
 The Git commit ID is 08cd40ddcbee21b7a7d9ff5980264a2b4ee5f1f3


 https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=commit;h=08cd40ddcbee21b7a7d9ff5980264a2b4ee5f1f3

 Checksums of nifi-0.2.1-source-release.zip:
 MD5: e87b9c20660cea6045f8b16a99278aa8
 SHA1: 2dcfd583a3e16c39787dc3872362ace6e35d1514

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/joewitt.asc

 Should the release vote succeed convenience binaries will be made
 available at the proper dist location and NiFi downloads page.

 KEYS file available here:
 https://dist.apache.org/repos/dist/release/nifi/KEYS

 2 issues were closed/resolved for this release:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020version=12333089

 The vote will be open for 72 hours. (18:45EST 26 July 2015)
 Please download the release candidate and evaluate the necessary items
 including checking hashes, signatures, build from source, and test.
 The please vote:

 [ ] +1 Release this package as nifi-0.2.1
 [ ] +0 no opinion
 [ ] -1 Do not release this package because because...



<    3   4   5   6   7   8