Re: SplitJson:GC Overhead Limit Exceeded

2016-11-17 Thread Matt Burgess
If we consider streaming for SplitJson (or a new version of it), we wouldn't be able to support the "micro-batch" functionality as is in SplitJson today (like the fragment.count attribute, for example). Might not be a concern, or might warrant a new processor (SplitJsonStreaming, e.g.) . Regards,

Re: Nifi vs Sqoop

2016-11-10 Thread Matt Burgess
Nicolas, The Max Value Columns property of QueryDatabaseTable is the specification by which the processor fetches only the new lines. In your case you would put "lastmodificationdate" as the Max Value Column. The first time the processor is triggered, it will execute a "SELECT * from myTable" and

Re: PutElasticsearchHttp error behaviour

2016-11-07 Thread Matt Burgess
I agree, this should be caught and the flow file routed to failure. Do you mind filing a Jira for this? Thanks, Matt On Mon, Nov 7, 2016 at 2:40 PM, Gaspar, Carson wrote: > Yes, I know why it failed, and how to fix it. What I don’t understand is why > it was retried forever, causing backpressure

Re: PutHiveStreaming has only one concurrent task

2016-11-07 Thread Matt Burgess
Olav, I contributed that processor, and for the life of me I can't remember why it is forcing serial execution. My guess is that it was a limitation of the version of the Hive Streaming libraries. I will try it with multiple concurrent tasks to see if it works, and if so, I will write a Jira to su

Re: ExecuteSQL just once

2016-10-31 Thread Matt Burgess
ow files. > > I'd say probably something that would make a convenient "run once unless > this flag is cleared" type setting would have been what I was looking for. > I admit it isn't quite as likely to be needed in a normal production > environment though, just deve

Re: ExecuteSQL just once

2016-10-30 Thread Matt Burgess
We have QueryDatabaseTable and GenerateTableFetch for such things, especially the former for 1.0.0. QueryDatabaseTable allows you to pick "thedate" as a max-value column that it keeps track of, and you can specify the initial maximum value (not sure what version that was added in as I'm AFK). S

Re: fetch elasticsearch http

2016-10-26 Thread Matt Burgess
cific index using nifi > then store them on HDFS. > > On Tue, Oct 25, 2016 at 6:34 PM, Matt Burgess wrote: >> >> Johny, >> >> What version of NiFi are you using? Also are you trying to get >> documents from ES using FetchElasticSearch(Http) or put docs to it >&

Re: fetch elasticsearch http

2016-10-25 Thread Matt Burgess
Johny, What version of NiFi are you using? Also are you trying to get documents from ES using FetchElasticSearch(Http) or put docs to it using PutElasticsearch(Http)? For Fetching, the Document Identifier is the _id of the document you want to retrieve. If you're looking to do a search on documen

Re: Build failure under CentOS 6.7

2016-10-18 Thread Matt Burgess
This might be an issue with a parallel build, I wonder if it builds successfully if you don't include the "-T C2.0" on the Maven command-line. Regards, Matt On Tue, Oct 18, 2016 at 3:08 PM, Andy LoPresto wrote: > Michael, > > Just to help us close this out, could you please provide the full outp

Re: Nifi template for Facebook api

2016-10-18 Thread Matt Burgess
I'm not aware of any existing templates, but one of my Stack Overflow answers [1] refers to the processor(s) used, as well as a link to a lengthy discussion about SSL/auth topics thanks to Andy LoPresto. Regards, Matt [1] http://stackoverflow.com/questions/36471725/procedure-to-fetch-facebook-da

Re: Flow file from a (long) string

2016-10-18 Thread Matt Burgess
Alessio, The ReplaceText processor [1] will allow you to set the content of a flow file to a string of your choosing. You can use GenerateFlowFile [2] with a size of 0B and type Text, to get a flow file routed to ReplaceText, which then sets the content. This is a common pattern seen in some of th

Re: Penalize Flow File on Failure

2016-10-14 Thread Matt Burgess
f-referencing flow files to hit the main processor again immediately? > > Regards, > Manish > > -Original Message- > From: Matt Burgess [mailto:mattyb...@apache.org] > Sent: Friday, October 14, 2016 10:25 PM > To: users@nifi.apache.org > Subject: Re: Penalize Flow

Re: Penalize Flow File on Failure

2016-10-14 Thread Matt Burgess
Manish, The use of penalize(), yield(), etc. is not enforced by the framework, so processors can have different behavior, sometimes on purpose, and sometimes inadvertently. The Developer's Guide has guidance on when to use such methods [1], and reviewers often check the submissions to see if they

Re: NiFi for backup solution

2016-10-13 Thread Matt Burgess
Rai, There are incremental data movement processors in NiFi depending on your source/target. For example, if your sources are files, you can use ListFile in combination with FetchFile, the former will keep track of which files it has found thus far, so if you put new files into the location (or up

Re: listdatabasetables -> querydatabasetable

2016-10-10 Thread Matt Burgess
Selvam, For the first version of QueryDatabaseTable and GenerateTableFetch, the idea was to just allow this for a single table, as it simplifies the logic and behavior of the processor. Having said that, I think what you propose is a useful extension, and I have written up a Jira [1] to cover it.

[DISCUSS] Slack team for Apache NiFi

2016-10-10 Thread Matt Burgess
All, I'd like to revisit the idea of having (and promoting) a Slack team for the Apache NiFi community. We broached the subject in an email thread a while back [1]. The email lists are great and IRC per se is still popular in the open-source community, but I think for folks that are more comfortab

Re: Download Multiple Files from Queue

2016-10-10 Thread Matt Burgess
Manish, This is possible via the REST API [1]: 1) Identify the UUID of the connection you're interested in. This can be done manually using the UI (right-click on the connection, choose Configure, then on the Settings tab there is an "Id" field. You can also use the flow API (nifi-api/flow/proces

Re: Routeonattribute

2016-10-08 Thread Matt Burgess
Selvam, Are those two branches meeting at the same RouteOnAttribute (aka filterpoint)? If so, I'm assuming you'd like the GetFile/ExtractText to inform the RouteOnAttribute processor how to handle the flow files coming in from the other branch (please correct me if I've misunderstood). In Flow-Bas

Re: SelectHiveQL Error

2016-10-07 Thread Matt Burgess
ent error set of >> errors. >> >> >> >> >Error getting Hive connection >> >> >org.apache.commons.dbcp.SQLNestedException: Cannot create >> > PoolableConnectionFactory (Could not open client transport with JDBC Uri: >> > jdbc:hive2://…) >> >> >Caus

Re: SelectHiveQL Error

2016-10-06 Thread Matt Burgess
>> driver of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default' >> at >> org.apache.commons.dbcp.BasicDataSource.createConnectionFactory(BasicDataSource.java:1452) >> ~[commons-dbcp-1.4.jar:1.4] &g

Re: SelectHiveQL Error

2016-10-06 Thread Matt Burgess
Dan, That is a catch-all error returned when (in case probably) something is misconfigured. Are there more error lines below that in the log? The driver class and all its dependencies are present in the Hive NAR, so there is likely an underlying error that, while being propagated up, returns the g

Re: JoltTransformJSON error

2016-10-05 Thread Matt Burgess
I'm not near my computer but my knee-jerk reaction is that all the jolt-app-demo transforms are actually Chain transforms, some (like your example) with a single transform inside (like a Shift). Try removing the array brackets if you're selecting a Shift transform, or choose Chain and keep them

Re: importing data from postgresql to disk using NIFI

2016-10-04 Thread Matt Burgess
Selvam, Do you know which table is giving that error, and what columns are in it? Type is "other", which implies it contains non-standard values. It is possible we could improve ExecuteSQL to try and treat it like a string, but that wouldn't likely work for all cases, so it might be perfor

Re: Nifi for java Program

2016-10-04 Thread Matt Burgess
Selvam, Are you looking to run an external java program (like running "java -jar MyCode.jar" from the command-line)? If so, you can use the ExecuteProcess [1] or ExecuteStreamCommand [2] processor(s). If you are looking to call code from a JAR directly, you could use the ExecuteScript processor [

Re: PutHiveQL and Hive Connection Pool with HDInsight

2016-09-30 Thread Matt Burgess
a new > directory is created by an incoming flow file. If yes, we just want to call > Alter Table add partition to refresh Hive metadata with newly created > partition. > > > > > > Regards, > > Manish > > > > *From:* Matt Burgess [mailto:mattyb...@apa

Re: PutHiveQL and Hive Connection Pool with HDInsight

2016-09-30 Thread Matt Burgess
ustername.azurehdinsight.net:443/ > somedbname;ssl=true?hive.server2.transport.mode=http; > hive.server2.thrift.http.path=/hive2. > > But, I was getting *java.lang.NoSuchFieldError: INSTANCE: > java.lang.NoSuchFieldError: INSTANCE*. > > > > I will try again with *transportMo

Re: PutHiveQL and Hive Connection Pool with HDInsight

2016-09-29 Thread Matt Burgess
Manish, According to [1], status 72 means a bad URL, perhaps you need a transportMode and/or httpPath parameter in the URL (as described in the post)? Regards, Matt [1] https://community.hortonworks.com/questions/23864/hive-http-transport-mode-problem.html On Thu, Sep 29, 2016 at 9:06 AM, Mani

Re: Create NiFi Templates

2016-09-28 Thread Matt Burgess
Ashish, I don't have the 0.7 docs in front of me so I'm not sure if/how it is possible there, but it is definitely possible via the 1.0 REST API [1]. Procedure is as follows: If the process group already exists in the flow (and you know its ID), here are some REST API calls that should create a t

Re: ExecuteSQL & BigInt fieds

2016-09-27 Thread Matt Burgess
All, I just reviewed and merged this fix. If you need a workaround in the meantime, if you can change your table such that the 'code' column is an unsigned bigint, then I think it works. That's what I tested for a related issue NIFI-2531, but forgot the signed bigint case :( Regards, Matt On Tue

Re: PutHiveQL Multiple Ordered Statements

2016-09-23 Thread Matt Burgess
actText > doesn't have anything like that. Thoughts? > > --Peter > > > -Original Message- > From: Matt Burgess [mailto:mattyb...@apache.org] > Sent: Friday, September 23, 2016 8:02 AM > To: users@nifi.apache.org > Subject: Re: PutHiveQL Multiple Ordered

Re: PutHiveQL Multiple Ordered Statements

2016-09-23 Thread Matt Burgess
however I am not really sure how to apply the correct priority > attribute to the correct split. Does split already apply a split index? (I > haven't checked) > > Thanks, > Peter > > -Original Message- > From: Matt Burgess [mailto:mattyb...@apache.or

Re: PutHiveQL Multiple Ordered Statements

2016-09-23 Thread Matt Burgess
Peter, Since each of your statements ends with a semicolon, I would think you could use SplitText with Enable Multiline Mode and a delimiter of ';' to get flowfiles containing a single statement apiece, then route those to a single PutHiveQL. Not sure what the exact regex would look like but on it

Re: QueryDatabaseTable Processor

2016-09-14 Thread Matt Burgess
Guillaume, that is certainly something that could be added. Would you mind filing a Jira for this at https://issues.apache.org/jira/browse/NIFI ? Thanks, Matt On Wed, Sep 14, 2016 at 12:52 AM, Guillaume Pool wrote: > Hi, > > On the Processor there are pre-processing options for handling Oracle d

Re: Nifi 1.0.0 compatibility with Hive 1.1.0

2016-09-09 Thread Matt Burgess
Yari, NiFi was coded and built against Apache Hive 1.2.1. Some of the API, files, and folders had changed between Hive 1.1.0 and Hive 1.2.1, such as the org.apache.hadoop.hive.ql.io.filters package being added in 1.2.0. Bringing in Apache ORC won't work for the same reason, it was split from Hive

Re: Dynamic property in QueryDatabaseTable

2016-09-08 Thread Matt Burgess
Ravisankar, The dynamic property needs to have a certain name, in general of the form initial.maxvalue.{max_value_column}. So if you have a max value column called last_updated, you will want to add a dynamic property called initial.maxvalue.last_updated, and you set the value to whatever you wan

Re: Appending files in Hadoop with PutHDFS ...

2016-09-07 Thread Matt Burgess
file) but the latency > it introduces is not acceptable. What are some other options that we can try? > > Suyog Kulkarni > suyog_kulka...@csx.com > > > -Original Message- > From: Matt Burgess [mailto:mattyb...@apache.org] > Sent: Wednesday, September 07, 201

Re: Appending files in Hadoop with PutHDFS ...

2016-09-07 Thread Matt Burgess
Suyog, PutHDFS does not support appending files at the moment. I believe the Jira you mentioned is NIFI-958 [1], which is marked Resolved but should be Closed as duplicate. This case was split into two others, NIFI-1321 for PutFile [2] and NIFI-1322 for PutHDFS [3]. The latter is not resolved or b

Re: Posting input files to NiFi using REST

2016-09-05 Thread Matt Burgess
, 2016 at 7:29 PM, James McMahon wrote: > Thank you Matt. This helps me configure NiFi to field the request. Is there > an example you can point me towards that shows how to build, issue, and send > such a request - in Python or java, for example? > > On Sun, Sep 4, 2016 at 8:22 A

Re: Posting input files to NiFi using REST

2016-09-04 Thread Matt Burgess
James, For simple calls that return immediately, ListenHttp probably works fine. For more flexible and powerful processing of HTTP requests (and responses), you might be better off with HandleHttpRequest and HandleHttpResponse. There is an example of this under Hello_NiFi_Web_Service [1]. Regards

Re: Processor to enrich attribute from external service

2016-09-02 Thread Matt Burgess
onal “Extract” type processor. All the > downstream processor can simply work with “jsonPath” for additional lookup > inside the attribute. > > > > Regards, > > Manish > > > > From: Matt Burgess [mailto:mattyb...@gmail.com] > Sent: Friday, September 02, 2016

Re: Processor to enrich attribute from external service

2016-09-02 Thread Matt Burgess
Manish, Some of the queries in those processors could bring back lots of data, and putting them into an attribute could cause memory issues. Another concern is when the result is binary data, such as ExecuteSQL returning an Avro file. And since the return of these is a collection of records, th

Re: Drop FlowFIle in ExecuteScript

2016-08-31 Thread Matt Burgess
Actually I just found them hosted on javadoc.io, nice service that will grab the javadoc from Maven Central and host it for "any" artifact: https://www.javadoc.io/doc/org.apache.nifi/nifi-api/1.0.0 Regards, Matt On Wed, Aug 31, 2016 at 11:20 AM, Matt Burgess wrote: > The Javado

Re: Drop FlowFIle in ExecuteScript

2016-08-31 Thread Matt Burgess
is there any API documentation for the Session object online ? > > On 31 August 2016 at 16:03, Matt Burgess wrote: >> >> Mike, >> >> You can explicitly drop the flow file using session.remove(flowFile). >> I believe for auto-terminating connections that

Re: Drop FlowFIle in ExecuteScript

2016-08-31 Thread Matt Burgess
Mike, You can explicitly drop the flow file using session.remove(flowFile). I believe for auto-terminating connections that is what is happening under the hood. Regards, Matt On Wed, Aug 31, 2016 at 11:01 AM, Mike Harding wrote: > Hi all, > > I have an ExecuteScript processor that creates new f

Re: NiFi processor to convert CSV to XML

2016-08-25 Thread Matt Burgess
Ram, You could use the ExecuteScript processor if you are comfortable with scripting in Groovy, JavaScript, Jython, JRuby, or Lua. I have an example [1] of reading in a file and splitting on a delimiter (like a comma). If you use Groovy, you can leverage the MarkupBuilder [2] to build XML. Please

Re: Need to read a small local file into a flow file property

2016-08-24 Thread Matt Burgess
Chris, Are you looking to have a flow file that has its own content also as an attribute? With EvaluateJsonPath, are you taking in the entire document? If so, you could use ExtractText with a regex that captures all text and puts it in an attribute, I believe the content of the flow file is untouc

Re: adding dependencies like jdbc drivers to the build

2016-08-22 Thread Matt Burgess
All, I took a shot at adding the ability to specify multiple URLs, files, and folders to the DBCPConnectionPool configuration (NIFI-2604). The branch is here if you'd like to build and try: https://github.com/mattyb149/nifi/tree/NIFI-2604 The property name, description, etc. has changed, which w

Re: Query related to ExecuteScript

2016-08-18 Thread Matt Burgess
ead.QueuedThreadPool Unexpected thread death: >> org.eclipse.jetty.util.thread.QueuedThreadPool$3@54b057d5 in NiFi Web >> Server{STARTED,8<=13<=200,i=4,q=0} >> 2016-08-18 20:49:42,759 ERROR [NiFi Web Server-111] org.apache.nifi.NiFi >> An Unknown Error Occurred in Thread Thread

Re: Query related to ExecuteScript

2016-08-17 Thread Matt Burgess
() > > When I run this , Nifi hangs for some reason. Am I doing something grossly > wrong ? > > > > > > > On Wed, Aug 17, 2016 at 3:57 PM, Matt Burgess wrote: > >> If you need an input flowfile, you're probably better off with >> ExecuteStreamCommand than Exec

Re: Query related to ExecuteScript

2016-08-17 Thread Matt Burgess
If you need an input flowfile, you're probably better off with ExecuteStreamCommand than ExecuteScript for this use case. ExecuteStreamCommand is much like ExecuteProcess but it accepts input flow files. Regards, Matt > On Aug 17, 2016, at 6:49 PM, koustav choudhuri wrote: > > HI All > > I

Re: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Matt Burgess
Oops sorry, had replied before I saw this :) > On Aug 15, 2016, at 11:15 PM, Peter Wicks (pwicks) wrote: > > Oh, disregard J. I misread GenerateTableFetch as being an actual data fetch > vs a query builder. > > From: Peter Wicks (pwicks) > Sent: Monday, August 15, 2016 9:11 PM > To: 'users@

Re: v0.* QueryDatabaseTable vs v1 GenerateTableFetch

2016-08-15 Thread Matt Burgess
Peter, Another difference between the two (besides the paging) is that QueryDatabaseTable executes SQL and GenerateTableFetch generates SQL. With the paging capability (which with Remote Process Groups enables distributed fetch a la Sqoop), you're likely correct that GTF will replace / deprecat

Re: MergeContent with varying number of entries in bins.

2016-08-10 Thread Matt Burgess
Michael, There are a handful of examples of ExecuteScript using Javascript and/or Jython, on my blog (http://funnifi.blogspot.com) and other locations: Javascript: http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited.html https://mail-archives.apache.org/mod_mbox/nifi-users/20

Re: ExecuteSQL question

2016-08-03 Thread Matt Burgess
hen finally write text file back to file system to be picked up next time? > Thanks > Conrad > > On 03/08/2016, 14:02, "Matt Burgess" wrote: > >Conrad, > >Is it possible to add a view (materialized or not) to the RDBMS? That >view could take car

Re: ExecuteSQL question

2016-08-03 Thread Matt Burgess
Conrad, Is it possible to add a view (materialized or not) to the RDBMS? That view could take care of the denormalization and then QueryDatabaseTable could point at the view. The DB would take care of the push-down filters, which functionally is like if you had a QueryDatabaseTable for each table

Re: JsonSplit Question/Help

2016-07-26 Thread Matt Burgess
Sven, You can use the SplitJson processor with a JSONPath value of $.twitter.hashtags, it will create a new flowfile for each hashtag. Then you can use EvaluateJsonPath to get the text value from each of the flow files. Regards, Matt On Tue, Jul 26, 2016 at 5:09 PM, Sven Davison wrote: > I’m tr

Re: export from Teradata

2016-07-20 Thread Matt Burgess
Anuj Handa wrote: >>>> Hi Dima, >>>> >>>> You will have to create an über jar from the JDBC drivers provided by >>>> Teradata and copy the uberjar into the lib folder of nifi. >>>> >>>> As Matt pointed out the instructio

Re: export from Teradata

2016-07-14 Thread Matt Burgess
Dima, There was a discussion on how to get the SQL processors working with Teradata a little while ago: http://mail-archives.apache.org/mod_mbox/nifi-users/201605.mbox/%3CCAEXY4srXZkb2pMGiOFGs%3DrSc_mHCFx%2BvjW32RjPhz_K1pMr%2B%2Bg%40mail.gmail.com%3E Looks like it involves making a fat JAR to in

Re: Create a PutRethinkDB processor

2016-07-10 Thread Matt Burgess
Stéphane, In 0.7.0 and forward, you will be able to set the number of concurrent tasks for ExecuteScript to whatever you like [1]. For InvokeScriptedProcessor, a current issue is that it only expects (and interacts with) a Processor interface, which includes an "initialize" method but doesn't che

Re: Json routing

2016-07-07 Thread Matt Burgess
hu, Jul 7, 2016 at 11:45 AM, Matt Burgess wrote: >> It will be fixed in 0.7.0 [1]. Also you could use >> InvokeScriptedProcessor to replace both the ExecuteScript and >> RouteOnAttribute, since the scripted processor can define the >> relationships and provide th

Re: Json routing

2016-07-07 Thread Matt Burgess
It will be fixed in 0.7.0 [1]. Also you could use InvokeScriptedProcessor to replace both the ExecuteScript and RouteOnAttribute, since the scripted processor can define the relationships and provide the logic to extract the arbitrary JSON keys. Regards, Matt [1] https://issues.apache.org/jira/b

Re: ExecuteProcess (fetch output)

2016-07-03 Thread Matt Burgess
A single dot will match a single character, so I think you'll need ".*". Also ExtractText might be looking for a grouping, so you may need "(.*)". If that doesn't handle multi-lines I think there's a processor property for that. Sorry I'm not at my computer so can't confirm. > On Jul 3, 2016, a

Re: PutCassandraQL failing on ISO-8601-formatted timestamp

2016-07-01 Thread Matt Burgess
events starting with ID 16 > 2016-07-01 12:13:18,095 INFO [Provenance Repository Rollover Thread-1] > o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files > (8 records) into single Provenance Log File ./provenance_repository/8.prov in > 165 milliseconds >

Re: PutCassandraQL failing on ISO-8601-formatted timestamp

2016-07-01 Thread Matt Burgess
#x27;cql.args.3.type' > Value: 'float' > Key: 'cql.args.3.value' > Value: '6.7' > Key: 'j.id' > Value: 'temp3' > Key: 'j.ts' > Value: '2016-06-30T20:04:36Z' > Key: &#x

Re: Problem with EvaluationJsonPath

2016-06-24 Thread Matt Burgess
Anuj, It seems like the value at that path is an array, perhaps add [0] to your JSON Path. Is "0" the expected value of that field? If not then perhaps the JSON path itself is incorrect, you could test it with some sample data at http://jsonpath.com/ Regards, Matt On Fri, Jun 24, 2016 at 3:34 P

Re: PutCassandraQL failing on ISO-8601-formatted timestamp

2016-06-21 Thread Matt Burgess
Jeff, That appears to be a correct ISO-8601 date, so I'm not sure what's going on there. I checked the NiFi code and the Cassandra Java driver Jira and didn't see anything related (that wasn't already fixed, in the latter case). The upcoming 0.7.0 release has an updated Cassandra driver, perhaps t

Re: IDE-specific setup

2016-06-21 Thread Matt Burgess
Same as Bryan, I choose New (Project or Module) from Existing Sources and point at the POM for that directory/project/module, IntelliJ does a good job of getting everything set up. On Tue, Jun 21, 2016 at 5:29 PM, Bryan Bende wrote: > I personally use IntelliJ and it generally does pretty well at

Re: Escape * and new line character

2016-06-21 Thread Matt Burgess
Huagen, I agree with Bryan that other processors may be better here. You could use ListFile -> FetchFile, or as Bryan said, you could use GetFile. Regards, Matt On Tue, Jun 21, 2016 at 9:38 AM, Bryan Rosander wrote: > Hi Huagen, > > 1. The ExecuteStreamCommand uses a ProcessBuilder under the co

Re: GetHTTP->ExtractText (Regex/User problem?)

2016-06-20 Thread Matt Burgess
Looks like "content" is in smart quotes, try plain quotes instead. On Mon, Jun 20, 2016 at 1:43 PM, Sven Davison wrote: > I had tried that but got a NULL value result. Is there a setting w/in the > extractor that I need to change too? > > > > > > > > -Sven > > Sent from Mail for Windows 10 > > >

Re: Writing files to MapR File system using putHDFS

2016-06-14 Thread Matt Burgess
gt; > Ravi Papisetti > > Technical Leader > > Services Technology Incubation Center > <http://wwwin.cisco.com/CustAdv/ts/cstg/stic/> > > rpapi...@cisco.com > > Phone: +1 512 340 3377 > > > [image: stic-logo-email-blue] > > From: Matt Burgess > Rep

Re: Writing files to MapR File system using putHDFS

2016-06-13 Thread Matt Burgess
Sumo, I'll try the MapR PR with your additional settings below. If they work, they'll need to be added to the doc (or ideally, the profile if possible). That's what I suspected had been missing but haven't had a chance to try yet, will do that shortly :) Thanks, Matt > On Jun 13, 2016, at 9:1

Re: Failure when running a workflow created from a template from another NiFi version.

2016-06-09 Thread Matt Burgess
Not to stir the pot but the only time I've seen that error was a Jira that was fixed for 0.6.1, in fact I think I recommended such an upgrade for someone with the same problem. I'm glad the issue went away but it's weird that it showed up in 0.6.1... > On Jun 9, 2016, at 7:41 PM, James Wing w

Re: PutElasticsearch Identifier attribute question

2016-06-07 Thread Matt Burgess
uot;:"160889137" causes the issue. > > >> On Tue, Jun 7, 2016 at 6:53 PM, Matt Burgess wrote: >> Igor, >> >> The "id" field you have is in your content, but PutElasticsearch is >> looking for a flow file attribute. This can be fixed by put

Re: PutElasticsearch Identifier attribute question

2016-06-07 Thread Matt Burgess
Igor, The "id" field you have is in your content, but PutElasticsearch is looking for a flow file attribute. This can be fixed by putting an EvaluateJsonPath processor before the PutElasticsearch processor, with the Destination property set to "flowfile-attribute" and add a dynamic property called

Re: Dependency for SSL

2016-05-25 Thread Matt Burgess
If that library has dependencies, you may need to remove the jar so that it brings in the POM, which should get the JAR and its dependencies. Regards, Matt > On May 25, 2016, at 5:38 PM, Kumiko Yada wrote: > > Thank you Bryan and Matt. > > I added the dependency in the pom of processor jar

Re: Dependency for SSL

2016-05-25 Thread Matt Burgess
Kumiko, I'm guessing that entry is in your processor's POM. I believe you need the following in your NAR's POM as well: org.apache.nifi nifi-standard-services-api-nar nar Regards, Matt On Wed, May 25, 2016 at 3:44 PM, Kumiko Yada wrote: > Hello, >

Re: Nifi into Titan graph

2016-05-22 Thread Matt Burgess
Pat, I did a very deep dive into Tinkerpop3 this weekend, I was looking for a very generic solution (to involve GremlinServer at the least but hopefully using RemoteGraph/RemoteConnection for any server that can accept graph traversals, not a Titan one in particular). Also I wanted to abstract the

Re: EvaluateXPath and xml namespace

2016-05-22 Thread Matt Burgess
ked. > > How may I handle this error? > > By the way, I'm using Nifi 0.6.0 > > Thanks. > Hong > > > > *Hong Li* > > *Centric Consulting* > > *In Balance* > (888) 781-7567 office > (614) 296-7644 mobile > www.centricconsulting.com | @Centr

Re: EvaluateXPath and xml namespace

2016-05-21 Thread Matt Burgess
Hong, The use of a default namespace makes the XPath more tricky, as the namespace technically exists as a prefix although it is not visible in the document. As an example, I used this sample content: http://cp.com/rules/client";> Hello In order to get the value "Hello", I had to use wildcar

Re: JSON Schema

2016-05-17 Thread Matt Burgess
Madhu, This is a good idea for a processor (ValidateJson like the existing ValidateXml processor), I've written up [1] in Jira for it. In the meantime, here's a Groovy script you could use in ExecuteScript, just need to download the two JAR dependencies ([2] and [3]) and add them to your Module D

Re: QueryDatabaseTable errors

2016-05-12 Thread Matt Burgess
We can probably do better with the error displayed in the bulletin, maybe by propagating the message from the cause to the RuntimeException or something. > On May 12, 2016, at 6:31 PM, Thad Guidry wrote: > > The odbc6.jar is in the classpath already ( dropped it into NiFi /lib) > > The URL is

Re: Elastic search processor in NiFi

2016-05-12 Thread Matt Burgess
Ravi, You can use InvokeHttp along with the Elasticsearch Search API [1] to get all the documents for a specific index and type. If you are including the _source in your query then I think you get the contents back too. Otherwise if you are getting just the IDs back, then you can use SplitJSON to

Re: How to extract mutiple json properties/fields into processor properties?

2016-05-11 Thread Matt Burgess
Are you using update attribute to fill HTTP header attributes? In any case, I think InvokeHttp will be a solution. Regards, Matt > On May 11, 2016, at 6:15 PM, Keith Lim wrote: > > Thanks Brian, that works. I have a follow up question. I want to use the > update attribute flowfile from this

Re: SplitJson configuration question

2016-05-11 Thread Matt Burgess
I believe $.* should work to split at the root. > On May 11, 2016, at 5:23 PM, Igor Kravzov wrote: > > Looks like am missing something. How to configure SplitJson to split array > like bellow to individual JSON files. Basically split on "root" of array. > > [{ > "id":1, >"data":"data1"

Re: SelectHiveQL HiveConnectionPool issues

2016-05-09 Thread Matt Burgess
select to export it in Avro I get the following exception: > > [image: Inline images 1] > > I'm assuming this is happening because the underlying data on HDFS my hive > table is reading from is not Avro? its currently standard JSON. > > Thanks, > Mike > > >

Re: SelectHiveQL HiveConnectionPool issues

2016-05-09 Thread Matt Burgess
Your URL has a scheme of "mysql", try replacing with "hive2", and also maybe explicitly setting the port: jdbc:hive2://:1/default If that doesn't work, can you see if there is an error/stack trace in logs/nifi-app.log? Regards, Matt On Mon, May 9, 2016 at 12:04 PM, Mike Harding wrote: > Hi

Re: PutElasticsearch error

2016-05-06 Thread Matt Burgess
Pierre is correct Sent from my iPhone > On May 6, 2016, at 5:20 PM, Pierre Villard > wrote: > > Hi Igor, > > I believe ES processor uses port 9300 (transport port) and not 9200 port > (http port) > > Pierre. > > 2016-05-06 23:16 GMT+02:00 Igor Kravzov : >> I configured ES processor with ES

Re: Lua usage in ExecuteScript Processor

2016-05-04 Thread Matt Burgess
;> >> >>> I am trying to read the lua file this way, but its not working. How to >> >>> read the lua files from module directory and use it in execution? >> >>> >> >>> luajava.LuaState = luajava.LuaStateFactory.newLuaState() >> &g

Re: Lua usage in ExecuteScript Processor

2016-05-04 Thread Matt Burgess
;> luajava.LuaState.LdoFile("common_log_format.lua"); >>> >>> >>> On Wed, Apr 20, 2016 at 4:29 PM, Madhukar Thota >>> wrote: >>>> >>>> Thanks Matt. This will be helpful to get started. I will definitely >>>> contribu

Re: ExecuteScript Processor Performance

2016-05-02 Thread Matt Burgess
Madhu, In addition to Joe's suggestions, currently ExecuteScript only allows for one task at a time, which is currently a pretty bad bottleneck if you are dealing with lots of throughput. However I have written up a Jira [1] for this and issued a PR [2] to fix it, feel free to try that out and/or

Re: Doing development on nifi

2016-04-28 Thread Matt Burgess
Stéphane, Welcome to NiFi, glad to have you aboard! May I ask what version you are using? I believe as of at least 0.6.0, you can view the items in a queued connection. So for your example, you can have a GetHttp into a SplitJson, but don't start the SplitJson, just the GetHttp. You will see any

Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-28 Thread Matt Burgess
to issue the PR for > this? > > > Cheers, > Mike > > On Tue, 26 Apr 2016 at 14:47, Matt Burgess wrote: >> >> Hive doesn't work with ExecuteSQL as its JDBC driver does not support >> all the JDBC API calls made by ExecuteSQL / PutSQL. However I am

Re: Nifi parsing examples

2016-04-27 Thread Matt Burgess
Sorry that was just for example #2 :) > On Apr 27, 2016, at 3:59 PM, Matt Burgess wrote: > > If you can represent the expected string format as a regular > expression, you can use the replaceAll() function [1] with > back-references: > > ${url:replaceAll('(http:

Re: Nifi parsing examples

2016-04-27 Thread Matt Burgess
If you can represent the expected string format as a regular expression, you can use the replaceAll() function [1] with back-references: ${url:replaceAll('(http://[a-zA-Z0-9]+:)[a-zA-Z0-9]+(@.*)','$1x$2')} original: http://username:p...@host.com after: http://username:xx...@host.com Note I h

Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
for some alterbatives like using Groovy for JSON-to-JSON > conversion. But not sure how StandardCharsets.UTF_8 will work with > multi-byte languages. > > > On Tue, Apr 26, 2016 at 12:11 PM, Matt Burgess wrote: >> >> Yes, I think you'll be better off with Aldr

Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
Yes, I think you'll be better off with Aldrin's suggestion of ReplaceText. Then you can put the value of the attribute(s) directly into the content. For example, if you have two attributes "entities" and "users", and you want a JSON doc with those two objects inside, you can use ReplaceText with t

Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-26 Thread Matt Burgess
Hive doesn't work with ExecuteSQL as its JDBC driver does not support all the JDBC API calls made by ExecuteSQL / PutSQL. However I am working on a Hive NAR to include ExecuteHiveQL and PutHiveQL processors (https://issues.apache.org/jira/browse/NIFI-981), there is a prototype pull request on GitH

Re: Lua usage in ExecuteScript Processor

2016-04-20 Thread Matt Burgess
Madhu, I know very little about Lua, so I haven't tried making a Lua version of my JSON-to-JSON scripts/blogs (funnifi.blogspot.com), but here's something that works to get you started. The following Luaj script creates a flow file, writes to it, adds an attribute, then transfers it to success. Ho

Re: ExecuteSQL to elasticsearch

2016-04-07 Thread Matt Burgess
": "lab1", "CounterName": > "AvgDiskSecTransfer", "InstanceName": "C:", "MetricValue": > 9.60508652497083E-4}, > {"DateTime": "2016-04-07 17:22:00.0", "HostName": "lab1", "CounterName&q

Re: ExecuteSQL to elasticsearch

2016-04-07 Thread Matt Burgess
Can you provide a sample JSON output from your ConvertAvroToJson processor? It could help identify the location of any mapping/parser exceptions. Thanks, Matt On Thu, Apr 7, 2016 at 1:31 PM, Madhukar Thota wrote: > I am able to construct the dataflow with the following processors > > ExecuteSQL

<    1   2   3   4   5   6   >