Re: Help with replace method

2016-04-26 Thread Joe Percivall
Hello Igor,

I got your template working by using the below replacement string and changing 
the "Replacement Strategy" to "Always Replace". I've attached a template that 
works for me.

{"test":"${teststr:replaceAll('"','"')}"}


The backslashes are a bit weird because they escape characters and are used to 
escape themselves. So when you're trying to use them explicitly it can lead 
needing to repeat them multiple times (in this case 4).

Hope this helps,
Joe
- - - - - - 
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com



On Tuesday, April 26, 2016 6:10 PM, Igor Kravzov  wrote:



Attached please find the test template. NiFi 0.6.1I am trying to replace " with 
\" in a text.  So "Here "we" go" should become \"Here \"we\" go\"


The call is in ReplaceText processor: {"test":"${teststr:replace('"','\\"')}"}
teststr cerated in UpdateAttribute.


>From some reason unable to make it working. What can be wrong?

Thanks in advance.Fixed_Replace_Test069b7449-0406-41d5-899b-e128143a91f13b98a054-b345-477c-abbf-e6e64cc373030 MB03b98a054-b345-477c-abbf-e6e64cc373038c36844e-3826-4491-827a-5ce33be372f1PROCESSOR0 sec1success3b98a054-b345-477c-abbf-e6e64cc37303abe1e7cd-d862-4b20-8c4a-6bba6d490807PROCESSOR0428901df-02be-46a4-b5bd-11a871d3bef43b98a054-b345-477c-abbf-e6e64cc373030 MB03b98a054-b345-477c-abbf-e6e64cc373033496125f-5737-446e-80f2-918fe2ea0760PROCESSOR0 sec1success3b98a054-b345-477c-abbf-e6e64cc373038c36844e-3826-4491-827a-5ce33be372f1PROCESSOR05ac0bd2e-a325-4b69-abd1-444c01cbefcb3b98a054-b345-477c-abbf-e6e64cc373030 MB03b98a054-b345-477c-abbf-e6e64cc37303abe1e7cd-d862-4b20-8c4a-6bba6d490807PROCESSOR0 sec1success3b98a054-b345-477c-abbf-e6e64cc37303621d8f8a-7f4a-4dea-a6ee-4beff2715617PROCESSOR0abe1e7cd-d862-4b20-8c4a-6bba6d4908073b98a054-b345-477c-abbf-e6e64cc373031716.623391446894207.89790068405205WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Delete Attributes ExpressionRegular expression for attributes to be deleted from flowfiles.Delete Attributes ExpressionfalseDelete Attributes Expressionfalsefalsetruefilenamefilenametruefilenamefalsefalsetrueteststrteststrtrueteststrfalsefalsetruefalse30 secDelete Attributes ExpressionRouteOnAttribute.Routefilename${filename}.jsonteststrHere we go00 secTIMER_DRIVEN1 secUpdateAttributefalseAll FlowFiles are routed to this relationshipsuccessSTOPPEDtruetrueorg.apache.nifi.processors.attributes.UpdateAttribute3496125f-5737-446e-80f2-918fe2ea07603b98a054-b345-477c-abbf-e6e64cc373031712.0604.0WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Log LeveltracetracedebugdebuginfoinfowarnwarnerrorerrorinfoThe Log Level to use when logging the AttributesLog LevelfalseLog LeveltruefalsefalseLog PayloadtruetruefalsefalsefalseIf true, the FlowFile's payload will be logged, in addition to its attributes; otherwise, just the Attributes will be logged.Log PayloadfalseLog PayloadtruefalsefalseAttributes to LogA comma-separated list of Attributes to Log. If not specified, all attributes will be logged.Attributes to LogfalseAttributes to LogfalsefalsefalseAttributes to IgnoreA comma-separated list of Attributes to ignore. If not specified, no attributes will be ignored.Attributes to IgnorefalseAttributes to IgnorefalsefalsefalseLog prefixLog prefix appended to the log lines. It helps to distinguish the output of multiple LogAttribute processors.Log prefixfalseLog prefixfalsefalsetruefalse30 secLog LevelLog PayloadAttributes to LogAttributes to IgnoreLog prefix00 secTIMER_DRIVEN1 secLogAttributetrueAll FlowFiles are routed to this relationshipsuccessSTOPPEDtruetrueorg.apache.nifi.processors.standard.LogAttribute621d8f8a-7f4a-4dea-a6ee-4beff27156173b98a054-b345-477c-abbf-e6e64cc373031722.33067613624138.0WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?File SizeThe size of the file that will be usedFile SizefalseFile SizetruefalsefalseBatch Size1The number of FlowFiles to be transferred in each invocationBatch SizefalseBatch SizetruefalsefalseData FormatBinaryBinaryTextTextBinarySpecifies whether the data should be Text or BinaryData FormatfalseData FormattruefalsefalseUnique FlowFilestruetruefalsefalsefalseIf true, each FlowFile that is generated will be unique. If false, a random value will be generated and all FlowFiles will get the same content but this offers much higher throughputUnique FlowFilesfalseUnique FlowFilestruefalsefalsefalse30 secFile Size0kbBatch SizeData FormatTextUnique FlowFiles010 secTIMER_DRIVEN1 secGenerateFlowFilefalsesuccessSTOPPEDfalsetrueorg.apache.nifi.processors.standard.GenerateFlowFile8c36844e-3826-4491-827a-5ce33be372f13b98a054-b345-477c-abbf-e6e64cc373031712.0422.30129317631213WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Regular Expression(?s:^.*$)The Search Value to search for in the FlowFile content. Only used for 'Literal Replace' and 'Regex Replace' matching strategiesSearch 

Help with replace method

2016-04-26 Thread Igor Kravzov
Attached please find the test template. NiFi 0.6.1
I am trying to replace " with \" in a text.  So "Here "we" go" should
become \"Here \"we\" go\"


The call is in ReplaceText
processor: {"test":"${teststr:replace('"','\\"')}"}
teststr cerated in UpdateAttribute.

>From some reason unable to make it working. What can be wrong?

Thanks in advance.
Replace Testbafb5a69-6551-4c63-8aa4-a3857e8efc38009feada-f057-4ab1-9475-b35f06f54a400 MB0009feada-f057-4ab1-9475-b35f06f54a401c6c7313-3add-4a7d-8b8e-6b5c89cac083PROCESSOR0 sec1success009feada-f057-4ab1-9475-b35f06f54a400a596ac5-f49a-4df4-8ebc-db98359d6c03PROCESSOR0fdfb4ec3-8274-4f39-a64d-ee559f1d8e9a009feada-f057-4ab1-9475-b35f06f54a400 MB0009feada-f057-4ab1-9475-b35f06f54a40d5332eb8-4cfe-4dc0-9fe2-d5e3994787f1PROCESSOR0 sec1success009feada-f057-4ab1-9475-b35f06f54a407c460d73-e691-4d2a-ab28-ab59de20d134PROCESSOR030cd0a29-7e7a-4410-8843-85158dbe3f67009feada-f057-4ab1-9475-b35f06f54a400 MB0009feada-f057-4ab1-9475-b35f06f54a407c460d73-e691-4d2a-ab28-ab59de20d134PROCESSOR0 sec1success009feada-f057-4ab1-9475-b35f06f54a401c6c7313-3add-4a7d-8b8e-6b5c89cac083PROCESSOR01c6c7313-3add-4a7d-8b8e-6b5c89cac083009feada-f057-4ab1-9475-b35f06f54a406573.5538641934241665.4968078056427WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Delete Attributes ExpressionRegular expression for attributes to be deleted from flowfiles.Delete Attributes ExpressionfalseDelete Attributes Expressionfalsefalsetruefilenamefilenametruefilenamefalsefalsetrueteststrteststrtrueteststrfalsefalsetruefalse30 secDelete Attributes ExpressionRouteOnAttribute.Routefilename${filename}.jsonteststrHere we go00 secTIMER_DRIVEN1 secUpdateAttributefalseAll FlowFiles are routed to this relationshipsuccessSTOPPEDtruetrueorg.apache.nifi.processors.attributes.UpdateAttributed5332eb8-4cfe-4dc0-9fe2-d5e3994787f1009feada-f057-4ab1-9475-b35f06f54a406576.56911665775351983.5617464027887WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?DirectoryThe directory to which files should be written. You may use expression language such as /aa/bb/${path}DirectoryfalseDirectorytruefalsetrueConflict Resolution StrategyreplacereplaceignoreignorefailfailfailIndicates what should happen when a file with the same name already exists in the output directoryConflict Resolution StrategyfalseConflict Resolution StrategytruefalsefalseCreate Missing DirectoriestruetruefalsefalsetrueIf true, then missing destination directories will be created. If false, flowfiles are penalized and sent to failure.Create Missing DirectoriesfalseCreate Missing DirectoriestruefalsefalseMaximum File CountSpecifies the maximum number of files that can exist in the output directoryMaximum File CountfalseMaximum File CountfalsefalsefalseLast Modified TimeSets the lastModifiedTime on the output file to the value of this attribute.  Format must be -MM-dd'T'HH:mm:ssZ.  You may also use expression language such as ${file.lastModifiedTime}.Last Modified TimefalseLast Modified TimefalsefalsetruePermissionsSets the permissions on the output file to the value of this attribute.  Format must be either UNIX rwxrwxrwx with a - in place of denied permissions (e.g. rw-r--r--) or an octal number (e.g. 644).  You may also use expression language such as ${file.permissions}.PermissionsfalsePermissionsfalsefalsetrueOwnerSets the owner on the output file to the value of this attribute.  You may also use expression language such as ${file.owner}.OwnerfalseOwnerfalsefalsetrueGroupSets the group on the output file to the value of this attribute.  You may also use expression language such as ${file.group}.GroupfalseGroupfalsefalsetruefalse30 secDirectoryD:/Projects/ApacheNiFi/outConflict Resolution StrategyfailCreate Missing DirectoriestrueMaximum File CountLast Modified TimePermissionsOwnerGroup00 secTIMER_DRIVEN1 secOuttrueFiles that could not be written to the output directory for some reason are transferred to this relationshipfailuretrueFiles that have been successfully written to the output directory are transferred to this relationshipsuccessSTOPPEDfalsetrueorg.apache.nifi.processors.standard.PutFile7c460d73-e691-4d2a-ab28-ab59de20d134009feada-f057-4ab1-9475-b35f06f54a406572.930472746531826.9002002979028WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?Regular Expression(?s:^.*$)The Search Value to search for in the FlowFile content. Only used for 'Literal Replace' and 'Regex Replace' matching strategiesSearch ValuefalseRegular ExpressiontruefalsetrueReplacement Value$1The value to insert using the 'Replacement Strategy'. Using Regex Replace back-references to Regular Expression capturing groups are supported, but back-references that reference capturing groups that do not exist in the regular expression will be treated as literal value. Back References may also be referenced using the Expression Language, as '$1', '$2', etc. The single-tick marks MUST be included, as these variables are 

Re: Nifi + opentsdb

2016-04-26 Thread Madhukar Thota
Thanks Guys for the input. I will start with InvokeHTTP for now, but i
would like to write a processor for opentsdb and will contribute back to
community.

On Tue, Apr 26, 2016 at 1:04 AM, karthi keyan 
wrote:

> Madhu,
>
> As Joe said, Opentsdb has an Rest support you can use InvokeHTTP or if you
> having idea to create a custom processor. Just give a try over Telnet API
> in which OpenTsdb has support.
>
> Just put the metrics over that Telnet API.
>
> Best,
> Karthik
>
> On Tue, Apr 26, 2016 at 8:23 AM, Joe Percivall 
> wrote:
>
>> A quick look at the documentation it looks like OpenTSDB has an HTTP
>> api[1] you could use to POST/GET. So one option may be to use the
>> InvokeHttp[2] processor to create messages to GET/POST the HTTP api.
>>
>> If you need help configuring a flow to properly set headers or content to
>> GET/POST just let us know.
>>
>> [1] http://opentsdb.net/docs/build/html/api_http/index.html[2]
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.InvokeHTTP/index.html
>>
>>
>> Joe
>> - - - - - -
>> Joseph Percivall
>> linkedin.com/in/Percivall
>> e: joeperciv...@yahoo.com
>>
>>
>>
>>
>> On Monday, April 25, 2016 10:46 PM, Joe Witt  wrote:
>> Madhu,
>>
>> I'm not aware of anyone doing so but as always we'd be happy to help
>> it be brought in as a contrib.
>>
>> Thanks
>> Joe
>>
>>
>> On Mon, Apr 25, 2016 at 7:50 PM, Madhukar Thota
>>  wrote:
>> > Friends,
>> >
>> > Just checking to see if anyone in the community using Nifi or custom
>> Nifi
>> > processor to write the data into opentsdb? Any input is appreciated.
>> >
>> > -Madhu
>>
>
>


Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
You can certainly use ExecuteScript with Groovy for JSON-to-JSON
conversion, I have a blog post about that here:
http://funnifi.blogspot.com/2016/02/executescript-json-to-json-conversion.html
 I am using UTF-8 in that example but you can use whatever you like,
or even let the user choose the value for a dynamic property like
"scriptCharset" or something, and use that value (scriptCharset.value)
in your Groovy script.

ExecuteScript can be very slow for high data loads as it executes 1
task at a time. I am looking into improving that, but if you need more
performance, you can use InvokeScriptedProcessor to do the same thing
(I can outline the differences if you like).

Regards,
Matt

On Tue, Apr 26, 2016 at 1:05 PM, Igor Kravzov  wrote:
> Hi Matt,
>
> You described an interesting process. I will think about it.
> Initially I wanted to grab just some properties, like "entities" and "text",
> of original JSON and create anew one.
> ReplaceTexts works fine as long as "text" value does not have quotes inside
> the text. Once it has quotes and goes through the replace processor, JSON
> became broken.
>
> Also looking for some alterbatives like using Groovy for JSON-to-JSON
> conversion.  But not sure how StandardCharsets.UTF_8 will work with
> multi-byte languages.
>
>
> On Tue, Apr 26, 2016 at 12:11 PM, Matt Burgess  wrote:
>>
>> Yes, I think you'll be better off with Aldrin's suggestion of
>> ReplaceText. Then you can put the value of the attribute(s) directly
>> into the content.  For example, if you have two attributes "entities"
>> and "users", and you want a JSON doc with those two objects inside,
>> you can use ReplaceText with the following for replacement:
>>
>> {"entities": ${entities}, "users": ${users}}
>>
>> Note this "manually" transforms the JSON. Before we get the
>> TransformJSON processor, this is a decent workaround if you know what
>> the resulting JSON document should look like (and if you have
>> attributes containing the desired values).
>>
>> If you're doing this to insert into Elasticsearch, you might want to
>> handle entities and users separately and have "types" in ES for
>> "entities" and "users". In that case you could use EvaluateJsonPath to
>> get both attributes out, then wire the "success" relationship to two
>> different ReplaceTexts, one to store the entities and one for users.
>> Then you could add an attribute called "es.type" (for example), set to
>> "entities" and "users" respectively. Then you can send both forks to a
>> PutElasticsearch, setting the Type property to "${es.type}". That will
>> put the entities documents into the entities type and the same for
>> users. This will help with indexing versus one huge document.
>>
>> This process can be broken down into individual entities and users, if
>> you'd like a separate ES document for each. In that case you'd likely
>> need a SplitJson after the ReplaceText, pointing at the array of
>> entity/user objects. Then you'll get a flow file per entity/user,
>> meaning you'll get a separate ES doc for each entity and user,
>> stored/indexed/categorized by its type.
>>
>> Does this help solve your use case? If not please let me know, I'm
>> happy to help work through this :)
>>
>> Regards,
>> Matt
>>
>> On Tue, Apr 26, 2016 at 11:51 AM, Igor Kravzov 
>> wrote:
>> > I see.
>> > But I think I found the problem. It's AttributesToJson escapes the
>> > result.
>> >
>> > On Apr 26, 2016 11:46 AM, "McDermott, Chris Kevin (MSDU -
>> > STaTS/StorefrontRemote)"  wrote:
>> >>
>> >> Hi Igor,
>> >>
>> >> jsonPath will return JSON as an unescaped String.
>> >>
>> >> Chris
>> >>
>> >> From: Igor Kravzov
>> >> >
>> >> Reply-To: "users@nifi.apache.org"
>> >> >
>> >> Date: Monday, April 25, 2016 at 2:27 PM
>> >> To: "users@nifi.apache.org"
>> >> >
>> >> Subject: Re: ReplaceText processor configuration help
>> >>
>> >> Hi Chris,
>> >>
>> >> How will it help in my situation?
>> >>
>> >> On Mon, Apr 25, 2016 at 1:50 PM, McDermott, Chris Kevin (MSDU -
>> >> STaTS/StorefrontRemote)
>> >> > wrote:
>> >> Igor,
>> >>
>> >> I think the jsonPath extension to the EL is going to be the ticket [1].
>> >> A
>> >> patch is available if you are willing to build NiFi yourself to test it
>> >> out.
>> >>
>> >> Cheers,
>> >> Chris
>> >>
>> >> [1] https://issues.apache.org/jira/browse/NIFI-1660
>> >>
>> >>
>> >> From: Igor Kravzov
>> >>
>> >> >>
>> >> Reply-To:
>> >>
>> >> "users@nifi.apache.org>"
>> 

Re: ReplaceText processor configuration help

2016-04-26 Thread Igor Kravzov
Hi Matt,

You described an interesting process. I will think about it.
Initially I wanted to grab just some properties, like "entities" and
"text",  of original JSON and create anew one.
ReplaceTexts works fine as long as "text" value does not have quotes inside
the text. Once it has quotes and goes through the replace processor, JSON
became broken.

Also looking for some alterbatives like using Groovy for JSON-to-JSON
conversion.  But not sure how StandardCharsets.UTF_8 will work with
multi-byte languages.


On Tue, Apr 26, 2016 at 12:11 PM, Matt Burgess  wrote:

> Yes, I think you'll be better off with Aldrin's suggestion of
> ReplaceText. Then you can put the value of the attribute(s) directly
> into the content.  For example, if you have two attributes "entities"
> and "users", and you want a JSON doc with those two objects inside,
> you can use ReplaceText with the following for replacement:
>
> {"entities": ${entities}, "users": ${users}}
>
> Note this "manually" transforms the JSON. Before we get the
> TransformJSON processor, this is a decent workaround if you know what
> the resulting JSON document should look like (and if you have
> attributes containing the desired values).
>
> If you're doing this to insert into Elasticsearch, you might want to
> handle entities and users separately and have "types" in ES for
> "entities" and "users". In that case you could use EvaluateJsonPath to
> get both attributes out, then wire the "success" relationship to two
> different ReplaceTexts, one to store the entities and one for users.
> Then you could add an attribute called "es.type" (for example), set to
> "entities" and "users" respectively. Then you can send both forks to a
> PutElasticsearch, setting the Type property to "${es.type}". That will
> put the entities documents into the entities type and the same for
> users. This will help with indexing versus one huge document.
>
> This process can be broken down into individual entities and users, if
> you'd like a separate ES document for each. In that case you'd likely
> need a SplitJson after the ReplaceText, pointing at the array of
> entity/user objects. Then you'll get a flow file per entity/user,
> meaning you'll get a separate ES doc for each entity and user,
> stored/indexed/categorized by its type.
>
> Does this help solve your use case? If not please let me know, I'm
> happy to help work through this :)
>
> Regards,
> Matt
>
> On Tue, Apr 26, 2016 at 11:51 AM, Igor Kravzov 
> wrote:
> > I see.
> > But I think I found the problem. It's AttributesToJson escapes the
> result.
> >
> > On Apr 26, 2016 11:46 AM, "McDermott, Chris Kevin (MSDU -
> > STaTS/StorefrontRemote)"  wrote:
> >>
> >> Hi Igor,
> >>
> >> jsonPath will return JSON as an unescaped String.
> >>
> >> Chris
> >>
> >> From: Igor Kravzov >
> >> Reply-To: "users@nifi.apache.org"
> >> >
> >> Date: Monday, April 25, 2016 at 2:27 PM
> >> To: "users@nifi.apache.org"
> >> >
> >> Subject: Re: ReplaceText processor configuration help
> >>
> >> Hi Chris,
> >>
> >> How will it help in my situation?
> >>
> >> On Mon, Apr 25, 2016 at 1:50 PM, McDermott, Chris Kevin (MSDU -
> >> STaTS/StorefrontRemote)
> >> > wrote:
> >> Igor,
> >>
> >> I think the jsonPath extension to the EL is going to be the ticket
> [1].  A
> >> patch is available if you are willing to build NiFi yourself to test it
> out.
> >>
> >> Cheers,
> >> Chris
> >>
> >> [1] https://issues.apache.org/jira/browse/NIFI-1660
> >>
> >>
> >> From: Igor Kravzov
> >>  igork.ine...@gmail.com>>
> >> Reply-To:
> >> "users@nifi.apache.org users@nifi.apache.org>"
> >>  users@nifi.apache.org>>
> >> Date: Monday, April 25, 2016 at 11:45 AM
> >> To:
> >> "users@nifi.apache.org users@nifi.apache.org>"
> >>  users@nifi.apache.org>>
> >> Subject: Re: ReplaceText processor configuration help
> >>
> >> Aldrin,
> >>
> >> The overall goal is to extract some subset of attributes from tweet's
> >> JSON, create a new JSON and ingest it into Elasticsearch for indexing.
> >> Hope this helps.
> >>
> >> On Mon, Apr 25, 2016 at 11:18 AM, Aldrin Piri
> >>  aldrinp...@gmail.com>>
> >> wrote:
> >> Igor,
> >>
> >> Thanks for the template.  It looks like the trouble is with
> >> AttributesToJSON 

Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
Yes, I think you'll be better off with Aldrin's suggestion of
ReplaceText. Then you can put the value of the attribute(s) directly
into the content.  For example, if you have two attributes "entities"
and "users", and you want a JSON doc with those two objects inside,
you can use ReplaceText with the following for replacement:

{"entities": ${entities}, "users": ${users}}

Note this "manually" transforms the JSON. Before we get the
TransformJSON processor, this is a decent workaround if you know what
the resulting JSON document should look like (and if you have
attributes containing the desired values).

If you're doing this to insert into Elasticsearch, you might want to
handle entities and users separately and have "types" in ES for
"entities" and "users". In that case you could use EvaluateJsonPath to
get both attributes out, then wire the "success" relationship to two
different ReplaceTexts, one to store the entities and one for users.
Then you could add an attribute called "es.type" (for example), set to
"entities" and "users" respectively. Then you can send both forks to a
PutElasticsearch, setting the Type property to "${es.type}". That will
put the entities documents into the entities type and the same for
users. This will help with indexing versus one huge document.

This process can be broken down into individual entities and users, if
you'd like a separate ES document for each. In that case you'd likely
need a SplitJson after the ReplaceText, pointing at the array of
entity/user objects. Then you'll get a flow file per entity/user,
meaning you'll get a separate ES doc for each entity and user,
stored/indexed/categorized by its type.

Does this help solve your use case? If not please let me know, I'm
happy to help work through this :)

Regards,
Matt

On Tue, Apr 26, 2016 at 11:51 AM, Igor Kravzov  wrote:
> I see.
> But I think I found the problem. It's AttributesToJson escapes the result.
>
> On Apr 26, 2016 11:46 AM, "McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote)"  wrote:
>>
>> Hi Igor,
>>
>> jsonPath will return JSON as an unescaped String.
>>
>> Chris
>>
>> From: Igor Kravzov >
>> Reply-To: "users@nifi.apache.org"
>> >
>> Date: Monday, April 25, 2016 at 2:27 PM
>> To: "users@nifi.apache.org"
>> >
>> Subject: Re: ReplaceText processor configuration help
>>
>> Hi Chris,
>>
>> How will it help in my situation?
>>
>> On Mon, Apr 25, 2016 at 1:50 PM, McDermott, Chris Kevin (MSDU -
>> STaTS/StorefrontRemote)
>> > wrote:
>> Igor,
>>
>> I think the jsonPath extension to the EL is going to be the ticket [1].  A
>> patch is available if you are willing to build NiFi yourself to test it out.
>>
>> Cheers,
>> Chris
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-1660
>>
>>
>> From: Igor Kravzov
>> >>
>> Reply-To:
>> "users@nifi.apache.org>"
>> >>
>> Date: Monday, April 25, 2016 at 11:45 AM
>> To:
>> "users@nifi.apache.org>"
>> >>
>> Subject: Re: ReplaceText processor configuration help
>>
>> Aldrin,
>>
>> The overall goal is to extract some subset of attributes from tweet's
>> JSON, create a new JSON and ingest it into Elasticsearch for indexing.
>> Hope this helps.
>>
>> On Mon, Apr 25, 2016 at 11:18 AM, Aldrin Piri
>> >>
>> wrote:
>> Igor,
>>
>> Thanks for the template.  It looks like the trouble is with
>> AttributesToJSON converting the attribute, which in your case, is a JSON
>> blob, into additional JSON and thus the escaping to ensure nothing is lost.
>> Are you just trying to get that entity body out to a file?  If so, the
>> AttributesToJSON is likely not needed and you should be able to use
>> something like ReplaceText to write the attribute to the FlowFile body.
>> Please let us know your overall goal and we can see if the right mix of
>> components already exists or if we are running into a path that may need
>> some additional functionality.
>>
>> Thanks!
>> Aldrin
>>
>>
>>
>> On Mon, Apr 25, 2016 at 10:33 AM, Igor Kravzov
>> >>
>> 

Re: ReplaceText processor configuration help

2016-04-26 Thread McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)
Hi Igor,

jsonPath will return JSON as an unescaped String.

Chris

From: Igor Kravzov >
Reply-To: "users@nifi.apache.org" 
>
Date: Monday, April 25, 2016 at 2:27 PM
To: "users@nifi.apache.org" 
>
Subject: Re: ReplaceText processor configuration help

Hi Chris,

How will it help in my situation?

On Mon, Apr 25, 2016 at 1:50 PM, McDermott, Chris Kevin (MSDU - 
STaTS/StorefrontRemote) 
> wrote:
Igor,

I think the jsonPath extension to the EL is going to be the ticket [1].  A 
patch is available if you are willing to build NiFi yourself to test it out.

Cheers,
Chris

[1] https://issues.apache.org/jira/browse/NIFI-1660


From: Igor Kravzov 
>>
Reply-To: 
"users@nifi.apache.org>"
 
>>
Date: Monday, April 25, 2016 at 11:45 AM
To: 
"users@nifi.apache.org>"
 
>>
Subject: Re: ReplaceText processor configuration help

Aldrin,

The overall goal is to extract some subset of attributes from tweet's JSON, 
create a new JSON and ingest it into Elasticsearch for indexing.
Hope this helps.

On Mon, Apr 25, 2016 at 11:18 AM, Aldrin Piri 
>>
 wrote:
Igor,

Thanks for the template.  It looks like the trouble is with AttributesToJSON 
converting the attribute, which in your case, is a JSON blob, into additional 
JSON and thus the escaping to ensure nothing is lost.  Are you just trying to 
get that entity body out to a file?  If so, the AttributesToJSON is likely not 
needed and you should be able to use something like ReplaceText to write the 
attribute to the FlowFile body.  Please let us know your overall goal and we 
can see if the right mix of components already exists or if we are running into 
a path that may need some additional functionality.

Thanks!
Aldrin



On Mon, Apr 25, 2016 at 10:33 AM, Igor Kravzov 
>>
 wrote:
Hi Aldrin,


Attached please find the template.  In this workflow I want to pull "entities" 
and "user" entries for Twitter JSON as entire structure. I only can do it if I 
set Return Type as JSON.
Subsequently I use AttributesToJSON to create a new JSON file. But returning 
values for "entities" and "user" are escaped so I had to clean these before 
converting to JSON.

Hope this helps.

On Mon, Apr 25, 2016 at 10:15 AM, Aldrin Piri 
>>
 wrote:
Hi Igor,

That should certainly be possible.  Would you mind opening up a ticket 
(https://issues.apache.org/jira/browse/NIFI) and providing a template of your 
flow that is causing the issue?

Thanks!

On Mon, Apr 25, 2016 at 10:09 AM, Igor Kravzov 
>>
 wrote:
Thanks Pierre. It worked. Looks like I was doing something wrong inside my 
workflow.
Would not be it feasible to have an option for EvaluateJsonPath processor to 
have an option to return escaped or unescaped JSON result?

On Mon, Apr 25, 2016 at 7:20 AM, Pierre Villard 
>>
 wrote:
Hi Igor,

Please use ReplaceText processors.

1.
Search value : \\
Replace value : Empty string set

2.
Search value : "\{
Replace value : \{

3.
Search value : \}"
Replace value : \}

Template example attached.

HTH
Pierre


2016-04-24 20:12 GMT+02:00 Igor Kravzov 
>>:

I am not that good in regex. What would be the proper configuration to do the 
following;

  1.  Remove backslash from text.
  2.  Replace "{ with {
  3.  replace }" with }

Basically I need to clean escaped JSON.

Like before:

 

Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-26 Thread Matt Burgess
Hive doesn't work with ExecuteSQL as its JDBC driver does not support
all the JDBC API calls made by ExecuteSQL / PutSQL.  However I am
working on a Hive NAR to include ExecuteHiveQL and PutHiveQL
processors (https://issues.apache.org/jira/browse/NIFI-981), there is
a prototype pull request on GitHub
(https://github.com/apache/nifi/pull/372) if you'd like to try them
out. I am currently adding support for Kerberos and finishing up, then
will issue a new PR for the processors.

To use ExecuteScript in the meantime, you've got a couple of options
after downloading the driver and all its dependencies (or better yet,
the single "fat JAR"):

1) Add the location of the JAR(s) to the Module Directory property of
the ExecuteScript dialog. You will have to create your own connection,
if you're using Groovy then its Sql facility is quite nice
(http://www.schibsted.pl/2015/06/groovy-sql-an-easy-way-to-database-scripting/)

2) Create a Database Connection Pool configured to point at the JAR(s)
and use the Hive driver (org.apache.hive.jdbc.HiveDriver). Then you
can get a connection from there and continue on with Groovy SQL (for
example). I have a blog post about this:
http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html

Regards,
Matt

On Tue, Apr 26, 2016 at 8:07 AM, Pierre Villard
 wrote:
> Hi Mike,
>
> I never tried but using the JDBC client you should be able to query your
> Hive table using ExecuteSQL processor.
>
> Hope that helps,
> Pierre
>
>
> 2016-04-26 13:53 GMT+02:00 Mike Harding :
>>
>> Hi All,
>>
>> I have a requirement to access a lookup Hive table to translate a code
>> number in a FlowFile to a readable name. I'm just unsure how trivial it is
>> to connect to the db from an ExecuteScript processor?
>>
>> Nifi and the hiveserver2 sit on the same node so I'm wondering if its
>> possible to use HiveServer2's JDBC client
>> (https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC)
>> without any issues?
>>
>> Thanks in advance,
>> Mike
>
>


Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-26 Thread Pierre Villard
Hi Mike,

I never tried but using the JDBC client you should be able to query your
Hive table using ExecuteSQL processor.

Hope that helps,
Pierre


2016-04-26 13:53 GMT+02:00 Mike Harding :

> Hi All,
>
> I have a requirement to access a lookup Hive table to translate a code
> number in a FlowFile to a readable name. I'm just unsure how trivial it is
> to connect to the db from an ExecuteScript processor?
>
> Nifi and the hiveserver2 sit on the same node so I'm wondering if its
> possible to use HiveServer2's JDBC client (
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC)
> without any issues?
>
> Thanks in advance,
> Mike
>


Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-26 Thread Mike Harding
Hi All,

I have a requirement to access a lookup Hive table to translate a code
number in a FlowFile to a readable name. I'm just unsure how trivial it is
to connect to the db from an ExecuteScript processor?

Nifi and the hiveserver2 sit on the same node so I'm wondering if its
possible to use HiveServer2's JDBC client (
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC)
without any issues?

Thanks in advance,
Mike


RE: AvroRuntimeException : Duplicate field name

2016-04-26 Thread Panos Geo








Hello Toivo, all,


Thank you very much for your feedback.


I have tried this with the following JDBC drivers : mariadb-java-client v1.2.3, 
v.1.3.2, v.1.3.6, v.1.4.2 (latest) and also with 
mysql-connector-java-5.1.38-bin and they all exhibit the same behaviour.


Essentially, the exception is thrown because of the combination of the 
following : 

Aliases are not being used in the SQL query results with the above JDBC drivers
We have SQL queries that use tables that contain the same field names

A friend helped me find a solution to this, and it is essentially to use 
"useOldAliasMetadataBehavior=true” in the connection string to the database. So 
it should be something like : 
jdbc:mariadb://localhost:3306/db?useOldAliasMetadataBehavior=true


It appears that they’ve changed the implementation of aliases in the latest 
JSBC drivers and so you need the above flag in order for this to work.


I hope this helps other people as well.


Best regards,
Panos

Date: Mon, 25 Apr 2016 17:37:36 +0300
Subject: Re: AvroRuntimeException : Duplicate field name
From: toivo.ad...@gmail.com
To: users@nifi.apache.org










Hi Panos,



NiFi includes Junit test:
/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/util/TestJdbcHugeStream.java



This test also use aliases:
  
 final
ResultSet resultSet
= st.executeQuery("select
"
  
 + "
 PER.ID as PersonId, PER.NAME as PersonName, PER.CODE as PersonCode"
  
 + ",
PRD.ID as ProductId,PRD.NAME as ProductName,PRD.CODE as ProductCode"
  
 + ",
REL.ID as RelId,REL.NAME as RelName,REL.CODE as RelCode"
  
 + ",
ROW_NUMBER() OVER () as rownr "
  
 + "
from persons PER, products PRD, relationships REL");


When Derby database is used test is successful.

Please can you run this test using MariaDB?
Maybe MariaDB JDBC driver have some unexpected behavior?

Unfortunately I don't have running MariaDB at the moment.
I won't exclude also bug in NiFi.

Thanks
Toivo
2016-04-25 12:13 GMT+03:00 Panos Geo :



Hello Toivo, all,
I am not sure what else would you require from me to test this. In my view, 
this appears to be a bug as no matter what I've tried any SQL statement where 
fields from different tables have the same name, would force the SQL execute 
processor to throw an "org.apache.avro.AvroRuntimeException: Duplicate field 
name in record".
For example, this statement : 
"SELECT plant.name, area.area_id, area.name
FROM plant, area WHERE plant.plant_id=area.plant_id;"
Aliases in the SQL statement are ignored by the SQL execute processor. Direct 
(i.e. no through NiFi) SQL statement execution in MariaDB works correctly.
I would appreciate any help with this. 
Many thanks in advance,Panos 



Date: Thu, 21 Apr 2016 19:36:22 +0300
Subject: Re: AvroRuntimeException : Duplicate field name
From: toivo.ad...@gmail.com
To: users@nifi.apache.org

It seems ExecuteSQL Junit test case must be created.
Then we can investigate problem.

Unfortunately I don't have other ideas (at least at the moment).

thanks
toivo
2016-04-21 19:26 GMT+03:00 Panos Geo :



No worries, I appreciate your help anyhow.
I am using MariaDB, but I get the below as warning when I start NiFi even 
before triggering the processor to execute the SQL statement. When I do trigger 
the processor to execute the SQL query, then I see the AvroRuntimeException as 
full error...
Thanks,Panos
Date: Thu, 21 Apr 2016 19:16:46 +0300
Subject: Re: AvroRuntimeException : Duplicate field name
From: toivo.ad...@gmail.com
To: users@nifi.apache.org

Sorry, I didn't read your email carefully enough.

Alias should work.
Which database you are using?

thanks
toivo
2016-04-21 19:05 GMT+03:00 Panos Geo :



Hello Toivo,
Many thanks for your reply! As I have indicated in my initial email, using 
aliases doesn't make any difference. It appears as if they are ignored and I am 
getting the same error.
Any other thoughts?
Many thanks,Panos

Date: Thu, 21 Apr 2016 18:59:33 +0300
Subject: Re: AvroRuntimeException : Duplicate field name
From: toivo.ad...@gmail.com
To: users@nifi.apache.org

Hi,

Field names should be unique.
Currently after executing query both 'plant.name' and 'area.name' will be just 
same 'name'

You can use alias to have unique name, like:
SELECT plant.name as pname, area.area_id, area.name as aname

thanks
toivo

2016-04-21 18:51 GMT+03:00 Panos Geo :






Hello all,I would appreciate your help in the Avro error that I am seeing. I am 
executing the following very simple SQL select statement using the ExecuteSQL 
processor : SELECT plant.name, area.area_id, area.nameFROM plant, area WHERE 
plant.plant_id=area.plant_id;…and I get the following error message 
“org.apache.avro.AvroRuntimeException: Duplicate field name in record 
any.data.plant: name type:UNION