Re: Log4j/logback parser via syslog

2016-02-12 Thread Bryan Bende
I believe groovy, python, jython, jruby, ruby, javascript, and lua.

The associated JIRA is here:
https://issues.apache.org/jira/browse/NIFI-210

There are some cool blogs about them here:
http://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html

-Bryan

On Fri, Feb 12, 2016 at 10:48 AM, Madhukar Thota 
wrote:

> Thanks Bryan. I will look into ExtractText processor.
>
> Do you know what scripting languages are supported with new processors?
>
> -Madhu
>
> On Fri, Feb 12, 2016 at 9:27 AM, Bryan Bende  wrote:
>
>> Hello,
>>
>> Currently there are no built in processors to parse log formats, but have
>> you taken a look at the ExtractText processor [1]?
>>
>> If you can come up with a regular expression for whatever you are trying
>> to extract, then you should be able to use ExtractText.
>>
>> Other options...
>>
>> You could write a custom processor, but this sounds like it might be
>> overkill for your scenario.
>> In the next release (hopefully out in a few days) there will be two new
>> processors that support scripting languages. It may be easier to use a
>> scripting language to manipulate/parse the text.
>>
>> Thanks,
>>
>> Bryan
>>
>> [1]
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExtractText/index.html
>>
>>
>> On Fri, Feb 12, 2016 at 12:16 AM, Madhukar Thota <
>> madhukar.th...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> I am very new to Apache Nifi and just started learning about how to use
>>> it.
>>>
>>> We have a requirement where we need to parse log4j/logback pattern
>>> messages coming from SyslogAppenders via Syslog udp. I can read the
>>> standard syslog messages, but how can i further extract log4j/logback
>>> messages  from syslog body.
>>>
>>> Is there any log parsers( log4j/logback/Apache access log format)
>>> available in apache nifi?
>>>
>>>
>>> Any help on this much appreciated.
>>>
>>> Thanks in Advance.
>>>
>>>
>>
>


Re: Log4j/logback parser via syslog

2016-02-12 Thread Madhukar Thota
Thanks Bryan. Looking forward for the release.



On Fri, Feb 12, 2016 at 10:55 AM, Bryan Bende  wrote:

> I believe groovy, python, jython, jruby, ruby, javascript, and lua.
>
> The associated JIRA is here:
> https://issues.apache.org/jira/browse/NIFI-210
>
> There are some cool blogs about them here:
>
> http://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html
>
> -Bryan
>
> On Fri, Feb 12, 2016 at 10:48 AM, Madhukar Thota  > wrote:
>
>> Thanks Bryan. I will look into ExtractText processor.
>>
>> Do you know what scripting languages are supported with new processors?
>>
>> -Madhu
>>
>> On Fri, Feb 12, 2016 at 9:27 AM, Bryan Bende  wrote:
>>
>>> Hello,
>>>
>>> Currently there are no built in processors to parse log formats, but
>>> have you taken a look at the ExtractText processor [1]?
>>>
>>> If you can come up with a regular expression for whatever you are trying
>>> to extract, then you should be able to use ExtractText.
>>>
>>> Other options...
>>>
>>> You could write a custom processor, but this sounds like it might be
>>> overkill for your scenario.
>>> In the next release (hopefully out in a few days) there will be two new
>>> processors that support scripting languages. It may be easier to use a
>>> scripting language to manipulate/parse the text.
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>> [1]
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExtractText/index.html
>>>
>>>
>>> On Fri, Feb 12, 2016 at 12:16 AM, Madhukar Thota <
>>> madhukar.th...@gmail.com> wrote:
>>>
 Hi

 I am very new to Apache Nifi and just started learning about how to use
 it.

 We have a requirement where we need to parse log4j/logback pattern
 messages coming from SyslogAppenders via Syslog udp. I can read the
 standard syslog messages, but how can i further extract log4j/logback
 messages  from syslog body.

 Is there any log parsers( log4j/logback/Apache access log format)
 available in apache nifi?


 Any help on this much appreciated.

 Thanks in Advance.


>>>
>>
>


Re: concatenate attributes

2016-02-12 Thread Brandon DeVries
Chris,

Have your tried just:

${plugin}.${type_instance}

?  Are you sure (via LogAttributes or something) that the attributes you're
using are actually there?  What are you getting with the values you've
tried so far?

Brandon


On Fri, Feb 12, 2016 at 1:58 PM, Christopher Wilson 
wrote:

> I wanted to create an attribute by concatenating two attributes with a
> connector string ('.' or '_') using the UpdateAttribute processor.
>
> For example I have some JSON formatted data coming off CollectD that I'm
> trying to reshape for OpenTSDB.  I'm pulling out attributes like 'plugin'
> ("cpu") and 'type_instance' ("user") and I want to concatonate them into an
> OpenTSDB metric like "cpu.system".
>
> I've tried append in various configurations, ${plugin}"."${type_instance}
> as an entry, and a few others and have not gotten to the end yet.
>
> Any help appreciated, thanks.
>
> -Chris
>


Re: concatenate attributes

2016-02-12 Thread Lee Laim
Chris,

Additionally, have you try to use the 'append' function from the expression
language?

${plugin:append('.'):append(${type_instance})}


-Lee

On Fri, Feb 12, 2016 at 11:58 AM, Christopher Wilson 
wrote:

> I wanted to create an attribute by concatenating two attributes with a
> connector string ('.' or '_') using the UpdateAttribute processor.
>
> For example I have some JSON formatted data coming off CollectD that I'm
> trying to reshape for OpenTSDB.  I'm pulling out attributes like 'plugin'
> ("cpu") and 'type_instance' ("user") and I want to concatonate them into an
> OpenTSDB metric like "cpu.system".
>
> I've tried append in various configurations, ${plugin}"."${type_instance}
> as an entry, and a few others and have not gotten to the end yet.
>
> Any help appreciated, thanks.
>
> -Chris
>


Re: Log4j/logback parser via syslog

2016-02-12 Thread Joe Percivall
Hello Madhu,


If you're looking for a template to show how to create a dynamic property for 
RouteOnAttribute to use, I'd suggest checking out this template[1]. It is a 
simple template that checks to see if the an attribute matches 'NiFi'.

Also provenance can be a very powerful debugging tool. If a flowfile gets 
routed to a relationship you don't expect, simply check the provenance for the 
destination of the relationship. You'll be able to see the exact attributes for 
any recent flowfile that was routed there.
[1] 
https://github.com/hortonworks-gallery/nifi-templates/blob/master/templates/simple-httpget-route-flow.xml

 
Hope that helps,
Joe

- - - - - - 
Joseph Percivall
linkedin.com/in/Percivall
e: joeperciv...@yahoo.com



On Friday, February 12, 2016 2:28 PM, Madhukar Thota  
wrote:



I am getting my log4j logs on facility value 23 ( LOCAL7) how can route only 
facility 23 logs for further extraction.

I added RouteonAttribute  processor and defined this property 
:${facility:contains(23)}  but none of them messages getting matched. I am not 
sure my defined property is correct. How can i route messages based on the 
field value to different processors?

-Madhu


On Fri, Feb 12, 2016 at 11:33 AM, Madhukar Thota  
wrote:

Thanks Bryan. Looking forward for the release.
>
>
> 
>
>
>On Fri, Feb 12, 2016 at 10:55 AM, Bryan Bende  wrote:
>
>I believe groovy, python, jython, jruby, ruby, javascript, and lua.
>>
>>
>>The associated JIRA is here:
>>https://issues.apache.org/jira/browse/NIFI-210
>>
>>
>>
>>There are some cool blogs about them here:
>>http://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html
>>
>>
>>
>>-Bryan
>>
>>
>>On Fri, Feb 12, 2016 at 10:48 AM, Madhukar Thota  
>>wrote:
>>
>>Thanks Bryan. I will look into ExtractText processor.
>>>
>>>
>>>Do you know what scripting languages are supported with new processors?
>>>
>>>
>>>-Madhu
>>>
>>>
>>>On Fri, Feb 12, 2016 at 9:27 AM, Bryan Bende  wrote:
>>>
>>>Hello,


Currently there are no built in processors to parse log formats, but have 
you taken a look at the ExtractText processor [1]? 


If you can come up with a regular expression for whatever you are trying to 
extract, then you should be able to use ExtractText.


Other options... 


You could write a custom processor, but this sounds like it might be 
overkill for your scenario.
In the next release (hopefully out in a few days) there will be two new 
processors that support scripting languages. It may be easier to use a 
scripting language to manipulate/parse the text. 


Thanks,


Bryan


[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExtractText/index.html




On Fri, Feb 12, 2016 at 12:16 AM, Madhukar Thota  
wrote:

Hi 
>
>
>I am very new to Apache Nifi and just started learning about how to use it.
>
>
>We have a requirement where we need to parse log4j/logback pattern 
>messages coming from SyslogAppenders via Syslog udp. I can read the 
>standard syslog messages, but how can i further extract log4j/logback 
>messages  from syslog body.
>
>
>Is there any log parsers( log4j/logback/Apache access log format) 
>available in apache nifi?
>
>
>
>
>Any help on this much appreciated. 
>
>
>Thanks in Advance.
>
>

>>>
>>
>


Re: Thread Control of Processors

2016-02-12 Thread Jeff - Data Bean Australia
Thank you both, Joe and Simon, for pointing out InvokeHTTP and ControlRate
to me.

Regarding InvokeHTTP, I got a couple of questions for you.

Both InvokeHTTP and GetHTTP has settings as "Concurrent Tasks" and "Run
schedule", if my use case is only about GET method, why InvokeHTTP is
better? I noticed that InvokeHTTP inherits from AbstractProcessor, while
AbstractProcessor and GetHTTP share the same parent,
AbstractSessionFactoryProcessor, can you explain what enhancement
InvokeHTTP gets by going one further down the hierarchy?

ControlRate and InvokeHTTP are at the same level regarding class hierarchy.
Simon pointed out that ControlRate and InvokeHTTP can work together for
more intuitive control. This looks wonderful. Both of you mentioned
backpressure, what's the different regarding backpressure when using
InvokeHTTP alone and combine it with ControlRate?

Can ControlRate work with GetHTTP also? If yes, what would be the
different?

Thanks,
Jeff

On Fri, Feb 12, 2016 at 7:06 PM, Simon Ball  wrote:

> Jeff,
>
> Another approach I've used with some success is the ControlRate processor
> before the InvokeHttp, which gives you an intuitive way of limiting the
> number of requests in a specified time interval. Note however that this
> does need to be combined with back pressure control to prevent requests
> queuing behind the InvokeHttp. Feeding failure retries back into the
> ControlRate, or a funnel before the invoke also gives you a bit more
> control over groupings of retries for example.
>
> Simon
>
> Sent from my iPhone
>
> —
> Simon Elliston Ball
> Solutions Engineer - EMEA
> +44 7930 424111 <+44%207930%20424111>
> Hortonworks - We Do Hadoop
>
>
> On 12 Feb 2016, at 05:34, Joe Witt  wrote:
>
> Jeff,
>
> This is definitely a strong use case for nifi.
>
> It might be that InvokeHTTP is the better choice here.
>
> If what you'd like to do is effectively throttle the rate at which you
> hit the web service with the InvokeHttp calls you can schedule that
> processor to run as often as you like (for example every 100 ms).
> Then use backpressure settings on the queues feeding that InvokeHTTP
> process.  Effectively you can control where data will back up in the
> flow while it is being throttled.
>
> If the lookup data is a good candidate for caching then there may be
> other great options to make this more efficient.
>
> Perhaps you can share a flow template of what you have so far and we
> can make recommendations on next steps?
>
> Thanks
> Joe
>
> On Fri, Feb 12, 2016 at 12:28 AM, Jeff - Data Bean Australia
>  wrote:
>
> Hi
>
>
> I got a use case like this:
>
>
> There is a file that contains thousands of items, each on one line. For
> each
>
> item, it will trigger one GetHTTP processor to fetch some data.
>
>
> Here is what I am trying to do:
>
>
> 1. Fetch this file
>
> 2. For each line, I generate one file using SplitText
>
> 3. Drive GetHTTP downstream.
>
>
> However, given there are more than 2000 lines, more than 2000 HTTP Get
>
> processes will be created and flooded into one web site, which doesn't
> sound
>
> like a good idea. So I would like to control the processors, so that only a
>
> couple of them will be running at the same time, and maybe delay for a
>
> couple of seconds after finish.
>
>
> How can I do that in NiFi?
>
>
> Thanks,
>
> Jeff
>
>
>
>
>
> --
>
> Data Bean - A Big Data Solution Provider in Australia.
>
>
>


-- 
Data Bean - A Big Data Solution Provider in Australia.