[ 
https://issues.apache.org/jira/browse/METRON-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984811#comment-15984811
 ] 

Otto Fowler commented on METRON-893:
------------------------------------

What if you could implement your cleaning in Stellar functions, which would be 
in libraries that were loaded as plugins and available to all your parsers?

my_field = ALI_CLEANMYFIELD(my_field)

The idea would be:

* Metron has an archetype for creating stellar ‘libraries’
* You write your stellar functions and the unit/integration tests for them, and 
maintain that project outside the metron tree ( as hopefully you will be able 
to do soon with parsers -METRON-777, METRON-258 )
* You use the metron management UI to install your stellar libraries
* You call your stellar functions from your parser configuration

> Adding normalization bolt for parsing topology
> ----------------------------------------------
>
>                 Key: METRON-893
>                 URL: https://issues.apache.org/jira/browse/METRON-893
>             Project: Metron
>          Issue Type: New Feature
>            Reporter: Ali Nazemian
>            Priority: Minor
>
> We are facing certain use cases in Metron production that happen to be 
> related to noisy stream. For example, a wrong timestamp, duplicate 
> hostname/IP address, etc. To deal with the normalization, we have added an 
> additional step for the corresponding parsers to do the data cleaning. 
> Clearly, parsing is a standard factor which is mostly related to the device 
> that is generating the data and can be used for the same type of device 
> everywhere, but normalization is very production dependent and there is no 
> point of mixing normalization with parsing. It would be nice to have a 
> sperate bolt in a parsing topologies to dedicate to production related 
> cleaning process. In that case, eveybody can easily contribute to Metron 
> community with additional parsers without being worried about mixing parsers 
> and data cleaning process.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to