[
https://issues.apache.org/jira/browse/METRON-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984811#comment-15984811
]
Otto Fowler commented on METRON-893:
------------------------------------
What if you could implement your cleaning in Stellar functions, which would be
in libraries that were loaded as plugins and available to all your parsers?
my_field = ALI_CLEANMYFIELD(my_field)
The idea would be:
* Metron has an archetype for creating stellar ‘libraries’
* You write your stellar functions and the unit/integration tests for them, and
maintain that project outside the metron tree ( as hopefully you will be able
to do soon with parsers -METRON-777, METRON-258 )
* You use the metron management UI to install your stellar libraries
* You call your stellar functions from your parser configuration
> Adding normalization bolt for parsing topology
> ----------------------------------------------
>
> Key: METRON-893
> URL: https://issues.apache.org/jira/browse/METRON-893
> Project: Metron
> Issue Type: New Feature
> Reporter: Ali Nazemian
> Priority: Minor
>
> We are facing certain use cases in Metron production that happen to be
> related to noisy stream. For example, a wrong timestamp, duplicate
> hostname/IP address, etc. To deal with the normalization, we have added an
> additional step for the corresponding parsers to do the data cleaning.
> Clearly, parsing is a standard factor which is mostly related to the device
> that is generating the data and can be used for the same type of device
> everywhere, but normalization is very production dependent and there is no
> point of mixing normalization with parsing. It would be nice to have a
> sperate bolt in a parsing topologies to dedicate to production related
> cleaning process. In that case, eveybody can easily contribute to Metron
> community with additional parsers without being worried about mixing parsers
> and data cleaning process.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)