Devs, I am continuing to drive the migration of our logging pipeline to NiFi and in the process identified some areas of log processing that could be improved by the introduction of new processors.
I wonder Would anyone oppose the idea of introducing the following processors: 1. ParseCEF (think of it like logstash-codec-cef) Processor to parse CEF format - ( https://www.protect724.hpe.com/docs/DOC-1072); CEF attributes would be converted into NiFi FlowFiles attributes; 2. ParseKV (think of it like Splunk's kv parser) A processor to split strings by keys and values (delimiter based) would be added to FlowFIle attributes; Parser would support extracting multiple instances of the same key via attributes like parse.kv.key_name.0 , parse.kv.key_name.1, etc) 3. QueryBulkWhoisAPI This processor would read a batch of Flowfiles, extract the appropriate field (e.g. ip address), make the batch whois query, parse results and then append results to individual FlowFiles. This processor would complement QueryDNS (PR#496). QueryDNS only makes individual queries and depending on API access conditions it may lead to blacklisting. Some providers will license access (e.g. Spamhaus RBLs), while others (e.g. SHadowServer) suggest instead the use of bulk queries. Keen to hear your opinion
