mmiklavc commented on issue #1341: METRON-614: Eliminate use of the default 
Charset
URL: https://github.com/apache/metron/pull/1341#issuecomment-493500295
 
 
   I think setting a default to `UTF-8` in the parsers and documenting it would 
be the way to go. Provide a per-sensor config option, e.g. `inputDataCharset` 
that lets users configure it for the edge case. Emphasis on per-sensor because 
99/100 sensors will probably be `UTF-8`, and then one will be something wild 
like `EBCDIC` because hey, why not.
   
   In general, I agree that it would be odd for any network sensors to be set 
to anything other than `UTF-8`. We're probably looking at other sources of 
mischief, though. A couple examples could be streaming and bulk loaded 
enrichments. I would not be surprised to find someone at some point loading 
`ISO-8859-1` or `Windows-1252`. In multiple big data projects prior to Metron I 
had to deal with encodings like this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to