mmiklavc commented on issue #1341: METRON-614: Eliminate use of the default Charset URL: https://github.com/apache/metron/pull/1341#issuecomment-493500295 I think setting a default to `UTF-8` in the parsers and documenting it would be the way to go. Provide a per-sensor config option, e.g. `inputDataCharset` that lets users configure it for the edge case. Emphasis on per-sensor because 99/100 sensors will probably be `UTF-8`, and then one will be something wild like `EBCDIC` because hey, why not. In general, I agree that it would be odd for any network sensors to be set to anything other than `UTF-8`. We're probably looking at other sources of mischief, though. A couple examples could be streaming and bulk loaded enrichments. I would not be surprised to find someone at some point loading `ISO-8859-1` or `Windows-1252`. In multiple big data projects prior to Metron I had to deal with encodings like this.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
