[ 
https://issues.apache.org/jira/browse/METRON-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Leet updated METRON-614:
-------------------------------
    Description: 
During work on METRON-612, I noticed a lot of our warnings are from implicit 
use of the default Charset (e.g. when creating InputStreamReaders).  Given that 
this value is platform dependent, we should at least consider explicitly using 
a particular Charset.

In addition, after fixing this, I would like to see it upgraded to a compile 
error, because in my experience almost everyone tends to use methods that use 
the implicit default Charset.  This can be done by changing the compiler arg in 
the pom to instead be
{code}
<compilerArgs>
  <arg>-Xlint:unchecked</arg>
  <arg>-Xep:DefaultCharset:ERROR</arg>
</compilerArgs>
{code}

My assumption is that we'll want UTF-8, but given a lack of expertise and that 
usually there's a lot of opinions on character set issues, we may want to 
consider other options.

http://errorprone.info/bugpattern/DefaultCharset
https://docs.oracle.com/javase/8/docs/api/java/nio/charset/Charset.html lists 
standard Charsets and provides more details.
{code}
Charset Description
US-ASCII        Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block 
of the Unicode character set
ISO-8859-1      ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8   Eight-bit UCS Transformation Format
UTF-16BE        Sixteen-bit UCS Transformation Format, big-endian byte order
UTF-16LE        Sixteen-bit UCS Transformation Format, little-endian byte order
UTF-16  Sixteen-bit UCS Transformation Format, byte order identified by an 
optional byte-order mark
{code}

  was:
During work on Metron-612, I noticed a lot of our warnings are from implicit 
use of the default Charset (e.g. when creating InputStreamReaders).  Given that 
this value is platform dependent, we should at least consider explicitly using 
a particular Charset.

In addition, after fixing this, I would like to see it upgraded to a compile 
error, because in my experience almost everyone tends to use methods that use 
the implicit default Charset.  This can be done by changing the compiler arg in 
the pom to instead be
{code}
<compilerArgs>
  <arg>-Xlint:unchecked</arg>
  <arg>-Xep:DefaultCharset:ERROR</arg>
</compilerArgs>
{code}

My assumption is that we'll want UTF-8, but given a lack of expertise and that 
usually there's a lot of opinions on character set issues, we may want to 
consider other options.

http://errorprone.info/bugpattern/DefaultCharset
https://docs.oracle.com/javase/8/docs/api/java/nio/charset/Charset.html lists 
standard Charsets and provides more details.
{code}
Charset Description
US-ASCII        Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block 
of the Unicode character set
ISO-8859-1      ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8   Eight-bit UCS Transformation Format
UTF-16BE        Sixteen-bit UCS Transformation Format, big-endian byte order
UTF-16LE        Sixteen-bit UCS Transformation Format, little-endian byte order
UTF-16  Sixteen-bit UCS Transformation Format, byte order identified by an 
optional byte-order mark
{code}


> Eliminate use of the default Charset
> ------------------------------------
>
>                 Key: METRON-614
>                 URL: https://issues.apache.org/jira/browse/METRON-614
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Justin Leet
>            Priority: Minor
>
> During work on METRON-612, I noticed a lot of our warnings are from implicit 
> use of the default Charset (e.g. when creating InputStreamReaders).  Given 
> that this value is platform dependent, we should at least consider explicitly 
> using a particular Charset.
> In addition, after fixing this, I would like to see it upgraded to a compile 
> error, because in my experience almost everyone tends to use methods that use 
> the implicit default Charset.  This can be done by changing the compiler arg 
> in the pom to instead be
> {code}
> <compilerArgs>
>   <arg>-Xlint:unchecked</arg>
>   <arg>-Xep:DefaultCharset:ERROR</arg>
> </compilerArgs>
> {code}
> My assumption is that we'll want UTF-8, but given a lack of expertise and 
> that usually there's a lot of opinions on character set issues, we may want 
> to consider other options.
> http://errorprone.info/bugpattern/DefaultCharset
> https://docs.oracle.com/javase/8/docs/api/java/nio/charset/Charset.html lists 
> standard Charsets and provides more details.
> {code}
> Charset       Description
> US-ASCII      Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block 
> of the Unicode character set
> ISO-8859-1    ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
> UTF-8 Eight-bit UCS Transformation Format
> UTF-16BE      Sixteen-bit UCS Transformation Format, big-endian byte order
> UTF-16LE      Sixteen-bit UCS Transformation Format, little-endian byte order
> UTF-16        Sixteen-bit UCS Transformation Format, byte order identified by 
> an optional byte-order mark
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to