Re: RFC: System.console().encoding()
Am 15.09.2016 um 17:56 schrieb Aleksey Shipilev: ...which opens a way to poll this without a Reflection hack. Extended the JMH hack with it, but it still fragile: http://hg.openjdk.java.net/code-tools/jmh/rev/8c20adb08b2d Maybe keep it simple - no need for (prop != null) - and in design line with the 2 other tries: // Try 3. Try to poll internal properties. try { return Charset.forName(System.getProperty("sun.stdout.encoding") ); } catch (Exception e) { // fall-through } -Ulf
Re: RFC: System.console().encoding()
On 9/15/16 8:56 AM, Aleksey Shipilev wrote: On 09/15/2016 09:06 AM, Xueming Shen wrote: Console is supposed to be a "char/String" based class, "encoding" really should have no business here in its api. Simply for some implementation convenience is really not a good reason to add such a public method. Let's look at it this way: there is a problem with console encoding that Console class solves, nicely abstracting the subtleties away. In doing so, it polls GetConsoleCP, the WinAPI call: What I meant is that the Console was/is designed the way that the user can access the console/terminal without knowing/dealing with the "encoding". The encoding concept is purposely hidden from the very beginning, with the assumption/believe that this is a implementation detail you really don't need to know when using the Console class in general use scenario. It's obviously it would be helpful and convenient if this encoding info can be accessed in your use case, but given this is really not a "normal" use scenario, what I'm saying is the System properties might be better place for such information. Seems like the jmh.util.Utils is accessing the System.properties already for other system/vm-wide info, such as the vm version, os.name ...as well as the file.encoding you might need for non-console in/output. Sherman JNIEXPORT jstring JNICALL Java_java_io_Console_encoding(JNIEnv *env, jclass cls) { char buf[64]; int cp = GetConsoleCP(); if (cp >= 874 && cp <= 950) sprintf(buf, "ms%d", cp); else sprintf(buf, "cp%d", cp); return JNU_NewStringPlatform(env, buf); } If by "convenience" you mean avoiding doing the JNI call that polls that OS-specific bit of data, then yes, APIs provide lots of those conveniences. That said, I would be fine to have such informative info in the system properties, together with its siblings, file,encoding and another "supposed to be private" property sun.jnu.encoding. Actually, if you look into the launcher, it does: static char* getConsoleEncoding() { char* buf = malloc(16); int cp; if (buf == NULL) { return NULL; } cp = GetConsoleCP(); if (cp >= 874 && cp <= 950) sprintf(buf, "ms%d", cp); else sprintf(buf, "cp%d", cp); return buf; } ... sprops.sun_stdout_encoding = getConsoleEncoding(); ...which opens a way to poll this without a Reflection hack. Extended the JMH hack with it, but it still fragile: http://hg.openjdk.java.net/code-tools/jmh/rev/8c20adb08b2d Thanks, -Aleksey
Re: RFC: System.console().encoding()
On 09/15/2016 09:06 AM, Xueming Shen wrote: > Console is supposed to be a "char/String" based class, "encoding" > really should have no business here in its api. Simply for some > implementation convenience is really not a good reason to add such a > public method. Let's look at it this way: there is a problem with console encoding that Console class solves, nicely abstracting the subtleties away. In doing so, it polls GetConsoleCP, the WinAPI call: JNIEXPORT jstring JNICALL Java_java_io_Console_encoding(JNIEnv *env, jclass cls) { char buf[64]; int cp = GetConsoleCP(); if (cp >= 874 && cp <= 950) sprintf(buf, "ms%d", cp); else sprintf(buf, "cp%d", cp); return JNU_NewStringPlatform(env, buf); } If by "convenience" you mean avoiding doing the JNI call that polls that OS-specific bit of data, then yes, APIs provide lots of those conveniences. > That said, I would be fine to have such informative info in the > system properties, together with its siblings, file,encoding and > another "supposed to be private" property sun.jnu.encoding. Actually, if you look into the launcher, it does: static char* getConsoleEncoding() { char* buf = malloc(16); int cp; if (buf == NULL) { return NULL; } cp = GetConsoleCP(); if (cp >= 874 && cp <= 950) sprintf(buf, "ms%d", cp); else sprintf(buf, "cp%d", cp); return buf; } ... sprops.sun_stdout_encoding = getConsoleEncoding(); ...which opens a way to poll this without a Reflection hack. Extended the JMH hack with it, but it still fragile: http://hg.openjdk.java.net/code-tools/jmh/rev/8c20adb08b2d Thanks, -Aleksey
Re: RFC: System.console().encoding()
On 09/15/2016 02:06 AM, Xueming Shen wrote: -1 :-) Console is supposed to be a "char/String" based class, "encoding" really should have no business here in its api. Simply for some implementation convenience is really not a good reason to add such a public method. Let's look at the two likely cases fairly though: if the console is purely char-based it could easily report a Unicode-based encoding (like UTF-16_BE), implying full support for any string that is output or input. If the console is byte-based, then the encoding definitely provides real, useful information that could be relevant to the application. Overall it seems harmless to me. That said, I would be fine to have such informative info in the system properties, together with its siblings, file,encoding and another "supposed to be private" property sun.jnu.encoding. sherman On 9/14/16, 11:42 PM, Aleksey Shipilev wrote: Hi, Claes pointed out that our own reflective hacks to figure out console encoding do not work anymore [1]. But, we need the console encoding for reliably printing on the console from within different sources. Note that you would normally just use System.console() and its PrintWriter, but reality is a bit more complicated, and sometimes you need to write the plain char stream directly into the byte[]-accepting methods, encoding on your own. So, my question: should we, in the light of extended Jigsaw-solving crunch, provide the public Console.encoding() method that would return the console charset? Thanks, -Aleksey [1] http://mail.openjdk.java.net/pipermail/jmh-dev/2016-September/002330.html -- - DML
Re: RFC: System.console().encoding()
> out of curiosity... what will you do if you find the encoding lacking what > you need? Oh, display a warning. Helps to figure out where those "???" characters are coming from... Naive, I know. But it's the best one can do and it works (most of the time). D.
Re: RFC: System.console().encoding()
On 15.09.2016 09:21, Dawid Weiss wrote: Console is supposed to be a "char/String" based class, "encoding" really should have no business here in its api. While I agree with your concerns about the functional side of the API, I disagree about this method having no practical use. I can give you a concrete example. The use case that we had was to check whether the "terminal" (console) would be able to handle non-ASCII characters. A Writer doesn't tell you anything. An encoding does provide at least some confidence that certain characters will be translated properly -- if your encoding is US-ASCII or ISO8859-1 then Polish diacritics won't get displayed for sure. This doesn't mean 100% confidence in actual glyph rendering of course, but it's a cheap and safe sanity check of the terminal's capabilities. out of curiosity... what will you do if you find the encoding lacking what you need? bye Jochen
Re: RFC: System.console().encoding()
+1 Won't be enough, though, since in JMH it appears you're also getting the encoding from System.out (java.io.PrintStream) via reflective hacks. /Claes On 2016-09-15 08:42, Aleksey Shipilev wrote: Hi, Claes pointed out that our own reflective hacks to figure out console encoding do not work anymore [1]. But, we need the console encoding for reliably printing on the console from within different sources. Note that you would normally just use System.console() and its PrintWriter, but reality is a bit more complicated, and sometimes you need to write the plain char stream directly into the byte[]-accepting methods, encoding on your own. So, my question: should we, in the light of extended Jigsaw-solving crunch, provide the public Console.encoding() method that would return the console charset? Thanks, -Aleksey [1] http://mail.openjdk.java.net/pipermail/jmh-dev/2016-September/002330.html
Re: RFC: System.console().encoding()
> Console is supposed to be a "char/String" based class, "encoding" really > should have no business here in its api. While I agree with your concerns about the functional side of the API, I disagree about this method having no practical use. I can give you a concrete example. The use case that we had was to check whether the "terminal" (console) would be able to handle non-ASCII characters. A Writer doesn't tell you anything. An encoding does provide at least some confidence that certain characters will be translated properly -- if your encoding is US-ASCII or ISO8859-1 then Polish diacritics won't get displayed for sure. This doesn't mean 100% confidence in actual glyph rendering of course, but it's a cheap and safe sanity check of the terminal's capabilities. An (undocumented) proprietary property? Sure, but I really don't see the reason why this shouldn't be in the API, unless you know of terminals that handle Unicode-based streams directly (in which case the encoding method would simply return UTF32). Dawid
Re: RFC: System.console().encoding()
-1 :-) Console is supposed to be a "char/String" based class, "encoding" really should have no business here in its api. Simply for some implementation convenience is really not a good reason to add such a public method. That said, I would be fine to have such informative info in the system properties, together with its siblings, file,encoding and another "supposed to be private" property sun.jnu.encoding. sherman On 9/14/16, 11:42 PM, Aleksey Shipilev wrote: Hi, Claes pointed out that our own reflective hacks to figure out console encoding do not work anymore [1]. But, we need the console encoding for reliably printing on the console from within different sources. Note that you would normally just use System.console() and its PrintWriter, but reality is a bit more complicated, and sometimes you need to write the plain char stream directly into the byte[]-accepting methods, encoding on your own. So, my question: should we, in the light of extended Jigsaw-solving crunch, provide the public Console.encoding() method that would return the console charset? Thanks, -Aleksey [1] http://mail.openjdk.java.net/pipermail/jmh-dev/2016-September/002330.html
Re: RFC: System.console().encoding()
+1 for adding a public Console.encoding(). I remember needing it as well, the current hacks are very ugly. Dawid On Thu, Sep 15, 2016 at 8:42 AM, Aleksey Shipilev wrote: > Hi, > > Claes pointed out that our own reflective hacks to figure out console > encoding do not work anymore [1]. But, we need the console encoding for > reliably printing on the console from within different sources. Note > that you would normally just use System.console() and its PrintWriter, > but reality is a bit more complicated, and sometimes you need to write > the plain char stream directly into the byte[]-accepting methods, > encoding on your own. > > So, my question: should we, in the light of extended Jigsaw-solving > crunch, provide the public Console.encoding() method that would return > the console charset? > > Thanks, > -Aleksey > > [1] > http://mail.openjdk.java.net/pipermail/jmh-dev/2016-September/002330.html >