Re: Netbeans encoding
Okay, legacy web application - what appserver? Seems like the bug here is that, if NetBeans is starting the application server, that it should detect (from configuration files or wherever) the encoding that application server will use for logging (or know what it is). That would be an actual fix without interfering with anything else. Failing to detect or know output encoding from an application server is the actual bug here. In the meantime, you could either keep your netbeans.conf setting, *or* set the system property file.encoding to your system default in your appserver's configuration to be consistent with NetBeans' expectations. -Tim
Re: Netbeans encoding
I'm working right now on a legacy web project that uses a bunch of System.out.println for logging and debugging. It was plagued by a lot of encoding problems, and I fixed most of them by forcing everything to be UTF-8. There is no sense in using the platform default encoding when the text output is something to be embedded into HTML and juggling/converting strings of different encodings around is what I considers a form of torture. Also, reading text/config/html/java/whatever files written in different encodings and having to guess what is the correct encoding on a case-by-case basis just makes it a still worse torture. Also, I done the translation of Checkstyle to Portuguese and the translation files are encoded in UTF-8. When using them in my project, Checkstyle prints localized messages of code-style violations, and those became garbled in the netbeans console. Don't know if this is a Checkstyle bug, but even if it is, having to do encoding checks and conversions just to println a String for debugging purposes is a burden that no programmer should deserve to have. Also, even if it is a Checkstyle bug and someone gets to fix it, there would be probably more millions of tools out there with the same bug. Ok, Netbeans can't do anything to fix a bunch of tools that have encoding problems. However this show that the hole here is much bigger: the simple existence of the concept of a default platform encoding is the root cause of all those problems. Even a simple System.out.println statement may suffer from this problem because you don't know and can't know (and ideally shouldn't need to know or care) if the String it is going to the console, to a file, to a socket or to anywhere else. It only works safely when all the produced strings born, live and die within the same machine, an assumption that is and always was simply plain false and wrong. This is why I strong support the idea of deprecating any methods that rely on the default platform encoding. Netbeans could also do its part by not ever relying on that. Victor Williams Stafusa da Silva 2018-04-21 3:53 GMT-03:00 Tim Boudreau: > No argument that the situation doesn't need a fix. > > But you didn't answer my question: *What* are you running when the problem > shows up? Your own Java project? If so, Ant, Maven or something else > (i.e. build system where this is settable/detectable or not?)? Or some > application server or third party thing? > > What I'm trying to nail down is, what is the point of minimal intervention > where this could either be detected or made settable. External processes > in Java have binary output; the IDE decides what character set to impose > over that. That decision can be improved, but not without knowing what the > stream is coming from there's no place to start. Without knowing what it > is you're looking at the output of when you see this problem, there's no > progress to be made. > > I'm all for UTF-8 everywhere in theory, and on my own systems, but > defaulting to that is likely to break things for at least as many people as > it helps. So, in the interest of solving it with a scalpel instead of a > sledgehammer, could you give a little detail on where the problematic > output is coming from and how it is generated? > > Thanks, > > -Tim > > On Fri, Apr 20, 2018 at 8:33 PM, Victor Williams Stafusa da Silva < > victorwssi...@gmail.com> wrote: > > > In my case, I'm running in windows, with the dreaded and hated > Windows-1252 > > default encoding. > > > > Using default OS encoding is really bad for portability and causes a lot > of > > encoding problems. See this JEP draft maybe for Java 11: > > http://openjdk.java.net/jeps/8187041 - There are three proposed > > alternatives: 1) Keep the status quo; 2) Deprecate all the methods that > > uses the platform default encoding; 3) Force UTF-8 deing the default > > regardless of anything. > > > > As a speaker of Portuguese, a language that is full of diacritics, I'm > > already very sick of years and years of being haunted by encoding > problems > > in buggy software. But it could be much worse if my language was Chinese > or > > Japanese. > > > > Since option 1 is unacceptable and 3 is too drastic and dangerous due to > > backwards-compatibility concerns, I think that this JEP, if it eventually > > gets delivered, will go to option 2. > > > > Anyway, regardless of this JEP or its future, Netbeans should either get > > the correct encoding in the console window or at least provide an easy > and > > accessible way to et the user define it. > > > > Victor Williams Stafusa da Silva > > > > > > 2018-04-20 20:00 GMT-03:00 Tim Boudreau : > > > > > Your problem is most likely your operating system's default file > encoding > > > here (perhaps MacRoman?). The IDE is assuming that process output is > > > whatever your operating system's default encoding is, which is the > right > > > assumption, since that *is* what command-line
Re: Netbeans encoding
No argument that the situation doesn't need a fix. But you didn't answer my question: *What* are you running when the problem shows up? Your own Java project? If so, Ant, Maven or something else (i.e. build system where this is settable/detectable or not?)? Or some application server or third party thing? What I'm trying to nail down is, what is the point of minimal intervention where this could either be detected or made settable. External processes in Java have binary output; the IDE decides what character set to impose over that. That decision can be improved, but not without knowing what the stream is coming from there's no place to start. Without knowing what it is you're looking at the output of when you see this problem, there's no progress to be made. I'm all for UTF-8 everywhere in theory, and on my own systems, but defaulting to that is likely to break things for at least as many people as it helps. So, in the interest of solving it with a scalpel instead of a sledgehammer, could you give a little detail on where the problematic output is coming from and how it is generated? Thanks, -Tim On Fri, Apr 20, 2018 at 8:33 PM, Victor Williams Stafusa da Silva < victorwssi...@gmail.com> wrote: > In my case, I'm running in windows, with the dreaded and hated Windows-1252 > default encoding. > > Using default OS encoding is really bad for portability and causes a lot of > encoding problems. See this JEP draft maybe for Java 11: > http://openjdk.java.net/jeps/8187041 - There are three proposed > alternatives: 1) Keep the status quo; 2) Deprecate all the methods that > uses the platform default encoding; 3) Force UTF-8 deing the default > regardless of anything. > > As a speaker of Portuguese, a language that is full of diacritics, I'm > already very sick of years and years of being haunted by encoding problems > in buggy software. But it could be much worse if my language was Chinese or > Japanese. > > Since option 1 is unacceptable and 3 is too drastic and dangerous due to > backwards-compatibility concerns, I think that this JEP, if it eventually > gets delivered, will go to option 2. > > Anyway, regardless of this JEP or its future, Netbeans should either get > the correct encoding in the console window or at least provide an easy and > accessible way to et the user define it. > > Victor Williams Stafusa da Silva > > > 2018-04-20 20:00 GMT-03:00 Tim Boudreau: > > > Your problem is most likely your operating system's default file encoding > > here (perhaps MacRoman?). The IDE is assuming that process output is > > whatever your operating system's default encoding is, which is the right > > assumption, since that *is* what command-line utilities will output. It > > happens that the process you're running is outputting UTF-8 *rather than* > > the > > OS's default encoding. > > > > Setting that as a default would be assuming that every operating system > > uses UTF-8 regardless of what it does - it would be wrong a lot of the > > time. It just happens to solve the case that whatever you're running is > > outputting UTF-8 in spite of what the operating system provides. > > > > That's not that uncommon, but the right solution is to *detect* that the > > output is UTF-8 when the IDE runs whatever it is you're running. > > > > So... what are you running? Is this project output? If so, what kind of > > project? Or server output of some kind? A correct fix would be to (if > > possible), detect what that is and that it will output UTF-8, and have > the > > IDE open the output of that process with the right encoding. > > > > -Tim > > > > On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva < > > victorwssi...@gmail.com> wrote: > > > > > I frequently had some long-standing problems with the console output > > > encoding in Netbeans. Which always presented garbled non-ascii > characters > > > for me. > > > > > > After deciding that it was enough, I went to search for a solution and > > did > > > found a very simple one in StackOverflow. Just add > > -J-Dfile.encoding=UTF-8 > > > into the netbeans_default_options line of netbeans.conf file and voilà, > > it > > > works! > > > > > > However, this make me think about it: > > > > > > 1. Is there a reason to not add it there by default? > > > > > > 2. If it can't be added there by default for some reason, can it at > least > > > be something more user-friendly and less arcane to be configured by the > > > normal user? > > > > > > Victor Williams Stafusa da Silva > > > > > > > > > > > -- > > http://timboudreau.com > > > -- http://timboudreau.com
Re: Netbeans encoding
TB> That's not that uncommon, but the right solution is to *detect* that the output is UTF-8 when the IDE runs whatever it is you're running. That's hard to do in general, unfortunately. Web browsers do character set detection by a statistical analysis of character frequencies in input documents [1]--and it's not at all guaranteed to be correct (and magic BOM characters are far from universal). Console output likely has a different statistical profile of characters than that of web pages (making existing libraries less accurate), and furthermore, console output is not available "all at once" for analysis. After how many lines of output should the console detect the encoding? Should it then switch from one encoding to another? Should it switch again if more lines are printed and the analysis can be made more accurate? VWSdS> Netbeans should ... at least provide an easy and accessible way to [l]et the user define it. That will be more reliable. Options could be made available from the right-click menu in the terminal. There could also be an option here that tries to auto-detect the encoding only at the time that the "Auto-Detect" option is invoked, based on whatever data is currently in the terminal. That alleviates some of the previously mentioned issues. Though this still sounds like a problem with the app--console text output should generally be in whatever encoding the OS expects on its own terminal/command-line tool. Otherwise the output would be equally garbled in the OS terminal. -- Eirik [1] https://www-archive.mozilla.org/projects/intl/chardet.html On 4/20/18, 8:33 PM, "Victor Williams Stafusa da Silva"wrote: >In my case, I'm running in windows, with the dreaded and hated >Windows-1252 >default encoding. > >Using default OS encoding is really bad for portability and causes a lot >of >encoding problems. See this JEP draft maybe for Java 11: >http://openjdk.java.net/jeps/8187041 - There are three proposed >alternatives: 1) Keep the status quo; 2) Deprecate all the methods that >uses the platform default encoding; 3) Force UTF-8 deing the default >regardless of anything. > >As a speaker of Portuguese, a language that is full of diacritics, I'm >already very sick of years and years of being haunted by encoding problems >in buggy software. But it could be much worse if my language was Chinese >or >Japanese. > >Since option 1 is unacceptable and 3 is too drastic and dangerous due to >backwards-compatibility concerns, I think that this JEP, if it eventually >gets delivered, will go to option 2. > >Anyway, regardless of this JEP or its future, Netbeans should either get >the correct encoding in the console window or at least provide an easy and >accessible way to et the user define it. > >Victor Williams Stafusa da Silva > > >2018-04-20 20:00 GMT-03:00 Tim Boudreau : > >> Your problem is most likely your operating system's default file >>encoding >> here (perhaps MacRoman?). The IDE is assuming that process output is >> whatever your operating system's default encoding is, which is the right >> assumption, since that *is* what command-line utilities will output. It >> happens that the process you're running is outputting UTF-8 *rather >>than* >> the >> OS's default encoding. >> >> Setting that as a default would be assuming that every operating system >> uses UTF-8 regardless of what it does - it would be wrong a lot of the >> time. It just happens to solve the case that whatever you're running is >> outputting UTF-8 in spite of what the operating system provides. >> >> That's not that uncommon, but the right solution is to *detect* that the >> output is UTF-8 when the IDE runs whatever it is you're running. >> >> So... what are you running? Is this project output? If so, what kind >>of >> project? Or server output of some kind? A correct fix would be to (if >> possible), detect what that is and that it will output UTF-8, and have >>the >> IDE open the output of that process with the right encoding. >> >> -Tim >> >> On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva < >> victorwssi...@gmail.com> wrote: >> >> > I frequently had some long-standing problems with the console output >> > encoding in Netbeans. Which always presented garbled non-ascii >>characters >> > for me. >> > >> > After deciding that it was enough, I went to search for a solution and >> did >> > found a very simple one in StackOverflow. Just add >> -J-Dfile.encoding=UTF-8 >> > into the netbeans_default_options line of netbeans.conf file and >>voilà, >> it >> > works! >> > >> > However, this make me think about it: >> > >> > 1. Is there a reason to not add it there by default? >> > >> > 2. If it can't be added there by default for some reason, can it at >>least >> > be something more user-friendly and less arcane to be configured by >>the >> > normal user? >> > >> > Victor Williams Stafusa da Silva >> > >> >> >> >> -- >> http://timboudreau.com >>
Re: Netbeans encoding
In my case, I'm running in windows, with the dreaded and hated Windows-1252 default encoding. Using default OS encoding is really bad for portability and causes a lot of encoding problems. See this JEP draft maybe for Java 11: http://openjdk.java.net/jeps/8187041 - There are three proposed alternatives: 1) Keep the status quo; 2) Deprecate all the methods that uses the platform default encoding; 3) Force UTF-8 deing the default regardless of anything. As a speaker of Portuguese, a language that is full of diacritics, I'm already very sick of years and years of being haunted by encoding problems in buggy software. But it could be much worse if my language was Chinese or Japanese. Since option 1 is unacceptable and 3 is too drastic and dangerous due to backwards-compatibility concerns, I think that this JEP, if it eventually gets delivered, will go to option 2. Anyway, regardless of this JEP or its future, Netbeans should either get the correct encoding in the console window or at least provide an easy and accessible way to et the user define it. Victor Williams Stafusa da Silva 2018-04-20 20:00 GMT-03:00 Tim Boudreau: > Your problem is most likely your operating system's default file encoding > here (perhaps MacRoman?). The IDE is assuming that process output is > whatever your operating system's default encoding is, which is the right > assumption, since that *is* what command-line utilities will output. It > happens that the process you're running is outputting UTF-8 *rather than* > the > OS's default encoding. > > Setting that as a default would be assuming that every operating system > uses UTF-8 regardless of what it does - it would be wrong a lot of the > time. It just happens to solve the case that whatever you're running is > outputting UTF-8 in spite of what the operating system provides. > > That's not that uncommon, but the right solution is to *detect* that the > output is UTF-8 when the IDE runs whatever it is you're running. > > So... what are you running? Is this project output? If so, what kind of > project? Or server output of some kind? A correct fix would be to (if > possible), detect what that is and that it will output UTF-8, and have the > IDE open the output of that process with the right encoding. > > -Tim > > On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva < > victorwssi...@gmail.com> wrote: > > > I frequently had some long-standing problems with the console output > > encoding in Netbeans. Which always presented garbled non-ascii characters > > for me. > > > > After deciding that it was enough, I went to search for a solution and > did > > found a very simple one in StackOverflow. Just add > -J-Dfile.encoding=UTF-8 > > into the netbeans_default_options line of netbeans.conf file and voilà, > it > > works! > > > > However, this make me think about it: > > > > 1. Is there a reason to not add it there by default? > > > > 2. If it can't be added there by default for some reason, can it at least > > be something more user-friendly and less arcane to be configured by the > > normal user? > > > > Victor Williams Stafusa da Silva > > > > > > -- > http://timboudreau.com >
Re: Netbeans encoding
Your problem is most likely your operating system's default file encoding here (perhaps MacRoman?). The IDE is assuming that process output is whatever your operating system's default encoding is, which is the right assumption, since that *is* what command-line utilities will output. It happens that the process you're running is outputting UTF-8 *rather than* the OS's default encoding. Setting that as a default would be assuming that every operating system uses UTF-8 regardless of what it does - it would be wrong a lot of the time. It just happens to solve the case that whatever you're running is outputting UTF-8 in spite of what the operating system provides. That's not that uncommon, but the right solution is to *detect* that the output is UTF-8 when the IDE runs whatever it is you're running. So... what are you running? Is this project output? If so, what kind of project? Or server output of some kind? A correct fix would be to (if possible), detect what that is and that it will output UTF-8, and have the IDE open the output of that process with the right encoding. -Tim On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva < victorwssi...@gmail.com> wrote: > I frequently had some long-standing problems with the console output > encoding in Netbeans. Which always presented garbled non-ascii characters > for me. > > After deciding that it was enough, I went to search for a solution and did > found a very simple one in StackOverflow. Just add -J-Dfile.encoding=UTF-8 > into the netbeans_default_options line of netbeans.conf file and voilà, it > works! > > However, this make me think about it: > > 1. Is there a reason to not add it there by default? > > 2. If it can't be added there by default for some reason, can it at least > be something more user-friendly and less arcane to be configured by the > normal user? > > Victor Williams Stafusa da Silva > -- http://timboudreau.com
Netbeans encoding
I frequently had some long-standing problems with the console output encoding in Netbeans. Which always presented garbled non-ascii characters for me. After deciding that it was enough, I went to search for a solution and did found a very simple one in StackOverflow. Just add -J-Dfile.encoding=UTF-8 into the netbeans_default_options line of netbeans.conf file and voilà, it works! However, this make me think about it: 1. Is there a reason to not add it there by default? 2. If it can't be added there by default for some reason, can it at least be something more user-friendly and less arcane to be configured by the normal user? Victor Williams Stafusa da Silva