Re: Netbeans encoding

2018-04-22 Thread Tim Boudreau
Okay, legacy web application - what appserver?

Seems like the bug here is that, if NetBeans is starting the application
server, that it should detect (from configuration files or wherever) the
encoding that application server will use for logging (or know what it
is).  That would be an actual fix without interfering with anything else.

Failing to detect or know output encoding from an application server is the
actual bug here.

In the meantime, you could either keep your netbeans.conf setting, *or* set
the system property file.encoding to your system default in your
appserver's configuration to be consistent with NetBeans' expectations.

-Tim


Re: Netbeans encoding

2018-04-21 Thread Victor Williams Stafusa da Silva
I'm working right now on a legacy web project that uses a bunch of
System.out.println for logging and debugging. It was plagued by a lot of
encoding problems, and I fixed most of them by forcing everything to be
UTF-8. There is no sense in using the platform default encoding when the
text output is something to be embedded into HTML and juggling/converting
strings of different encodings around is what I considers a form of
torture. Also, reading text/config/html/java/whatever files written in
different encodings and having to guess what is the correct encoding on a
case-by-case basis just makes it a still worse torture.

Also, I done the translation of Checkstyle to Portuguese and the
translation files are encoded in UTF-8. When using them in my project,
Checkstyle prints localized messages of code-style violations, and those
became garbled in the netbeans console. Don't know if this is a Checkstyle
bug, but even if it is, having to do encoding checks and conversions just
to println a String for debugging purposes is a burden that no programmer
should deserve to have. Also, even if it is a Checkstyle bug and someone
gets to fix it, there would be probably more millions of tools out there
with the same bug.

Ok, Netbeans can't do anything to fix a bunch of tools that have encoding
problems. However this show that the hole here is much bigger: the simple
existence of the concept of a default platform encoding is the root cause
of all those problems. Even a simple System.out.println statement may
suffer from this problem because you don't know and can't know (and ideally
shouldn't need to know or care) if the String it is going to the console,
to a file, to a socket or to anywhere else. It only works safely when all
the produced strings born, live and die within the same machine, an
assumption that is and always was simply plain false and wrong. This is why
I strong support the idea of deprecating any methods that rely on the
default platform encoding. Netbeans could also do its part by not ever
relying on that.

Victor Williams Stafusa da Silva

2018-04-21 3:53 GMT-03:00 Tim Boudreau :

> No argument that the situation doesn't need a fix.
>
> But you didn't answer my question:  *What* are you running when the problem
> shows up?  Your own Java project?  If so, Ant, Maven or something else
> (i.e. build system where this is settable/detectable or not?)?  Or some
> application server or third party thing?
>
> What I'm trying to nail down is, what is the point of minimal intervention
> where this could either be detected or made settable.  External processes
> in Java have binary output;  the IDE decides what character set to impose
> over that.  That decision can be improved, but not without knowing what the
> stream is coming from there's no place to start.  Without knowing what it
> is you're looking at the output of when you see this problem, there's no
> progress to be made.
>
> I'm all for UTF-8 everywhere in theory, and on my own systems, but
> defaulting to that is likely to break things for at least as many people as
> it helps.  So, in the interest of solving it with a scalpel instead of a
> sledgehammer, could you give a little detail on where the problematic
> output is coming from and how it is generated?
>
> Thanks,
>
> -Tim
>
> On Fri, Apr 20, 2018 at 8:33 PM, Victor Williams Stafusa da Silva <
> victorwssi...@gmail.com> wrote:
>
> > In my case, I'm running in windows, with the dreaded and hated
> Windows-1252
> > default encoding.
> >
> > Using default OS encoding is really bad for portability and causes a lot
> of
> > encoding problems. See this JEP draft maybe for Java 11:
> > http://openjdk.java.net/jeps/8187041 - There are three proposed
> > alternatives: 1) Keep the status quo; 2) Deprecate all the methods that
> > uses the platform default encoding; 3) Force UTF-8 deing the default
> > regardless of anything.
> >
> > As a speaker of Portuguese, a language that is full of diacritics, I'm
> > already very sick of years and years of being haunted by encoding
> problems
> > in buggy software. But it could be much worse if my language was Chinese
> or
> > Japanese.
> >
> > Since option 1 is unacceptable and 3 is too drastic and dangerous due to
> > backwards-compatibility concerns, I think that this JEP, if it eventually
> > gets delivered, will go to option 2.
> >
> > Anyway, regardless of this JEP or its future, Netbeans should either get
> > the correct encoding in the console window or at least provide an easy
> and
> > accessible way to et the user define it.
> >
> > Victor Williams Stafusa da Silva
> >
> >
> > 2018-04-20 20:00 GMT-03:00 Tim Boudreau :
> >
> > > Your problem is most likely your operating system's default file
> encoding
> > > here (perhaps MacRoman?).  The IDE is assuming that process output is
> > > whatever your operating system's default encoding is, which is the
> right
> > > assumption, since that *is* what command-line 

Re: Netbeans encoding

2018-04-21 Thread Tim Boudreau
No argument that the situation doesn't need a fix.

But you didn't answer my question:  *What* are you running when the problem
shows up?  Your own Java project?  If so, Ant, Maven or something else
(i.e. build system where this is settable/detectable or not?)?  Or some
application server or third party thing?

What I'm trying to nail down is, what is the point of minimal intervention
where this could either be detected or made settable.  External processes
in Java have binary output;  the IDE decides what character set to impose
over that.  That decision can be improved, but not without knowing what the
stream is coming from there's no place to start.  Without knowing what it
is you're looking at the output of when you see this problem, there's no
progress to be made.

I'm all for UTF-8 everywhere in theory, and on my own systems, but
defaulting to that is likely to break things for at least as many people as
it helps.  So, in the interest of solving it with a scalpel instead of a
sledgehammer, could you give a little detail on where the problematic
output is coming from and how it is generated?

Thanks,

-Tim

On Fri, Apr 20, 2018 at 8:33 PM, Victor Williams Stafusa da Silva <
victorwssi...@gmail.com> wrote:

> In my case, I'm running in windows, with the dreaded and hated Windows-1252
> default encoding.
>
> Using default OS encoding is really bad for portability and causes a lot of
> encoding problems. See this JEP draft maybe for Java 11:
> http://openjdk.java.net/jeps/8187041 - There are three proposed
> alternatives: 1) Keep the status quo; 2) Deprecate all the methods that
> uses the platform default encoding; 3) Force UTF-8 deing the default
> regardless of anything.
>
> As a speaker of Portuguese, a language that is full of diacritics, I'm
> already very sick of years and years of being haunted by encoding problems
> in buggy software. But it could be much worse if my language was Chinese or
> Japanese.
>
> Since option 1 is unacceptable and 3 is too drastic and dangerous due to
> backwards-compatibility concerns, I think that this JEP, if it eventually
> gets delivered, will go to option 2.
>
> Anyway, regardless of this JEP or its future, Netbeans should either get
> the correct encoding in the console window or at least provide an easy and
> accessible way to et the user define it.
>
> Victor Williams Stafusa da Silva
>
>
> 2018-04-20 20:00 GMT-03:00 Tim Boudreau :
>
> > Your problem is most likely your operating system's default file encoding
> > here (perhaps MacRoman?).  The IDE is assuming that process output is
> > whatever your operating system's default encoding is, which is the right
> > assumption, since that *is* what command-line utilities will output.  It
> > happens that the process you're running is outputting UTF-8 *rather than*
> > the
> > OS's default encoding.
> >
> > Setting that as a default would be assuming that every operating system
> > uses UTF-8 regardless of what it does - it would be wrong a lot of the
> > time.  It just happens to solve the case that whatever you're running is
> > outputting UTF-8 in spite of what the operating system provides.
> >
> > That's not that uncommon, but the right solution is to *detect* that the
> > output is UTF-8 when the IDE runs whatever it is you're running.
> >
> > So... what are you running?  Is this project output?  If so, what kind of
> > project?  Or server output of some kind?  A correct fix would be to (if
> > possible), detect what that is and that it will output UTF-8, and have
> the
> > IDE open the output of that process with the right encoding.
> >
> > -Tim
> >
> > On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva <
> > victorwssi...@gmail.com> wrote:
> >
> > > I frequently had some long-standing problems with the console output
> > > encoding in Netbeans. Which always presented garbled non-ascii
> characters
> > > for me.
> > >
> > > After deciding that it was enough, I went to search for a solution and
> > did
> > > found a very simple one in StackOverflow. Just add
> > -J-Dfile.encoding=UTF-8
> > > into the netbeans_default_options line of netbeans.conf file and voilà,
> > it
> > > works!
> > >
> > > However, this make me think about it:
> > >
> > > 1. Is there a reason to not add it there by default?
> > >
> > > 2. If it can't be added there by default for some reason, can it at
> least
> > > be something more user-friendly and less arcane to be configured by the
> > > normal user?
> > >
> > > Victor Williams Stafusa da Silva
> > >
> >
> >
> >
> > --
> > http://timboudreau.com
> >
>



-- 
http://timboudreau.com


Re: Netbeans encoding

2018-04-20 Thread Eirik Bakke
TB> That's not that uncommon, but the right solution is to *detect* that
the output is UTF-8 when the IDE runs whatever it is you're running.

That's hard to do in general, unfortunately. Web browsers do character set
detection by a statistical analysis of character frequencies in input
documents [1]--and it's not at all guaranteed to be correct (and magic BOM
characters are far from universal). Console output likely has a different
statistical profile of characters than that of web pages (making existing
libraries less accurate), and furthermore, console output is not available
"all at once" for analysis. After how many lines of output should the
console detect the encoding? Should it then switch from one encoding to
another? Should it switch again if more lines are printed and the analysis
can be made more accurate?

VWSdS> Netbeans should ... at least provide an easy and accessible way to
[l]et the user define it.
That will be more reliable. Options could be made available from the
right-click menu in the terminal. There could also be an option here that
tries to auto-detect the encoding only at the time that the "Auto-Detect"
option is invoked, based on whatever data is currently in the terminal.
That alleviates some of the previously mentioned issues.


Though this still sounds like a problem with the app--console text output
should generally be in whatever encoding the OS expects on its own
terminal/command-line tool. Otherwise the output would be equally garbled
in the OS terminal.

-- Eirik

[1] https://www-archive.mozilla.org/projects/intl/chardet.html



On 4/20/18, 8:33 PM, "Victor Williams Stafusa da Silva"
 wrote:

>In my case, I'm running in windows, with the dreaded and hated
>Windows-1252
>default encoding.
>
>Using default OS encoding is really bad for portability and causes a lot
>of
>encoding problems. See this JEP draft maybe for Java 11:
>http://openjdk.java.net/jeps/8187041 - There are three proposed
>alternatives: 1) Keep the status quo; 2) Deprecate all the methods that
>uses the platform default encoding; 3) Force UTF-8 deing the default
>regardless of anything.
>
>As a speaker of Portuguese, a language that is full of diacritics, I'm
>already very sick of years and years of being haunted by encoding problems
>in buggy software. But it could be much worse if my language was Chinese
>or
>Japanese.
>
>Since option 1 is unacceptable and 3 is too drastic and dangerous due to
>backwards-compatibility concerns, I think that this JEP, if it eventually
>gets delivered, will go to option 2.
>
>Anyway, regardless of this JEP or its future, Netbeans should either get
>the correct encoding in the console window or at least provide an easy and
>accessible way to et the user define it.
>
>Victor Williams Stafusa da Silva
>
>
>2018-04-20 20:00 GMT-03:00 Tim Boudreau :
>
>> Your problem is most likely your operating system's default file
>>encoding
>> here (perhaps MacRoman?).  The IDE is assuming that process output is
>> whatever your operating system's default encoding is, which is the right
>> assumption, since that *is* what command-line utilities will output.  It
>> happens that the process you're running is outputting UTF-8 *rather
>>than*
>> the
>> OS's default encoding.
>>
>> Setting that as a default would be assuming that every operating system
>> uses UTF-8 regardless of what it does - it would be wrong a lot of the
>> time.  It just happens to solve the case that whatever you're running is
>> outputting UTF-8 in spite of what the operating system provides.
>>
>> That's not that uncommon, but the right solution is to *detect* that the
>> output is UTF-8 when the IDE runs whatever it is you're running.
>>
>> So... what are you running?  Is this project output?  If so, what kind
>>of
>> project?  Or server output of some kind?  A correct fix would be to (if
>> possible), detect what that is and that it will output UTF-8, and have
>>the
>> IDE open the output of that process with the right encoding.
>>
>> -Tim
>>
>> On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva <
>> victorwssi...@gmail.com> wrote:
>>
>> > I frequently had some long-standing problems with the console output
>> > encoding in Netbeans. Which always presented garbled non-ascii
>>characters
>> > for me.
>> >
>> > After deciding that it was enough, I went to search for a solution and
>> did
>> > found a very simple one in StackOverflow. Just add
>> -J-Dfile.encoding=UTF-8
>> > into the netbeans_default_options line of netbeans.conf file and
>>voilà,
>> it
>> > works!
>> >
>> > However, this make me think about it:
>> >
>> > 1. Is there a reason to not add it there by default?
>> >
>> > 2. If it can't be added there by default for some reason, can it at
>>least
>> > be something more user-friendly and less arcane to be configured by
>>the
>> > normal user?
>> >
>> > Victor Williams Stafusa da Silva
>> >
>>
>>
>>
>> --
>> http://timboudreau.com
>>



Re: Netbeans encoding

2018-04-20 Thread Victor Williams Stafusa da Silva
In my case, I'm running in windows, with the dreaded and hated Windows-1252
default encoding.

Using default OS encoding is really bad for portability and causes a lot of
encoding problems. See this JEP draft maybe for Java 11:
http://openjdk.java.net/jeps/8187041 - There are three proposed
alternatives: 1) Keep the status quo; 2) Deprecate all the methods that
uses the platform default encoding; 3) Force UTF-8 deing the default
regardless of anything.

As a speaker of Portuguese, a language that is full of diacritics, I'm
already very sick of years and years of being haunted by encoding problems
in buggy software. But it could be much worse if my language was Chinese or
Japanese.

Since option 1 is unacceptable and 3 is too drastic and dangerous due to
backwards-compatibility concerns, I think that this JEP, if it eventually
gets delivered, will go to option 2.

Anyway, regardless of this JEP or its future, Netbeans should either get
the correct encoding in the console window or at least provide an easy and
accessible way to et the user define it.

Victor Williams Stafusa da Silva


2018-04-20 20:00 GMT-03:00 Tim Boudreau :

> Your problem is most likely your operating system's default file encoding
> here (perhaps MacRoman?).  The IDE is assuming that process output is
> whatever your operating system's default encoding is, which is the right
> assumption, since that *is* what command-line utilities will output.  It
> happens that the process you're running is outputting UTF-8 *rather than*
> the
> OS's default encoding.
>
> Setting that as a default would be assuming that every operating system
> uses UTF-8 regardless of what it does - it would be wrong a lot of the
> time.  It just happens to solve the case that whatever you're running is
> outputting UTF-8 in spite of what the operating system provides.
>
> That's not that uncommon, but the right solution is to *detect* that the
> output is UTF-8 when the IDE runs whatever it is you're running.
>
> So... what are you running?  Is this project output?  If so, what kind of
> project?  Or server output of some kind?  A correct fix would be to (if
> possible), detect what that is and that it will output UTF-8, and have the
> IDE open the output of that process with the right encoding.
>
> -Tim
>
> On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva <
> victorwssi...@gmail.com> wrote:
>
> > I frequently had some long-standing problems with the console output
> > encoding in Netbeans. Which always presented garbled non-ascii characters
> > for me.
> >
> > After deciding that it was enough, I went to search for a solution and
> did
> > found a very simple one in StackOverflow. Just add
> -J-Dfile.encoding=UTF-8
> > into the netbeans_default_options line of netbeans.conf file and voilà,
> it
> > works!
> >
> > However, this make me think about it:
> >
> > 1. Is there a reason to not add it there by default?
> >
> > 2. If it can't be added there by default for some reason, can it at least
> > be something more user-friendly and less arcane to be configured by the
> > normal user?
> >
> > Victor Williams Stafusa da Silva
> >
>
>
>
> --
> http://timboudreau.com
>


Re: Netbeans encoding

2018-04-20 Thread Tim Boudreau
Your problem is most likely your operating system's default file encoding
here (perhaps MacRoman?).  The IDE is assuming that process output is
whatever your operating system's default encoding is, which is the right
assumption, since that *is* what command-line utilities will output.  It
happens that the process you're running is outputting UTF-8 *rather than* the
OS's default encoding.

Setting that as a default would be assuming that every operating system
uses UTF-8 regardless of what it does - it would be wrong a lot of the
time.  It just happens to solve the case that whatever you're running is
outputting UTF-8 in spite of what the operating system provides.

That's not that uncommon, but the right solution is to *detect* that the
output is UTF-8 when the IDE runs whatever it is you're running.

So... what are you running?  Is this project output?  If so, what kind of
project?  Or server output of some kind?  A correct fix would be to (if
possible), detect what that is and that it will output UTF-8, and have the
IDE open the output of that process with the right encoding.

-Tim

On Fri, Apr 20, 2018 at 6:18 PM, Victor Williams Stafusa da Silva <
victorwssi...@gmail.com> wrote:

> I frequently had some long-standing problems with the console output
> encoding in Netbeans. Which always presented garbled non-ascii characters
> for me.
>
> After deciding that it was enough, I went to search for a solution and did
> found a very simple one in StackOverflow. Just add -J-Dfile.encoding=UTF-8
> into the netbeans_default_options line of netbeans.conf file and voilà, it
> works!
>
> However, this make me think about it:
>
> 1. Is there a reason to not add it there by default?
>
> 2. If it can't be added there by default for some reason, can it at least
> be something more user-friendly and less arcane to be configured by the
> normal user?
>
> Victor Williams Stafusa da Silva
>



-- 
http://timboudreau.com


Netbeans encoding

2018-04-20 Thread Victor Williams Stafusa da Silva
I frequently had some long-standing problems with the console output
encoding in Netbeans. Which always presented garbled non-ascii characters
for me.

After deciding that it was enough, I went to search for a solution and did
found a very simple one in StackOverflow. Just add -J-Dfile.encoding=UTF-8
into the netbeans_default_options line of netbeans.conf file and voilà, it
works!

However, this make me think about it:

1. Is there a reason to not add it there by default?

2. If it can't be added there by default for some reason, can it at least
be something more user-friendly and less arcane to be configured by the
normal user?

Victor Williams Stafusa da Silva