Re: UnicodeLittleUnmarked or UTF-16LE in NTLM code?
On Mon, 2013-07-08 at 11:10 +0100, sebb wrote: The NTLM code uses the charset UnicodeLittleUnmarked a lot. The official page: http://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html says they are the same, but different APIs use a different canonical name. I assume the methods will therefore take either. Might be worth changing to the slightly shorter - but more obviously 16 bit - name? In any case, extracting as a constant and documenting the choice would be a good idea. Especially since the code also uses US-ASCII or ASCII sometimes (why?) Sebastian, I would like to propose to move to Java 1.6 at some point (rather sooner than later). One of the reasons to make this move is to be able to use Charset variant of String#getBytes() method and to clean up the use of various charsets throughout the code base, not just NTLM code. Maintaining Java 1.5 compatibility has been getting increasingly difficult and increasingly pointless. The question is whether this is too later for 4.3 or not. Oleg - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org
Re: UnicodeLittleUnmarked or UTF-16LE in NTLM code?
On 8 July 2013 11:10, sebb seb...@gmail.com wrote: The NTLM code uses the charset UnicodeLittleUnmarked a lot. The official page: http://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html says they are the same, but different APIs use a different canonical name. I assume the methods will therefore take either. Might be worth changing to the slightly shorter - but more obviously 16 bit - name? In any case, extracting as a constant and documenting the choice would be a good idea. Especially since the code also uses US-ASCII or ASCII sometimes (why?) I've just been looking at http://davenport.sourceforge.net/ntlm.html and this says that certain fields always use OEM encoding. This is documented as being the local machine's native character set (DOS codepage), however the code seems to use ASCII (or US-ASCII) for this. That seems wrong - although ASCII is likely to be a subset of the default encoding, this is not 100% guaranteed. If the code does make this assumption, I think it should be documented. - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org
Re: UnicodeLittleUnmarked or UTF-16LE in NTLM code?
On 8 July 2013 11:29, Oleg Kalnichevski ol...@apache.org wrote: On Mon, 2013-07-08 at 11:10 +0100, sebb wrote: The NTLM code uses the charset UnicodeLittleUnmarked a lot. The official page: http://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html says they are the same, but different APIs use a different canonical name. I assume the methods will therefore take either. Might be worth changing to the slightly shorter - but more obviously 16 bit - name? In any case, extracting as a constant and documenting the choice would be a good idea. Especially since the code also uses US-ASCII or ASCII sometimes (why?) Sebastian, I would like to propose to move to Java 1.6 at some point (rather sooner than later). One of the reasons to make this move is to be able to use Charset variant of String#getBytes() method Yes, that's definitely easier. and to clean up the use of various charsets throughout the code base, not just NTLM code. Not sure that requires Java 1.6. Maintaining Java 1.5 compatibility has been getting increasingly difficult and increasingly pointless. The question is whether this is too later for 4.3 or not. There's still quite a lot of Java 1.5 out there, so I would suggest holding off requiring 1.6 until after 4.3. There are a lot of useful fixes etc in 4.3, so why not make them available to people still stuck on Java 5? Also conversion to Java 1.6 requires lots of changes to @Override. It's more work than might at first appear. Oleg - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org
Re: UnicodeLittleUnmarked or UTF-16LE in NTLM code?
On Mon, 2013-07-08 at 11:35 +0100, sebb wrote: On 8 July 2013 11:29, Oleg Kalnichevski ol...@apache.org wrote: On Mon, 2013-07-08 at 11:10 +0100, sebb wrote: The NTLM code uses the charset UnicodeLittleUnmarked a lot. The official page: http://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html says they are the same, but different APIs use a different canonical name. I assume the methods will therefore take either. Might be worth changing to the slightly shorter - but more obviously 16 bit - name? In any case, extracting as a constant and documenting the choice would be a good idea. Especially since the code also uses US-ASCII or ASCII sometimes (why?) Sebastian, I would like to propose to move to Java 1.6 at some point (rather sooner than later). One of the reasons to make this move is to be able to use Charset variant of String#getBytes() method Yes, that's definitely easier. and to clean up the use of various charsets throughout the code base, not just NTLM code. Not sure that requires Java 1.6. Maintaining Java 1.5 compatibility has been getting increasingly difficult and increasingly pointless. The question is whether this is too later for 4.3 or not. There's still quite a lot of Java 1.5 out there, so I would suggest holding off requiring 1.6 until after 4.3. There are a lot of useful fixes etc in 4.3, so why not make them available to people still stuck on Java 5? All right. Fair enough. Let's discuss the move post 4.3 Oleg - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org
Re: UnicodeLittleUnmarked or UTF-16LE in NTLM code?
On Jul 8, 2013, at 6:30, Oleg Kalnichevski ol...@apache.org wrote: On Mon, 2013-07-08 at 11:10 +0100, sebb wrote: The NTLM code uses the charset UnicodeLittleUnmarked a lot. The official page: http://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html says they are the same, but different APIs use a different canonical name. I assume the methods will therefore take either. Might be worth changing to the slightly shorter - but more obviously 16 bit - name? In any case, extracting as a constant and documenting the choice would be a good idea. Especially since the code also uses US-ASCII or ASCII sometimes (why?) Sebastian, I would like to propose to move to Java 1.6 at some point (rather sooner than later). One of the reasons to make this move is to be able to use Charset variant of String#getBytes() method and to clean up the use of various charsets throughout the code base, not just NTLM code. Maintaining Java 1.5 compatibility has been getting increasingly difficult and increasingly pointless. The question is whether this is too later for 4.3 or not. I would move to Java 6 now, a major release is as good a time as any and gives us the opportunity for changes that are easier to make than in a minor release. I would not want to be stuck with supporting Java 5 until the next major release. Gary Oleg - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org - To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org