Re: The Case For Autodecode

2016-06-04 Thread Steven Schveighoffer via Digitalmars-d
On 6/4/16 4:57 AM, Patrick Schluter wrote: On Friday, 3 June 2016 at 20:18:31 UTC, Steven Schveighoffer wrote: On 6/3/16 3:52 PM, ag0aep6g wrote: Does it work for for char -> wchar, too? It does not. 0x is a valid code point, and I think so are all the other values that would result. In

Re: The Case For Autodecode

2016-06-04 Thread Observer via Digitalmars-d
On Friday, 3 June 2016 at 11:24:40 UTC, ag0aep6g wrote: Finally, this is not the only argument in favor of *keeping* autodecoding, of course. Not wanting to break user code is the big one there, I guess. I'm not familiar with the details of autodecoding, but one thing strikes me about this who

Re: The Case For Autodecode

2016-06-04 Thread Patrick Schluter via Digitalmars-d
On Friday, 3 June 2016 at 20:18:31 UTC, Steven Schveighoffer wrote: On 6/3/16 3:52 PM, ag0aep6g wrote: On 06/03/2016 09:09 PM, Steven Schveighoffer wrote: Except many chars *do* properly convert. This should work: char c = 'a'; dchar d = c; assert(d == 'a'); Yeah, that's what I meant by "sta

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 11:13 PM, Steven Schveighoffer wrote: No, but I like the idea of preserving the erroneous character you tried to convert. Makes sense. But is there an invalid wchar? I looked through the wikipedia article on UTF 16, and it didn't seem to say there was one. If we use U+FFFD, tha

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 4:39 PM, ag0aep6g wrote: On 06/03/2016 10:18 PM, Steven Schveighoffer wrote: But you can get a standalone code unit that is part of a coded sequence quite easily foo(string s) { auto x = s[0]; dchar d = x; } I don' think we're disagreeing on anything. I'm calling UTF-8 code

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 10:18 PM, Steven Schveighoffer wrote: But you can get a standalone code unit that is part of a coded sequence quite easily foo(string s) { auto x = s[0]; dchar d = x; } I don' think we're disagreeing on anything. I'm calling UTF-8 code units below 0x80 "standalone" code

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 3:52 PM, ag0aep6g wrote: On 06/03/2016 09:09 PM, Steven Schveighoffer wrote: Except many chars *do* properly convert. This should work: char c = 'a'; dchar d = c; assert(d == 'a'); Yeah, that's what I meant by "standalone code unit". Code units that on their own represent a code poi

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 09:09 PM, Steven Schveighoffer wrote: Except many chars *do* properly convert. This should work: char c = 'a'; dchar d = c; assert(d == 'a'); Yeah, that's what I meant by "standalone code unit". Code units that on their own represent a code point would not be touched. As I me

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 3:12 PM, Steven Schveighoffer wrote: On 6/3/16 3:09 PM, Steven Schveighoffer wrote: Hm... an interesting possiblity: dchar _dchar_convert(char c) { return cast(int)cast(byte)c; // get sign extension for non-ASCII } Allows this too: dchar d = char.init; // calls conversion funct

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 3:09 PM, Steven Schveighoffer wrote: Hm... an interesting possiblity: dchar _dchar_convert(char c) { return cast(int)cast(byte)c; // get sign extension for non-ASCII } Allows this too: dchar d = char.init; // calls conversion function assert(d == dchar.init); :) -Steve

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 2:55 PM, ag0aep6g wrote: On 06/03/2016 08:36 PM, Steven Schveighoffer wrote: but a direct cast of the bits from char does NOT mean the same thing as a dchar. That gives me an idea. A bitwise reinterpretation of int to float is nonsensical, too. Yet int implicitly converts to float an

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 08:36 PM, Steven Schveighoffer wrote: but a direct cast of the bits from char does NOT mean the same thing as a dchar. That gives me an idea. A bitwise reinterpretation of int to float is nonsensical, too. Yet int implicitly converts to float and (for small values) preserves the

Re: The Case For Autodecode

2016-06-03 Thread Patrick Schluter via Digitalmars-d
On Friday, 3 June 2016 at 18:36:45 UTC, Steven Schveighoffer wrote: The real problem here is that char implicitly casts to dchar. That should not be allowed. Indeed.

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 07:51 PM, Patrick Schluter wrote: You mean that '¶' is represented internally as 1 byte 0xB6 and that it can be handled as such without error? This would mean that char literals are broken. The only valid way to represent '¶' in memory is 0xC3 0x86. Sorry if I misunderstood, I'm onl

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 1:51 PM, Patrick Schluter wrote: On Friday, 3 June 2016 at 11:24:40 UTC, ag0aep6g wrote: This is mostly me trying to make sense of the discussion. So everyone hates autodecoding. But Andrei seems to hate it a good bit less than everyone else. As far as I could follow, he has one reaso

Re: The Case For Autodecode

2016-06-03 Thread Patrick Schluter via Digitalmars-d
On Friday, 3 June 2016 at 11:24:40 UTC, ag0aep6g wrote: This is mostly me trying to make sense of the discussion. So everyone hates autodecoding. But Andrei seems to hate it a good bit less than everyone else. As far as I could follow, he has one reason for that, which might not be clear to ev

Re: The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
On 06/03/2016 03:56 PM, Kagamin wrote: A lot of discussion is disagreement on understanding of correctness of unicode support. I see 4 possible meanings here: 1. Implemented according to spec. 2. Provides level 1 unicode support. 3. Provides level 2 unicode support. 4. Achieves the goal of unicod

Re: The Case For Autodecode

2016-06-03 Thread Kagamin via Digitalmars-d
On Friday, 3 June 2016 at 11:24:40 UTC, ag0aep6g wrote: Finally, this is not the only argument in favor of *keeping* autodecoding, of course. Not wanting to break user code is the big one there, I guess. A lot of discussion is disagreement on understanding of correctness of unicode support. I

Re: The Case For Autodecode

2016-06-03 Thread Steven Schveighoffer via Digitalmars-d
On 6/3/16 7:24 AM, ag0aep6g wrote: This is mostly me trying to make sense of the discussion. So everyone hates autodecoding. But Andrei seems to hate it a good bit less than everyone else. As far as I could follow, he has one reason for that, which might not be clear to everyone: I don't hate

The Case For Autodecode

2016-06-03 Thread ag0aep6g via Digitalmars-d
This is mostly me trying to make sense of the discussion. So everyone hates autodecoding. But Andrei seems to hate it a good bit less than everyone else. As far as I could follow, he has one reason for that, which might not be clear to everyone: char converts implicitly to dchar, so the compi