Re: The Case Against Autodecode

2016-05-26 Thread Vladimir Panteleev via Digitalmars-d
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu wrote: 4. Autodecoding is slow and has no place in high speed string processing. I would agree only with the amendment "...if used naively", which is important. Knowledge of how autodecoding works is a prerequisite for writing

Re: The Case Against Autodecode

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d
On 05/26/2016 07:23 PM, H. S. Teoh via Digitalmars-d wrote: Therefore, instead of: myString.splitter!"abc".joiner!"def".count; we have to write: myString.representation .splitter!("abc".representation) .joiner!("def".representation)

Re: The Case Against Autodecode

2016-05-26 Thread H. S. Teoh via Digitalmars-d
On Thu, May 26, 2016 at 12:00:54PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: [...] > On 05/12/2016 04:15 PM, Walter Bright wrote: [...] > > 4. Autodecoding is slow and has no place in high speed string processing. > > I would agree only with the amendment "...if used naively", which is

Re: The Case Against Autodecode

2016-05-26 Thread Jack Stouffer via Digitalmars-d
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu wrote: instead, it should use standard library algorithms for searching, matching etc. When needed, iterating every code unit is trivially done through indexing. For an example where the std.algorithm/range functions don't cut

Re: The Case Against Autodecode

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d
This might be a good time to discuss this a tad further. I'd appreciate if the debate stayed on point going forward. Thanks! My thesis: the D1 design decision to represent strings as char[] was disastrous and probably one of the largest weaknesses of D1. The decision in D2 to use

Re: The Case Against Autodecode

2016-05-17 Thread sarn via Digitalmars-d
On Tuesday, 17 May 2016 at 09:53:17 UTC, Kagamin wrote: With UTF-8 problems happened on a massive scale in LAMP setups: mysql used latin1 as a default encoding and almost everything worked fine. ^ latin-1 with Swedish collation rules. And even if you set the encoding to "utf8", almost

Re: The Case Against Autodecode

2016-05-17 Thread Kagamin via Digitalmars-d
On Friday, 13 May 2016 at 21:46:28 UTC, Jonathan M Davis wrote: The history of why UTF-16 was chosen isn't really relevant to my point (Win32 has the same problem as Java and for similar reasons). My point was that if you use UTF-8, then it's obvious _really_ fast when you screwed up

Re: The Case Against Autodecode

2016-05-16 Thread jmh530 via Digitalmars-d
On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote: Runs for each combination were done five times and the median times used. The median times and the char[] to ubyte[] ratio are below: | | |char[] | ubyte[] | | Compiler | Text type | time (ms) | time (ms) | ratio |

Re: The Case Against Autodecode

2016-05-15 Thread H. S. Teoh via Digitalmars-d
On Mon, May 16, 2016 at 12:31:04AM +, Jack Stouffer via Digitalmars-d wrote: > On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote: > >Given the importance of performance in the auto-decoding topic, it > >seems reasonable to quantify it. I took a stab at this. It would of > >course be prudent

Re: The Case Against Autodecode

2016-05-15 Thread Jack Stouffer via Digitalmars-d
On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote: Given the importance of performance in the auto-decoding topic, it seems reasonable to quantify it. I took a stab at this. It would of course be prudent to have others conduct similar analysis rather than rely on my numbers alone. Here is

Re: The Case Against Autodecode

2016-05-15 Thread Jon D via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: > I am as unclear about the problems of autodecoding as I am about the necessity > to remove curl. Whenever I ask I hear some arguments that work well emotionally > but are scant on

Re: The Case Against Autodecode

2016-05-15 Thread Ola Fosheim Grøstad via Digitalmars-d
On Sunday, 15 May 2016 at 01:45:25 UTC, Bill Hicks wrote: From a technical point, D is not successful, for the most part. C/C++ at least can use the excuse that they were created during a time when we didn't have the experience and the knowledge that we do now. Not really. The dominating

Re: The Case Against Autodecode

2016-05-14 Thread Bill Hicks via Digitalmars-d
On Friday, 13 May 2016 at 09:28:45 UTC, Chris wrote: PS I wonder does Bill Hicks know you're using his name? But I guess he's lost interest in this planet and happily lives on Mars now. Maybe I'm using the name to avoid being harassed. Or maybe, there are thousands of people in the world

Re: The Case Against Autodecode

2016-05-14 Thread Bill Hicks via Digitalmars-d
On Friday, 13 May 2016 at 07:26:53 UTC, poliklosio wrote: Also, you are missing the point by claiming that a technical problem is sure to kill D. Note that very successful languages like C++, python and so on also have undergone heated discussions about various features, and often live

Re: The Case Against Autodecode

2016-05-13 Thread Steven Schveighoffer via Digitalmars-d
On 5/12/16 4:15 PM, Walter Bright wrote: 10. Autodecoded arrays cannot be RandomAccessRanges, losing a key benefit of being arrays in the first place. I'll repeat what I said in the other thread. The problem isn't auto-decoding. The problem is hijacking the char[] and wchar[] (and variants)

Re: The Case Against Autodecode

2016-05-13 Thread Chris via Digitalmars-d
On Friday, 13 May 2016 at 14:06:28 UTC, Vladimir Panteleev wrote: On Friday, 13 May 2016 at 13:41:30 UTC, Chris wrote: PS Why does do I get a "StopForumSpam error" every time I post today? Has anyone else experienced the same problem: "StopForumSpam error: Socket error: Lookup error:

Re: The Case Against Autodecode

2016-05-13 Thread Vladimir Panteleev via Digitalmars-d
On Friday, 13 May 2016 at 13:41:30 UTC, Chris wrote: PS Why does do I get a "StopForumSpam error" every time I post today? Has anyone else experienced the same problem: "StopForumSpam error: Socket error: Lookup error: getaddrinfo error: Name or service not known. Please solve a CAPTCHA to

Re: The Case Against Autodecode

2016-05-13 Thread Chris via Digitalmars-d
On Friday, 13 May 2016 at 13:17:44 UTC, Walter Bright wrote: On 5/13/2016 2:12 AM, Chris wrote: If autodecode is killed, could we have a test version asap? I'd be willing to test my programs with autodecode turned off and see what happens. Others should do likewise and we could come up with a

Re: The Case Against Autodecode

2016-05-13 Thread Walter Bright via Digitalmars-d
On 5/13/2016 3:43 AM, Marc Schütz wrote: On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: 7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the wild that pure Unicode is not

Re: The Case Against Autodecode

2016-05-13 Thread Walter Bright via Digitalmars-d
On 5/12/2016 11:50 PM, Bill Hicks wrote: And I get called a troll and other names when I list half a dozen things wrong with D, my posts get removed/censored, etc, all because I try to inform people not to waste time with D because it's a broken and failed language. Posts that engage in

Re: The Case Against Autodecode

2016-05-13 Thread Walter Bright via Digitalmars-d
On 5/13/2016 2:12 AM, Chris wrote: If autodecode is killed, could we have a test version asap? I'd be willing to test my programs with autodecode turned off and see what happens. Others should do likewise and we could come up with a transition strategy based on what happened. You can avoid

Re: The Case Against Autodecode

2016-05-13 Thread Kagamin via Digitalmars-d
On Friday, 13 May 2016 at 10:38:09 UTC, Jonathan M Davis wrote: IIRC, Andrei talked in TDPL about how Java's choice to go with UTF-16 was worse than the choice to go with UTF-8, because it was correct in many more cases UTF-16 was a migration from UCS-2, and UCS-2 was superior at the time.

Re: The Case Against Autodecode

2016-05-13 Thread Nick Treleaven via Digitalmars-d
On Friday, 13 May 2016 at 00:47:04 UTC, Jack Stouffer wrote: If you're serious about removing auto-decoding, which I think you and others have shown has merits, you have to the THE SIMPLEST migration path ever, or you will kill D. I'm talking a simple press of a button. char[] is always

Re: The Case Against Autodecode

2016-05-13 Thread Marc Schütz via Digitalmars-d
On Friday, 13 May 2016 at 10:38:09 UTC, Jonathan M Davis wrote: Ideally, algorithms would be Unicode aware as appropriate, but the default would be to operate on code units with wrappers to handle decoding by code point or grapheme. Then it's easy to write fast code while still allowing for

Re: The Case Against Autodecode

2016-05-13 Thread Marc Schütz via Digitalmars-d
On Thursday, 12 May 2016 at 23:16:23 UTC, H. S. Teoh wrote: Therefore, autodecoding actually only produces intuitively correct results when your string has a 1-to-1 correspondence between grapheme and code point. In general, this is only true for a small subset of languages, mainly a few

Re: The Case Against Autodecode

2016-05-13 Thread Chris via Digitalmars-d
On Friday, 13 May 2016 at 10:38:09 UTC, Jonathan M Davis wrote: Based on what I've seen in previous conversations on auto-decoding over the past few years (be it in the newsgroup, on github, or at dconf), most of the core devs think that auto-decoding was a major blunder that we continue to

Re: The Case Against Autodecode

2016-05-13 Thread Marc Schütz via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: 7. Autodecode cannot be used with unicode path/filenames, because it is legal (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the wild that pure Unicode is not universal - there's lots of dirty Unicode that

Re: The Case Against Autodecode

2016-05-13 Thread Jonathan M Davis via Digitalmars-d
On Thursday, May 12, 2016 13:15:45 Walter Bright via Digitalmars-d wrote: > On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: > > I am as unclear about the problems of autodecoding as I am about the > > necessity to remove curl. Whenever I ask I hear some arguments that work > > well emotionally

Re: The Case Against Autodecode

2016-05-13 Thread Kagamin via Digitalmars-d
On Friday, 13 May 2016 at 06:50:49 UTC, Bill Hicks wrote: not to waste time with D because it's a broken and failed language. D is a better broken thing among all the broken things in this broken world, so it's to be expected to be preferred to spend time on.

Re: The Case Against Autodecode

2016-05-13 Thread Chris via Digitalmars-d
On Friday, 13 May 2016 at 06:50:49 UTC, Bill Hicks wrote: Wow, that's eleven things wrong with just one tiny element of D, with the potential to cause problems, whether fixed or not. And I get called a troll and other names when I list half a dozen things wrong with D, my posts get

Re: The Case Against Autodecode

2016-05-13 Thread Chris via Digitalmars-d
On Friday, 13 May 2016 at 01:00:54 UTC, Walter Bright wrote: On 5/12/2016 5:47 PM, Jack Stouffer wrote: D is much less popular now than was Python at the time, and Python 2 problems were more straight forward than the auto-decoding problem. You'll need a very clear migration path, years long

Re: The Case Against Autodecode

2016-05-13 Thread Ola Fosheim Grøstad via Digitalmars-d
On Friday, 13 May 2016 at 00:47:04 UTC, Jack Stouffer wrote: D is much less popular now than was Python at the time, and Python 2 problems were more straight forward than the auto-decoding problem. You'll need a very clear migration path, years long deprecations, and automatic tools in order

Re: The Case Against Autodecode

2016-05-13 Thread poliklosio via Digitalmars-d
On Friday, 13 May 2016 at 06:50:49 UTC, Bill Hicks wrote: On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: (...) Wow, that's eleven things wrong with just one tiny element of D, with the potential to cause problems, whether fixed or not. And I get called a troll and other names

Re: The Case Against Autodecode

2016-05-13 Thread Ethan Watson via Digitalmars-d
On Friday, 13 May 2016 at 06:50:49 UTC, Bill Hicks wrote: *rant* Actually, chap, it's the attitude that's the turn-off in your post there. Listing problems in order to improve them, and listing problems to convince people something is a waste of time are incompatible mindsets around here.

Re: The Case Against Autodecode

2016-05-13 Thread Bill Hicks via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: Here are some that are not matters of opinion. 1. Ranges of characters do not autodecode, but arrays of characters do. This is a glaring inconsistency. 2. Every time one wants an algorithm to work with both strings and ranges,

Re: The Case Against Autodecode

2016-05-12 Thread Jack Stouffer via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: 2. Every time one wants an algorithm to work with both strings and ranges, you wind up special casing the strings to defeat the autodecoding, or to decode the ranges. Having to constantly special case it makes for more special

Re: The Case Against Autodecode

2016-05-12 Thread Walter Bright via Digitalmars-d
On 5/12/2016 5:47 PM, Jack Stouffer wrote: D is much less popular now than was Python at the time, and Python 2 problems were more straight forward than the auto-decoding problem. You'll need a very clear migration path, years long deprecations, and automatic tools in order to make the

Re: The Case Against Autodecode

2016-05-12 Thread Jack Stouffer via Digitalmars-d
On Friday, 13 May 2016 at 00:47:04 UTC, Jack Stouffer wrote: I'm not exaggerating here. Python, a language which was much more popular than D at the time, came out with two versions in 2008: Python 2.7 which had numerous unicode problems, and Python 3.0 which fixed those problems. Almost eight

Re: The Case Against Autodecode

2016-05-12 Thread Walter Bright via Digitalmars-d
On 5/12/2016 4:52 PM, Marco Leise wrote: I'd like 'string' to mean valid UTF-8 in D as far as the encoding goes. A filename should not be a 'string'. I would have agreed with you in the past, but more and more it just doesn't seem practical. UTF-8 is dirty in the real world, and D code will

Re: The Case Against Autodecode

2016-05-12 Thread Jack Stouffer via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: Here are some that are not matters of opinion. If you're serious about removing auto-decoding, which I think you and others have shown has merits, you have to the THE SIMPLEST migration path ever, or you will kill D. I'm talking

Re: The Case Against Autodecode

2016-05-12 Thread Marco Leise via Digitalmars-d
Am Thu, 12 May 2016 13:15:45 -0700 schrieb Walter Bright : > 7. Autodecode cannot be used with unicode path/filenames, because it is legal > (at least on Linux) to have invalid UTF-8 as filenames. More precisely they are byte strings with '/' reserved to separate

Re: The Case Against Autodecode

2016-05-12 Thread Walter Bright via Digitalmars-d
On 5/12/2016 4:23 PM, Daniel Kozak wrote: But what I am really piss of is that current string type is alias to immutable(char)[] (so it is not usable at all). This is really problem for me. Because this make working on array of chars almost impossible. Even char[] is unusable. So I am force to

Re: The Case Against Autodecode

2016-05-12 Thread Daniel Kozak via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: > I am as unclear about the problems of autodecoding as I am about the necessity > to remove curl. Whenever I ask I hear some arguments that work well emotionally > but are scant on

Re: The Case Against Autodecode

2016-05-12 Thread H. S. Teoh via Digitalmars-d
On Thu, May 12, 2016 at 08:24:23PM +, Vladimir Panteleev via Digitalmars-d wrote: [...] > 12. The result of autodecoding, a range of Unicode code points, is > rarely actually useful, and code that relies on autodecoding is rarely > actually, universally correct. Graphemes are occasionally

Re: The Case Against Autodecode

2016-05-12 Thread H. S. Teoh via Digitalmars-d
On Thu, May 12, 2016 at 08:24:23PM +, Vladimir Panteleev via Digitalmars-d wrote: > On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: [...] > >1. Ranges of characters do not autodecode, but arrays of characters > >do. This is a glaring inconsistency. > > > >2. Every time one

The Case Against Autodecode

2016-05-12 Thread Walter Bright via Digitalmars-d
On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: > I am as unclear about the problems of autodecoding as I am about the necessity > to remove curl. Whenever I ask I hear some arguments that work well emotionally > but are scant on reason and engineering. Maybe it's time to rehash them? I just >

Re: The Case Against Autodecode

2016-05-12 Thread Vladimir Panteleev via Digitalmars-d
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote: On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: > I am as unclear about the problems of autodecoding as I am about the necessity > to remove curl. Whenever I ask I hear some arguments that work well emotionally > but are scant on

<    1   2   3   4   5