Re: char array weirdness

2016-03-31 Thread Jack Stouffer via Digitalmars-d-learn
On Thursday, 31 March 2016 at 12:49:57 UTC, ag0aep6g wrote: I get theses timings then: auto-decoding 642 ms, 969 μs, and 1 hnsec byCodeUnit 84 ms, 980 μs, and 3 hnsecs And 643 / 85 ≅ 8. Ok, so not as bad as 100x, but still not great by any means. I think I

Re: char array weirdness

2016-03-31 Thread ag0aep6g via Digitalmars-d-learn
On 31.03.2016 07:40, Jack Stouffer wrote: $ ldc2 -O3 -release -boundscheck=off test.d $ ./test auto-decoding1 sec, 757 ms, and 946 μs byCodeUnit87 ms, 731 μs, and 8 hnsecs byGrapheme14 secs, 769 ms, 796 μs, and 6 hnsecs So the auto-decoding version takes about twenty

Re: char array weirdness

2016-03-30 Thread Jack Stouffer via Digitalmars-d-learn
On Wednesday, 30 March 2016 at 22:49:24 UTC, ag0aep6g wrote: When byCodeUnit takes no time at all, isn't 1µs infinite times slower, instead of 100 times? And I think byCodeUnits's 1µs is so low that noise is going to mess with any ratios you make. It's not that it's taking no time at all,

Re: char array weirdness

2016-03-30 Thread ag0aep6g via Digitalmars-d-learn
On 30.03.2016 19:30, Jack Stouffer wrote: Just to drive this point home, I made a very simple benchmark. Iterating over code points when you don't need to is 100x slower than iterating over code units. [...] enum testCount = 1_000_000; enum var = "Lorem ipsum dolor sit amet, consectetur

Re: char array weirdness

2016-03-30 Thread Jack Stouffer via Digitalmars-d-learn
On Wednesday, 30 March 2016 at 05:16:04 UTC, H. S. Teoh wrote: If we didn't have autodecoding, would be a simple matter of searching for sentinel substrings. This also indicates that most of the work done by autodecoding is unnecessary -- it's wasted work since most of the string data is

Re: char array weirdness

2016-03-29 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Mar 29, 2016 at 08:05:29PM -0400, Steven Schveighoffer via Digitalmars-d-learn wrote: [...] > Phobos treats narrow strings (wchar[], char[]) as ranges of dchar. It > was discovered that auto decoding strings isn't always the smartest > thing to do, especially for performance. > > So you

Re: char array weirdness

2016-03-29 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Mar 30, 2016 at 03:22:48AM +, Jack Stouffer via Digitalmars-d-learn wrote: > On Tuesday, 29 March 2016 at 23:42:07 UTC, H. S. Teoh wrote: > >Believe it or not, it was only last year (IIRC, maybe the year > >before) that Walter "discovered" that Phobos does autodecoding, and > >got

Re: char array weirdness

2016-03-29 Thread Jack Stouffer via Digitalmars-d-learn
On Tuesday, 29 March 2016 at 23:42:07 UTC, H. S. Teoh wrote: Believe it or not, it was only last year (IIRC, maybe the year before) that Walter "discovered" that Phobos does autodecoding, and got pretty upset over it. If even Walter wasn't aware of this for that long... The link (I think

Re: char array weirdness

2016-03-29 Thread Basile B. via Digitalmars-d-learn
On Wednesday, 30 March 2016 at 00:05:29 UTC, Steven Schveighoffer wrote: On 3/29/16 7:42 PM, H. S. Teoh via Digitalmars-d-learn wrote: On Tue, Mar 29, 2016 at 11:15:26PM +, Basile B. via Digitalmars-d-learn wrote: On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main ()

Re: char array weirdness

2016-03-29 Thread Steven Schveighoffer via Digitalmars-d-learn
On 3/29/16 7:42 PM, H. S. Teoh via Digitalmars-d-learn wrote: On Tue, Mar 29, 2016 at 11:15:26PM +, Basile B. via Digitalmars-d-learn wrote: On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main () { import std.range.primitives; char[] val = ['1', '0', 'h', '3',

Re: char array weirdness

2016-03-29 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Mar 29, 2016 at 11:15:26PM +, Basile B. via Digitalmars-d-learn wrote: > On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: > >void main () { > >import std.range.primitives; > >char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's']; > >pragma(msg,

Re: char array weirdness

2016-03-29 Thread Jack Stouffer via Digitalmars-d-learn
On Tuesday, 29 March 2016 at 23:15:26 UTC, Basile B. wrote: I've seen you so many time as a reviewer on dlang that I belive this Q is a joke. Even if obviously nobody can know everything... https://www.youtube.com/watch?v=l97MxTx0nzs seriously you didn't know that auto decoding is on and that

Re: char array weirdness

2016-03-29 Thread Basile B. via Digitalmars-d-learn
On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main () { import std.range.primitives; char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's']; pragma(msg, ElementEncodingType!(typeof(val))); pragma(msg, typeof(val.front)); } prints char dchar

Re: char array weirdness

2016-03-29 Thread Marco Leise via Digitalmars-d-learn
Am Mon, 28 Mar 2016 16:29:50 -0700 schrieb "H. S. Teoh via Digitalmars-d-learn" : > […] your diacritics may get randomly reattached to > stuff they weren't originally attached to, or you may end up with wrong > sequences of Unicode code points (e.g. diacritics

Re: char array weirdness

2016-03-29 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, March 28, 2016 16:29:50 H. S. Teoh via Digitalmars-d-learn wrote: > On Mon, Mar 28, 2016 at 04:07:22PM -0700, Jonathan M Davis via > Digitalmars-d-learn wrote: [...] > > > The range API considers all strings to have an element type of dchar. > > char, wchar, and dchar are UTF code units

Re: char array weirdness

2016-03-28 Thread Steven Schveighoffer via Digitalmars-d-learn
On 3/28/16 7:06 PM, Anon wrote: The compiler doesn't know that, and it isn't true in general. You could have, for example, U+3042 in your char[]. That would be encoded as three chars. It wouldn't make sense (or be correct) for val.front to yield '\xe3' (the first byte of U+3042 in UTF-8). I

Re: char array weirdness

2016-03-28 Thread Jack Stouffer via Digitalmars-d-learn
On Monday, 28 March 2016 at 23:07:22 UTC, Jonathan M Davis wrote: ... Thanks for the detailed responses. I think I'll compile this info and put it in a blog post so people can just point to it when someone else is confused.

Re: char array weirdness

2016-03-28 Thread H. S. Teoh via Digitalmars-d-learn
On Mon, Mar 28, 2016 at 04:07:22PM -0700, Jonathan M Davis via Digitalmars-d-learn wrote: [...] > The range API considers all strings to have an element type of dchar. > char, wchar, and dchar are UTF code units - UTF-8, UTF-16, and UTF-32 > respectively. One or more code units make up a code

Re: char array weirdness

2016-03-28 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, March 28, 2016 16:02:26 H. S. Teoh via Digitalmars-d-learn wrote: > For the time being, I'd recommend std.utf.byCodeUnit as a workaround. Yeah, though as I've started using it, I've quickly found that enough of Phobos doesn't support it yet, that it's problematic. e.g.

Re: char array weirdness

2016-03-28 Thread ag0aep6g via Digitalmars-d-learn
On 29.03.2016 00:49, Jack Stouffer wrote: But the value fits into a char; a dchar is a waste of space. Why on Earth would a different type be given for the front value than the type of the elements themselves? UTF-8 strings are decoded by the range primitives. That is, `front` returns one

Re: char array weirdness

2016-03-28 Thread Anon via Digitalmars-d-learn
On Monday, 28 March 2016 at 23:06:49 UTC, Anon wrote: Any because you're using ranges, *And because you're using ranges,

Re: char array weirdness

2016-03-28 Thread Anon via Digitalmars-d-learn
On Monday, 28 March 2016 at 22:49:28 UTC, Jack Stouffer wrote: On Monday, 28 March 2016 at 22:43:26 UTC, Anon wrote: On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main () { import std.range.primitives; char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's'];

Re: char array weirdness

2016-03-28 Thread H. S. Teoh via Digitalmars-d-learn
On Mon, Mar 28, 2016 at 10:49:28PM +, Jack Stouffer via Digitalmars-d-learn wrote: > On Monday, 28 March 2016 at 22:43:26 UTC, Anon wrote: > >On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: > >>void main () { > >>import std.range.primitives; > >>char[] val = ['1', '0',

Re: char array weirdness

2016-03-28 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, March 28, 2016 22:34:31 Jack Stouffer via Digitalmars-d-learn wrote: > void main () { > import std.range.primitives; > char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's']; > pragma(msg, ElementEncodingType!(typeof(val))); > pragma(msg, typeof(val.front)); > }

Re: char array weirdness

2016-03-28 Thread Jack Stouffer via Digitalmars-d-learn
On Monday, 28 March 2016 at 22:43:26 UTC, Anon wrote: On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main () { import std.range.primitives; char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's']; pragma(msg, ElementEncodingType!(typeof(val)));

Re: char array weirdness

2016-03-28 Thread Anon via Digitalmars-d-learn
On Monday, 28 March 2016 at 22:34:31 UTC, Jack Stouffer wrote: void main () { import std.range.primitives; char[] val = ['1', '0', 'h', '3', '6', 'm', '2', '8', 's']; pragma(msg, ElementEncodingType!(typeof(val))); pragma(msg, typeof(val.front)); } prints char dchar