Re: Same process to different results?

2015-07-01 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jul 01, 2015 at 02:14:49PM -0400, Steven Schveighoffer via 
Digitalmars-d-learn wrote:
> On 7/1/15 1:44 PM, Steven Schveighoffer wrote:
> 
> >Schizophrenia of Phobos.
> >
> >Phobos thinks a string is a range of dchar instead of a range of
> >char.  So what cycle, take, and array all output are dchar ranges and
> >arrays.
> >
> >When you cast the dchar[] result to a string, (which is a char[]), it
> >then treats all the 0's in each dchar element as '\0', printing a
> >blank apparently.
> 
> This has to be one of the most obvious cases I've ever seen that
> phobos treating string as a range of dchar was the wrong decision.
> That one can't use ranges to make a new string is ridiculous. Just the
> thought of "fixing" this by re-encoding...
[...]

Yeah, although Andrei has vetoed all suggestions of getting rid of
autodecoding, this is one of the glaring cases where it's obviously a
bad idea.

It almost makes me want to create my own custom string type that serves
up char instead of dchar.


T

-- 
There are four kinds of lies: lies, damn lies, and statistics.


Re: Same process to different results?

2015-07-01 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/1/15 1:44 PM, Steven Schveighoffer wrote:


Schizophrenia of Phobos.

Phobos thinks a string is a range of dchar instead of a range of char.
So what cycle, take, and array all output are dchar ranges and arrays.

When you cast the dchar[] result to a string, (which is a char[]), it
then treats all the 0's in each dchar element as '\0', printing a blank
apparently.


This has to be one of the most obvious cases I've ever seen that phobos 
treating string as a range of dchar was the wrong decision. That one 
can't use ranges to make a new string is ridiculous. Just the thought of 
"fixing" this by re-encoding...


-Steve


Re: Same process to different results?

2015-07-01 Thread Taylor Hillegeist via Digitalmars-d-learn
On Wednesday, 1 July 2015 at 17:00:51 UTC, Taylor Hillegeist 
wrote:

When I run the code (compiled on DMD 2.067.1):


--
import std.algorithm;
import std.stdio;
import std.range;

string A="AaA";
string B="BbBb";
string C="CcCcC";

void main(){
int L=25;

  int seg1len=(L-B.length)/2;
  int seg2len=B.length;
  int seg3len=L-seg1len-seg2len;

  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array).writeln;

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);

  q.writeln;

}
---

I get a weird result of
AaAAaAAaAABbBbCcCcCCcCcCC
A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   
C

  c   C   C   c   C   c   C   C

Any ideas why?


Some way or another the type was converted to a dchar[]
during this process:

 A.cycle.take(seg1len).array
~B.cycle.take(seg2len).array
~C.cycle.take(seg3len).array

Why would it change the type so sneaky like?... Except for maybe 
its the default behavior with string due to 32bits => (typically 
one grapheme)?

I bet cycle did this.


Re: Same process to different results?

2015-07-01 Thread anonymous via Digitalmars-d-learn
On Wednesday, 1 July 2015 at 17:13:03 UTC, Taylor Hillegeist 
wrote:

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);
  q.writeln;

I was wondering if it might be the cast?


Yes, the cast is wrong. You're reinterpreting (not converting) an 
array of `dchar`s (UTF-32 code units) as an array of `char`s 
(UTF-8 code units).


If you print the numeric values of the string, e.g. via 
std.string.representation, you can see that every actual 
character has three null bytes following it:


import std.string: representation;
writeln(q.representation);

[65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 
65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 
66, 0, 0, 0, 98, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 67, 0, 0, 0, 
99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0, 
99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0]



Use std.conv.to for less surprising conversions. And don't use 
casts unless you know exactly what you're doing.


Re: Same process to different results?

2015-07-01 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/1/15 1:00 PM, Taylor Hillegeist wrote:

When I run the code (compiled on DMD 2.067.1):


--
import std.algorithm;
import std.stdio;
import std.range;

string A="AaA";
string B="BbBb";
string C="CcCcC";

void main(){
 int L=25;

   int seg1len=(L-B.length)/2;
   int seg2len=B.length;
   int seg3len=L-seg1len-seg2len;

   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array).writeln;

   string q = cast(string)
   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array);

   q.writeln;

}
---

I get a weird result of
AaAAaAAaAABbBbCcCcCCcCcCC
A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C   c
C   C   c   C   c   C   C

Any ideas why?


Schizophrenia of Phobos.

Phobos thinks a string is a range of dchar instead of a range of char. 
So what cycle, take, and array all output are dchar ranges and arrays.


When you cast the dchar[] result to a string, (which is a char[]), it 
then treats all the 0's in each dchar element as '\0', printing a blank 
apparently.


-Steve


Re: Same process to different results?

2015-07-01 Thread Taylor Hillegeist via Digitalmars-d-learn

On Wednesday, 1 July 2015 at 17:06:01 UTC, Adam D. Ruppe wrote:
I betcha it is because A, B, and C are modified by the first 
pass. A lot of the range functions consume their input.


Running them one at a time produces the same result.

for some reason:

  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array).writeln;

is different from:

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);
  q.writeln;

I was wondering if it might be the cast?


Re: Same process to different results?

2015-07-01 Thread Adam D. Ruppe via Digitalmars-d-learn
I betcha it is because A, B, and C are modified by the first 
pass. A lot of the range functions consume their input.


Same process to different results?

2015-07-01 Thread Taylor Hillegeist via Digitalmars-d-learn

When I run the code (compiled on DMD 2.067.1):


--
import std.algorithm;
import std.stdio;
import std.range;

string A="AaA";
string B="BbBb";
string C="CcCcC";

void main(){
int L=25;

  int seg1len=(L-B.length)/2;
  int seg2len=B.length;
  int seg3len=L-seg1len-seg2len;

  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array).writeln;

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);

  q.writeln;

}
---

I get a weird result of
AaAAaAAaAABbBbCcCcCCcCcCC
A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C 
  c   C   C   c   C   c   C   C


Any ideas why?