Re: shuffle a character array

2016-07-21 Thread pineapple via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 18:32:15 UTC, Jesse Phillips wrote:
I think you mean that your range library treats them as arrays 
of code units, meaning your library will break (some) unicode 
strings.


Right - I disagree with the assessment that all (or even most) 
char[] types are intended to represent unicode strings, rather 
than arrays containing chars.


If you want your array to be interpreted as a unicode string, 
then you should use std.utc's byGrapheme or similar functions.


Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 16:44:11 UTC, ketmar wrote:

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:

There is no auto-decoding going on here,

...

as char[] and wchar[] are rejected outright since they are not 
considered random access ranges.

...due to autodecoding.


No, due to them being multi-byte formats. I don't see what auto 
decoding has to do with it. That's a separate concept. We could 
take auto decoding out of Phobos and still disqualify them as 
random access ranges.


Re: shuffle a character array

2016-07-20 Thread Jesse Phillips via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 16:03:27 UTC, pineapple wrote:

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:
There is no auto-decoding going on here, as char[] and wchar[] 
are rejected outright since they are not considered random 
access ranges.


They are considered random access ranges by my ranges library, 
because they are treated as arrays of characters and not as 
unicode strings.


I think you mean that your range library treats them as arrays of 
code units, meaning your library will break (some) unicode 
strings.


Note that auto decoding and random access range are different. 
The isRandomAccess check must make a special condition that the 
string is not "narrow" else they would be considered random 
access even though front automatically decodes.


922: static assert(!isNarrowString!R);


Re: shuffle a character array

2016-07-20 Thread Ali Çehreli via Digitalmars-d-learn

On 07/20/2016 10:40 AM, Jack Stouffer wrote:

On Wednesday, 20 July 2016 at 17:31:18 UTC, Ali Çehreli wrote:

making it impossible to access randomly


making it impossible to access randomly __correctly__, unless you're
safely assuming there's only ASCII in your string.


Yes, perhaps I should have said "making it not meaningful to access 
randomly" (in general, as you note).


Ali



Re: shuffle a character array

2016-07-20 Thread Jack Stouffer via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 17:31:18 UTC, Ali Çehreli wrote:

making it impossible to access randomly


making it impossible to access randomly __correctly__, unless 
you're safely assuming there's only ASCII in your string.


Re: shuffle a character array

2016-07-20 Thread ketmar via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 17:31:18 UTC, Ali Çehreli wrote:
I think both not being random access ranges and there is 
auto-decoding in Phobos are design decisions due to the fact 
that char[] is a multi-byte encoding.


Phobos could choose not to auto-decode but char[] would still 
be multi-byte, making it impossible to access randomly.


but it does happen that we have autodecoding, and 
non-random-access char ranges, and it is clearly tied. so, 
leaving aside "what if..." things, we can say that it is 
autodecoding issue. ;-)


Re: shuffle a character array

2016-07-20 Thread Ali Çehreli via Digitalmars-d-learn

On 07/20/2016 09:44 AM, ketmar wrote:

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:

There is no auto-decoding going on here,

...


as char[] and wchar[] are rejected outright since they are not
considered random access ranges.

...due to autodecoding.


I think both not being random access ranges and there is auto-decoding 
in Phobos are design decisions due to the fact that char[] is a 
multi-byte encoding.


Phobos could choose not to auto-decode but char[] would still be 
multi-byte, making it impossible to access randomly.


Ali



Re: shuffle a character array

2016-07-20 Thread ag0aep6g via Digitalmars-d-learn

On 07/20/2016 06:18 PM, Mike Parker wrote:

The relevant lines I quoted from the docs above explain quite clearly
that it's because they are multi-byte formats. Indexing them is not
inefficient, it simply makes no sense. What does it mean to take the
value at index i when it is part of a multi-byte sequence that continues
at index i+1? Auto-decoding has nothing to do with it.


Without auto decoding, char[] would (most probably) be a random access 
range of code units. Taking the value at index i would return the code 
unit at index i, like it does for the array.


It's not that way, because narrow strings are decoded by the range 
primitives (auto decoding).


Re: shuffle a character array

2016-07-20 Thread ketmar via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:

There is no auto-decoding going on here,

...

as char[] and wchar[] are rejected outright since they are not 
considered random access ranges.

...due to autodecoding.


Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 16:08:26 UTC, pineapple wrote:



Pardon my being scatterbrained (and there not being an "edit 
post" function) - you're referring to phobos not considering 
char[] and wchar[] to have random access? The reason they are 
not considered to have random access is because they are 
auto-decoded by other functions that handle them, and the 
auto-decoding makes random access inefficient. Not because 
shuffleRandom itself auto-decodes them.


The relevant lines I quoted from the docs above explain quite 
clearly that it's because they are multi-byte formats. Indexing 
them is not inefficient, it simply makes no sense. What does it 
mean to take the value at index i when it is part of a multi-byte 
sequence that continues at index i+1? Auto-decoding has nothing 
to do with it.


Re: shuffle a character array

2016-07-20 Thread pineapple via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 16:04:50 UTC, pineapple wrote:

On Wednesday, 20 July 2016 at 16:03:27 UTC, pineapple wrote:

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:
There is no auto-decoding going on here, as char[] and 
wchar[] are rejected outright since they are not considered 
random access ranges.


They are considered random access ranges by my ranges library, 
because they are treated as arrays of characters and not as 
unicode strings.


On second thought that's not even relevant - the linked-to 
module performs an out-of-place shuffle and so does not even 
require the input range to have random access.


Pardon my being scatterbrained (and there not being an "edit 
post" function) - you're referring to phobos not considering 
char[] and wchar[] to have random access? The reason they are not 
considered to have random access is because they are auto-decoded 
by other functions that handle them, and the auto-decoding makes 
random access inefficient. Not because shuffleRandom itself 
auto-decodes them.


Re: shuffle a character array

2016-07-20 Thread pineapple via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 16:03:27 UTC, pineapple wrote:

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:
There is no auto-decoding going on here, as char[] and wchar[] 
are rejected outright since they are not considered random 
access ranges.


They are considered random access ranges by my ranges library, 
because they are treated as arrays of characters and not as 
unicode strings.


On second thought that's not even relevant - the linked-to module 
performs an out-of-place shuffle and so does not even require the 
input range to have random access.


Re: shuffle a character array

2016-07-20 Thread pineapple via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 13:33:34 UTC, Mike Parker wrote:
There is no auto-decoding going on here, as char[] and wchar[] 
are rejected outright since they are not considered random 
access ranges.


They are considered random access ranges by my ranges library, 
because they are treated as arrays of characters and not as 
unicode strings.


Re: shuffle a character array

2016-07-20 Thread celavek via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 10:40:04 UTC, pineapple wrote:



There's also the shuffle module in mach.range which doesn't do 
any auto-decoding: 
https://github.com/pineapplemachine/mach.d/blob/master/mach/range/random/shuffle.d


Interesting project. Thanks for the link.



Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 10:40:04 UTC, pineapple wrote:

On Wednesday, 20 July 2016 at 08:02:07 UTC, Mike Parker wrote:
You can then go to the documentation for 
std.range.primitives.isRandomAccessRange [2], where you'll 
find the following:


"Although char[] and wchar[] (as well as their qualified 
versions including string and wstring) are arrays, 
isRandomAccessRange yields false for them because they use 
variable-length encodings (UTF-8 and UTF-16 respectively). 
These types are bidirectional ranges only."


There's also the shuffle module in mach.range which doesn't do 
any auto-decoding: 
https://github.com/pineapplemachine/mach.d/blob/master/mach/range/random/shuffle.d


There is no auto-decoding going on here, as char[] and wchar[] 
are rejected outright since they are not considered random access 
ranges.


Re: shuffle a character array

2016-07-20 Thread pineapple via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:02:07 UTC, Mike Parker wrote:
You can then go to the documentation for 
std.range.primitives.isRandomAccessRange [2], where you'll find 
the following:


"Although char[] and wchar[] (as well as their qualified 
versions including string and wstring) are arrays, 
isRandomAccessRange yields false for them because they use 
variable-length encodings (UTF-8 and UTF-16 respectively). 
These types are bidirectional ranges only."


There's also the shuffle module in mach.range which doesn't do 
any auto-decoding: 
https://github.com/pineapplemachine/mach.d/blob/master/mach/range/random/shuffle.d


Re: shuffle a character array

2016-07-20 Thread celavek via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:30:37 UTC, Mike Parker wrote:



representation does not allocate any new memory. It points to 
the same memory, same data. If we think of D arrays as 
something like this:


struct Array(T) {
size_t len;
T* ptr;
}

Then representation is doing this:

Array original;
Array representation(original.len, original.ptr);

So, yes, the char data will still be shuffled in place. All 
you're doing is getting a ubyte view onto it so that it can be 
treated as a range.


Thank you for the very useful information. I really appreciate 
taking the time to explain

these, maybe trivial, things to me.

I confirmed the behavior with a test. working as expected.



Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:18:55 UTC, celavek wrote:



As far as my current understanding goes the shuffle will be 
done in place.
If I use the "representation" would that still hold, that is 
will I be able
to use the same char[] but in the shuffled form? (of course I 
will test that)


representation does not allocate any new memory. It points to the 
same memory, same data. If we think of D arrays as something like 
this:


struct Array(T) {
size_t len;
T* ptr;
}

Then representation is doing this:

Array original;
Array representation(original.len, original.ptr);

So, yes, the char data will still be shuffled in place. All 
you're doing is getting a ubyte view onto it so that it can be 
treated as a range.


Re: shuffle a character array

2016-07-20 Thread celavek via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:02:07 UTC, Mike Parker wrote:


If you are absolutely, 100% certain that you are dealing with 
ASCII, you can do this:


```
import std.string : representation;
randomShuffle(charArray.representation);

That will give you a ubyte[] for char[] and a ushort[] for 
wchar[].


[1] https://dlang.org/phobos/std_random.html#.randomShuffle
[2] 
https://dlang.org/phobos/std_range_primitives.html#isRandomAccessRange


Ahhh! That again. I was thinking about using the representation. 
I should take a deeper

look at the documentation.

As far as my current understanding goes the shuffle will be done 
in place.
If I use the "representation" would that still hold, that is will 
I be able
to use the same char[] but in the shuffled form? (of course I 
will test that)


Thank you


Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:05:20 UTC, Mike Parker wrote:

On Wednesday, 20 July 2016 at 08:02:07 UTC, Mike Parker wrote:

On Wednesday, 20 July 2016 at 07:49:38 UTC, celavek wrote:


If you are absolutely, 100% certain that you are dealing with 
ASCII, you can do this:




And I forgot to add:

Otherwise, you'll want to convert to dchar[] (probably via 
std.utf.toUTF32) and pass that along instead.


Actually, std.conv.to might be better, since toUTF32 returns 
dstring:


auto dcharArray = to!(dchar[])(charArray);


Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 08:02:07 UTC, Mike Parker wrote:

On Wednesday, 20 July 2016 at 07:49:38 UTC, celavek wrote:


If you are absolutely, 100% certain that you are dealing with 
ASCII, you can do this:




And I forgot to add:

Otherwise, you'll want to convert to dchar[] (probably via 
std.utf.toUTF32) and pass that along instead.


Re: shuffle a character array

2016-07-20 Thread Mike Parker via Digitalmars-d-learn

On Wednesday, 20 July 2016 at 07:49:38 UTC, celavek wrote:



I thought that I could use a dynamic array as a range ...


You can. However, if you take a look at the documentation for 
std.random.randomShuffle [1], you'll find the following 
constraint:


if (isRandomAccessRange!Range);

You can then go to the documentation for 
std.range.primitives.isRandomAccessRange [2], where you'll find 
the following:


"Although char[] and wchar[] (as well as their qualified versions 
including string and wstring) are arrays, isRandomAccessRange 
yields false for them because they use variable-length encodings 
(UTF-8 and UTF-16 respectively). These types are bidirectional 
ranges only."


If you are absolutely, 100% certain that you are dealing with 
ASCII, you can do this:


```
import std.string : representation;
randomShuffle(charArray.representation);

That will give you a ubyte[] for char[] and a ushort[] for 
wchar[].


[1] https://dlang.org/phobos/std_random.html#.randomShuffle
[2] 
https://dlang.org/phobos/std_range_primitives.html#isRandomAccessRange