Re: Utf8 to Utf32 cast cost

2015-06-10 Thread Marco Leise via Digitalmars-d-learn
Am Mon, 8 Jun 2015 12:59:31 +0200
schrieb Daniel Kozák via Digitalmars-d-learn
:

> 
> On Mon, 08 Jun 2015 10:41:59 +
> Kadir Erdem Demir via Digitalmars-d-learn
>  wrote:
> 
> > I want to use my char array with awesome, cool std.algorithm 
> > functions. Since many of this algorithms requires like slicing 
> > etc.. I prefer to create my string with Utf32 chars. But by 
> > default all strings literals are Utf8 for performance.
> > 
> > With my current knowledge I use to!dhar to convert Utf8[](or 
> > char[]) to Utf32[](or dchar[])
> > 
> > dchar[] range = to!dchar("erdem".dup)
> > 
> > How costly is this?
> 
> import std.conv;
> import std.utf;
> import std.datetime;
> import std.stdio;
> 
> void f0() {
> string somestr = "some not so long utf8 string forbenchmarking";
> dstring str = to!dstring(somestr);
> }
> 
> 
> void f1() {
> string somestr = "some not so long utf8 string forbenchmarking";
> dstring str = toUTF32(somestr);
> }
> 
> void main() {
> auto r = benchmark!(f0,f1)(1_000_000);
> auto f0Result = to!Duration(r[0]);
> auto f1Result = to!Duration(r[1]);
> writeln("f0 time: ",f0Result);
> writeln("f1 time: ",f1Result);
> }
> 
> 
> /// output ///
> f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
> f1 time: 600 ms, 979 μs, and 8 hnsecs
> 

Please have the result of the transcode influence the program
output. E.g. Add the first character of the UTF32 string to
some global variable and print it out. At the moment - at
least in theory - you allow the compiler to deduce f0/f1 as
pure, return-nothing functions and you will benchmark anything
from your written code to an empty loop. I'm talking out of
experience here:
https://github.com/mleise/fast/blob/master/source/fast/internal.d#L99

-- 
Marco



Re: Utf8 to Utf32 cast cost

2015-06-10 Thread Marco Leise via Digitalmars-d-learn
Am Mon, 08 Jun 2015 11:13:25 +
schrieb "Daniel Kozak" :

> BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) 
> transcode is the fastest:
> 
> f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring
> f1 time: 449 ms and 329 μs // toUTF32
> f2 time: 272 ms, 969 μs, and 1 hnsec // transcode

Three functions, each twice as fast and twice as hidden as the
one before. :)

-- 
Marco



Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Anonymouse via Digitalmars-d-learn

On Monday, 8 June 2015 at 18:48:17 UTC, Daniel Kozak wrote:
Yep, but I dont care, I am the one who makes transcode faster, 
so I am happy

with results :P.

P.S. I care and probably when I have some spare time I will
improve to!dstring too


Ah, so you are. I confused you with Kadir Erdem Demir.


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozak via Digitalmars-d-learn

On Mon, 08 Jun 2015 18:16:57 +
Anonymouse via Digitalmars-d-learn  wrote:

> On Monday, 8 June 2015 at 11:44:47 UTC, Daniel Kozák wrote:
> > No difference even with GC.disable() results are same.
> 
> Profile! Callgrind is your friend~
Yep, but I dont care, I am the one who makes transcode faster, so I am happy
with results :P. 

P.S. I care and probably when I have some spare time I will
improve to!dstring too



Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Anonymouse via Digitalmars-d-learn

On Monday, 8 June 2015 at 11:44:47 UTC, Daniel Kozák wrote:

No difference even with GC.disable() results are same.


Profile! Callgrind is your friend~


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozák via Digitalmars-d-learn

On Mon, 08 Jun 2015 11:32:07 +
Kagamin via Digitalmars-d-learn 
wrote:

> On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote:
> > import std.conv;
> > import std.utf;
> > import std.datetime;
> > import std.stdio;
> >
> > void f0() {
> > string somestr = "some not so long utf8 string 
> > forbenchmarking";
> > dstring str = to!dstring(somestr);
> > }
> >
> >
> > void f1() {
> > string somestr = "some not so long utf8 string 
> > forbenchmarking";
> > dstring str = toUTF32(somestr);
> > }
> >
> > void main() {
> > auto r = benchmark!(f0,f1)(1_000_000);
> > auto f0Result = to!Duration(r[0]);
> > auto f1Result = to!Duration(r[1]);
> > writeln("f0 time: ",f0Result);
> > writeln("f1 time: ",f1Result);
> > }
> >
> >
> > /// output ///
> > f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
> > f1 time: 600 ms, 979 μs, and 8 hnsecs
> 
> Chances are you're benchmarking the GC. Try 
> benchmark!(f0,f1,f0,f1,f0,f1);

No difference even with GC.disable() results are same.



Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Kagamin via Digitalmars-d-learn

On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote:

import std.conv;
import std.utf;
import std.datetime;
import std.stdio;

void f0() {
string somestr = "some not so long utf8 string 
forbenchmarking";

dstring str = to!dstring(somestr);
}


void f1() {
string somestr = "some not so long utf8 string 
forbenchmarking";

dstring str = toUTF32(somestr);
}

void main() {
auto r = benchmark!(f0,f1)(1_000_000);
auto f0Result = to!Duration(r[0]);
auto f1Result = to!Duration(r[1]);
writeln("f0 time: ",f0Result);
writeln("f1 time: ",f1Result);
}


/// output ///
f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
f1 time: 600 ms, 979 μs, and 8 hnsecs


Chances are you're benchmarking the GC. Try 
benchmark!(f0,f1,f0,f1,f0,f1);


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozak via Digitalmars-d-learn

On Monday, 8 June 2015 at 11:06:07 UTC, Daniel Kozák wrote:


On Mon, 08 Jun 2015 10:51:53 +
weaselcat via Digitalmars-d-learn 


wrote:


On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
> On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir 
> wrote:
>> I want to use my char array with awesome, cool 
>> std.algorithm functions. Since many of this algorithms 
>> requires like slicing etc.. I prefer to create my string 
>> with Utf32 chars. But by default all strings literals are 
>> Utf8 for performance.

>>
>> With my current knowledge I use to!dhar to convert 
>> Utf8[](or char[]) to Utf32[](or dchar[])

>>
>> dchar[] range = to!dchar("erdem".dup)
>>
>> How costly is this?
>> Is there a way which I can have Utf32 string directly 
>> without a cast?

>
> 1. dstring range = to!dstring("erdem"); //without dup
> 2. dchar[] range = to!(dchar[])("erdem"); //mutable
> 3. dstring range = "erdem"d; //directly
> 4. dchar[] range = "erdem"d.dup; //mutable

what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32


from: http://dlang.org/phobos/std_encoding.html#.transcode

Supersedes:
This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and
std.utf.toUTF32() (but note that to!() supersedes it more 
conveniently).


BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) 
transcode is the fastest:


f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring
f1 time: 449 ms and 329 μs // toUTF32
f2 time: 272 ms, 969 μs, and 1 hnsec // transcode


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozák via Digitalmars-d-learn

On Mon, 08 Jun 2015 10:51:53 +
weaselcat via Digitalmars-d-learn 
wrote:

> On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:
> > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
> >> I want to use my char array with awesome, cool std.algorithm 
> >> functions. Since many of this algorithms requires like slicing 
> >> etc.. I prefer to create my string with Utf32 chars. But by 
> >> default all strings literals are Utf8 for performance.
> >>
> >> With my current knowledge I use to!dhar to convert Utf8[](or 
> >> char[]) to Utf32[](or dchar[])
> >>
> >> dchar[] range = to!dchar("erdem".dup)
> >>
> >> How costly is this?
> >> Is there a way which I can have Utf32 string directly without 
> >> a cast?
> >
> > 1. dstring range = to!dstring("erdem"); //without dup
> > 2. dchar[] range = to!(dchar[])("erdem"); //mutable
> > 3. dstring range = "erdem"d; //directly
> > 4. dchar[] range = "erdem"d.dup; //mutable
> 
> what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32

from: http://dlang.org/phobos/std_encoding.html#.transcode

Supersedes:
This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and
std.utf.toUTF32() (but note that to!() supersedes it more conveniently).


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Kadir Erdem Demir via Digitalmars-d-learn

Thanks a lot, your answers are very useful for me .
Nothing wrong with toUtf32, I just didn't know it.


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozák via Digitalmars-d-learn

On Mon, 08 Jun 2015 10:41:59 +
Kadir Erdem Demir via Digitalmars-d-learn
 wrote:

> I want to use my char array with awesome, cool std.algorithm 
> functions. Since many of this algorithms requires like slicing 
> etc.. I prefer to create my string with Utf32 chars. But by 
> default all strings literals are Utf8 for performance.
> 
> With my current knowledge I use to!dhar to convert Utf8[](or 
> char[]) to Utf32[](or dchar[])
> 
> dchar[] range = to!dchar("erdem".dup)
> 
> How costly is this?

import std.conv;
import std.utf;
import std.datetime;
import std.stdio;

void f0() {
string somestr = "some not so long utf8 string forbenchmarking";
dstring str = to!dstring(somestr);
}


void f1() {
string somestr = "some not so long utf8 string forbenchmarking";
dstring str = toUTF32(somestr);
}

void main() {
auto r = benchmark!(f0,f1)(1_000_000);
auto f0Result = to!Duration(r[0]);
auto f1Result = to!Duration(r[1]);
writeln("f0 time: ",f0Result);
writeln("f1 time: ",f1Result);
}


/// output ///
f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs
f1 time: 600 ms, 979 μs, and 8 hnsecs



Re: Utf8 to Utf32 cast cost

2015-06-08 Thread weaselcat via Digitalmars-d-learn

On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote:

On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
I want to use my char array with awesome, cool std.algorithm 
functions. Since many of this algorithms requires like slicing 
etc.. I prefer to create my string with Utf32 chars. But by 
default all strings literals are Utf8 for performance.


With my current knowledge I use to!dhar to convert Utf8[](or 
char[]) to Utf32[](or dchar[])


dchar[] range = to!dchar("erdem".dup)

How costly is this?
Is there a way which I can have Utf32 string directly without 
a cast?


1. dstring range = to!dstring("erdem"); //without dup
2. dchar[] range = to!(dchar[])("erdem"); //mutable
3. dstring range = "erdem"d; //directly
4. dchar[] range = "erdem"d.dup; //mutable


what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32


Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Daniel Kozák via Digitalmars-d-learn

On Mon, 08 Jun 2015 10:41:59 +
Kadir Erdem Demir via Digitalmars-d-learn
 wrote:

> I want to use my char array with awesome, cool std.algorithm 
> functions. Since many of this algorithms requires like slicing 
> etc.. I prefer to create my string with Utf32 chars. But by 
> default all strings literals are Utf8 for performance.
> 
> With my current knowledge I use to!dhar to convert Utf8[](or 
> char[]) to Utf32[](or dchar[])
> 
> dchar[] range = to!dchar("erdem".dup)
> 
> How costly is this?
> Is there a way which I can have Utf32 string directly without a 
> cast?

dstring str = "erdem"d;
dstring str2 = std.utf.toUTF32(someUtf8Or16Or32String);






Re: Utf8 to Utf32 cast cost

2015-06-08 Thread Ilya Yaroshenko via Digitalmars-d-learn

On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote:
I want to use my char array with awesome, cool std.algorithm 
functions. Since many of this algorithms requires like slicing 
etc.. I prefer to create my string with Utf32 chars. But by 
default all strings literals are Utf8 for performance.


With my current knowledge I use to!dhar to convert Utf8[](or 
char[]) to Utf32[](or dchar[])


dchar[] range = to!dchar("erdem".dup)

How costly is this?
Is there a way which I can have Utf32 string directly without a 
cast?


1. dstring range = to!dstring("erdem"); //without dup
2. dchar[] range = to!(dchar[])("erdem"); //mutable
3. dstring range = "erdem"d; //directly
4. dchar[] range = "erdem"d.dup; //mutable


Utf8 to Utf32 cast cost

2015-06-08 Thread Kadir Erdem Demir via Digitalmars-d-learn
I want to use my char array with awesome, cool std.algorithm 
functions. Since many of this algorithms requires like slicing 
etc.. I prefer to create my string with Utf32 chars. But by 
default all strings literals are Utf8 for performance.


With my current knowledge I use to!dhar to convert Utf8[](or 
char[]) to Utf32[](or dchar[])


dchar[] range = to!dchar("erdem".dup)

How costly is this?
Is there a way which I can have Utf32 string directly without a 
cast?