Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Bauss via Digitalmars-d-announce

On Tuesday, 28 June 2016 at 03:58:23 UTC, Jay Norwood wrote:

On Tuesday, 28 June 2016 at 03:11:26 UTC, Jay Norwood wrote:

On Tuesday, 28 June 2016 at 01:53:22 UTC, deadalnix wrote:
If we were in interview, I'd ask you "what does this returns 
if you pass it an empty string ?"


oops.  I see ... need to test for empty string.

nothrow pure size_t strlen2(const(char)* c) {
  if (c is null || *c==0)
 return 0;
  const(char)* c_save = c;
  while (*c){ c+=4; }
  while (*c==0){ c--; }
  c++;
  return c - c_save;
}


Why not just do

nothrow pure size_t strlen2(const(char)* c) {
  if (!c)
return 0;

  ...
}


Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Jay Norwood via Digitalmars-d-announce

On Tuesday, 28 June 2016 at 03:11:26 UTC, Jay Norwood wrote:

On Tuesday, 28 June 2016 at 01:53:22 UTC, deadalnix wrote:
If we were in interview, I'd ask you "what does this returns 
if you pass it an empty string ?"


oops.  I see ... need to test for empty string.

nothrow pure size_t strlen2(const(char)* c) {
  if (c is null || *c==0)
 return 0;
  const(char)* c_save = c;
  while (*c){ c+=4; }
  while (*c==0){ c--; }
  c++;
  return c - c_save;
}


Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Jay Norwood via Digitalmars-d-announce

On Tuesday, 28 June 2016 at 01:53:22 UTC, deadalnix wrote:
If we were in interview, I'd ask you "what does this returns if 
you pass it an empty string ?"


I'd say use this one instead, to avoid negative size_t. It is 
also a little faster for the same measurement.


nothrow pure size_t strlen2(const(char)* c) {
  if (c is null)
 return 0;
  const(char)* c_save = c;
  while (*c){ c+=4; }
  while (*c==0){ c--; }
  c++;
  return c - c_save;
}

2738
540
2744



Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread deadalnix via Digitalmars-d-announce

On Sunday, 26 June 2016 at 16:40:08 UTC, Jay Norwood wrote:
After watching Andre's sentinel thing, I'm playing with strlen 
on char strings with 4 terminating 0s instead of a single one.  
Seems to work and is 4x faster compared to the runtime version.


nothrow pure size_t strlen2(const(char)* c) {
 if (c is null)
   return 0;
 size_t l=0;
 while (*c){ c+=4; l+=4;}
 while (*c==0){ c--; l--;}
 return l+1;
}

This is the timing of my test case, which I can post if anyone 
is interested.

strlen\Release>strlen
2738
681


If we were in interview, I'd ask you "what does this returns if 
you pass it an empty string ?"




Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Mike Parker via Digitalmars-d-announce

On Monday, 27 June 2016 at 19:51:48 UTC, Jay Norwood wrote:

I also found it strange, the non-zero initialization values for 
char, dchar, wchar.  I suppose there's some reason?


int [100]  to zeros.
char [100]  to 0xff;
dchar [100]   to 0x;
wchar [100]   to 0x;


The same reason float and double are default initialized to nan. 
char, wchar and dchar are default initialized to invalid unicode 
values.


Re: Release D 2.071.1

2016-06-27 Thread Jack Stouffer via Digitalmars-d-announce

On Monday, 27 June 2016 at 22:11:53 UTC, Martin Nowak wrote:

Glad to announce D 2.071.1.

http://dlang.org/download.html

This point release fixes a few issues over 2.071.0, see the 
changelog for more details.


http://dlang.org/changelog/2.071.1.html

-Martin


Glad to see this out :)


Re: Release D 2.071.1

2016-06-27 Thread Jack Stouffer via Digitalmars-d-announce
On Monday, 27 June 2016 at 23:15:06 UTC, Robert burner Schadek 
wrote:

Awesome, releases are becoming more and more boring. I like it!


I wouldn't call 1.0 * -1.0 == 1.0 boring!


Re: Release D 2.071.1

2016-06-27 Thread Robert burner Schadek via Digitalmars-d-announce

Awesome, releases are becoming more and more boring. I like it!


Release D 2.071.1

2016-06-27 Thread Martin Nowak via Digitalmars-d-announce
Glad to announce D 2.071.1.

http://dlang.org/download.html

This point release fixes a few issues over 2.071.0, see the changelog
for more details.

http://dlang.org/changelog/2.071.1.html

-Martin


Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Ola Fosheim Grøstad via Digitalmars-d-announce

On Monday, 27 June 2016 at 21:41:57 UTC, Jay Norwood wrote:
measurements. I'm using a 100KB char array terminated by four 
zeros, and doing strlen on substring pointers into it 
incremented by 1 for 100K times.


But this is a rather atypical use case for zero terminated 
strings? It would make more sense to test it on a massive amount 
of short strings stuffed into a very large hash-table. 
(filenames, keys, user names, email adresses etc)




Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Jay Norwood via Digitalmars-d-announce
On Monday, 27 June 2016 at 20:43:40 UTC, Ola Fosheim Grøstad 
wrote:
Just keep in mind that the major bottleneck now is loading 64 
bytes from memory into cache. So if you test performance you 
have to make sure to invalidate the caches before you test and 
test with spurious reads over a very large memory area to get 
realistic results.


But essentially, the operation is not heavy, so to speed it up 
you need to predict and prefetch from memory in time, meaning 
no library solution is sufficient. (you need to prefetch memory 
way before your library function is called)


I doubt the external memory accesses are involved in these 
measurements. I'm using a 100KB char array terminated by four 
zeros, and doing strlen on substring pointers into it incremented 
by 1 for 100K times.  The middle of the three timings is for 
strlen2, while the two outer timings are for strlen during the 
same program execution.


I'm initializing the 100KB immediately prior to the measurement. 
The 100KB array should all be in L1 or L2 cache by the time I 
make even the first of the three time measurements.


The prefetch shouldn't have a problem predicting this.

2749
688
2783

2741
683
2738




Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Ola Fosheim Grøstad via Digitalmars-d-announce
On Monday, 27 June 2016 at 06:31:49 UTC, Ola Fosheim Grøstad 
wrote:

On Monday, 27 June 2016 at 05:27:12 UTC, chmike wrote:
Ending strings with a single null byte/char is to save space. 
It was critical in the 70´s when C was created and memory 
space was very limited. That's not the case anymore and I 
guess the


Not only to save space, some CPUs also had cheap incrementing 
load/stores and branching on zero is faster than sacrificing 
another register for a counter.


I incidentally just found my 1992 implementation for Motorola 
68K, to illustrate:


_mystrcpy   
move.l  4(sp),a1; pointer for destination
move.l  8(sp),a0; pointer for source

mystrcpymove.l  a0,d0
1$  move.b  (a0)+,(a1)+ ; copy
bne.s   1$ ; jump back up if not zero
rts

As you can see it is a tight loop. Other CPUs are even tighter, 
and have single-instruction loops (even 8086?)


So not only storage, also performance on specific CPUs. Which is 
a good reason for keeping datatypes in standard libraries 
abstract, different CPUs favour different representations. Even 
on very basic datatypes.






Re: Another audio plugin in D

2016-06-27 Thread Jacob Carlborg via Digitalmars-d-announce

On 27/06/16 21:22, Guillaume Piolat wrote:


My wording was a bit strong.

As you may remember, the workaround involved "leaking" the dynlib.

On OS X I keep having a lingering crash which is a bit random, happens
with multiple instantiation/closing of a dynlib. It is a bit hard to
reproduce and I failed to remove it. It follows an hysteresis pattern,
when it's here it reproduces reliably, then disappear. With LDC-b2 I
thought it was gone (was codegen I thought), but seems still here somehow.

I'm not sure at all if it's related at all to dynlib unloading (wild
guess probability: 50%).


Ok, I see. We need to add proper support of dynamic libraries on OS X.

--
/Jacob Carlborg


Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Ola Fosheim Grøstad via Digitalmars-d-announce

On Monday, 27 June 2016 at 19:51:48 UTC, Jay Norwood wrote:
Your link's use of padding pads out with a variable number of 
zeros, so that a larger data type can be used for the compare 
operations.  This isn't the same as my example, which is 
simpler due to not having to fiddle with alignment and data 
type casting.


That's true, and it is fun to think about different string 
implementations. Just keep in mind that prior to the 90s, text 
was the essential datatype for many programmers and inventing new 
ways to do strings is heavily explored. I remember the first 
exercise we got at the university when doing the OS course was to 
implement "strlen", "strcpy" and "strcmp" in C or machine 
language. It can be fun.


Just keep in mind that the major bottleneck now is loading 64 
bytes from memory into cache. So if you test performance you have 
to make sure to invalidate the caches before you test and test 
with spurious reads over a very large memory area to get 
realistic results.


But essentially, the operation is not heavy, so to speed it up 
you need to predict and prefetch from memory in time, meaning no 
library solution is sufficient. (you need to prefetch memory way 
before your library function is called)




Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Jay Norwood via Digitalmars-d-announce
On Monday, 27 June 2016 at 16:38:58 UTC, Ola Fosheim Grøstad 
wrote:
Yes, and the idea of speeding up strings by padding out with 
zeros is not new. ;-) I recall suggesting it back in 1999 when 
discussing the benefits of having a big endian cpu when sorting 
strings. If it is big endian you can compare ascii as 32/64 bit 
integers, so if you align the string and pad out with zeros 
then you can speed up strcmp() by a significant factor. Oh, 
here it is:


Your link's use of padding pads out with a variable number of 
zeros, so that a larger data type can be used for the compare 
operations.  This isn't the same as my example, which is simpler 
due to not having to fiddle with alignment and data type casting.


I didn't find a strlen implementation for dchar or wchar in the D 
libraries.


I also found it strange, the non-zero initialization values for 
char, dchar, wchar.  I suppose there's some reason?


int [100]  to zeros.
char [100]  to 0xff;
dchar [100]   to 0x;
wchar [100]   to 0x;









Re: Beta D 2.071.1-b2

2016-06-27 Thread Martin Nowak via Digitalmars-d-announce
On 06/16/2016 08:43 PM, Jack Stouffer wrote:
> On Sunday, 29 May 2016 at 21:53:23 UTC, Martin Nowak wrote:
>> Second beta for the 2.071.1 release.
>>
>> http://dlang.org/download.html#dmd_beta
>> http://dlang.org/changelog/2.071.1.html
>>
>> Please report any bugs at https://issues.dlang.org
>>
>> -Martin
> 
> This release would fix some pretty serious bugs. What's the holdup?

I couldn't find enough time to fix
https://issues.dlang.org/show_bug.cgi?id=16085. Let's do the point
release now anyhow and follow-up later on.



Re: Beta D 2.071.1-b2

2016-06-27 Thread Martin Nowak via Digitalmars-d-announce
On 06/16/2016 09:47 PM, deadalnix wrote:
> 196418a8b3ec1c5f284da5009b4bb18e3f70d99f still not in after 3 month.
> This is typesystem breaking. While I understand it wasn't picked for
> 2.071 , I'm not sure why it wasn't for 2.071.1 .

Because it didn't target stable.



Re: Another audio plugin in D

2016-06-27 Thread Guillaume Piolat via Digitalmars-d-announce

On Monday, 27 June 2016 at 18:59:35 UTC, Jacob Carlborg wrote:

On 27/06/16 13:02, Guillaume Piolat wrote:

Unloading of shared libraries on OS X continues to be a 
problem though,

it would be nice if it worked in 64-bit.


I know the current situation is not ideal, but does it cause 
any problems?


My wording was a bit strong.

As you may remember, the workaround involved "leaking" the dynlib.

On OS X I keep having a lingering crash which is a bit random, 
happens with multiple instantiation/closing of a dynlib. It is a 
bit hard to reproduce and I failed to remove it. It follows an 
hysteresis pattern, when it's here it reproduces reliably, then 
disappear. With LDC-b2 I thought it was gone (was codegen I 
thought), but seems still here somehow.


I'm not sure at all if it's related at all to dynlib unloading 
(wild guess probability: 50%).


Re: Another audio plugin in D

2016-06-27 Thread Jacob Carlborg via Digitalmars-d-announce

On 27/06/16 13:02, Guillaume Piolat wrote:


Unloading of shared libraries on OS X continues to be a problem though,
it would be nice if it worked in 64-bit.


I know the current situation is not ideal, but does it cause any problems?

--
/Jacob Carlborg


Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Brad Roberts via Digitalmars-d-announce

On 6/26/2016 11:47 AM, Jay Norwood via Digitalmars-d-announce wrote:

On Sunday, 26 June 2016 at 16:59:54 UTC, David Nadlinger wrote:

Please keep general discussions like this off the announce list, which
would e.g. be suitable for announcing a fleshed out collection of
high-performance string handling routines.

A couple of quick hints:
 - This is not a correct implementation of strlen, as it already
assumes that the array is terminated by four zero bytes. That
iterating memory with a stride of 4 instead of 1 will be faster is a
self-evident truth.
 - You should be benchmarking against a "proper" SIMD-optimised strlen
implementation.

 — David



This is more of just an observation that the choice of the single zero
sentinel for C string termination comes at a cost of 4x strlen speed vs
using four terminating zeros.

I don't see a SIMD strlen implementation in the D libraries.

The strlen2 function I posted works on any string that is terminated by
four zeros, and returns the same len as strlen in that case, but much
faster.

How to get strings initialized with four terminating zeros at compile
time is a separate issue.  I don't know the solution, else I might
consider doing more with this.


Yup.. there's a reason that many many hours have been spent optimizing 
strlen and other memory related length and comparison routines.  They 
are used a lot and the number of ways of making them fast varies almost 
as much as the number of cpu's that exist.  This effort is embedded in 
the code gen of compilers (other than dmd) and libc runtimes.  Trying to 
re-invent it is noble, and very educational, but largely redundant.


Re: [Semi OT] About code review

2016-06-27 Thread Johan Engelen via Digitalmars-d-announce

On Monday, 27 June 2016 at 00:01:34 UTC, deadalnix wrote:
Several people during DConf asked abut tips and tricks on code 
review. So I wrote an article about it:


http://www.deadalnix.me/2016/06/27/on-code-review/


It's a nice read.

One comment: perhaps the balance has tipped a bit much to "making 
a good PR", rather than "doing a good review". I feel the merit 
of a review is to improve the contribution, rather than to decide 
whether it is mergable or not.  Although it is in the article, I 
think it could be given a little more attention: the review 
itself should contribute to the project, i.e. the reviewer should 
(try hard to) propose alternatives if something should/could be 
improved. Criticism is very easy, _constructive_ criticism isn't; 
I think the latter is needed to gain a contributor, and the first 
does the opposite.


-Johan



Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Ola Fosheim Grøstad via Digitalmars-d-announce

On Monday, 27 June 2016 at 16:22:56 UTC, Jay Norwood wrote:
This strlen2 doesn't require special alignment or casting of 
char pointer types to some larger type. That keeps the strlen2 
implementation fairly simple.


Yes, and the idea of speeding up strings by padding out with 
zeros is not new. ;-) I recall suggesting it back in 1999 when 
discussing the benefits of having a big endian cpu when sorting 
strings. If it is big endian you can compare ascii as 32/64 bit 
integers, so if you align the string and pad out with zeros then 
you can speed up strcmp() by a significant factor. Oh, here it is:


http://disinterest.org/resource/MUD-Dev/1999q1/009759.html

Of course, this is all moot now, little endian + simd has made 
such tricks redundant. Simd probably makes your strlen2 redundant 
too. The bottle neck tends to be memory access/prefetching for 
simple algorithms.




Re: 4x faster strlen with 4 char sentinel

2016-06-27 Thread Jay Norwood via Digitalmars-d-announce
On Monday, 27 June 2016 at 06:31:49 UTC, Ola Fosheim Grøstad 
wrote:
Besides there are plenty of other advantages to using a 
terminating sentinel depending on the use scenario. E.g. if you 
want many versions of the same tail or if you are splitting a 
string at white space (overwrite a white space char with a 
zero).


This strlen2 doesn't require special alignment or casting of char 
pointer types to some larger type. That keeps the strlen2 
implementation fairly simple.


The implementation is only testing one char per increment.  It 
doesn't require the extra xor processing used in some of the 
examples.


I haven't checked if there is a strlen for dchar or wchar, but it 
should also speed up those.







Re: [Semi OT] About code review

2016-06-27 Thread Steven Schveighoffer via Digitalmars-d-announce

On 6/26/16 8:01 PM, deadalnix wrote:

Several people during DConf asked abut tips and tricks on code review.
So I wrote an article about it:

http://www.deadalnix.me/2016/06/27/on-code-review/


Very nice. One thing missing: Always remember to update documentation 
when submitting updates! Can probably be lumped together with tests section.


-Steve


Re: PowerNex - New release of my D kernel

2016-06-27 Thread Guest via Digitalmars-d-announce

Also mentioned on OSNews: http://www.osnews.com/comments/29268


Another audio plugin in D

2016-06-27 Thread Guillaume Piolat via Digitalmars-d-announce

Greetings,

Auburn Sounds has released his second product fully made in D. It 
is intended to solve the following audio mixing problems:

- "I need to put more stereo in this track" and
- "regular panning doesn't sound that good on headphones".

https://www.auburnsounds.com/products/Panagement.html
https://www.youtube.com/watch?v=YytzPk09cQk

On the dplug side (https://github.com/p0nce/dplug/) rendering got 
optimized further, Audio Unit v2 was added and LDC became the 
compiler of choice for all releases. I couldn't been happier 
about LDC development.


On this note: if LDC ever supports iPhone and dplug implement 
Audio Unit v3, this might open up the iPhone market for audio 
effects since AU are sellable on the AppStore directly.


Unloading of shared libraries on OS X continues to be a problem 
though, it would be nice if it worked in 64-bit.


Last piece of news: I also started freelancing by accident 
(automating signaletics, in D too). So if you need a D programmer 
drop me an email!


Re: [Semi OT] About code review

2016-06-27 Thread Walter Bright via Digitalmars-d-announce

On 6/26/2016 5:01 PM, deadalnix wrote:

http://www.deadalnix.me/2016/06/27/on-code-review/


Nice article!


Re: Button: A fast, correct, and elegantly simple build system.

2016-06-27 Thread Jason White via Digitalmars-d-announce

On Monday, 27 June 2016 at 06:43:26 UTC, Rory McGuire wrote:
FYI, I implemented this feature today (no Batch/PowerShell 
output yet

though):

http://jasonwhite.github.io/button/docs/commands/convert

I think Bash should work on most Unix-like platforms.



And there is this[0] for windows, if you wanted to try bash on 
windows:



[0]: https://msdn.microsoft.com/en-us/commandline/wsl/about


Thanks, but I'll be sticking to bash on Linux. ;)

I'll add Batch (and maybe PowerShell) output when Button is 
supported on Windows. It should be very easy.