Re: The One Billion Row Challenge

2024-01-13 Thread Sergey via Digitalmars-d-learn

On Saturday, 13 January 2024 at 23:25:07 UTC, monkyyy wrote:

On Thursday, 11 January 2024 at 11:21:39 UTC, Sergey wrote:
On Thursday, 11 January 2024 at 08:57:43 UTC, Christian 
Köstlin wrote:

Did someone already try to do this in dlang?
I guess it will be very hard to beat the java solutions 
running with graalvm!


https://news.ycombinator.com/item?id=38851337

Kind regards,
Christian


I think C++ people already beated Java's performance 
https://github.com/buybackoff/1brc?tab=readme-ov-file#native


I feel we could beat c++ if they didn't radix sort


The project is very hard. Many optimizations and tricks were 
applied by others.
It requires a lot of skill to implement everything on a high 
level.


Re: The One Billion Row Challenge

2024-01-13 Thread monkyyy via Digitalmars-d-learn

On Thursday, 11 January 2024 at 11:21:39 UTC, Sergey wrote:
On Thursday, 11 January 2024 at 08:57:43 UTC, Christian Köstlin 
wrote:

Did someone already try to do this in dlang?
I guess it will be very hard to beat the java solutions 
running with graalvm!


https://news.ycombinator.com/item?id=38851337

Kind regards,
Christian


I think C++ people already beated Java's performance 
https://github.com/buybackoff/1brc?tab=readme-ov-file#native


I feel we could beat c++ if they didn't radix sort


Re: Help optimize D solution to phone encoding problem: extremely slow performace.

2024-01-13 Thread Sergey via Digitalmars-d-learn

On Saturday, 13 January 2024 at 19:35:57 UTC, Renato wrote:

On Saturday, 13 January 2024 at 17:00:58 UTC, Anonymouse wrote:

On Saturday, 13 January 2024 at 12:55:27 UTC, Renato wrote:

[...]
I will have to try it... I thought that `BigInt` was to blame 
for the slowness (from what I could read from the trace logs), 
but after replacing that with basically a byte array key (see 
[commit 
here](https://github.com/renatoathaydes/prechelt-phone-number-encoding/commit/0e9025b9aacdcfef5a2649be4cc82b9bc607fd6c)) it barely improved. It's still much slower than Common Lisp and very, very far from Java and Rust.


In the repo is hard to find the proper version.
I've checked the Rust from master branch and it looks a bit 
different from D implementation..


I would suggest to rewrite in the same way as Rust implemented.
Probably you would like to try:
* do not use BigInt from std. It could be quite slow. Try to use 
GMP library from Dub instead

* don't do "idup" every time
* instead of byLine, try byLineCopy
* instead of "arr ~= data" try to use Appender 
(https://dlang.org/library/std/array/appender.html)
* also you could try to use splitter 
(https://dlang.org/library/std/algorithm/iteration/splitter.html) 
to lazily process each part of the data
* isLastDigit function has many checks, but I think it could be 
implemented easier in a Rust way
* also consider to use functions from Range (filter, map) as you 
use it in Rust, instead of using for loops


Re: Socket handle leak and active handle warning with Vibe-D

2024-01-13 Thread bomat via Digitalmars-d-learn

I am still getting this in 2024 and vibe.d 0.9.7:
```
Warning: 1 socket handles leaked at driver shutdown.
```

I was wondering if maybe someone has new info on this...


Re: Help optimize D solution to phone encoding problem: extremely slow performace.

2024-01-13 Thread Renato via Digitalmars-d-learn

On Saturday, 13 January 2024 at 17:00:58 UTC, Anonymouse wrote:

On Saturday, 13 January 2024 at 12:55:27 UTC, Renato wrote:

[...]
Not a great profiling experience :). Anyone has a better 
suggestion to "parse" the trace file?


As a drive-by suggestion and I hope it doesn't derail anything, 
but if you have the opportunity to run it on linux, have you 
tried profiling with callgrind instead, with {Q,K}Cachegrind to 
visualise things? Your repositories probably have them. 
(callgrind is a part of valgrind.)


The wiki only mentions callgrind in passing, but it has worked 
well for me. [(example)](https://i.imgur.com/WWZAwy3.png)


Thanks for the suggestion, this looks promising as I do have a 
Linux laptop (just not my main one).


I will have to try it... I thought that `BigInt` was to blame for 
the slowness (from what I could read from the trace logs), but 
after replacing that with basically a byte array key (see [commit 
here](https://github.com/renatoathaydes/prechelt-phone-number-encoding/commit/0e9025b9aacdcfef5a2649be4cc82b9bc607fd6c)) it barely improved. It's still much slower than Common Lisp and very, very far from Java and Rust.


Re: Help optimize D solution to phone encoding problem: extremely slow performace.

2024-01-13 Thread Anonymouse via Digitalmars-d-learn

On Saturday, 13 January 2024 at 12:55:27 UTC, Renato wrote:

[...]
Not a great profiling experience :). Anyone has a better 
suggestion to "parse" the trace file?


As a drive-by suggestion and I hope it doesn't derail anything, 
but if you have the opportunity to run it on linux, have you 
tried profiling with callgrind instead, with {Q,K}Cachegrind to 
visualise things? Your repositories probably have them. 
(callgrind is a part of valgrind.)


The wiki only mentions callgrind in passing, but it has worked 
well for me. [(example)](https://i.imgur.com/WWZAwy3.png)


Re: Help optimize D solution to phone encoding problem: extremely slow performace.

2024-01-13 Thread Renato via Digitalmars-d-learn

On Saturday, 13 January 2024 at 11:03:42 UTC, Renato wrote:


I tried to profile the D code but the profiler seems to be 
broken on my OS (Mac):




I profiled it on Linux and stored [the trace.log 
file](https://gist.github.com/renatoathaydes/fd8752ed81b0cf792ed7ef5aa3d68acd) on a public Gist.


I tried using the HTML viewer [recommended in the 
wiki](https://wiki.dlang.org/Development_tools) but that doesn't 
work... first try:


```
Running 
../../../.dub/packages/d-profile-viewer/~master/d-profile-viewer/d-profile-viewer
Corrupt trace.log (can't compute ticks per second), please 
re-profile and try again

```

Second try with a longer trace (the one I saved in the gist):

```
Running 
../../../.dub/packages/d-profile-viewer/~master/d-profile-viewer/d-profile-viewer

std.conv.ConvException@/home/renato/dlang/ldc-1.36.0/bin/../import/std/conv.d(2533):
 Unexpected '-' when converting from type char[] to type ulong

??:?
 [0x55c4be9bc99e]
??:?
 [0x55c4be9bc612]
??:?
 [0x55c4be9de81e]
??:?
 [0x55c4be9c4fbf]
/home/renato/dlang/ldc-1.36.0/bin/../import/std/conv.d:2533 
[0x55c4be98ba2f]
/home/renato/dlang/ldc-1.36.0/bin/../import/std/conv.d:2002 
[0x55c4be98b6fc]
/home/renato/dlang/ldc-1.36.0/bin/../import/std/conv.d:210 
[0x55c4be9665ec]

../../../.dub/packages/d-profile-viewer/~master/d-profile-viewer/source/app.d:1095
 [0x55c4be965e21]
../../../.dub/packages/d-profile-viewer/~master/d-profile-viewer/source/app.d:1138
 [0x55c4be96698a]
??:?
 [0x55c4be9c4c9c]
```

Not a great profiling experience :). Anyone has a better 
suggestion to "parse" the trace file?




Help optimize D solution to phone encoding problem: extremely slow performace.

2024-01-13 Thread Renato via Digitalmars-d-learn
I like to use a phone encoding problem to determine the 
strenghtness and weaknesses of programming languages because this 
problem is easy enough I can write solutions in any language in a 
few hours, but complex enough to exercise lots of interesting 
parts of the language.


You can check [my initial blog 
post](https://renato.athaydes.com/posts/revisiting-prechelt-paper-comparing-languages) about this, which was an analysis or the original study by Prechelt about the productivity differences between programmers using different languages.


This original problem had a flaw when used in modern computers as 
the programs can find solutions so quickly that most of the time 
in the benchmarks was being used to actually print solutions to 
stdout, so I modified the problem so that there's an option to 
either print the solutions, or just count the number of solutions 
- so the programs still need to do all the work, but are not 
required to print anything other than how many solutions were 
found.


Anyway, I ported the Common Lisp solution to D because, like CL, 
D has built-in data structures like associative arrays and 
`BigInt` (at least it's in the stdlib)... I thought this would 
actually give D an edge! But it turns out D performs very badly 
for larger input sets. It starts quite well on smaller inputs, 
but scales very poorly.


My initial Rust solution also performed very poorly, so I'm 
afraid the same is happening with my initial D solution, despite 
my best effort to write something "decent".


You can find my D solution [in this Pull 
Request](https://github.com/renatoathaydes/prechelt-phone-number-encoding/pull/16).


The solution is almost identical, to the extent possible, to the 
Common Lisp solution, which can be [found 
here](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/fastest-implementations-print-or-count/src/lisp/main.lisp).


The secret to high performance for the algorithm being used is 
having a very efficient `BigInt` implementation, and fast hashing 
function for the hash table (associative array in D, with 
`BigInt` as keys). Hence, I suppose D's hash function or `BigInt` 
are not that fast (Rust's default hash is also not great due to 
security concerns, but it's very easy to use a custom one which 
is much faster by changing a single line of code).


Anyway, here's the current relative performance (the other 
languages are pretty heavily optimised, the Java solution uses a 
different algorithm so it's not directly comparable, [the Rust 
solution](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/fastest-implementations-print-or-count/src/rust/phone_encoder/src/main.rs) uses approx. the same algorithm as used in CL and D, but instead of `BigInt`, it uses a `Vec` as key as that turned out to be faster - I may try that technique in D as well - but notice that even using Rust's slower BigInt, it was orders of magnitude faster than D).


```
Benchmarking...
Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,109895680,377
java-Main,179634176,1025
java-Main,167149568,1621
java-Main,180203520,2493
java-Main,96481280,6112
java-Main,95526912,7989
===> sbcl --script src/lisp/main.fasl
sbcl,31780864,74
sbcl,79437824,888
sbcl,79613952,3991
sbcl,80654336,7622
sbcl,80420864,18623
sbcl,83402752,29503
===> ./rust
./rust,23257088,58
./rust,23437312,260
./rust,23433216,1077
./rust,23416832,2017
./rust,7106560,6757
./rust,7110656,10165
===> src/d/dencoder
src/d/dencoder,38748160,223
src/d/dencoder,75800576,3154
src/d/dencoder,75788288,14905
src/d/dencoder,75751424,30271
```

I had to abort D as it was taking too long.

The above is with the `print` option (that's normally slow when 
stdout is not buffered, but I did buffer D's writes and it's 
still very slow).


With the `count` option, which does not print anything except a 
number at the end, D took so long I couldn't even collect its 
results (I waited several minutes)... the other languages 
finished in much less than a minute:


```
Benchmarking...
Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,124112896,7883
java-Main,107487232,9273
===> sbcl --script src/lisp/main.fasl
sbcl,82669568,25638
sbcl,83759104,33501
===> ./rust
./rust,7061504,9488
./rust,7127040,11441
===> src/d/dencoder
^C
(ldc-1.36.0)
```

I tried to profile the D code but the profiler seems to be broken 
on my OS (Mac):


```
▶ dub build -b profile
Starting Performing "profile" build using 
/Users/renato/dlang/ldc-1.36.0/bin/ldc2 for x86_64.
Building prechelt ~master: building configuration 
[application]

 Linking dencoder
(ldc-1.36.0)
prechelt-phone-number-encoding/src/d  dlang ✗ 
9m ◒

▶ cd ../..
(ldc-1.36.0)
programming/projects/prechelt-phone-number-encoding  dlang ✗  
9m ◒

▶ src/d/dencoder
[1]