Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-13 Thread Linux-Fan

David Christensen writes:


On 2/12/24 08:30, Linux-Fan wrote:

David Christensen writes:


On 2/11/24 02:26, Linux-Fan wrote:

I wrote a program to automatically generate random bytes in multiple threads:
https://masysma.net/32/big4.xhtml


What algorithm did you implement?


I copied the algorithm from here:
https://www.javamex.com/tutorials/random_numbers/numerical_recipes.shtml


That Java code uses locks, which implies it uses global state and cannot be  
run multi-threaded (?).  (E.g. one process with one JVM.)


Indeed, the example code uses locks which is bad from a performance point of  
view. That is why _my_ implementation works without this fine-grained  
locking and instead ensures that each thread uses its own instance as to  
avoid the lock. IOW: I copied the algorithm but of course adjusted the code  
to my use case.


My version basically runs as follows:

* Create one queue of ByteBuffers
* Create multiple threads
   * Each thread runs their own RNG instance
   * Upon finishing the creation of a buffer, it enqueues
 the resulting ByteBuffer into the queue (this is the only
 part where multiple threads access concurrently)
* The main thread dequeues from the queue and writes the
  buffers to the output file

Is it possible to obtain parallel operation on an SMP machine with multiple  
virtual processors?  (Other than multiple OS processes with one PRNG on one  
JVM each?)


Even the locked random could be instantiated multiple times (each instance  
gets their own lock) and this could still be faster than running just  
one of it. However, since the computation itself is fast, I suppose the  
performance hit from managing the locks could be significant. Multiple OS  
processes would also work, but is pretty uncommon in Java land AFAIR.


I found it during the development of another application where I needed a  
lot of random data for simulation purposes :)


My implementation code is here:
https://github.com/m7a/bo-big/blob/master/latest/Big4.java


See the end of that file to compare with the “Numerical Recipes” RNG linked  
further above to observe the difference wrt. locking :)



If I were to do it again today, I'd probably switch to any of these PRNGS:

* https://burtleburtle.net/bob/rand/smallprng.html
* https://www.pcg-random.org/


Hard core.  I'll let the experts figure it out; and then I will use their  
libraries and programs.


IIRC one of the findings of PCG was that the default RNGs of many  
programming languages and environments are surprisingly bad. I only arrived at  
using a non-default implementation after facing some issues with the Java  
integrated ThreadLocalRandom ”back then” :)


It may indeed be worth pointing out (as Jeffrey Walton already mentioned in  
another subthread) that these RNGs discussed here are _not_ cryptographic  
RNGs. I think for disk testing purposes it is OK to use fast non- 
cryptographic RNGs, but other applications may have higher demands on their  
RNGs.


HTH
Linux-Fan

öö

[...]


pgpLYMPAsjGtj.pgp
Description: PGP signature


Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-12 Thread David Christensen

On 2/12/24 08:30, Linux-Fan wrote:

David Christensen writes:


On 2/11/24 02:26, Linux-Fan wrote:
I wrote a program to automatically generate random bytes in multiple 
threads:

https://masysma.net/32/big4.xhtml


What algorithm did you implement?


I copied the algorithm from here:
https://www.javamex.com/tutorials/random_numbers/numerical_recipes.shtml



That Java code uses locks, which implies it uses global state and cannot 
be run multi-threaded (?).  (E.g. one process with one JVM.)



Is it possible to obtain parallel operation on an SMP machine with 
multiple virtual processors?  (Other than multiple OS processes with one 
PRNG on one JVM each?)



I found it during the development of another application where I needed 
a lot of random data for simulation purposes :)


My implementation code is here:
https://github.com/m7a/bo-big/blob/master/latest/Big4.java

If I were to do it again today, I'd probably switch to any of these PRNGS:

* https://burtleburtle.net/bob/rand/smallprng.html
* https://www.pcg-random.org/



Hard core.  I'll let the experts figure it out; and then I will use 
their libraries and programs.



David



Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-12 Thread Jeffrey Walton
On Mon, Feb 12, 2024 at 3:02 PM Linux-Fan  wrote:
>
> David Christensen writes:
>
> > On 2/11/24 02:26, Linux-Fan wrote:
> >> I wrote a program to automatically generate random bytes in multiple 
> >> threads:
> >> https://masysma.net/32/big4.xhtml
> >>
> >> Before knowing about `fio` this way my way to benchmark SSDs :)
> >>
> >> Example:
> >>
> >> | $ big4 -b /dev/null 100 GiB
> >> | Ma_Sys.ma Big 4.0.2, Copyright (c) 2014, 2019, 2020 Ma_Sys.ma.
> >> | For further info send an e-mail to ma_sys...@web.de.
>
> [...]
>
> >> | 99.97% +8426 MiB 7813 MiB/s 102368/102400 MiB
> >> | Wrote 102400 MiB in 13 s @ 7812.023 MiB/s
> >
> >
> > What algorithm did you implement?
>
> I copied the algorithm from here:
> https://www.javamex.com/tutorials/random_numbers/numerical_recipes.shtml
>
> I found it during the development of another application where I needed a
> lot of random data for simulation purposes :)

A PRNG for a simulation has different requirements than a PRNG for
cryptographic purposes. A simulation usually needs numbers fast from a
uniform distribution. Simulations can use predictable numbers. Often a
Linear Congurential Generator (LCG) will do just fine even though they
were broken about 35 years ago. See Also see Joan Boyer's Inferring
Sequences Produced by Pseudo-Random Number Generators,
.

A cryptographic application will have more stringent requirements. A
cryptographic generator may (will?) take longer to generate a number,
the numbers need to come from a uniform distribution, and the numbers
need to be prediction resistant. You can read about the cryptographic
qualities of random numbers in NIST SP800-90 and friends.

> My implementation code is here:
> https://github.com/m7a/bo-big/blob/master/latest/Big4.java
>
> If I were to do it again today, I'd probably switch to any of these PRNGS:
>
>  * https://burtleburtle.net/bob/rand/smallprng.html
>  * https://www.pcg-random.org/
>
> >> Secure Random can be obtained from OpenSSL:
> >>
> >> | $ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 
> >> 1024
> >> * 1024)); done
> >> |
> >> | real0m49.288s
> >> | user0m44.710s
> >> | sys0m4.579s
> >>
> >> Effectively 2078 MiB/s (quite OK for single-threaded operation). It is not
> >> designed to generate large amounts of random data as the size is limited by
> >> integer range...

Jeff



Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-12 Thread Linux-Fan

David Christensen writes:


On 2/11/24 02:26, Linux-Fan wrote:

I wrote a program to automatically generate random bytes in multiple threads:
https://masysma.net/32/big4.xhtml

Before knowing about `fio` this way my way to benchmark SSDs :)

Example:

| $ big4 -b /dev/null 100 GiB
| Ma_Sys.ma Big 4.0.2, Copyright (c) 2014, 2019, 2020 Ma_Sys.ma.
| For further info send an e-mail to ma_sys...@web.de.


[...]


| 99.97% +8426 MiB 7813 MiB/s 102368/102400 MiB
| Wrote 102400 MiB in 13 s @ 7812.023 MiB/s



What algorithm did you implement?


I copied the algorithm from here:
https://www.javamex.com/tutorials/random_numbers/numerical_recipes.shtml

I found it during the development of another application where I needed a  
lot of random data for simulation purposes :)


My implementation code is here:
https://github.com/m7a/bo-big/blob/master/latest/Big4.java

If I were to do it again today, I'd probably switch to any of these PRNGS:

* https://burtleburtle.net/bob/rand/smallprng.html
* https://www.pcg-random.org/


Secure Random can be obtained from OpenSSL:

| $ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 1024  
* 1024)); done

|
| real    0m49.288s
| user    0m44.710s
| sys    0m4.579s

Effectively 2078 MiB/s (quite OK for single-threaded operation). It is not  
designed to generate large amounts of random data as the size is limited by  
integer range...



Thank you for posting the openssl(1) incantation.


You're welcome.

[...]

HTH
Linux-Fan

öö


pgpjvuqb6Fy1L.pgp
Description: PGP signature


Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-11 Thread David Christensen

On 2/11/24 02:26, Linux-Fan wrote:
I wrote a program to automatically generate random bytes in multiple 
threads:

https://masysma.net/32/big4.xhtml

Before knowing about `fio` this way my way to benchmark SSDs :)

Example:

| $ big4 -b /dev/null 100 GiB
| Ma_Sys.ma Big 4.0.2, Copyright (c) 2014, 2019, 2020 Ma_Sys.ma.
| For further info send an e-mail to ma_sys...@web.de.
|| 0.00% +0 MiB 0 MiB/s 0/102400 MiB
| 3.48% +3562 MiB 3255 MiB/s 3562/102400 MiB
| 11.06% +7764 MiB 5407 MiB/s 11329/102400 MiB
| 19.31% +8436 MiB 6387 MiB/s 19768/102400 MiB
| 27.71% +8605 MiB 6928 MiB/s 28378/102400 MiB
| 35.16% +7616 MiB 7062 MiB/s 35999/102400 MiB
| 42.58% +7595 MiB 7150 MiB/s 43598/102400 MiB
| 50.12% +7720 MiB 7230 MiB/s 51321/102400 MiB
| 58.57% +8648 MiB 7405 MiB/s 59975/102400 MiB
| 66.96% +8588 MiB 7535 MiB/s 68569/102400 MiB
| 75.11% +8343 MiB 7615 MiB/s 76916/102400 MiB
| 83.38% +8463 MiB 7691 MiB/s 85383/102400 MiB
| 91.74% +8551 MiB 7762 MiB/s 93937/102400 MiB
| 99.97% +8426 MiB 7813 MiB/s 102368/102400 MiB
|| Wrote 102400 MiB in 13 s @ 7812.023 MiB/s



What algorithm did you implement?



Secure Random can be obtained from OpenSSL:

| $ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 
1024 * 1024)); done

|
| real    0m49.288s
| user    0m44.710s
| sys    0m4.579s

Effectively 2078 MiB/s (quite OK for single-threaded operation). It is 
not designed to generate large amounts of random data as the size is 
limited by integer range...



Thank you for posting the openssl(1) incantation.


Benchmarking my daily driver laptop without Intel Secure Key:

2024-02-11 21:54:04 dpchrist@laalaa ~
$ lscpu | grep 'Model name'
Model name: Intel(R) Core(TM) i7-2720QM CPU @ 
2.20GHz


2024-02-11 21:54:09 dpchrist@laalaa ~
$ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 
1024 * 1024)); done


real1m40.149s
user1m25.174s
sys 0m14.952s


So, ~1.072E+9 bytes per second.


Benchmarking a workstation with Intel Secure Key:

2024-02-11 21:54:40 dpchrist@taz ~
$ lscpu | grep 'Model name'
Model name: Intel(R) Xeon(R) E-2174G CPU @ 3.80GHz

2024-02-11 21:54:46 dpchrist@taz ~
$ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 
1024 * 1024)); done


real1m14.696s
user1m0.338s
sys 0m14.353s


So, ~1.437E+09 bytes per second.


David




Re: Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-11 Thread Gremlin

On 2/11/24 05:26, Linux-Fan wrote:

David Christensen writes:


On 2/11/24 00:11, Thomas Schmitt wrote:


[...]


Increase block size:

2024-02-11 01:18:51 dpchrist@laalaa ~
$ dd if=/dev/urandom of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 3.62874 s, 296 MB/s


Here (Intel Xeon W-2295)

| $ dd if=/dev/urandom of=/dev/null bs=1M count=1K
| 1024+0 records in
| 1024+0 records out
| 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.15018 s, 499 MB/s



Raspberry pi 5:

dd if=/dev/urandom of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 3.0122 s, 356 MB/s

Now lets do it right and use random..
dd if=/dev/random of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.9859 s, 360 MB/s


Secure Random can be obtained from OpenSSL:

| $ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 
1024 * 1024)); done

|
| real    0m49.288s
| user    0m44.710s
| sys    0m4.579s


time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 1024 
* 1024)); done


real1m30.884s
user1m24.528s
sys 0m6.116s





Fast Random Data Generation (Was: Re: Unidentified subject!)

2024-02-11 Thread Linux-Fan

David Christensen writes:


On 2/11/24 00:11, Thomas Schmitt wrote:


[...]


Increase block size:

2024-02-11 01:18:51 dpchrist@laalaa ~
$ dd if=/dev/urandom of=/dev/null bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 3.62874 s, 296 MB/s


Here (Intel Xeon W-2295)

| $ dd if=/dev/urandom of=/dev/null bs=1M count=1K
| 1024+0 records in
| 1024+0 records out
| 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.15018 s, 499 MB/s


Concurrency:

threads throughput
1   296 MB/s
2   285+286=571 MB/s
3   271+264+266=801 MB/s
4   249+250+241+262=1,002 MB/s
5   225+214+210+224+225=1,098 MB/s
6   223+199+199+204+213+205=1,243 MB/s
7   191+209+210+204+213+201+197=1,425 MB/s
8   205+198+180+195+205+184+184+189=1,540 MB/s


I wrote a program to automatically generate random bytes in multiple threads:
https://masysma.net/32/big4.xhtml

Before knowing about `fio` this way my way to benchmark SSDs :)

Example:

| $ big4 -b /dev/null 100 GiB
| Ma_Sys.ma Big 4.0.2, Copyright (c) 2014, 2019, 2020 Ma_Sys.ma.
| For further info send an e-mail to ma_sys...@web.de.
| 
| 0.00% +0 MiB 0 MiB/s 0/102400 MiB

| 3.48% +3562 MiB 3255 MiB/s 3562/102400 MiB
| 11.06% +7764 MiB 5407 MiB/s 11329/102400 MiB
| 19.31% +8436 MiB 6387 MiB/s 19768/102400 MiB
| 27.71% +8605 MiB 6928 MiB/s 28378/102400 MiB
| 35.16% +7616 MiB 7062 MiB/s 35999/102400 MiB
| 42.58% +7595 MiB 7150 MiB/s 43598/102400 MiB
| 50.12% +7720 MiB 7230 MiB/s 51321/102400 MiB
| 58.57% +8648 MiB 7405 MiB/s 59975/102400 MiB
| 66.96% +8588 MiB 7535 MiB/s 68569/102400 MiB
| 75.11% +8343 MiB 7615 MiB/s 76916/102400 MiB
| 83.38% +8463 MiB 7691 MiB/s 85383/102400 MiB
| 91.74% +8551 MiB 7762 MiB/s 93937/102400 MiB
| 99.97% +8426 MiB 7813 MiB/s 102368/102400 MiB
| 
| Wrote 102400 MiB in 13 s @ 7812.023 MiB/s


[...]

Secure Random can be obtained from OpenSSL:

| $ time for i in `seq 1 100`; do openssl rand -out /dev/null $((1024 * 1024 * 
1024)); done
|
| real  0m49.288s
| user  0m44.710s
| sys   0m4.579s

Effectively 2078 MiB/s (quite OK for single-threaded operation). It is not  
designed to generate large amounts of random data as the size is limited by  
integer range...


HTH
Linux-Fan

öö


pgpoAijtbLtka.pgp
Description: PGP signature