Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread 9il via Digitalmars-d-announce

On Friday, 4 December 2020 at 03:48:15 UTC, Walter Bright wrote:

On 12/3/2020 8:27 AM, 9il wrote:
Since the first announcement [0] the original benchmark [1] 
has been boosted [2] with Mir-like implementations.


This is really great! Can you write an article about it? Such 
would be really helpful in letting people know about it.


Thanks! The README is really great as the benchmark description. 
I will do a small article about Mir this year.


Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread 9il via Digitalmars-d-announce

On Friday, 4 December 2020 at 02:35:49 UTC, data pulverizer wrote:

On Thursday, 3 December 2020 at 21:28:04 UTC, jmh530 wrote:

Am I correct in assuming that the data in the NDSlice is also a 
single array?


sweep_ndslice uses (2*N - 1) arrays to index U, this allows LDC 
to unroll the loop.


For example, for 2D case, withNeighboursSum [2] will store the 
pointer to the result, and the pointer at rows above and below.


matrix:
--
--a--- above iterator
--r--- the result
--b--- below iterator
--

Also, for AVX-512 targets it allows vectorizing the loop [1]. The 
benchmark has been run on the AVX2 CPU.


[1] https://github.com/typohnebild/numpy-vs-mir/issues/4
[2] 
http://mir-algorithm.libmir.org/mir_ndslice_topology.html#.withNeighboursSum


Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread 9il via Digitalmars-d-announce

On Thursday, 3 December 2020 at 17:08:58 UTC, jmh530 wrote:

On Thursday, 3 December 2020 at 16:27:59 UTC, 9il wrote:
Looks good, but a few typos:


Thanks!


Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread 9il via Digitalmars-d-announce

On Thursday, 3 December 2020 at 16:50:39 UTC, Andre Pany wrote:

On Thursday, 3 December 2020 at 16:27:59 UTC, 9il wrote:

Hi all,

Since the first announcement [0] the original benchmark [1] 
has been boosted [2] with Mir-like implementations.


D+Mir:
 1. is more abstract than NumPy
 2. requires less code for multidimensional algorithms
 3. doesn't require indexing
 4. uses recursion across dimensions
 5. a few times faster than NumPy for non-trivial real-world 
applications.


Why Mir is faster than NumPy?

1. Mir allows the compiler to generate specialized kernels 
while NumPy constraints a user to write code that needs to 
access memory twice or more times.


Another Mir killer feature is the ability to write generalized 
N-dimensional implementations, while Numpy code needs to have 
separate implementations for 1D, 2D, and 3D cases. For 
example, the main D loop in the benchmark can compile for 4D, 
5D, and higher dimensional optimizations.


2. @nogc iteration loop. @nogc helps when you need to control 
what is going on with your memory allocations in the critical 
code part.


[0] 
https://forum.dlang.org/post/pemharpztorlqkxdo...@forum.dlang.org

[1] https://github.com/typohnebild/numpy-vs-mir
[2] https://github.com/typohnebild/numpy-vs-mir/pull/1

The benchmark [1] has been created by Christoph Alt and Tobias 
Schmidt.


Kind regards,
Ilya


Hi Ilya,

Thanks a lot for sharing the update. I am currently working on 
porting a python package called FMPY to D. This package makes 
usage of numpy and I hope I can use MIR here.


Probably you may want to express FMI entities as Algebraic types 
rather than classes. mir.algebraic can be really helpful here


http://mir-core.libmir.org/mir_algebraic.html

Somehow it is hard to get started to learn MIR. What maybe 
could help python developers is to have some articles showing 
numpy coding and side by side the equivalent MIR coding.


It is hard for me to write articles. I will try to write a small 
one this year, but it would be Mir only. Maybe this benchmark can 
be used as an example and if one wishes to write a side-by-side 
comparison with NumPy I would be happy to comment and explain the 
D implementation and what it is doing internally.


What I miss in MIR is a function to read and write CSV files. 
Is s.th. like numpy.genfromtxt planned?


Unlikely I would add it but can do a code review.

Currently, we can load/safe NumPy binary data with numir

https://libmir.github.io/numir/io.html

Kind regards,
Ilya


Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread jmh530 via Digitalmars-d-announce

On Friday, 4 December 2020 at 20:26:17 UTC, data pulverizer wrote:

[snip]

I see, looking at some of the code, field case is literally 
doing the indexing calculation right there. I guess ndslice is 
doing the same thing just with "Mir magic" an in the 
background? Still, ndslice is able to get a consistent higher 
rate of flops than the field case - interesting. One thing I 
discovered about these kinds of plots is that introducing log 
scale or two particularly for timed comparisons can make the 
differences between different methods that look close clearer. 
A log plot might show some consistent difference between the 
timings of ndslice and the field case. Underneath they should 
be doing essentially the same thing so teasing out what is 
causing the difference would be interesting. Is Mir doing some 
more efficient form of the indexing calculation than naked 
field calculations?


I'm still not sure why slice is so slow. Doesn't that 
completely rely on the opSlice implementations? The choice of 
indexing method and underlying data structure? Isn't it just a 
symbolic interface that you write whatever you want?


Ilya might have a better ability to answer that than me.


Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread data pulverizer via Digitalmars-d-announce

On Friday, 4 December 2020 at 14:48:32 UTC, jmh530 wrote:


It looks like all the `sweep_XXX` functions are only defined 
for contiguous slices, as that would be the default if define a 
Slice!(T, N).


How the functions access the data is a big difference. If you 
compare the `sweep_field` version with the `sweep_naive` 
version, the `sweep_field` function is able to access through 
one index, whereas the `sweep_naive` function has to use two in 
the 2d version and 3 in the 3d version.


Also, the main difference in the NDSlice version is that it 
uses *built-in* MIR functionality, like how `sweep_ndslice` 
uses the `each` function from MIR, whereas `sweep_field` uses a 
for loop. I think this is partially to show that the built-in 
MIR functionality is as fast as if you tried to do it with a 
for loop yourself.


I see, looking at some of the code, field case is literally doing 
the indexing calculation right there. I guess ndslice is doing 
the same thing just with "Mir magic" an in the background? Still, 
ndslice is able to get a consistent higher rate of flops than the 
field case - interesting. One thing I discovered about these 
kinds of plots is that introducing log scale or two particularly 
for timed comparisons can make the differences between different 
methods that look close clearer. A log plot might show some 
consistent difference between the timings of ndslice and the 
field case. Underneath they should be doing essentially the same 
thing so teasing out what is causing the difference would be 
interesting. Is Mir doing some more efficient form of the 
indexing calculation than naked field calculations?


I'm still not sure why slice is so slow. Doesn't that completely 
rely on the opSlice implementations? The choice of indexing 
method and underlying data structure? Isn't it just a symbolic 
interface that you write whatever you want?




Re: Mir vs. Numpy: Reworked!

2020-12-04 Thread jmh530 via Digitalmars-d-announce

On Friday, 4 December 2020 at 02:35:49 UTC, data pulverizer wrote:

[snip]
NDSlice is even faster for this case - cool. Am I correct in 
assuming that the data in the NDSlice is also a single array?


It looks like all the `sweep_XXX` functions are only defined for 
contiguous slices, as that would be the default if define a 
Slice!(T, N).


How the functions access the data is a big difference. If you 
compare the `sweep_field` version with the `sweep_naive` version, 
the `sweep_field` function is able to access through one index, 
whereas the `sweep_naive` function has to use two in the 2d 
version and 3 in the 3d version.


Also, the main difference in the NDSlice version is that it uses 
*built-in* MIR functionality, like how `sweep_ndslice` uses the 
`each` function from MIR, whereas `sweep_field` uses a for loop. 
I think this is partially to show that the built-in MIR 
functionality is as fast as if you tried to do it with a for loop 
yourself.