Re: Winning the Base64 benchmarks.

2019-10-17 Thread cblake
Back when I used Cython more, I did always use Cython and just gradually type 
in things as required (or for some things do the equivalent of #include some C 
for some SSE intrinsics type codes). I agree it is not generally "popular", but 
it is a very legitimate mode of usage. Adding more and more cdef's etc. brings 
your code closer and closer to C semantically while staying in Cython syntax. 
Partly, I had better profiling tools for such code than "pure python" style.

They even have a nice \--annotate mode supporting this usage style that spits 
out an HTML page with clickable generated code colorized by how C-like it is. 
That kind of source to "assembly" visualization tool would be nice for Nim as 
well, or even just gcc/clang going all the way to "real" assembly. It's got a 
bit more "oomph" for things spanning orders of magnitude of efficiency like 
pure Py and C, though, and the CPython API calls Cython generates may be (a 
little easier to read than real assembly, much like the C code Nim generates.

Now, why good modes of usage (of almost anything) are not more popular..well, 
that's some question for the ages. People are imitative. "It's not popular 
because it's not popular." How to bootstrap something catching on is 
just..tricky.


Re: Winning the Base64 benchmarks.

2019-10-17 Thread treeform
I think the big difference between making python faster vs making nim faster 
... is in nim you just use better algorithms like precomputed tables ... some 
thing natural. While in python you use a different language like C modules or 
cython. It's just not the same. Why not always just use C and cython then?

In nim the optimization path seems natural extension of what you do anyways. 


Re: Winning the Base64 benchmarks.

2019-10-17 Thread torarinvik
Truth as it's told! Wise post indeed. I hear people talking about speed of the 
implementation rather than the language? I always heard people separating 
between CPython the implementation and the Python language. If one had a super 
powerful AI optimizer I would assume that all languages would be almost equally 
fast. The AI would do the heavy lifting optimizing algorithms generating 
beautiful machine code and people would do human friendly programming and 
higher concepts and ideas. People would program assembly in their spare time 
like other people chop wood for recreation :D :D


Re: Winning the Base64 benchmarks.

2019-10-17 Thread dom96
> benchmarks are a game. > > you can always beat or come close to C with Nim.

Agree with you 100% and especially these statements. For compiled languages 
like C/Nim/Rust/Go it really is just a case of putting in the time to optimise 
the code for a specific architecture or compiler. Even languages like Python 
can cheat and call into a C library, with enough effort you can make anything 
run fast, it's effort that matters though. So comparing performance of 
languages doesn't really make sense.

What does make sense is to compare the stdlib performance, but it's important 
not to blame the language for this and be aware that fixing the performance 
likely just needs someone knowledgeable to put in the effort like you have done 
here.


Re: Winning the Base64 benchmarks.

2019-10-16 Thread refaqtor
just gotta say, "cool!" and Thanks, all! 3 hours from "... noticed that..." to 
"PR up:"!

and love that compile time lookup table goodness!

I use nim heavily on my projects. I hope to get some bits polished enough to 
give back at some point.


Re: Winning the Base64 benchmarks.

2019-10-16 Thread Libman
> benchmarks are a game.

I agree, but in a very positive interpretation of that phrase.

Competitive games are essential, both to individual human development as well 
as software projects. They are a feedback mechanism that challenges potential 
complacency, and helps bring out the best that is within us.

Even if benchmarks don't have a perfect correlation with every real-world 
performance scenario, they have a strong correlation with many. Participating, 
tuning, and winning benchmarks shows that Nim has a community that cares about 
its success. 


Re: Winning the Base64 benchmarks.

2019-10-16 Thread treeform
PR up: 
[https://github.com/nim-lang/Nim/pull/12436](https://github.com/nim-lang/Nim/pull/12436)


Re: Winning the Base64 benchmarks.

2019-10-16 Thread treeform
I am working on a PR.

The gist is just a proof of concept.

You are right. I need to check for "" otherwise my code breaks. I just added 
that in. Thanks!


Re: Winning the Base64 benchmarks.

2019-10-16 Thread juancarlospaco
Wheres the PR ?. :P


if unlikely(str.len == 0): return "" # For eg. encode("")

Run

Special case return fast for empty string ?.


Re: Winning the Base64 benchmarks.

2019-10-16 Thread Araq
Sounds good.


Re: Winning the Base64 benchmarks.

2019-10-16 Thread treeform
I am happy to take over the stdlib base64 API and make it stable if you guys 
agree with my proposals bellow:

This is the place where stdlib does not handle errors:

[https://github.com/nim-lang/Nim/blob/master/lib/pure/base64.nim#L123](https://github.com/nim-lang/Nim/blob/master/lib/pure/base64.nim#L123)

It needs to throw exception saying invalid base64 encoding instead of setting 
it to 63... RFC says to do so: 
[https://tools.ietf.org/html/rfc4648#section-3.3](https://tools.ietf.org/html/rfc4648#section-3.3)

[https://github.com/nim-lang/Nim/blob/master/lib/pure/base64.nim#L47](https://github.com/nim-lang/Nim/blob/master/lib/pure/base64.nim#L47)

The feature it should not support is lineLen and setting a custom newLine. It 
makes code slower.

Python does not support this either: 
[https://docs.python.org/2/library/base64.html](https://docs.python.org/2/library/base64.html)

I think I have seen this in ObjectiveC. I think this is only needed when used 
in emails?

The RFC says: 
[https://tools.ietf.org/html/rfc4648#section-3.1](https://tools.ietf.org/html/rfc4648#section-3.1)
 to not support it? Leave it up to the email MIME spec. Multipurpose Internet 
Mail Extensions wrapper should do this part. See: 
[https://tools.ietf.org/html/rfc2045](https://tools.ietf.org/html/rfc2045)

What makes my code faster is the lookup table and dropping support for MIME 
stuff which should not be there.


Winning the Base64 benchmarks.

2019-10-16 Thread treeform
I was looking at Nim benchmarks here: 
[https://github.com/kostya/benchmarks#base64](https://github.com/kostya/benchmarks#base64)
 , and noticed that Nim's base64 is so far behind the simple C implementation. 
I took the plain C algorithm and ported it to Nim without using any crazy C's 
pointers etc... and its just as fast as C on my computer. I learned 3 things:

  * benchmarks are a game.
  * you can always beat or come close to C with Nim.
  * Nim's base64 standard library implementation is slow. (it has more features 
but does not handle errors?)



Here is my as fast as C implementation: 
[https://gist.github.com/treeform/900f55d4bc08e57fe2257360b5f9fa68](https://gist.github.com/treeform/900f55d4bc08e57fe2257360b5f9fa68)

It looks like you can go faster if you use SIMD stuff, like this C library: 
[https://github.com/aklomp/base64](https://github.com/aklomp/base64)


Re: Winning the Base64 benchmarks.

2019-10-16 Thread Araq
> it has more features but does not handle errors?

Since base64.nim says "unstable API" we could take your code... :-)