proposal for efficient 64-bit arithmetic without value objects

2013-10-30 Thread Vyacheslav Egorov
Hi,

There are algorithms that require 64-bit arithmetic, however currently
JavaScript does not provide any efficient way to perform it.

Similar to Math.imul which was added for the purposes of efficient
32-bit multiplication I propose extending Math object with a family of
operations Math.s64op where s is one of {i, u} and op is one
of { sub, add, mul, div, neg }

These functions are specified as follows for op in { sub, add, mul, div }:

Math.s64op

1. accepts 4 arguments al, ah, bl, bh
2. x' = ToUint32(x), for x in {al, ah, bl, bh }
3. pairs (al', ah') and (bl', bh') are interpreted as 64-bit signed
(if s == i) or unsigned (if s == u) integer values with first
member of the pair containing low bits and second - high bits.
4. Result is computed as a standard overflowing operation on 64-bit
values and is decomposed into two int32 values (cl', ch') = (al', ah')
op (bl', bh')
5. A one shot property Math.H is created that returns ch' on the first
access and deletes itself.
6. Return cl'

Unary operation is specified in a similar straightforward fashion.

Here is addition of three unsigned 64bit integers using described functions:

var al, ah, bl, bh, cl, ch, dl, dh;

dl = Math.u64add(Math.u64add(al, ah, bl, bh), Math.H, cl, ch);
dh = Math.H;

This API is designed purposefully to allow good optimizations of the
resulting code, e.g. optimizing compiler knowing one shot semantics of
H property can eradicate any memory traffic and keep things in
registers even collocating (dl, dh) to a single 64-bit register on
64-bit platform.

I think such API has many advantages:

- can greatly improve performance of numeric algorithms relying on 64-bit math;
- easily polyfillable in the current JavaScript;
- does not depend on any complicated language changes (e.g. value objects);
- simple to implement and optimize on any platform (including 32bit ones);

--
Vyacheslav Egorov
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: proposal for efficient 64-bit arithmetic without value objects

2013-10-30 Thread Vyacheslav Egorov
 Rationale being faster polyfilled execution

The main reason for H being one shot is to allow optimizing compiler
*elide* updating it in most cases to eliminate memory traffic.

After thinking about it a bit I propose the following alternative step 5:

Math.H is from the very beggining a non-configurable non-writable
accessor property with a getter that returns hidden inner value and
always zeros inner value.

--
Vyacheslav Egorov


On Wed, Oct 30, 2013 at 5:28 PM, Olov Lassus olov.las...@gmail.com wrote:
 2013/10/30 Vyacheslav Egorov m...@mrale.ph

 5. A one shot property Math.H is created that returns ch' on the first
 access and deletes itself.


 Alternative step 5: Math.H is assigned ch'.

 Rationale being faster polyfilled execution, in combination with a lack of
 imagination from my side to come up with a use case where any code would be
 interested in knowing (at run-time) whether Math.H exists or not (i.e.
 whether it has already been read). Does such a use case exist?

 If all of JSC, Chakra, V8 et.al reliably optimizes away most overhead of a
 polyfilled Math.H getter then perhaps this does not matter.

 /Olov

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: proposal for efficient 64-bit arithmetic without value objects

2013-10-30 Thread Vyacheslav Egorov
Some people find global state that this proposal introduces bad. I
see two ways addressing this:

- Returning {lo, hi} object.

Pros: no global state, in combination with destructuring allows to
write concise code, overhead can still be optimized away.
Cons: performance of polyfill is abysmal on bad and moderately good
VMs, requires allocation sinking pass to optimize away object
allocation.

- Make H property of the respective operation (e.g. u64mul updates its
own property H)

Pros: easy to implement, good perf on bad VMs
Cons: still kinda global state

- Math.s64op can become Math.createOperator(s64, op) that
returns function with H property:

var add = Math.createOperator(u64, add);
var dl = add(add(al, ah, bl, bh), add.H, cl, ch);
var dh = add.H;

Pros: no global state, relatively good performance on the non advanced
VMs, can be actually extended(!) e.g. SIMD operations can be exposed
as Math.createOperator(simd128, add)

--
Vyacheslav Egorov


On Wed, Oct 30, 2013 at 5:46 PM, Olov Lassus olov.las...@gmail.com wrote:
 2013/10/30 Vyacheslav Egorov m...@mrale.ph

  Rationale being faster polyfilled execution

 The main reason for H being one shot is to allow optimizing compiler
 *elide* updating it in most cases to eliminate memory traffic.


 Aaah. Thanks for pointing this out - I thought only of the polyfill
 performance so I neglected this key aspect of your proposal.


 After thinking about it a bit I propose the following alternative step 5:

 Math.H is from the very beggining a non-configurable non-writable
 accessor property with a getter that returns hidden inner value and
 always zeros inner value.


 +1 (for now) :)

 /Olov

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: proposal for efficient 64-bit arithmetic without value objects

2013-10-30 Thread Vyacheslav Egorov
Yes, all API variants I have proposed should result in the equivalent
performance, to the best of my knowledge.

I would even say that {lo, hi} one is easier on VMs for two reasons:

- VMs tend to have some sort of escape analysis / allocation sinking
and they can incorporate { lo, hi } support into this pass;

- If VM desires to allocate { lo, hi } value to a single register it
might be easier to do that when values are explicitly grouped, VM does
not have to rediscover pairing --- it is already there.

You also correctly reasoned that I proposed magic property API for the
purposes of faster polyfilling.

So given the choice between { lo, hi } and magical property API if I
would prefer { lo, hi } iff I ignore polyfill performance.

 The other main use case I can think of is large compiled C++ codebases

I saw crypto libraries (e.g. NaCl) heavily relying on 64bit arithmetic.

 Are there any other use cases you have in mind that really demand high 
 polyfill performance?

I am interested in the whole number hierarchy actually (int32 - int64
- bigint). But I have no clear idea what to do here.

One possibility would be to allow passing type arrays alongside with
primitive numbers into something Math.bigop. But this is pretty ugly
and probably also results in abysmal polyfill performance.


--
Vyacheslav Egorov


On Wed, Oct 30, 2013 at 9:56 PM, Luke Wagner l...@mozilla.com wrote:
 Just to be sure, do you agree that both the {lo, hi}-returning API and the 
 magic-property API should both be able to achieve equivalent performance on a 
 JS engine that has specifically added and optimized these int64 builtins?  I 
 think this is true.

 Assuming so, the reason to prefer the rather more awkward magic-property API 
 would be purely because its polyfill is more efficient.  This is a tough 
 choice, but it seems like bending the spec for the polyfill is overly 
 conservative in this case.  A lot of the use cases I hear for int64 come from 
 crypto and other very specific algorithms which already have implementations 
 in JS.  In such cases, it seems like the authors have to write a new version 
 of the algorithm using the new builtins anyway so, if performance was 
 important, they could just keep around the old version and pick which version 
 to call based on whether the new builtins are present.  Or they can just wait 
 until the optimization is broadly available before switching.

 The other main use case I can think of is large compiled C++ codebases.  
 However, in our experience, C++ codebases tend not to heavily use int64 so 
 the overhead of the polyfill would be less significant.

 Are there any other use cases you have in mind that really demand high 
 polyfill performance?

 API considerations aside, though, I like the idea of bringing fast 64-bit 
 arithmetic to JS without waiting for value objects.

 Cheers,
 Luke
 ___
 es-discuss mailing list
 es-discuss@mozilla.org
 https://mail.mozilla.org/listinfo/es-discuss
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Efficient 64 bit arithmetic

2014-07-09 Thread Vyacheslav Egorov
Hi Fabrice,

The main reason why I was proposing (hi, lo)-pair based API instead of
lower version is to

a) avoid necessity writing low-level looking code in the high-level
language which is JS;

b) avoid pattern matching in the JS code to recognize construction
built out of low-level 32-bit counterparts.

This is too much like first writing 32-bit assembly and then trying to
pattern match it to emit 64-bit one. If we are starting to add
low-level intrisics like that maybe it's time to actually agree that
we want a bytecode with a full range of numeric types and operations?

Math.H part of the proposal was an awful hack though in attempt to
avoid degrading polyfill performance while keeping API high-level and
atomic.

To be honest I think the actual API for high level language like JS
should be either value-object or pair based.

Given that ES6 defines typed-arrays it might be possible to make API
based on that:

var a = new Int32Array(2),
  b = new Int32Array(2),
  c = new Int32Array(2);
Math.div64(c, a, b);

But again it has bad performance unless you pool both a, b, c in the
right way (pooling is good for unoptimized code, but pooling globally
is bad for optimized code).

Lowered version

cl = Math.div64l(al, ah, bl, bh);
ch = Math.div64h(al, ah, bl, bh);

should perform well (no allocations, potentially only some boxing if
engine requires that) and can be assembled into a single instruction
by the optimizer... but this just looks slightly bizarre.

// Vyacheslav Egorov
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Efficient 64 bit arithmetic

2014-07-09 Thread Vyacheslav Egorov
 Note: my proposal is not limited to 64 bit operations. It can be used to
efficiently implement arbitrary precision arithmetic.

True, it is easier to arrive to the efficient and clear code with lowered
representation. With (lo, hi)-result API compiler will have to figure more
things out on it's own. Though I don't see any issues as both APIs are
strictly of the same power:

Math.imuluh(a, b) === Math.u64mul(a, 0, b, 0).hi

But then again the question would be: does it make sense to let people
implement say bigints in pure JavaScript or better give them bigints as
part of the language.

 If you really want such an API, it is not worth proposing a lowered
version because division is slow anyway

This is also a fair point. But having inconsistent APIs is very weird if
you ask me.





// Vyacheslav Egorov


On Wed, Jul 9, 2014 at 12:07 PM, Fabrice Bellard fabr...@bellard.org
wrote:

 Note: my proposal is not limited to 64 bit operations. It can be used to
 efficiently implement arbitrary precision arithmetic.

 64 bit integer division and remainder are less critical than the other
 operations because they are seldom used and slow anyway. Moreover, it is
 more difficult to define them because they are undefined cases (division by
 zero and overflow). So not proposing a specific API for them can be
 acceptable.

 If you really want such an API, it is not worth proposing a lowered
 version because division is slow anyway. You can define for example:

 {quo_lo:q0, quo_hi:q1, rem_lo:r0, rem_hi:r1 } = Math.idiv64(lo0, hi0, lo1,
 hi1)

 for signed 64 bit division and

 {quo_lo:q0, quo_hi:q1, rem_lo:r0, rem_hi:r1 } = Math.idivu64(lo0, hi0,
 lo1, hi1)

 for unsigned 64 bit division. In case of division by zero or overflow,
 null is returned. Instead of returning an object, you can use a
 preallocated Int32Array to return the 4 32 bit numbers but you also need to
 return a boolean to indicate the error cases.

 Fabrice.


 On 07/09/14 11:36, Vyacheslav Egorov wrote:

 Hi Fabrice,

 The main reason why I was proposing (hi, lo)-pair based API instead of
 lower version is to

 a) avoid necessity writing low-level looking code in the high-level
 language which is JS;

 b) avoid pattern matching in the JS code to recognize construction
 built out of low-level 32-bit counterparts.

 This is too much like first writing 32-bit assembly and then trying to
 pattern match it to emit 64-bit one. If we are starting to add
 low-level intrisics like that maybe it's time to actually agree that
 we want a bytecode with a full range of numeric types and operations?

 Math.H part of the proposal was an awful hack though in attempt to
 avoid degrading polyfill performance while keeping API high-level and
 atomic.

 To be honest I think the actual API for high level language like JS
 should be either value-object or pair based.

 Given that ES6 defines typed-arrays it might be possible to make API
 based on that:

 var a = new Int32Array(2),
b = new Int32Array(2),
c = new Int32Array(2);
 Math.div64(c, a, b);

 But again it has bad performance unless you pool both a, b, c in the
 right way (pooling is good for unoptimized code, but pooling globally
 is bad for optimized code).

 Lowered version

 cl = Math.div64l(al, ah, bl, bh);
 ch = Math.div64h(al, ah, bl, bh);

 should perform well (no allocations, potentially only some boxing if
 engine requires that) and can be assembled into a single instruction
 by the optimizer... but this just looks slightly bizarre.

 // Vyacheslav Egorov
 ___
 es-discuss mailing list
 es-discuss@mozilla.org
 https://mail.mozilla.org/listinfo/es-discuss


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss