Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread laurent

Hm yes, there's no difference between For and while.
What do you think about a[ a.length ] and a.push(); ?

How did you get the decompiles function ?

Thanks for pushing it.
L

Juan Pablo Califano a écrit :

I'd take these results with a pinch of salt. I don't think they're
conclusive, since one test seems to affect the performance of the others
(and we are talking about really small differences anyway, which I think
could be attributed to other factors than the tested code itself).

Let's take, for instance the for / while tests. If I run all your tests, I
get results similar to yours.

takeLengthPlusOut : 1958
takeForWithLengthPlus : 1788

However, if I run just those two tests the results change. And if I alter
the order in which the tests are run, the first one always seems to take
less time.
// running just these two, tracing the results in the same order the loops
are executed

takeLengthPlusOut : 1876
takeForWithLengthPlus : 2003

takeLengthPlusOut : 1876
takeForWithLengthPlus : 1935

takeForWithLengthPlus : 1874
takeLengthPlusOut : 1946

takeForWithLengthPlus : 1861
takeLengthPlusOut : 1935
I think this particular for / while case is very illustrative of some
external bias, because the execution order consistently affects the results
and because if you disassemble both loops, they're nearly identical. The
only difference is the order in which one operation previous to the loop is
executed (it's not even the body of the loop or its conditional test).


FOR:

function takeForWithLengthPlus():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11findpropstrict Array
  13constructprop  Array (0)
  16coerce Array
  18setlocal2
  19findpropstrict flash.utils::getTimer
  21callproperty   flash.utils::getTimer (0)
  24convert_d
  25setlocal3
  26pushbyte   0
  28setlocal1
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44pushint1000 // 0x989680
  46iflt   L2

  50pushstring takeForWithLengthPlus : 
  52findpropstrict flash.utils::getTimer
  54callproperty   flash.utils::getTimer (0)
  57getlocal3
  58subtract
  59add
  60returnvalue
}


WHILE:

function takeLengthPlusOut():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11pushbyte   0
  13setlocal1
  14findpropstrict Array
  16constructprop  Array (0)
  19coerce Array
  21setlocal2
  22findpropstrict flash.utils::getTimer
  24callproperty   flash.utils::getTimer (0)
  27convert_d
  28setlocal3
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44pushint1000 // 0x989680
  46iflt   L2

  50pushstring takeLengthPlusOut : 
  52findpropstrict flash.utils::getTimer
  54callproperty   flash.utils::getTimer (0)
  57getlocal3
  58subtract
  59add
  60returnvalue
}
The only difference is that in the foor loop, i = 0 is run immediately
before testing i against 1000:

  26pushbyte   0
  28setlocal1

whereas in the while loop, that assignment is executed before constructing
the array:

  11pushbyte   0
  13setlocal1


Cheers
Juan Pablo Califano

2008/7/29, laurent [EMAIL PROTECTED]:
  

Hi,

I often asked myself if a[ a.length ] = xxx was faster or slower then
a.push( xxx ), I did some test at wake up, fresh with coffee.
So now I know the answer and I got a bit more about while and for, and more
obvious about using them with decremental or incremental counters.

from results,  means faster :

for  while hey yes...oO
increment  decrement
length  push
increment or certainly make any number operation at same time than putting
the value in variable is slower than separate those actions, like:
  a[ a.length ] = i++;
slower than:
  i++;
  a[ a.length ] = i;

while incremental is faster than for decremental.
This is a totaly useless 

Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Juan Pablo Califano
My undestanding is that push should be slower than accessing the index
directly, but take that as common sense: a function call should involve a
bit more processing, but I don't know the specifics. I do know that both
compiles to different bytecode. Off the top of my head, it was 4 ops,
something like

push value
push index
push theobject
setproperty== theobject[index] = value

Making a push involves a callProperty operation, where you pass the object,
the method and the arguments. Maybe it's the same number of operations, but
I think the call operation is a bit more complex internally, as it creates a
new frame on the stack, it has to pass the arguments, executes the function,
clears the stack frame and then returns back to the callee.

For disassembling, I'm using abcdump, a tool that is included in the Tamarin
project. You can find a compiled version for Windows and use instructions
here (though it's super easy to use):

http://iteratif.free.fr/blog/index.php?2006/11/15/61-un-premier-decompileur-as3
It's in french, but looking at the text added to your reply, I assume you'll
have not problems with that ;)

Another post about the subject, in English
http://www.5etdemi.com/blog/archives/2007/01/as3-decompiler/



Cheers
Juan Pablo Califano

2008/7/30, laurent [EMAIL PROTECTED]:

 Hm yes, there's no difference between For and while.
 What do you think about a[ a.length ] and a.push(); ?

 How did you get the decompiles function ?

 Thanks for pushing it.
 L

 Juan Pablo Califano a écrit :

 I'd take these results with a pinch of salt. I don't think they're
 conclusive, since one test seems to affect the performance of the others
 (and we are talking about really small differences anyway, which I think
 could be attributed to other factors than the tested code itself).

 Let's take, for instance the for / while tests. If I run all your tests, I
 get results similar to yours.

 takeLengthPlusOut : 1958
 takeForWithLengthPlus : 1788

 However, if I run just those two tests the results change. And if I alter
 the order in which the tests are run, the first one always seems to take
 less time.
 // running just these two, tracing the results in the same order the loops
 are executed

 takeLengthPlusOut : 1876
 takeForWithLengthPlus : 2003

 takeLengthPlusOut : 1876
 takeForWithLengthPlus : 1935

 takeForWithLengthPlus : 1874
 takeLengthPlusOut : 1946

 takeForWithLengthPlus : 1861
 takeLengthPlusOut : 1935
 I think this particular for / while case is very illustrative of some
 external bias, because the execution order consistently affects the
 results
 and because if you disassemble both loops, they're nearly identical. The
 only difference is the order in which one operation previous to the loop
 is
 executed (it's not even the body of the loop or its conditional test).


 FOR:

function takeForWithLengthPlus():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11findpropstrict Array
  13constructprop  Array (0)
  16coerce Array
  18setlocal2
  19findpropstrict flash.utils::getTimer
  21callproperty   flash.utils::getTimer (0)
  24convert_d
  25setlocal3
  26pushbyte   0
  28setlocal1
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44pushint1000 // 0x989680
  46iflt   L2

  50pushstring takeForWithLengthPlus : 
  52findpropstrict flash.utils::getTimer
  54callproperty   flash.utils::getTimer (0)
  57getlocal3
  58subtract
  59add
  60returnvalue
}


 WHILE:

function takeLengthPlusOut():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11pushbyte   0
  13setlocal1
  14findpropstrict Array
  16constructprop  Array (0)
  19coerce Array
  21setlocal2
  22findpropstrict flash.utils::getTimer
  24callproperty   flash.utils::getTimer (0)
  27convert_d
  28setlocal3
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44

Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread laurent
I will use while and length for futur codes, even if the diffenrences 
seems very little.

Merci pour les liens et les precisions :]

L


Juan Pablo Califano a écrit :

My undestanding is that push should be slower than accessing the index
directly, but take that as common sense: a function call should involve a
bit more processing, but I don't know the specifics. I do know that both
compiles to different bytecode. Off the top of my head, it was 4 ops,
something like

push value
push index
push theobject
setproperty== theobject[index] = value

Making a push involves a callProperty operation, where you pass the object,
the method and the arguments. Maybe it's the same number of operations, but
I think the call operation is a bit more complex internally, as it creates a
new frame on the stack, it has to pass the arguments, executes the function,
clears the stack frame and then returns back to the callee.

For disassembling, I'm using abcdump, a tool that is included in the Tamarin
project. You can find a compiled version for Windows and use instructions
here (though it's super easy to use):

http://iteratif.free.fr/blog/index.php?2006/11/15/61-un-premier-decompileur-as3
It's in french, but looking at the text added to your reply, I assume you'll
have not problems with that ;)

Another post about the subject, in English
http://www.5etdemi.com/blog/archives/2007/01/as3-decompiler/



Cheers
Juan Pablo Califano

2008/7/30, laurent [EMAIL PROTECTED]:
  

Hm yes, there's no difference between For and while.
What do you think about a[ a.length ] and a.push(); ?

How did you get the decompiles function ?

Thanks for pushing it.
L

Juan Pablo Califano a écrit :



I'd take these results with a pinch of salt. I don't think they're
conclusive, since one test seems to affect the performance of the others
(and we are talking about really small differences anyway, which I think
could be attributed to other factors than the tested code itself).

Let's take, for instance the for / while tests. If I run all your tests, I
get results similar to yours.

takeLengthPlusOut : 1958
takeForWithLengthPlus : 1788

However, if I run just those two tests the results change. And if I alter
the order in which the tests are run, the first one always seems to take
less time.
// running just these two, tracing the results in the same order the loops
are executed

takeLengthPlusOut : 1876
takeForWithLengthPlus : 2003

takeLengthPlusOut : 1876
takeForWithLengthPlus : 1935

takeForWithLengthPlus : 1874
takeLengthPlusOut : 1946

takeForWithLengthPlus : 1861
takeLengthPlusOut : 1935
I think this particular for / while case is very illustrative of some
external bias, because the execution order consistently affects the
results
and because if you disassemble both loops, they're nearly identical. The
only difference is the order in which one operation previous to the loop
is
executed (it's not even the body of the loop or its conditional test).


FOR:

   function takeForWithLengthPlus():String /* disp_id 0*/
   {
 // local_count=4 max_scope=1 max_stack=3 code_len=61
 0 getlocal0
 1 pushscope
 2 pushbyte   0
 4 setlocal1
 5 pushnull
 6 coerce Array
 8 setlocal2
 9 pushnan
 10setlocal3
 11findpropstrict Array
 13constructprop  Array (0)
 16coerce Array
 18setlocal2
 19findpropstrict flash.utils::getTimer
 21callproperty   flash.utils::getTimer (0)
 24convert_d
 25setlocal3
 26pushbyte   0
 28setlocal1
 29jump   L1


 L2:
 33label
 34getlocal2
 35getlocal2
 36getpropertylength
 38getlocal1
 39setpropertynull
 41inclocal_i 1

 L1:
 43getlocal1
 44pushint1000 // 0x989680
 46iflt   L2

 50pushstring takeForWithLengthPlus : 
 52findpropstrict flash.utils::getTimer
 54callproperty   flash.utils::getTimer (0)
 57getlocal3
 58subtract
 59add
 60returnvalue
   }


WHILE:

   function takeLengthPlusOut():String /* disp_id 0*/
   {
 // local_count=4 max_scope=1 max_stack=3 code_len=61
 0 getlocal0
 1 pushscope
 2 pushbyte   0
 4 setlocal1
 5 pushnull
 6 coerce Array
 8 setlocal2
 9 pushnan
 10setlocal3
 11pushbyte   0
 13setlocal1
 14findpropstrict Array
 16constructprop  Array (0)
 19coerce Array
 21setlocal2
 22findpropstrict flash.utils::getTimer
 24callproperty   flash.utils::getTimer (0)
 27convert_d
 28setlocal3
 29jump   L1


 L2:
 33label
 34getlocal2
 35getlocal2
 36getpropertylength
 38getlocal1
 39setpropertynull
 

Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread laurent


Can you decompile the push method to see how it use the stack ? :)

Juan Pablo Califano a écrit :

PD: To add to how accessing the last element in the array works, this is the
relevant bit:

  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull


local2 is the array and local1 is the i variable:

So, considering this snippet as a self contained block regarding stack
state, breaking it down what happens is this:

   stack state
actionscript pseudo-equivalent
   34  theArray
   35  theArray, theArray
   36 theArray, theArray.length
   38 theArray, theArray.length, variable_i
   39 [empty]
theArray[theArray.length] = variable_i


Cheers
Juan Pablo Califano


2008/7/30, Juan Pablo Califano [EMAIL PROTECTED]:
  

My undestanding is that push should be slower than accessing the index
directly, but take that as common sense: a function call should involve a
bit more processing, but I don't know the specifics. I do know that both
compiles to different bytecode. Off the top of my head, it was 4 ops,
something like

push value
push index
push theobject
setproperty== theobject[index] = value

Making a push involves a callProperty operation, where you pass the object,
the method and the arguments. Maybe it's the same number of operations, but
I think the call operation is a bit more complex internally, as it creates a
new frame on the stack, it has to pass the arguments, executes the function,
clears the stack frame and then returns back to the callee.

For disassembling, I'm using abcdump, a tool that is included in the
Tamarin project. You can find a compiled version for Windows and use
instructions here (though it's super easy to use):


http://iteratif.free.fr/blog/index.php?2006/11/15/61-un-premier-decompileur-as3
It's in french, but looking at the text added to your reply, I assume
you'll have not problems with that ;)

Another post about the subject, in English
http://www.5etdemi.com/blog/archives/2007/01/as3-decompiler/



Cheers
Juan Pablo Califano

2008/7/30, laurent [EMAIL PROTECTED]:


Hm yes, there's no difference between For and while.
What do you think about a[ a.length ] and a.push(); ?

How did you get the decompiles function ?

Thanks for pushing it.
L

Juan Pablo Califano a écrit :

  

I'd take these results with a pinch of salt. I don't think they're
conclusive, since one test seems to affect the performance of the others
(and we are talking about really small differences anyway, which I think
could be attributed to other factors than the tested code itself).

Let's take, for instance the for / while tests. If I run all your tests,
I
get results similar to yours.

takeLengthPlusOut : 1958
takeForWithLengthPlus : 1788

However, if I run just those two tests the results change. And if I alter
the order in which the tests are run, the first one always seems to take
less time.
// running just these two, tracing the results in the same order the
loops
are executed

takeLengthPlusOut : 1876
takeForWithLengthPlus : 2003

takeLengthPlusOut : 1876
takeForWithLengthPlus : 1935

takeForWithLengthPlus : 1874
takeLengthPlusOut : 1946

takeForWithLengthPlus : 1861
takeLengthPlusOut : 1935
I think this particular for / while case is very illustrative of some
external bias, because the execution order consistently affects the
results
and because if you disassemble both loops, they're nearly identical. The
only difference is the order in which one operation previous to the loop
is
executed (it's not even the body of the loop or its conditional test).


FOR:

   function takeForWithLengthPlus():String /* disp_id 0*/
   {
 // local_count=4 max_scope=1 max_stack=3 code_len=61
 0 getlocal0
 1 pushscope
 2 pushbyte   0
 4 setlocal1
 5 pushnull
 6 coerce Array
 8 setlocal2
 9 pushnan
 10setlocal3
 11findpropstrict Array
 13constructprop  Array (0)
 16coerce Array
 18setlocal2
 19findpropstrict flash.utils::getTimer
 21callproperty   flash.utils::getTimer (0)
 24convert_d
 25setlocal3
 26pushbyte   0
 28setlocal1
 29jump   L1


 L2:
 33label
 34getlocal2
 35getlocal2
 36getpropertylength
 38getlocal1
 39setpropertynull
 41inclocal_i 1

 L1:
 43getlocal1
 44pushint1000 // 0x989680
 46iflt   L2

 50pushstring takeForWithLengthPlus : 
 52findpropstrict flash.utils::getTimer
 54callproperty   flash.utils::getTimer (0)
 57getlocal3
 58subtract
 59add
 60returnvalue
   }


WHILE:

   function takeLengthPlusOut():String /* disp_id 0*/
   {
 // local_count=4 max_scope=1 max_stack=3 code_len=61
 0 getlocal0
 1 pushscope
 2 pushbyte   0
   

Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Juan Pablo Califano
If you mean decompiling the push method itself, you can't because it's not
actioscript but a native code, implemented directly in the player.

If you mean how the push method is called, it'd be something like this:

Actionscript:

 function test():void {
  var i:int = 0;
  var arr:Array = new Array();
  arr.push(i);
 }

Disassembled bytecode:

function test():void /* disp_id 0*/
{
  // local_count=3 max_scope=1 max_stack=2 code_len=26
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushbyte   0
  11setlocal1
  12findpropstrict Array
  14constructprop  Array (0)
  17coerce Array
  19setlocal2
  20getlocal2
  21getlocal1
  22callpropvoid   http://adobe.com/AS3/2006/builtin::push (1)
  25returnvoid
}

The relevant part is this (local2 is the array and local1 the variable i)


  20getlocal2
  21getlocal1
  22callpropvoid   http://adobe.com/AS3/2006/builtin::push (1)

Basically, you push the array onto the stack, then the arguments (the i
variable), and then use the callpropvoid native method. That method pops the
stack to get the arguments (the number of arguments is specified by the
caller, in this case it's 1 as you can see between the parens), and then it
pops the stack again to get the object (the array in this case). Then the
player calls the method passed to callpropvoid (push), on the array, and
passes the arguments to it (the variable i). If push returned a value,
callprop would have been used instead of callpropvoid.
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Mark Winterhalder
On Wed, Jul 30, 2008 at 5:44 PM, laurent [EMAIL PROTECTED] wrote:

 Can you decompile the push method to see how it use the stack ? :)

I take the :) you already know the answer, but just in case, it's
intrinsic and you can probably look the C++ code up in the Tamarin
sources.

I have to agree with Juan that you shouldn't take those results
seriously. The numbers are closer to each other than the margin of
error.
When I set up such a test (sorry, no code on this machine), I wait a
few frames so the player is fully initialized, and then run several
test of each variant. I take the best result and discard the rest --
the average or median are worthless, they just tell you how much other
processes on the system interfere with the test, and the worst tells
you if the garbage collector did its round.

In any case, don't go down the X does Y the fastest, so I'll only use
X from now on route. If you're writing some selected inner loops or a
math library, OK, but most of your code won't get executed thousands
of times each frame, so it's much more important that it's readable
and easy to understand for others (including future-you). This is
especially true if your fastest solution is verbose and repetitive
--  it's inconvenient if you have to modify it in the future, add
traces for debugging or step through with the debugger.

Also, whether writing something in a single line or two is faster
depends on the compiler, and might change when better optimization is
introduced in a future version. Common ways of doing something are
probably more likely to get optimized.

Stating the obvious, try to find a better algorithm first.

And finally, have a look at haXe. That touches the compiler
optimization part again -- the haXe compiler knows much about your
code than the AS3 compilers, so it can do better optimization. Part of
it is explained here:
http://blog.haxe.org/entry/31
Note inlining and haxe.rtti.Generic. For an AS3 vs haXe example, read:
http://blog.haxe.org/entry/35

Mark
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


RE: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Kerry Thompson
Juan Pablo Califano wrote:

 If you mean decompiling the push method itself, you can't because it's not
 actioscript but a native code, implemented directly in the player.

Nice work, Juan Pablo.

The code you have been posting prompts me to comment on the underlying
mechanism of Flash. I know, from experience, that a lot of Flash coders (and
Director, and Java) don't understand about bytecode vs. native code.

If you're writing in a true compiled language like C++, your code will
compile to machine language specific to your CPU. Machine code is 1's and
0's, the on/off switches that are the basis of any binary computer. 

Flash is cross-platform, though. It has to work on Intel processors,
PowerPC, and others. It has to work on different OS's like Windows, Mac, and
Unix. The machine code is different for every processor, and the
implementation is specific to an OS. So, the Flash compiler can't compile to
machine code.

Instead, Macromedia, and now Adobe, have written a player for each of the
supported platforms. The player is in machine code (ones and zeros), but our
ActionScript code is not. ActionScript compiles to an intermediate bytecode,
or token. The player reads these tokens, and executes the appropriate
machine code.

That's what makes Flash slower than C++, and also more secure--it's much
more difficult to write malicious code if you don't have direct access to
the machine, but have to go through an interpreter.

This idea has been around for 25 years or so. The first implementation I
used was UCSC Pascal, which, like Flash, compiled down to an intermediate
token which was, in turn interpreted and executed by the player (we called
it a virtual machine back then). It has only been in the last 10 years or
so that machines have gotten fast enough to run this sort of code
satisfactorily.

If you understand this, you can find the bottlenecks in your code more
easily, and optimize it. Loops are often the main culprit, as they have to
interpret the bytecode each time through the loop. Also, if you're working
with something with a fixed length like an array or XML nodes (really the
same thing), it's faster if you store the length of the array in a register
variable. An illustration:

var arrLen:int;
arrLen = myArray.length();
for (var i:int; i  arrLen; i++)

works faster than 
for (var i:int; i  myArray.length(); i++)

Cordially,

Kerry Thompson

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Mark Winterhalder
Nitpicking, but just as anything digital the SWF opcodes essentially
are 1s and 0s, too. :)

Anyway, the new VM supports JIT compilation to native machine code. I
must admit I don't know if /all/ code gets JIT compiled or only
hotspots, and I don't know if it will be recompiled for each use to
hardcode variables, but that would also have implications.

Mark



On Wed, Jul 30, 2008 at 8:21 PM, Kerry Thompson [EMAIL PROTECTED] wrote:
 Juan Pablo Califano wrote:

 If you mean decompiling the push method itself, you can't because it's not
 actioscript but a native code, implemented directly in the player.

 Nice work, Juan Pablo.

 The code you have been posting prompts me to comment on the underlying
 mechanism of Flash. I know, from experience, that a lot of Flash coders (and
 Director, and Java) don't understand about bytecode vs. native code.

 If you're writing in a true compiled language like C++, your code will
 compile to machine language specific to your CPU. Machine code is 1's and
 0's, the on/off switches that are the basis of any binary computer.

 Flash is cross-platform, though. It has to work on Intel processors,
 PowerPC, and others. It has to work on different OS's like Windows, Mac, and
 Unix. The machine code is different for every processor, and the
 implementation is specific to an OS. So, the Flash compiler can't compile to
 machine code.

 Instead, Macromedia, and now Adobe, have written a player for each of the
 supported platforms. The player is in machine code (ones and zeros), but our
 ActionScript code is not. ActionScript compiles to an intermediate bytecode,
 or token. The player reads these tokens, and executes the appropriate
 machine code.

 That's what makes Flash slower than C++, and also more secure--it's much
 more difficult to write malicious code if you don't have direct access to
 the machine, but have to go through an interpreter.

 This idea has been around for 25 years or so. The first implementation I
 used was UCSC Pascal, which, like Flash, compiled down to an intermediate
 token which was, in turn interpreted and executed by the player (we called
 it a virtual machine back then). It has only been in the last 10 years or
 so that machines have gotten fast enough to run this sort of code
 satisfactorily.

 If you understand this, you can find the bottlenecks in your code more
 easily, and optimize it. Loops are often the main culprit, as they have to
 interpret the bytecode each time through the loop. Also, if you're working
 with something with a fixed length like an array or XML nodes (really the
 same thing), it's faster if you store the length of the array in a register
 variable. An illustration:

 var arrLen:int;
 arrLen = myArray.length();
 for (var i:int; i  arrLen; i++)

 works faster than
 for (var i:int; i  myArray.length(); i++)

 Cordially,

 Kerry Thompson

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


RE: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Kerry Thompson
Mark Winterhalder wrote:

 Nitpicking, but just as anything digital the SWF opcodes essentially
 are 1s and 0s, too. :)

Fair enough. Following that to its logical conclusion, _everything_ on your
computer is 1s and 0s, including the text in this email ^_^

You clearly understand what I was saying, Mark, but just a brief
reiteration: compiled ActionScript has to be interpreted by the VM, which is
_always_ slower than compiling directly to machine language.

When I was doing Director full time, I ran some tests that showed C++ to run
up to 400 times as fast as Lingo. I lobbied for years to get a true
machine-language compiler for Lingo, at least for desktop apps. I was struck
by how few developers understood the implications, and without other
developers clamoring for the need for speed, Macromedia never went there.
Director could have been a major player in the 3D game world.

And don't tell me that Director 3D is fast enough. Hard-core gamers buy
$8,000 machines to squeeze every last fps out of their games. With lights,
shaders, high-poly objects, multiple cameras, Director is just not fast
enough for a Quake or Doom LAN party. And, of course, neither is Flash.

 Anyway, the new VM supports JIT compilation to native machine code. I
 must admit I don't know if /all/ code gets JIT compiled or only
 hotspots, and I don't know if it will be recompiled for each use to
 hardcode variables, but that would also have implications.

One major implication would be in loops. The complier would  have no way of
knowing if an array would change length in a loop, for example, so it
couldn't hard code the length.

Cordially,

Kerry Thompson

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Mark Winterhalder
 You clearly understand what I was saying, Mark, but just a brief
 reiteration: compiled ActionScript has to be interpreted by the VM, which is
 _always_ slower than compiling directly to machine language.

Yes, I understand and am not even disagreeing. :)

However, there have been benchmarks where Java was actually marginally
/faster/ than C++ for some specific tests. This seems
counterintuitive, but the JIT compiler knows more at runtime than the
traditional compiler can know in advance, and I'm guessing that's why
it can (not generally, but in some rare situations) do better
optimizations. When you have ideal programmers then of course compiled
languages will be faster, but that difference is getting less as
technology evolves.
But of course we're talking about the Flashplayer here, and size,
portability and start-up time are more important design goals than
execution speed, so we'll most definitely will always have to live
with a very noticeable performance penalty. Then again, we don't have
to manage memory ourselves, which is a big plus.
(Btw memory allocation and optimization: recycling instances where
possible is also a good idea.)

If somebody knows a good explanation about the when and how of the
AVM2 JIT compiler, I'd be curious. The same goes for a table that
shows relative performance of stuff the renderer does -- like with
alpha vs. without, if rendering time grows linear with the number of
pixels, how much time is wasted on DisplayObjects outside of the
visible Stage, stuff like that.

Mark





On Wed, Jul 30, 2008 at 9:05 PM, Kerry Thompson [EMAIL PROTECTED] wrote:
 Mark Winterhalder wrote:

 Nitpicking, but just as anything digital the SWF opcodes essentially
 are 1s and 0s, too. :)

 Fair enough. Following that to its logical conclusion, _everything_ on your
 computer is 1s and 0s, including the text in this email ^_^

 You clearly understand what I was saying, Mark, but just a brief
 reiteration: compiled ActionScript has to be interpreted by the VM, which is
 _always_ slower than compiling directly to machine language.

 When I was doing Director full time, I ran some tests that showed C++ to run
 up to 400 times as fast as Lingo. I lobbied for years to get a true
 machine-language compiler for Lingo, at least for desktop apps. I was struck
 by how few developers understood the implications, and without other
 developers clamoring for the need for speed, Macromedia never went there.
 Director could have been a major player in the 3D game world.

 And don't tell me that Director 3D is fast enough. Hard-core gamers buy
 $8,000 machines to squeeze every last fps out of their games. With lights,
 shaders, high-poly objects, multiple cameras, Director is just not fast
 enough for a Quake or Doom LAN party. And, of course, neither is Flash.

 Anyway, the new VM supports JIT compilation to native machine code. I
 must admit I don't know if /all/ code gets JIT compiled or only
 hotspots, and I don't know if it will be recompiled for each use to
 hardcode variables, but that would also have implications.

 One major implication would be in loops. The complier would  have no way of
 knowing if an array would change length in a loop, for example, so it
 couldn't hard code the length.

 Cordially,

 Kerry Thompson

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Juan Pablo Califano
Check these slides:

http://www.onflex.org/ACDS/AS3TuningInsideAVM2JIT.pdf

From page 43:

* We make a simple hotspot-like decision about whether to interpret or JIT
* Initialization functions ($init, $cinit) are interpreted
* Everything else is JIT
* Upshot: Don't put performance-intensive code in class initialization


Cheers
Juan Pablo Califano


2008/7/30, Mark Winterhalder [EMAIL PROTECTED]:

  You clearly understand what I was saying, Mark, but just a brief
  reiteration: compiled ActionScript has to be interpreted by the VM, which
 is
  _always_ slower than compiling directly to machine language.

 Yes, I understand and am not even disagreeing. :)

 However, there have been benchmarks where Java was actually marginally
 /faster/ than C++ for some specific tests. This seems
 counterintuitive, but the JIT compiler knows more at runtime than the
 traditional compiler can know in advance, and I'm guessing that's why
 it can (not generally, but in some rare situations) do better
 optimizations. When you have ideal programmers then of course compiled
 languages will be faster, but that difference is getting less as
 technology evolves.
 But of course we're talking about the Flashplayer here, and size,
 portability and start-up time are more important design goals than
 execution speed, so we'll most definitely will always have to live
 with a very noticeable performance penalty. Then again, we don't have
 to manage memory ourselves, which is a big plus.
 (Btw memory allocation and optimization: recycling instances where
 possible is also a good idea.)

 If somebody knows a good explanation about the when and how of the
 AVM2 JIT compiler, I'd be curious. The same goes for a table that
 shows relative performance of stuff the renderer does -- like with
 alpha vs. without, if rendering time grows linear with the number of
 pixels, how much time is wasted on DisplayObjects outside of the
 visible Stage, stuff like that.

 Mark





 On Wed, Jul 30, 2008 at 9:05 PM, Kerry Thompson [EMAIL PROTECTED]
 wrote:
  Mark Winterhalder wrote:
 
  Nitpicking, but just as anything digital the SWF opcodes essentially
  are 1s and 0s, too. :)
 
  Fair enough. Following that to its logical conclusion, _everything_ on
 your
  computer is 1s and 0s, including the text in this email ^_^
 
  You clearly understand what I was saying, Mark, but just a brief
  reiteration: compiled ActionScript has to be interpreted by the VM, which
 is
  _always_ slower than compiling directly to machine language.
 
  When I was doing Director full time, I ran some tests that showed C++ to
 run
  up to 400 times as fast as Lingo. I lobbied for years to get a true
  machine-language compiler for Lingo, at least for desktop apps. I was
 struck
  by how few developers understood the implications, and without other
  developers clamoring for the need for speed, Macromedia never went there.
  Director could have been a major player in the 3D game world.
 
  And don't tell me that Director 3D is fast enough. Hard-core gamers buy
  $8,000 machines to squeeze every last fps out of their games. With
 lights,
  shaders, high-poly objects, multiple cameras, Director is just not fast
  enough for a Quake or Doom LAN party. And, of course, neither is Flash.
 
  Anyway, the new VM supports JIT compilation to native machine code. I
  must admit I don't know if /all/ code gets JIT compiled or only
  hotspots, and I don't know if it will be recompiled for each use to
  hardcode variables, but that would also have implications.
 
  One major implication would be in loops. The complier would  have no way
 of
  knowing if an array would change length in a loop, for example, so it
  couldn't hard code the length.
 
  Cordially,
 
  Kerry Thompson
 
  ___
  Flashcoders mailing list
  Flashcoders@chattyfig.figleaf.com
  http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
 
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-30 Thread Mark Winterhalder
On Wed, Jul 30, 2008 at 11:02 PM, Juan Pablo Califano
[EMAIL PROTECTED] wrote:
 Check these slides:

 http://www.onflex.org/ACDS/AS3TuningInsideAVM2JIT.pdf

 From page 43:

 * We make a simple hotspot-like decision about whether to interpret or JIT
 * Initialization functions ($init, $cinit) are interpreted
 * Everything else is JIT
 * Upshot: Don't put performance-intensive code in class initialization

Thanks for the link, but I was hoping for something more specific,
like an article that explains when the compilation happens. For
example, a method could be compiled initially, when it first runs, or
each time it gets called.

I'm just curious, it's not important to know.

Mark
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


[Flashcoders] faster, longer, better ... for programming maniaks

2008-07-29 Thread laurent

Hi,

I often asked myself if a[ a.length ] = xxx was faster or slower then 
a.push( xxx ), I did some test at wake up, fresh with coffee.
So now I know the answer and I got a bit more about while and for, and 
more obvious about using them with decremental or incremental counters.


from results,  means faster :

for  while hey yes...oO
increment  decrement
length  push
increment or certainly make any number operation at same time than 
putting the value in variable is slower than separate those actions, like:

   a[ a.length ] = i++;
slower than:
   i++;
   a[ a.length ] = i;

while incremental is faster than for decremental.
This is a totaly useless information as if you can use the while 
incremental instead of For decremental, then just use For incremental.
I heard that any loop was compiled to a while loop so I started coding 
everything with a while, what is faster to write and more elegant.
Now I'll go back to those For loops, I promess I never stoped loving you 
guys...


here some convincing results :
takeLengthMinus : 1695
takeLengthMinusOut : 1598
takeLengthPlus : 1580
takeLengthPlusOut : 1550


takePushMinus : 1860
takePushMinusOut : 1768
takePushPlus : 1756
takePushPlusOut : 1685

Don't compare results between separate paragraphe because they did not 
run all together:


takeForWithLengthMinus : 1624
takeForWithLengthPlus : 1581

The For loop is directly with operation outside, so it has to be 
compared to the


other outside operations:

takeLengthMinusOut : 1686
takeLengthPlusOut : 1610
takePushMinusOut : 1788
takePushPlusOut : 1666
takeForWithLengthMinus : 1626
takeForWithLengthPlus : 1563

I guess that's why I learned to use for( i = 0; i  n; i++ ) for my 
first loops.
So if we want our code faster we have to make it longer and actually 
more human readable, at the same time it means more computer readable as 
it gets fasterhm, is the computer so close to human...?!


I [ mean my brain ] actually use a dicotomic way to find my current 
client folder in the list of all my works.


cheers.
L

and here the codes :


function takeLengthMinus():String{
   var i : int = 1000;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i-- ){
   a[ a.length ] = i;
   }
   return takeLengthMinus :  + ( getTimer() - t );
}

function takeLengthMinusOut():String{
   var i : int = 1000;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i ){
   a[ a.length ] = i;
   i--;
   }
   return takeLengthMinusOut :  + ( getTimer() - t );
}

function takeLengthPlus():String{
   var i : int = 0;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i  1000 ){
   a[ a.length ] = i++;
   }
   return takeLengthPlus :  + ( getTimer() - t );
}

function takeLengthPlusOut():String{
   var i : int = 0;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i  1000 ){
   a[ a.length ] = i;
   i++;
   }
   return takeLengthPlusOut :  + ( getTimer() - t );
}

function takePushMinus():String{
   var i : int = 1000;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i-- ){
   a.push( i );
   }
   return takePushMinus :  + ( getTimer() - t );
}

function takePushMinusOut():String{
   var i : int = 1000;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i ){
   i--;
   a.push( i );
   }
   return takePushMinusOut :  + ( getTimer() - t );
}

function takePushPlus():String{
   var i : int = 0;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i  1000 ){
   a.push( i++ );
   }
   return takePushPlus :  + ( getTimer() - t );
}

function takePushPlusOut():String{
   var i : int = 0;
   var a: Array = new Array();
   var t: Number = getTimer();
   while( i  1000 ){
   a.push( i );
   i++;
   }
   return takePushPlusOut :  + ( getTimer() - t );
}

function takeForWithLengthMinus():String{
   var i : int;
   var a: Array = new Array();
   var t: Number = getTimer();
   for( i = 1000; i  0; i-- ){
   a[ a.length ] = i;
   }
   return takeForWithLengthMinus :  + ( getTimer() - t );
}

function takeForWithLengthPlus():String{
   var i : int;
   var a: Array = new Array();
   var t: Number = getTimer();
   for( i = 0; i  1000; i++ ){
   a[ a.length ] = i;
   }
   return takeForWithLengthPlus :  + ( getTimer() - t );
}

//trace( takeLengthMinus() );
trace( takeLengthMinusOut() );
//trace( takeLengthPlus() );
trace( takeLengthPlusOut() );
//trace( takePushMinus() );
trace( takePushMinusOut() );
//trace( takePushPlus() );
trace( takePushPlusOut() );
trace( takeForWithLengthMinus() );
trace( takeForWithLengthPlus() );

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] faster, longer, better ... for programming maniaks

2008-07-29 Thread Juan Pablo Califano
I'd take these results with a pinch of salt. I don't think they're
conclusive, since one test seems to affect the performance of the others
(and we are talking about really small differences anyway, which I think
could be attributed to other factors than the tested code itself).

Let's take, for instance the for / while tests. If I run all your tests, I
get results similar to yours.

takeLengthPlusOut : 1958
takeForWithLengthPlus : 1788

However, if I run just those two tests the results change. And if I alter
the order in which the tests are run, the first one always seems to take
less time.

// running just these two, tracing the results in the same order the loops
are executed

takeLengthPlusOut : 1876
takeForWithLengthPlus : 2003

takeLengthPlusOut : 1876
takeForWithLengthPlus : 1935

takeForWithLengthPlus : 1874
takeLengthPlusOut : 1946

takeForWithLengthPlus : 1861
takeLengthPlusOut : 1935
I think this particular for / while case is very illustrative of some
external bias, because the execution order consistently affects the results
and because if you disassemble both loops, they're nearly identical. The
only difference is the order in which one operation previous to the loop is
executed (it's not even the body of the loop or its conditional test).


FOR:

function takeForWithLengthPlus():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11findpropstrict Array
  13constructprop  Array (0)
  16coerce Array
  18setlocal2
  19findpropstrict flash.utils::getTimer
  21callproperty   flash.utils::getTimer (0)
  24convert_d
  25setlocal3
  26pushbyte   0
  28setlocal1
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44pushint1000 // 0x989680
  46iflt   L2

  50pushstring takeForWithLengthPlus : 
  52findpropstrict flash.utils::getTimer
  54callproperty   flash.utils::getTimer (0)
  57getlocal3
  58subtract
  59add
  60returnvalue
}


WHILE:

function takeLengthPlusOut():String /* disp_id 0*/
{
  // local_count=4 max_scope=1 max_stack=3 code_len=61
  0 getlocal0
  1 pushscope
  2 pushbyte   0
  4 setlocal1
  5 pushnull
  6 coerce Array
  8 setlocal2
  9 pushnan
  10setlocal3
  11pushbyte   0
  13setlocal1
  14findpropstrict Array
  16constructprop  Array (0)
  19coerce Array
  21setlocal2
  22findpropstrict flash.utils::getTimer
  24callproperty   flash.utils::getTimer (0)
  27convert_d
  28setlocal3
  29jump   L1


  L2:
  33label
  34getlocal2
  35getlocal2
  36getpropertylength
  38getlocal1
  39setpropertynull
  41inclocal_i 1

  L1:
  43getlocal1
  44pushint1000 // 0x989680
  46iflt   L2

  50pushstring takeLengthPlusOut : 
  52findpropstrict flash.utils::getTimer
  54callproperty   flash.utils::getTimer (0)
  57getlocal3
  58subtract
  59add
  60returnvalue
}
The only difference is that in the foor loop, i = 0 is run immediately
before testing i against 1000:

  26pushbyte   0
  28setlocal1

whereas in the while loop, that assignment is executed before constructing
the array:

  11pushbyte   0
  13setlocal1


Cheers
Juan Pablo Califano

2008/7/29, laurent [EMAIL PROTECTED]:

 Hi,

 I often asked myself if a[ a.length ] = xxx was faster or slower then
 a.push( xxx ), I did some test at wake up, fresh with coffee.
 So now I know the answer and I got a bit more about while and for, and more
 obvious about using them with decremental or incremental counters.

 from results,  means faster :

 for  while hey yes...oO
 increment  decrement
 length  push
 increment or certainly make any number operation at same time than putting
 the value in variable is slower than separate those actions, like:
   a[ a.length ] = i++;
 slower than:
   i++;
   a[ a.length ] = i;

 while incremental is faster than for decremental.
 This is a totaly useless information as if you can use the while
 incremental instead of For decremental, then just use For incremental.
 I heard that any loop was compiled to a while loop so I started coding
 everything with a