Re: Canvas performance on Mac OS

2015-04-19 Thread Jim Graham
Is that on Sierpinski?  The main question I still have is whether this 
helps other apps.  The usage pattern in Sierpinski is fairly specific 
and may not translate to all apps (we're still planning on doing this as 
it is an easy improvement that doesn't seem to have a down-side, but I 
don't want to break out any champagne if it is just this one benchmark)...


...jim

On 4/16/15 12:58 PM, Johan Vos wrote:

Hi Jim,

On iOS, the performance jumped from 2 fps to 15 fps on my old iPad.
Excellent work!

- Johan

2015-04-14 21:16 GMT+02:00 Jim Graham james.gra...@oracle.com:


Hi Chris,

We identified a fairly localized optimization that we might be able to
apply to enhance the performance of your Sierpinski program.  We don't have
any figures yet on whether this will improve other applications/benchmarks
that people have been discussing, but the improvements with your Sierpinski
program are quite dramatic on a number of platforms and GPUs.

This issue is now being tracked as: https://javafx-jira.kenai.com/
browse/RT-40533

If others could apply the indicated patch to an OpenJFX build and provide
feedback on any improvements (or bugs!) that they see, that would help.  In
the meantime, we have a lot of testing to do to verify the correctness of
the changes...

 ...jim


On 4/8/15 9:25 AM, Chris Newland wrote:


Hi Jim,

I'll post the verbose prism output from my iMac when I get home.

Just tried this on my Linux workstation and the performance gap is the
same between es2 and sw so I don't think it's an OSX issue.

uname -a
Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux

$JAVA_HOME/bin/java -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 20
fps: 31
fps: 32
fps: 33
fps: 35
fps: 34
fps: 33

$JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 56
fps: 60
fps: 59
fps: 60
fps: 61
fps: 61
fps: 60

This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
graphics card running driver 304.125

Regards,

Chris


On Wed, April 8, 2015 00:16, Jim Graham wrote:


OK, I took the time to put my rMBP on a diet yesterday and find room to
install a 10.10 partition.  I get the same numbers for Sierpinski on
10.10,
so my theory that something changed in the OGL implementation for 10.10
doesn't hold water.

But, I then tried it using the integrated graphics.  I get really poor
performance using the integrated Intel 4000 graphics, but I get great
numbers on the discrete nVidia 650m.  It makes sense that the Intel
graphics wouldn't be as powerful as the discrete graphics, but we
shouldn't be taxing it that much to make that big of a difference.

Just to be sure - is that iMac a dual graphics system, or is it
all-AMD-all-the-time?  You can see which GPU is being used if you run it
with -Dprism.verbose=true...

...jim


On 4/2/15 4:13 PM, Jim Graham wrote:

  On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you

running a newer version of MacOS?

...jim


On 3/31/15 3:40 PM, Chris Newland wrote:

  Hi Hervé,



That's a valid question :)


Probably because


a) All my non-UI graphics experience is with immediate-mode / raster
systems

b) I'm interested in using JavaFX for particle effects / demoscene /
gaming so assumed (perhaps wrongly?) that scenegraph was not the way
to go for that due to the very large number of nodes.

Numbers for my Sierpinski filled triangle example:


System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
1024 MB


java -Dprism.order=es2 -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski fps: 1
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25


java -Dprism.order=sw -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski fps: 1
fps: 54
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60


There are never more than 2500 filled triangles on screen. JDK is
1.8.0_40


I would say there is a performance problem here? (or at least a need
for documentation so as to set expectations for gc.fillPolygon).

Best regards,


Chris





On Tue, March 31, 2015 22:00, Hervé Girod wrote:

  Why don't you use Nodes rather than Canvas ?




Sent from my iPhone



  On Mar 31, 2015, at 22:31, Chris Newland

cnewl...@chrisnewland.com
wrote:



Hi Jim,



Thanks, that makes things much clearer.



I was surprised how much was going on under the hood of
GraphicsContext
and hoped it was just magic glue that gave the best of GPU
acceleration where available and immediate-mode-like simple
rasterizing where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon
where the software pipeline achieves the full 60fps but ES2 can
only manage 30-35fps. It uses lots of overlapping filled triangles
so I expect suffers from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
com/ch


Re: Canvas performance on Mac OS

2015-04-16 Thread Chris Newland
Hi Jim,

Thanks for looking into this.

The patch definitely improves es2 performance on Debian Linux amd64 from
around 33fps to around 53fps for me (nVidia FX580).

I've made patched overlay builds of OpenJFX (Linux) 8 and 9 available on
my OpenJFX CI server for anyone who wants to try it:
http://108.61.191.178/

Will test on OSX tonight.

Cheers,

Chris


On Tue, April 14, 2015 20:16, Jim Graham wrote:
 Hi Chris,


 We identified a fairly localized optimization that we might be able to
 apply to enhance the performance of your Sierpinski program.  We don't have
 any figures yet on whether this will improve other applications/benchmarks
 that people have been discussing, but the improvements with your
 Sierpinski program are quite dramatic on a number
 of platforms and GPUs.

 This issue is now being tracked as:
 https://javafx-jira.kenai.com/browse/RT-40533


 If others could apply the indicated patch to an OpenJFX build and
 provide feedback on any improvements (or bugs!) that they see, that would
 help.  In the meantime, we have a lot of testing to do to verify the
 correctness of the changes...

 ...jim


 On 4/8/15 9:25 AM, Chris Newland wrote:

 Hi Jim,


 I'll post the verbose prism output from my iMac when I get home.


 Just tried this on my Linux workstation and the performance gap is the
 same between es2 and sw so I don't think it's an OSX issue.

 uname -a Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64
 GNU/Linux


 $JAVA_HOME/bin/java -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 20
 fps: 31
 fps: 32
 fps: 33
 fps: 35
 fps: 34
 fps: 33


 $JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 54
 fps: 56
 fps: 60
 fps: 59
 fps: 60
 fps: 61
 fps: 61
 fps: 60


 This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
 graphics card running driver 304.125

 Regards,


 Chris



 On Wed, April 8, 2015 00:16, Jim Graham wrote:

 OK, I took the time to put my rMBP on a diet yesterday and find room
 to install a 10.10 partition.  I get the same numbers for Sierpinski
 on 10.10, so my theory that something changed in the OGL
 implementation for 10.10 doesn't hold water.

 But, I then tried it using the integrated graphics.  I get really
 poor performance using the integrated Intel 4000 graphics, but I get
 great numbers on the discrete nVidia 650m.  It makes sense that the
 Intel
 graphics wouldn't be as powerful as the discrete graphics, but we
 shouldn't be taxing it that much to make that big of a difference.

 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you run
 it with -Dprism.verbose=true...

 ...jim



 On 4/2/15 4:13 PM, Jim Graham wrote:


 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are
 you running a newer version of MacOS?

 ...jim



 On 3/31/15 3:40 PM, Chris Newland wrote:


 Hi Hervé,



 That's a valid question :)



 Probably because



 a) All my non-UI graphics experience is with immediate-mode /
 raster systems

 b) I'm interested in using JavaFX for particle effects /
 demoscene / gaming so assumed (perhaps wrongly?) that scenegraph
 was not the way to go for that due to the very large number of
 nodes.

 Numbers for my Sierpinski filled triangle example:



 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
  1024 MB



 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25



 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60



 There are never more than 2500 filled triangles on screen. JDK is
  1.8.0_40



 I would say there is a performance problem here? (or at least a
 need for documentation so as to set expectations for
 gc.fillPolygon).

 Best regards,



 Chris






 On Tue, March 31, 2015 22:00, Hervé Girod wrote:


 Why don't you use Nodes rather than Canvas ?




 Sent from my iPhone




 On Mar 31, 2015, at 22:31, Chris Newland
 cnewl...@chrisnewland.com
 wrote:




 Hi Jim,




 Thanks, that makes things much clearer.




 I was surprised how much was going on under the hood of
 GraphicsContext
 and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple
 rasterizing where not.

 I've managed to find an anomaly with
 GraphicsContext.fillPolygon
 where the software pipeline achieves the full 60fps but ES2
 can only manage 30-35fps. It uses lots of overlapping filled
 triangles so I expect suffers from the problem you've
 described.

 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/j
 ava/ com/ch

 risnewland/demofx/standalone/Sierpinski.java

 Was full frame rate canvas drawing an 

Re: Canvas performance on Mac OS

2015-04-16 Thread Chris Newland
Confirmed, full 60fps performance on 2011 iMac with this fix:

/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java
-cp target/DemoFX.jar com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 23
fps: 19
fps: 26
fps: 21
fps: 21
fps: 26
fps: 17

/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk_PATCHED/Contents/Home/bin/java
-cp target/DemoFX.jar com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 53
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60

I've uploaded OSX SDK overlay builds containing this webrev to
http://108.61.191.178/ if anyone wants to test the fix on their OSX
system.

Thanks a lot Jim and team for looking into this!

Cheers,

Chris


On Thu, April 16, 2015 09:39, Chris Newland wrote:
 Hi Jim,


 Thanks for looking into this.


 The patch definitely improves es2 performance on Debian Linux amd64 from
 around 33fps to around 53fps for me (nVidia FX580).

 I've made patched overlay builds of OpenJFX (Linux) 8 and 9 available on
 my OpenJFX CI server for anyone who wants to try it: http://108.61.191.178/


 Will test on OSX tonight.


 Cheers,


 Chris



 On Tue, April 14, 2015 20:16, Jim Graham wrote:

 Hi Chris,



 We identified a fairly localized optimization that we might be able to
 apply to enhance the performance of your Sierpinski program.  We don't
 have any figures yet on whether this will improve other
 applications/benchmarks that people have been discussing, but the
 improvements with your Sierpinski program are quite dramatic on a number
  of platforms and GPUs.

 This issue is now being tracked as:
 https://javafx-jira.kenai.com/browse/RT-40533



 If others could apply the indicated patch to an OpenJFX build and
 provide feedback on any improvements (or bugs!) that they see, that
 would help.  In the meantime, we have a lot of testing to do to verify
 the correctness of the changes...

 ...jim



 On 4/8/15 9:25 AM, Chris Newland wrote:


 Hi Jim,



 I'll post the verbose prism output from my iMac when I get home.



 Just tried this on my Linux workstation and the performance gap is
 the same between es2 and sw so I don't think it's an OSX issue.

 uname -a Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2
 x86_64 GNU/Linux



 $JAVA_HOME/bin/java -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 20
 fps: 31
 fps: 32
 fps: 33
 fps: 35
 fps: 34
 fps: 33



 $JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 54
 fps: 56
 fps: 60
 fps: 59
 fps: 60
 fps: 61
 fps: 61
 fps: 60



 This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
 graphics card running driver 304.125

 Regards,



 Chris




 On Wed, April 8, 2015 00:16, Jim Graham wrote:


 OK, I took the time to put my rMBP on a diet yesterday and find
 room to install a 10.10 partition.  I get the same numbers for
 Sierpinski
 on 10.10, so my theory that something changed in the OGL
 implementation for 10.10 doesn't hold water.

 But, I then tried it using the integrated graphics.  I get really
 poor performance using the integrated Intel 4000 graphics, but I get
  great numbers on the discrete nVidia 650m.  It makes sense that
 the Intel
 graphics wouldn't be as powerful as the discrete graphics, but we
 shouldn't be taxing it that much to make that big of a difference.

 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you
 run it with -Dprism.verbose=true...

 ...jim




 On 4/2/15 4:13 PM, Jim Graham wrote:



 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.
 Are
 you running a newer version of MacOS?

 ...jim




 On 3/31/15 3:40 PM, Chris Newland wrote:



 Hi Hervé,




 That's a valid question :)




 Probably because




 a) All my non-UI graphics experience is with immediate-mode /
 raster systems

 b) I'm interested in using JavaFX for particle effects /
 demoscene / gaming so assumed (perhaps wrongly?) that
 scenegraph was not the way to go for that due to the very large
 number of nodes.

 Numbers for my Sierpinski filled triangle example:




 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD
 6970M
 1024 MB




 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25




 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60




 There are never more than 2500 filled triangles on screen. JDK
 is 1.8.0_40




 I would say there is a performance problem here? (or at least a
  need for documentation so as to set expectations for
 gc.fillPolygon).

 Best regards,




 Chris







 On Tue, March 31, 2015 22:00, Hervé Girod wrote:



 Why don't you use Nodes rather than Canvas ?





 Sent from my iPhone





Re: Canvas performance on Mac OS

2015-04-14 Thread Jim Graham

Hi Chris,

We identified a fairly localized optimization that we might be able to 
apply to enhance the performance of your Sierpinski program.  We don't 
have any figures yet on whether this will improve other 
applications/benchmarks that people have been discussing, but the 
improvements with your Sierpinski program are quite dramatic on a number 
of platforms and GPUs.


This issue is now being tracked as: 
https://javafx-jira.kenai.com/browse/RT-40533


If others could apply the indicated patch to an OpenJFX build and 
provide feedback on any improvements (or bugs!) that they see, that 
would help.  In the meantime, we have a lot of testing to do to verify 
the correctness of the changes...


...jim

On 4/8/15 9:25 AM, Chris Newland wrote:

Hi Jim,

I'll post the verbose prism output from my iMac when I get home.

Just tried this on my Linux workstation and the performance gap is the
same between es2 and sw so I don't think it's an OSX issue.

uname -a
Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux

$JAVA_HOME/bin/java -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 20
fps: 31
fps: 32
fps: 33
fps: 35
fps: 34
fps: 33

$JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 56
fps: 60
fps: 59
fps: 60
fps: 61
fps: 61
fps: 60

This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
graphics card running driver 304.125

Regards,

Chris


On Wed, April 8, 2015 00:16, Jim Graham wrote:

OK, I took the time to put my rMBP on a diet yesterday and find room to
install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
so my theory that something changed in the OGL implementation for 10.10
doesn't hold water.

But, I then tried it using the integrated graphics.  I get really poor
performance using the integrated Intel 4000 graphics, but I get great
numbers on the discrete nVidia 650m.  It makes sense that the Intel
graphics wouldn't be as powerful as the discrete graphics, but we
shouldn't be taxing it that much to make that big of a difference.

Just to be sure - is that iMac a dual graphics system, or is it
all-AMD-all-the-time?  You can see which GPU is being used if you run it
with -Dprism.verbose=true...

...jim


On 4/2/15 4:13 PM, Jim Graham wrote:


On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
running a newer version of MacOS?

...jim


On 3/31/15 3:40 PM, Chris Newland wrote:


Hi Hervé,


That's a valid question :)


Probably because


a) All my non-UI graphics experience is with immediate-mode / raster
systems

b) I'm interested in using JavaFX for particle effects / demoscene /
gaming so assumed (perhaps wrongly?) that scenegraph was not the way
to go for that due to the very large number of nodes.

Numbers for my Sierpinski filled triangle example:


System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
1024 MB


java -Dprism.order=es2 -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski fps: 1
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25


java -Dprism.order=sw -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski fps: 1
fps: 54
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60


There are never more than 2500 filled triangles on screen. JDK is
1.8.0_40


I would say there is a performance problem here? (or at least a need
for documentation so as to set expectations for gc.fillPolygon).

Best regards,


Chris





On Tue, March 31, 2015 22:00, Hervé Girod wrote:


Why don't you use Nodes rather than Canvas ?



Sent from my iPhone




On Mar 31, 2015, at 22:31, Chris Newland
cnewl...@chrisnewland.com
wrote:



Hi Jim,



Thanks, that makes things much clearer.



I was surprised how much was going on under the hood of
GraphicsContext
and hoped it was just magic glue that gave the best of GPU
acceleration where available and immediate-mode-like simple
rasterizing where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon
where the software pipeline achieves the full 60fps but ES2 can
only manage 30-35fps. It uses lots of overlapping filled triangles
so I expect suffers from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
com/ch

risnewland/demofx/standalone/Sierpinski.java

Was full frame rate canvas drawing an expected use case for
JavaFX or
would I be better off with Graphics2D?

Thanks,



Chris




On Mon, March 30, 2015 20:04, Jim Graham wrote:
Hi Chris,




drawLine() is a very simple primitive that can be optimized
with a GPU
shader.  It either looks like a (potentially rotated) rectangle
or a rounded rect - and we have optimized shaders for both
cases.  A large number of drawLine() calls turns into simply
accumulating a large vertex list and uploading it to the GPU
with an appropriate shader which is very fast.


Re: Canvas performance on Mac OS

2015-04-09 Thread Mike
This is important 
Thanks guys 

Sent from my iPhone

 On Apr 8, 2015, at 9:25 AM, Chris Newland cnewl...@chrisnewland.com wrote:
 
 Hi Jim,
 
 I'll post the verbose prism output from my iMac when I get home.
 
 Just tried this on my Linux workstation and the performance gap is the
 same between es2 and sw so I don't think it's an OSX issue.
 
 uname -a
 Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux
 
 $JAVA_HOME/bin/java -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski
 fps: 1
 fps: 20
 fps: 31
 fps: 32
 fps: 33
 fps: 35
 fps: 34
 fps: 33
 
 $JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
 com.chrisnewland.demofx.standalone.Sierpinski
 fps: 1
 fps: 54
 fps: 56
 fps: 60
 fps: 59
 fps: 60
 fps: 61
 fps: 61
 fps: 60
 
 This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
 graphics card running driver 304.125
 
 Regards,
 
 Chris
 
 
 On Wed, April 8, 2015 00:16, Jim Graham wrote:
 OK, I took the time to put my rMBP on a diet yesterday and find room to
 install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
 so my theory that something changed in the OGL implementation for 10.10
 doesn't hold water.
 
 But, I then tried it using the integrated graphics.  I get really poor
 performance using the integrated Intel 4000 graphics, but I get great
 numbers on the discrete nVidia 650m.  It makes sense that the Intel
 graphics wouldn't be as powerful as the discrete graphics, but we
 shouldn't be taxing it that much to make that big of a difference.
 
 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you run it
 with -Dprism.verbose=true...
 
 ...jim
 
 
 On 4/2/15 4:13 PM, Jim Graham wrote:
 
 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
 running a newer version of MacOS?
 
 ...jim
 
 
 On 3/31/15 3:40 PM, Chris Newland wrote:
 
 Hi Hervé,
 
 
 That's a valid question :)
 
 
 Probably because
 
 
 a) All my non-UI graphics experience is with immediate-mode / raster
 systems
 
 b) I'm interested in using JavaFX for particle effects / demoscene /
 gaming so assumed (perhaps wrongly?) that scenegraph was not the way
 to go for that due to the very large number of nodes.
 
 Numbers for my Sierpinski filled triangle example:
 
 
 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
 1024 MB
 
 
 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25
 
 
 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 
 
 There are never more than 2500 filled triangles on screen. JDK is
 1.8.0_40
 
 
 I would say there is a performance problem here? (or at least a need
 for documentation so as to set expectations for gc.fillPolygon).
 
 Best regards,
 
 
 Chris
 
 
 
 
 
 On Tue, March 31, 2015 22:00, Hervé Girod wrote:
 
 Why don't you use Nodes rather than Canvas ?
 
 
 
 Sent from my iPhone
 
 
 
 On Mar 31, 2015, at 22:31, Chris Newland
 cnewl...@chrisnewland.com
 wrote:
 
 
 
 Hi Jim,
 
 
 
 Thanks, that makes things much clearer.
 
 
 
 I was surprised how much was going on under the hood of
 GraphicsContext
 and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple
 rasterizing where not.
 
 I've managed to find an anomaly with GraphicsContext.fillPolygon
 where the software pipeline achieves the full 60fps but ES2 can
 only manage 30-35fps. It uses lots of overlapping filled triangles
 so I expect suffers from the problem you've described.
 
 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
 com/ch
 
 risnewland/demofx/standalone/Sierpinski.java
 
 Was full frame rate canvas drawing an expected use case for
 JavaFX or
 would I be better off with Graphics2D?
 
 Thanks,
 
 
 
 Chris
 
 
 
 On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,
 
 
 
 
 drawLine() is a very simple primitive that can be optimized
 with a GPU
 shader.  It either looks like a (potentially rotated) rectangle
 or a rounded rect - and we have optimized shaders for both
 cases.  A large number of drawLine() calls turns into simply
 accumulating a large vertex list and uploading it to the GPU
 with an appropriate shader which is very fast.
 
 drawPolygon() is a very complex operation that involves things
 like:
 
 
 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of
 intersection once
 
 To handle all of that complexity we have to involve a
 rasterizer that takes the entire collection of lines, analyzes
 the stroke attributes and interactions and computes a coverage
 mask for each pixel in the region. We do that in software
 

Re: Canvas performance on Mac OS

2015-04-08 Thread Chris Newland
Hi Jim,

Definitely discrete GPU on the iMac:

java -cp target/DemoFX.jar -Dprism.verbose=true
com.chrisnewland.demofx.standalone.Sierpinski

Prism pipeline init order: es2 sw
Using native-based Pisces rasterizer
Using dirty region optimizations
Not using texture mask for primitives
Not forcing power of 2 sizes for textures
Using hardware CLAMP_TO_ZERO mode
Opting in for HiDPI pixel scaling
Prism pipeline name = com.sun.prism.es2.ES2Pipeline
Loading ES2 native library ... prism_es2
succeeded.
GLFactory using com.sun.prism.es2.MacGLFactory
(X) Got class = class com.sun.prism.es2.ES2Pipeline
Initialized prism pipeline: com.sun.prism.es2.ES2Pipeline
Maximum supported texture size: 16384
Maximum texture size clamped to 4096
Non power of two texture support = true
Maximum number of vertex attributes = 16
Maximum number of uniform vertex components = 3072
Maximum number of uniform fragment components = 3072
Maximum number of varying components = 128
Maximum number of texture units usable in a vertex shader = 16
Maximum number of texture units usable in a fragment shader = 16
Graphics Vendor: ATI Technologies Inc.
   Renderer: AMD Radeon HD 6970M OpenGL Engine
Version: 2.1 ATI-1.24.38
 vsync: true vpipe: true
fps: 1
ES2ResourceFactory: Prism - createStockShader: Solid_Color.frag
ES2ResourceFactory: Prism - createStockShader: FillPgram_Color.frag
Loading Prism common native library ...
succeeded.
ES2ResourceFactory: Prism - createStockShader: Texture_Color.frag
ES2ResourceFactory: Prism - createStockShader: Solid_TextureRGB.frag
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25
fps: 18

With software pipeline:

java -cp target/DemoFX.jar -Dprism.verbose=true -Dprism.order=sw
com.chrisnewland.demofx.standalone.Sierpinski

Prism pipeline init order: sw
Using native-based Pisces rasterizer
Using dirty region optimizations
Not using texture mask for primitives
Not forcing power of 2 sizes for textures
Using hardware CLAMP_TO_ZERO mode
Opting in for HiDPI pixel scaling
*** Fallback to Prism SW pipeline
Prism pipeline name = com.sun.prism.sw.SWPipeline
(X) Got class = class com.sun.prism.sw.SWPipeline
Initialized prism pipeline: com.sun.prism.sw.SWPipeline
 vsync: true vpipe: false
fps: 1
Loading Prism common native library ...
succeeded.
fps: 53
fps: 60
fps: 60
fps: 60
fps: 60

But earlier I got similar performance drop for es2 on a Linux system with
discrete Nvidia graphics (see my previous email).

I'll see if I can find a Windows box with discrete graphics to test if all
platforms exhibit this behaviour.

Cheers,

Chris


On Wed, April 8, 2015 00:16, Jim Graham wrote:
 OK, I took the time to put my rMBP on a diet yesterday and find room to
 install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
 so my theory that something changed in the OGL implementation for 10.10
 doesn't hold water.

 But, I then tried it using the integrated graphics.  I get really poor
 performance using the integrated Intel 4000 graphics, but I get great
 numbers on the discrete nVidia 650m.  It makes sense that the Intel
 graphics wouldn't be as powerful as the discrete graphics, but we
 shouldn't be taxing it that much to make that big of a difference.

 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you run it
 with -Dprism.verbose=true...

 ...jim


 On 4/2/15 4:13 PM, Jim Graham wrote:

 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
 running a newer version of MacOS?

 ...jim


 On 3/31/15 3:40 PM, Chris Newland wrote:

 Hi Hervé,


 That's a valid question :)


 Probably because


 a) All my non-UI graphics experience is with immediate-mode / raster
 systems

 b) I'm interested in using JavaFX for particle effects / demoscene /
 gaming so assumed (perhaps wrongly?) that scenegraph was not the way
 to go for that due to the very large number of nodes.

 Numbers for my Sierpinski filled triangle example:


 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
 1024 MB


 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25


 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60


 There are never more than 2500 filled triangles on screen. JDK is
 1.8.0_40


 I would say there is a performance problem here? (or at least a need
 for documentation so as to set expectations for gc.fillPolygon).

 Best regards,


 Chris





 On Tue, March 31, 2015 22:00, Hervé Girod wrote:

 Why don't you use Nodes rather than Canvas ?



 Sent from my iPhone



 On Mar 31, 2015, at 22:31, Chris Newland
 cnewl...@chrisnewland.com
 wrote:



 Hi Jim,



 Thanks, that makes things much clearer.



 I was surprised how much was 

Re: Canvas performance on Mac OS

2015-04-08 Thread Robert Krüger
All my MBP numbers are on integrated Intel graphics as well. I tested on an
old MBP that only has Intel graphics and on my more recent MBP the Nvidia
is deactivated due to a problem with it.

On Wed, Apr 8, 2015 at 1:16 AM, Jim Graham james.gra...@oracle.com wrote:

 OK, I took the time to put my rMBP on a diet yesterday and find room to
 install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
 so my theory that something changed in the OGL implementation for 10.10
 doesn't hold water.

 But, I then tried it using the integrated graphics.  I get really poor
 performance using the integrated Intel 4000 graphics, but I get great
 numbers on the discrete nVidia 650m.  It makes sense that the Intel
 graphics wouldn't be as powerful as the discrete graphics, but we shouldn't
 be taxing it that much to make that big of a difference.

 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you run it
 with -Dprism.verbose=true...

 ...jim


 On 4/2/15 4:13 PM, Jim Graham wrote:

 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
 running a newer version of MacOS?

  ...jim

 On 3/31/15 3:40 PM, Chris Newland wrote:

 Hi Hervé,

 That's a valid question :)

 Probably because

 a) All my non-UI graphics experience is with immediate-mode / raster
 systems

 b) I'm interested in using JavaFX for particle effects / demoscene /
 gaming so assumed (perhaps wrongly?) that scenegraph was not the way
 to go
 for that due to the very large number of nodes.

 Numbers for my Sierpinski filled triangle example:

 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M 1024 MB

 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski
 fps: 1
 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25

 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski
 fps: 1
 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60

 There are never more than 2500 filled triangles on screen. JDK is
 1.8.0_40

 I would say there is a performance problem here? (or at least a need for
 documentation so as to set expectations for gc.fillPolygon).

 Best regards,

 Chris




 On Tue, March 31, 2015 22:00, Hervé Girod wrote:

 Why don't you use Nodes rather than Canvas ?


 Sent from my iPhone


  On Mar 31, 2015, at 22:31, Chris Newland cnewl...@chrisnewland.com
 wrote:


 Hi Jim,


 Thanks, that makes things much clearer.


 I was surprised how much was going on under the hood of GraphicsContext
   and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple rasterizing
 where not.

 I've managed to find an anomaly with GraphicsContext.fillPolygon where
 the software pipeline achieves the full 60fps but ES2 can only manage
 30-35fps. It uses lots of overlapping filled triangles so I expect
 suffers from the problem you've described.

 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/
 master/src/main/java/com/ch

 risnewland/demofx/standalone/Sierpinski.java

 Was full frame rate canvas drawing an expected use case for JavaFX or
 would I be better off with Graphics2D?

 Thanks,


 Chris


  On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,



 drawLine() is a very simple primitive that can be optimized with a
 GPU
 shader.  It either looks like a (potentially rotated) rectangle or a
 rounded rect - and we have optimized shaders for both cases.  A large
   number of drawLine() calls turns into simply accumulating a large
 vertex list and uploading it to the GPU with an appropriate shader
 which is very fast.

 drawPolygon() is a very complex operation that involves things like:

 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of intersection
   once

 To handle all of that complexity we have to involve a rasterizer that
   takes the entire collection of lines, analyzes the stroke attributes
 and interactions and computes a coverage mask for each pixel in the
 region. We do that in software currently for all pipelines.

 For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
   rasterization.

 For the SW pipeline, drawLine is a simplified case of drawPolygon and
 so the overhead of lots of calls to drawLine() dominates its
 performance.

 I would expect ES2 to blow the SW pipeline out of the water with
 drawLine() performance (as long as there are no additional rendering
 primitives interspersed in the set of lines).

 But, both should be on the same footing for the drawPolygon case.
 Does
 the ES2 pipeline compare similarly (hopefully better than) the SW
 pipeline for the polygon case?

 One thing I noticed is that we have no optimized case for drawLine()
 on the SW pipeline.  It generates a path containing a single 

Re: Canvas performance on Mac OS

2015-04-08 Thread Chris Newland
Hi Jim,

I'll post the verbose prism output from my iMac when I get home.

Just tried this on my Linux workstation and the performance gap is the
same between es2 and sw so I don't think it's an OSX issue.

uname -a
Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux

$JAVA_HOME/bin/java -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 20
fps: 31
fps: 32
fps: 33
fps: 35
fps: 34
fps: 33

$JAVA_HOME/bin/java -Dprism.order=sw -classpath target/DemoFX.jar
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 56
fps: 60
fps: 59
fps: 60
fps: 61
fps: 61
fps: 60

This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
graphics card running driver 304.125

Regards,

Chris


On Wed, April 8, 2015 00:16, Jim Graham wrote:
 OK, I took the time to put my rMBP on a diet yesterday and find room to
 install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
 so my theory that something changed in the OGL implementation for 10.10
 doesn't hold water.

 But, I then tried it using the integrated graphics.  I get really poor
 performance using the integrated Intel 4000 graphics, but I get great
 numbers on the discrete nVidia 650m.  It makes sense that the Intel
 graphics wouldn't be as powerful as the discrete graphics, but we
 shouldn't be taxing it that much to make that big of a difference.

 Just to be sure - is that iMac a dual graphics system, or is it
 all-AMD-all-the-time?  You can see which GPU is being used if you run it
 with -Dprism.verbose=true...

 ...jim


 On 4/2/15 4:13 PM, Jim Graham wrote:

 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
 running a newer version of MacOS?

 ...jim


 On 3/31/15 3:40 PM, Chris Newland wrote:

 Hi Hervé,


 That's a valid question :)


 Probably because


 a) All my non-UI graphics experience is with immediate-mode / raster
 systems

 b) I'm interested in using JavaFX for particle effects / demoscene /
 gaming so assumed (perhaps wrongly?) that scenegraph was not the way
 to go for that due to the very large number of nodes.

 Numbers for my Sierpinski filled triangle example:


 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
 1024 MB


 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25


 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60


 There are never more than 2500 filled triangles on screen. JDK is
 1.8.0_40


 I would say there is a performance problem here? (or at least a need
 for documentation so as to set expectations for gc.fillPolygon).

 Best regards,


 Chris





 On Tue, March 31, 2015 22:00, Hervé Girod wrote:

 Why don't you use Nodes rather than Canvas ?



 Sent from my iPhone



 On Mar 31, 2015, at 22:31, Chris Newland
 cnewl...@chrisnewland.com
 wrote:



 Hi Jim,



 Thanks, that makes things much clearer.



 I was surprised how much was going on under the hood of
 GraphicsContext
 and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple
 rasterizing where not.

 I've managed to find an anomaly with GraphicsContext.fillPolygon
 where the software pipeline achieves the full 60fps but ES2 can
 only manage 30-35fps. It uses lots of overlapping filled triangles
 so I expect suffers from the problem you've described.

 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
 com/ch

 risnewland/demofx/standalone/Sierpinski.java

 Was full frame rate canvas drawing an expected use case for
 JavaFX or
 would I be better off with Graphics2D?

 Thanks,



 Chris



 On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,




 drawLine() is a very simple primitive that can be optimized
 with a GPU
 shader.  It either looks like a (potentially rotated) rectangle
 or a rounded rect - and we have optimized shaders for both
 cases.  A large number of drawLine() calls turns into simply
 accumulating a large vertex list and uploading it to the GPU
 with an appropriate shader which is very fast.

 drawPolygon() is a very complex operation that involves things
 like:


 - dealing with line joins between segments that don't exist for
  drawLine() - dealing with only rendering common points of
 intersection once

 To handle all of that complexity we have to involve a
 rasterizer that takes the entire collection of lines, analyzes
 the stroke attributes and interactions and computes a coverage
 mask for each pixel in the region. We do that in software
 currently for all pipelines.

 For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs
 CPU path
 rasterization.

 For the SW pipeline, drawLine is a simplified case of
 drawPolygon and so the overhead of lots of calls to drawLine()
 

Re: Canvas performance on Mac OS

2015-04-07 Thread Jim Graham
OK, I took the time to put my rMBP on a diet yesterday and find room to 
install a 10.10 partition.  I get the same numbers for Sierpinski on 
10.10, so my theory that something changed in the OGL implementation for 
10.10 doesn't hold water.


But, I then tried it using the integrated graphics.  I get really poor 
performance using the integrated Intel 4000 graphics, but I get great 
numbers on the discrete nVidia 650m.  It makes sense that the Intel 
graphics wouldn't be as powerful as the discrete graphics, but we 
shouldn't be taxing it that much to make that big of a difference.


Just to be sure - is that iMac a dual graphics system, or is it 
all-AMD-all-the-time?  You can see which GPU is being used if you run it 
with -Dprism.verbose=true...


...jim

On 4/2/15 4:13 PM, Jim Graham wrote:

On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
running a newer version of MacOS?

 ...jim

On 3/31/15 3:40 PM, Chris Newland wrote:

Hi Hervé,

That's a valid question :)

Probably because

a) All my non-UI graphics experience is with immediate-mode / raster
systems

b) I'm interested in using JavaFX for particle effects / demoscene /
gaming so assumed (perhaps wrongly?) that scenegraph was not the way
to go
for that due to the very large number of nodes.

Numbers for my Sierpinski filled triangle example:

System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M 1024 MB

java -Dprism.order=es2 -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25

java -Dprism.order=sw -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60

There are never more than 2500 filled triangles on screen. JDK is
1.8.0_40

I would say there is a performance problem here? (or at least a need for
documentation so as to set expectations for gc.fillPolygon).

Best regards,

Chris




On Tue, March 31, 2015 22:00, Hervé Girod wrote:

Why don't you use Nodes rather than Canvas ?


Sent from my iPhone



On Mar 31, 2015, at 22:31, Chris Newland cnewl...@chrisnewland.com
wrote:


Hi Jim,


Thanks, that makes things much clearer.


I was surprised how much was going on under the hood of GraphicsContext
  and hoped it was just magic glue that gave the best of GPU
acceleration where available and immediate-mode-like simple rasterizing
where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon where
the software pipeline achieves the full 60fps but ES2 can only manage
30-35fps. It uses lots of overlapping filled triangles so I expect
suffers from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/ch

risnewland/demofx/standalone/Sierpinski.java

Was full frame rate canvas drawing an expected use case for JavaFX or
would I be better off with Graphics2D?

Thanks,


Chris



On Mon, March 30, 2015 20:04, Jim Graham wrote:
Hi Chris,



drawLine() is a very simple primitive that can be optimized with a
GPU
shader.  It either looks like a (potentially rotated) rectangle or a
rounded rect - and we have optimized shaders for both cases.  A large
  number of drawLine() calls turns into simply accumulating a large
vertex list and uploading it to the GPU with an appropriate shader
which is very fast.

drawPolygon() is a very complex operation that involves things like:

- dealing with line joins between segments that don't exist for
drawLine() - dealing with only rendering common points of intersection
  once

To handle all of that complexity we have to involve a rasterizer that
  takes the entire collection of lines, analyzes the stroke attributes
and interactions and computes a coverage mask for each pixel in the
region. We do that in software currently for all pipelines.

For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
  rasterization.

For the SW pipeline, drawLine is a simplified case of drawPolygon and
so the overhead of lots of calls to drawLine() dominates its
performance.

I would expect ES2 to blow the SW pipeline out of the water with
drawLine() performance (as long as there are no additional rendering
primitives interspersed in the set of lines).

But, both should be on the same footing for the drawPolygon case.
Does
the ES2 pipeline compare similarly (hopefully better than) the SW
pipeline for the polygon case?

One thing I noticed is that we have no optimized case for drawLine()
on the SW pipeline.  It generates a path containing a single MOVETO
and LINETO and feeds it to the generalized path rasterizer when it
could instead compute the rounded/square rectangle and render it more
directly.  If we added that support then I'd expect the SW pipeline to
perform the set of drawLine calls faster than drawPolygon as well...

...jim




On 3/28/15 3:22 AM, Chris Newland wrote:


Hi Robert,



I've not filed a 

Re: Canvas performance on Mac OS

2015-04-07 Thread Jim Graham
If I modify the Sierpinksi program to use moveTo/lineTo/lineTo on a path 
and fill the entire path at once the performance improves dramatically 
on both Intel and nVidia GPUs. It is faster still if I replace the 
triangles with fillRect calls, but not by as large a margin. It would 
appear that we are getting entirely bogged down by uploading lots of 
little alpha coverage tiles for each individual polygon to the GPU (odd 
that this overhead would be greater for the Intel integrated graphics 
that uses main system RAM than the nVidia discrete graphics which uses a 
separate memory system, but there could be something to be said for the 
discrete VRAM being faster).


...jim

On 3/31/15, 1:31 PM, Chris Newland wrote:

Hi Jim,

Thanks, that makes things much clearer.

I was surprised how much was going on under the hood of GraphicsContext
and hoped it was just magic glue that gave the best of GPU acceleration
where available and immediate-mode-like simple rasterizing where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon where the
software pipeline achieves the full 60fps but ES2 can only manage
30-35fps. It uses lots of overlapping filled triangles so I expect suffers
from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/chrisnewland/demofx/standalone/Sierpinski.java

Was full frame rate canvas drawing an expected use case for JavaFX or
would I be better off with Graphics2D?

Thanks,

Chris

On Mon, March 30, 2015 20:04, Jim Graham wrote:

Hi Chris,


drawLine() is a very simple primitive that can be optimized with a GPU
shader.  It either looks like a (potentially rotated) rectangle or a
rounded rect - and we have optimized shaders for both cases.  A large
number of drawLine() calls turns into simply accumulating a large vertex
list and uploading it to the GPU with an appropriate shader which is very
fast.

drawPolygon() is a very complex operation that involves things like:

- dealing with line joins between segments that don't exist for
drawLine() - dealing with only rendering common points of intersection
once

To handle all of that complexity we have to involve a rasterizer that
takes the entire collection of lines, analyzes the stroke attributes and
interactions and computes a coverage mask for each pixel in the region. We
do that in software currently for all pipelines.

For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
rasterization.

For the SW pipeline, drawLine is a simplified case of drawPolygon and so
the overhead of lots of calls to drawLine() dominates its performance.

I would expect ES2 to blow the SW pipeline out of the water with
drawLine() performance (as long as there are no additional rendering
primitives interspersed in the set of lines).

But, both should be on the same footing for the drawPolygon case.  Does
the ES2 pipeline compare similarly (hopefully better than) the SW pipeline
for the polygon case?

One thing I noticed is that we have no optimized case for drawLine() on
the SW pipeline.  It generates a path containing a single MOVETO and LINETO
and feeds it to the generalized path rasterizer when it could instead
compute the rounded/square rectangle and render it more directly.  If we
added that support then I'd expect the SW pipeline to perform the set of
drawLine calls faster than drawPolygon as well...

...jim


On 3/28/15 3:22 AM, Chris Newland wrote:


Hi Robert,


I've not filed a Jira yet as I was hoping to find time to investigate
thoroughly but when I saw your question I thought I'd better add my
findings.

I believe the issue is in the ES2Pipeline as if I run with
-Dprism.order=sw then strokePolygon outperforms the series of strokeLine
  commands as expected:

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m line Result: 44fps


java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly Result: 60fps


Will see if I can find the root cause as I've got plenty more examples
where ES2Pipeline performs horribly on my Mac which should have no
problem throwing around a few thousand polys.

I realise there's a *lot* of indirection involved in making JavaFX
support such a wide range of underlying graphics systems but I do think
there's a bug here.

Will file a Jira if I can contribute a bit more than feels slow ;)


Cheers,


Chris


On Sat, March 28, 2015 10:06, Robert Krüger wrote:


This is consistent with what I am observing. Is this something that
Oracle
is aware of? Looking at Jira, I don't see that anyone is working on
this:


https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%2
2In%
20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%2
0la
bels%20in%20(performance)

Given that one of the One of the main reasons to use JFX for me is to
be able to develop with one code base for at least OSX and Windows and
the official statement what JavaFX is 

Re: Canvas performance on Mac OS

2015-04-05 Thread Robert Krüger
Hi,

On Sat, Apr 4, 2015 at 10:31 PM, Chris Newland cnewl...@chrisnewland.com
wrote:

 Hi Jim,



-snip


 I think my question is:

 Does the OpenJFX group think JavaFX is a suitable technology for full
 frame rate canvas-style graphics or is the degree of indirection between
 application code and the graphics hardware just too great?


I think there is also a general problem not related to 2d drawing at least
on 10.10.2. For RT-40377 I created a simple node-based alternative which is
animating _one_ circle and in full-screen mode I get 25-35 fps on my retina
MBP. Maybe it's unrelated but maybe there is an additional throttle
somewhere also affecting your case.


 I would have expected the hardware I've tested on to eat 2500 triangles at
 60fps for breakfast even with no GPU acceleration.


Yes, for my case with one circle I would have expected almost no CPU but I
still get 15% which I find quite a bit for rendering one circle 30
times/sec.


 I'm going to knock up a version of this code that uses Graphics2D for
 comparison.


If you do that, please also include numbers for running that code with
Apple Java 6 as well, because there are quite a few people still saying
that Apple's Java 6 outperforms Oracle's Java by a lot in 2D Graphics.


 Cheers,

 Chris


I don't know what else to do but to lobby here and invest some work in Jira
issues with reproducible test cases. There is a huge performance problem on
the Mac (I have to admit, I have no Windows machine to compare myself) with
the potential to drive companies like ours, which is seriously
considering/testing the technology for our product development, away from
the technology. I would also hope that other people who have encountered
this like the Ultramixer guys don't give up on this and keep posting
qualified information, making the case for this and supporting the Oracle
team by reproducible benchmarks/test cases.

Cheers,

Robert


Re: Canvas performance on Mac OS

2015-04-04 Thread Chris Newland
Hi Jim,

The first numbers were for my 27 2011 iMac which runs OSX 10.9 Mavericks.

Here are my numbers for a 2013 MacBook Pro (13 Retina) 2.4 GHz Intel Core
i5 / 8GB / Intel Iris 1536 MB / OSX 10.10.2 Yosemite

I don't get 60fps with either pipeline:

java -Dprism.order=es2 -cp target/classes
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 22
fps: 30
fps: 30
fps: 32

java -Dprism.order=sw -cp target/classes
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 28
fps: 34
fps: 33
fps: 34

The OSX Activity Monitor shows the CPU for the Java process near 100% so
it's CPU bound for both pipelines.

On my iMac where I get 60fps with sw pipeline the CPU is only 50%.

I've written a bunch of other JavaFX effects and it's only the routines
that use on strokePolygon and fillPolygon that don't get 60fps once the
polygon count goes above a few hundred.

I've checked the JIT compilation in my application code with JITWatch and
everything is compiled and inlined as I'd expect.

GC logs show a GC every couple of seconds freeing up about 30MB:

13.505: [GC (Allocation Failure) [PSYoungGen: 31983K-96K(36352K)]
37760K-5889K(123904K), 0.0013589 secs] [Times: user=0.00 sys=0.00,
real=0.00 secs]
fps: 32
fps: 32
15.089: [GC (Allocation Failure) [PSYoungGen: 31328K-160K(36352K)]
37121K-5969K(123904K), 0.0008222 secs] [Times: user=0.00 sys=0.00,
real=0.00 secs]
fps: 33
16.683: [GC (Allocation Failure) [PSYoungGen: 30880K-194K(35840K)]
36689K-6011K(123392K), 0.0005803 secs] [Times: user=0.00 sys=0.00,
real=0.00 secs]

I think my question is:

Does the OpenJFX group think JavaFX is a suitable technology for full
frame rate canvas-style graphics or is the degree of indirection between
application code and the graphics hardware just too great?

I would have expected the hardware I've tested on to eat 2500 triangles at
60fps for breakfast even with no GPU acceleration.

I'm going to knock up a version of this code that uses Graphics2D for
comparison.

Cheers,

Chris

On Fri, April 3, 2015 00:13, Jim Graham wrote:
 On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
 running a newer version of MacOS?

 ...jim


 On 3/31/15 3:40 PM, Chris Newland wrote:

 Hi Hervé,


 That's a valid question :)


 Probably because


 a) All my non-UI graphics experience is with immediate-mode / raster
 systems

 b) I'm interested in using JavaFX for particle effects / demoscene /
 gaming so assumed (perhaps wrongly?) that scenegraph was not the way to
 go for that due to the very large number of nodes.

 Numbers for my Sierpinski filled triangle example:


 System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M 1024
 MB


 java -Dprism.order=es2 -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 23
 fps: 18
 fps: 25
 fps: 18
 fps: 23
 fps: 23
 fps: 19
 fps: 25


 java -Dprism.order=sw -cp target/classes/
 com.chrisnewland.demofx.standalone.Sierpinski fps: 1
 fps: 54
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60
 fps: 60


 There are never more than 2500 filled triangles on screen. JDK is
 1.8.0_40


 I would say there is a performance problem here? (or at least a need
 for documentation so as to set expectations for gc.fillPolygon).

 Best regards,


 Chris





 On Tue, March 31, 2015 22:00, Hervé Girod wrote:

 Why don't you use Nodes rather than Canvas ?



 Sent from my iPhone



 On Mar 31, 2015, at 22:31, Chris Newland
 cnewl...@chrisnewland.com
 wrote:



 Hi Jim,



 Thanks, that makes things much clearer.



 I was surprised how much was going on under the hood of
 GraphicsContext
 and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple
 rasterizing where not.

 I've managed to find an anomaly with GraphicsContext.fillPolygon
 where the software pipeline achieves the full 60fps but ES2 can only
 manage 30-35fps. It uses lots of overlapping filled triangles so I
 expect suffers from the problem you've described.

 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/co
 m/ch risnewland/demofx/standalone/Sierpinski.java

 Was full frame rate canvas drawing an expected use case for JavaFX
 or would I be better off with Graphics2D?

 Thanks,



 Chris



 On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,




 drawLine() is a very simple primitive that can be optimized with
 a GPU
 shader.  It either looks like a (potentially rotated) rectangle or
 a rounded rect - and we have optimized shaders for both cases.  A
 large number of drawLine() calls turns into simply accumulating a
 large vertex list and uploading it to the GPU with an appropriate
 shader which is very fast.

 drawPolygon() is a very complex operation that involves things
 like:


 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of
 intersection once

 To handle all of that complexity we have to involve a rasterizer
 that 

Re: Canvas performance on Mac OS

2015-04-02 Thread Jim Graham
On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you 
running a newer version of MacOS?


...jim

On 3/31/15 3:40 PM, Chris Newland wrote:

Hi Hervé,

That's a valid question :)

Probably because

a) All my non-UI graphics experience is with immediate-mode / raster systems

b) I'm interested in using JavaFX for particle effects / demoscene /
gaming so assumed (perhaps wrongly?) that scenegraph was not the way to go
for that due to the very large number of nodes.

Numbers for my Sierpinski filled triangle example:

System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M 1024 MB

java -Dprism.order=es2 -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25

java -Dprism.order=sw -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60

There are never more than 2500 filled triangles on screen. JDK is 1.8.0_40

I would say there is a performance problem here? (or at least a need for
documentation so as to set expectations for gc.fillPolygon).

Best regards,

Chris




On Tue, March 31, 2015 22:00, Hervé Girod wrote:

Why don't you use Nodes rather than Canvas ?


Sent from my iPhone



On Mar 31, 2015, at 22:31, Chris Newland cnewl...@chrisnewland.com
wrote:


Hi Jim,


Thanks, that makes things much clearer.


I was surprised how much was going on under the hood of GraphicsContext
  and hoped it was just magic glue that gave the best of GPU
acceleration where available and immediate-mode-like simple rasterizing
where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon where
the software pipeline achieves the full 60fps but ES2 can only manage
30-35fps. It uses lots of overlapping filled triangles so I expect
suffers from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/ch
risnewland/demofx/standalone/Sierpinski.java

Was full frame rate canvas drawing an expected use case for JavaFX or
would I be better off with Graphics2D?

Thanks,


Chris



On Mon, March 30, 2015 20:04, Jim Graham wrote:
Hi Chris,



drawLine() is a very simple primitive that can be optimized with a
GPU
shader.  It either looks like a (potentially rotated) rectangle or a
rounded rect - and we have optimized shaders for both cases.  A large
  number of drawLine() calls turns into simply accumulating a large
vertex list and uploading it to the GPU with an appropriate shader
which is very fast.

drawPolygon() is a very complex operation that involves things like:

- dealing with line joins between segments that don't exist for
drawLine() - dealing with only rendering common points of intersection
  once

To handle all of that complexity we have to involve a rasterizer that
  takes the entire collection of lines, analyzes the stroke attributes
and interactions and computes a coverage mask for each pixel in the
region. We do that in software currently for all pipelines.

For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
  rasterization.

For the SW pipeline, drawLine is a simplified case of drawPolygon and
so the overhead of lots of calls to drawLine() dominates its
performance.

I would expect ES2 to blow the SW pipeline out of the water with
drawLine() performance (as long as there are no additional rendering
primitives interspersed in the set of lines).

But, both should be on the same footing for the drawPolygon case.
Does
the ES2 pipeline compare similarly (hopefully better than) the SW
pipeline for the polygon case?

One thing I noticed is that we have no optimized case for drawLine()
on the SW pipeline.  It generates a path containing a single MOVETO
and LINETO and feeds it to the generalized path rasterizer when it
could instead compute the rounded/square rectangle and render it more
directly.  If we added that support then I'd expect the SW pipeline to
perform the set of drawLine calls faster than drawPolygon as well...

...jim




On 3/28/15 3:22 AM, Chris Newland wrote:


Hi Robert,



I've not filed a Jira yet as I was hoping to find time to
investigate thoroughly but when I saw your question I thought I'd
better add my findings.

I believe the issue is in the ES2Pipeline as if I run with
-Dprism.order=sw then strokePolygon outperforms the series of
strokeLine commands as expected:

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m line Result:
44fps



java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly Result:
60fps



Will see if I can find the root cause as I've got plenty more
examples where ES2Pipeline performs horribly on my Mac which should
have no problem throwing around a few thousand polys.

I realise there's a *lot* of indirection involved in making JavaFX
support such a wide range of underlying 

Re: Canvas performance on Mac OS

2015-03-31 Thread Hervé Girod
Why don't you use Nodes rather than Canvas ?

Sent from my iPhone

 On Mar 31, 2015, at 22:31, Chris Newland cnewl...@chrisnewland.com wrote:
 
 Hi Jim,
 
 Thanks, that makes things much clearer.
 
 I was surprised how much was going on under the hood of GraphicsContext
 and hoped it was just magic glue that gave the best of GPU acceleration
 where available and immediate-mode-like simple rasterizing where not.
 
 I've managed to find an anomaly with GraphicsContext.fillPolygon where the
 software pipeline achieves the full 60fps but ES2 can only manage
 30-35fps. It uses lots of overlapping filled triangles so I expect suffers
 from the problem you've described.
 
 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/chrisnewland/demofx/standalone/Sierpinski.java
 
 Was full frame rate canvas drawing an expected use case for JavaFX or
 would I be better off with Graphics2D?
 
 Thanks,
 
 Chris
 
 On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,
 
 
 drawLine() is a very simple primitive that can be optimized with a GPU
 shader.  It either looks like a (potentially rotated) rectangle or a
 rounded rect - and we have optimized shaders for both cases.  A large
 number of drawLine() calls turns into simply accumulating a large vertex
 list and uploading it to the GPU with an appropriate shader which is very
 fast.
 
 drawPolygon() is a very complex operation that involves things like:
 
 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of intersection
 once
 
 To handle all of that complexity we have to involve a rasterizer that
 takes the entire collection of lines, analyzes the stroke attributes and
 interactions and computes a coverage mask for each pixel in the region. We
 do that in software currently for all pipelines.
 
 For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
 rasterization.
 
 For the SW pipeline, drawLine is a simplified case of drawPolygon and so
 the overhead of lots of calls to drawLine() dominates its performance.
 
 I would expect ES2 to blow the SW pipeline out of the water with
 drawLine() performance (as long as there are no additional rendering
 primitives interspersed in the set of lines).
 
 But, both should be on the same footing for the drawPolygon case.  Does
 the ES2 pipeline compare similarly (hopefully better than) the SW pipeline
 for the polygon case?
 
 One thing I noticed is that we have no optimized case for drawLine() on
 the SW pipeline.  It generates a path containing a single MOVETO and LINETO
 and feeds it to the generalized path rasterizer when it could instead
 compute the rounded/square rectangle and render it more directly.  If we
 added that support then I'd expect the SW pipeline to perform the set of
 drawLine calls faster than drawPolygon as well...
 
 ...jim
 
 
 On 3/28/15 3:22 AM, Chris Newland wrote:
 
 Hi Robert,
 
 
 I've not filed a Jira yet as I was hoping to find time to investigate
 thoroughly but when I saw your question I thought I'd better add my
 findings.
 
 I believe the issue is in the ES2Pipeline as if I run with
 -Dprism.order=sw then strokePolygon outperforms the series of strokeLine
 commands as expected:
 
 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m line Result: 44fps
 
 
 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly Result: 60fps
 
 
 Will see if I can find the root cause as I've got plenty more examples
 where ES2Pipeline performs horribly on my Mac which should have no
 problem throwing around a few thousand polys.
 
 I realise there's a *lot* of indirection involved in making JavaFX
 support such a wide range of underlying graphics systems but I do think
 there's a bug here.
 
 Will file a Jira if I can contribute a bit more than feels slow ;)
 
 
 Cheers,
 
 
 Chris
 
 
 On Sat, March 28, 2015 10:06, Robert Krüger wrote:
 
 This is consistent with what I am observing. Is this something that
 Oracle
 is aware of? Looking at Jira, I don't see that anyone is working on
 this:
 
 
 https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%2
 2In%
 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%2
 0la
 bels%20in%20(performance)
 
 Given that one of the One of the main reasons to use JFX for me is to
 be able to develop with one code base for at least OSX and Windows and
 the official statement what JavaFX is for, i.e.
 
 JavaFX is a set of graphics and media packages that enables
 developers to design, create, test, debug, and deploy rich client
 applications that operate consistently across diverse platforms
 
 and the fact that this is clearly not the case currently (8u40) as
 soon as I do something else than simple forms, I run into
 performance/quality problems on the Mac, I am a bit unsure what to
 make of all that. Is Mac OSX
 a second-class citizen as far as dev 

Re: Canvas performance on Mac OS

2015-03-31 Thread Chris Newland
Hi Jim,

Thanks, that makes things much clearer.

I was surprised how much was going on under the hood of GraphicsContext
and hoped it was just magic glue that gave the best of GPU acceleration
where available and immediate-mode-like simple rasterizing where not.

I've managed to find an anomaly with GraphicsContext.fillPolygon where the
software pipeline achieves the full 60fps but ES2 can only manage
30-35fps. It uses lots of overlapping filled triangles so I expect suffers
from the problem you've described.

SSCCE:
https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/chrisnewland/demofx/standalone/Sierpinski.java

Was full frame rate canvas drawing an expected use case for JavaFX or
would I be better off with Graphics2D?

Thanks,

Chris

On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,


 drawLine() is a very simple primitive that can be optimized with a GPU
 shader.  It either looks like a (potentially rotated) rectangle or a
 rounded rect - and we have optimized shaders for both cases.  A large
 number of drawLine() calls turns into simply accumulating a large vertex
 list and uploading it to the GPU with an appropriate shader which is very
 fast.

 drawPolygon() is a very complex operation that involves things like:

 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of intersection
 once

 To handle all of that complexity we have to involve a rasterizer that
 takes the entire collection of lines, analyzes the stroke attributes and
 interactions and computes a coverage mask for each pixel in the region. We
 do that in software currently for all pipelines.

 For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
 rasterization.

 For the SW pipeline, drawLine is a simplified case of drawPolygon and so
 the overhead of lots of calls to drawLine() dominates its performance.

 I would expect ES2 to blow the SW pipeline out of the water with
 drawLine() performance (as long as there are no additional rendering
 primitives interspersed in the set of lines).

 But, both should be on the same footing for the drawPolygon case.  Does
 the ES2 pipeline compare similarly (hopefully better than) the SW pipeline
 for the polygon case?

 One thing I noticed is that we have no optimized case for drawLine() on
 the SW pipeline.  It generates a path containing a single MOVETO and LINETO
 and feeds it to the generalized path rasterizer when it could instead
 compute the rounded/square rectangle and render it more directly.  If we
 added that support then I'd expect the SW pipeline to perform the set of
 drawLine calls faster than drawPolygon as well...

 ...jim


 On 3/28/15 3:22 AM, Chris Newland wrote:

 Hi Robert,


 I've not filed a Jira yet as I was hoping to find time to investigate
 thoroughly but when I saw your question I thought I'd better add my
 findings.

 I believe the issue is in the ES2Pipeline as if I run with
 -Dprism.order=sw then strokePolygon outperforms the series of strokeLine
  commands as expected:

 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m line Result: 44fps


 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly Result: 60fps


 Will see if I can find the root cause as I've got plenty more examples
 where ES2Pipeline performs horribly on my Mac which should have no
 problem throwing around a few thousand polys.

 I realise there's a *lot* of indirection involved in making JavaFX
 support such a wide range of underlying graphics systems but I do think
 there's a bug here.

 Will file a Jira if I can contribute a bit more than feels slow ;)


 Cheers,


 Chris


 On Sat, March 28, 2015 10:06, Robert Krüger wrote:

 This is consistent with what I am observing. Is this something that
 Oracle
 is aware of? Looking at Jira, I don't see that anyone is working on
 this:


 https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%2
 2In%
 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%2
 0la
 bels%20in%20(performance)

 Given that one of the One of the main reasons to use JFX for me is to
 be able to develop with one code base for at least OSX and Windows and
 the official statement what JavaFX is for, i.e.

 JavaFX is a set of graphics and media packages that enables
 developers to design, create, test, debug, and deploy rich client
 applications that operate consistently across diverse platforms

 and the fact that this is clearly not the case currently (8u40) as
 soon as I do something else than simple forms, I run into
 performance/quality problems on the Mac, I am a bit unsure what to
 make of all that. Is Mac OSX
 a second-class citizen as far as dev resources are concerned?

 Tobi and Chris, have you filed Jira Issues on Mac graphics
 performance that can be tracked?

 I will file an issue with a simple test case and hope for the best.






 On 

Re: Canvas performance on Mac OS

2015-03-31 Thread Chris Newland
Hi Hervé,

That's a valid question :)

Probably because

a) All my non-UI graphics experience is with immediate-mode / raster systems

b) I'm interested in using JavaFX for particle effects / demoscene /
gaming so assumed (perhaps wrongly?) that scenegraph was not the way to go
for that due to the very large number of nodes.

Numbers for my Sierpinski filled triangle example:

System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M 1024 MB

java -Dprism.order=es2 -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 23
fps: 18
fps: 25
fps: 18
fps: 23
fps: 23
fps: 19
fps: 25

java -Dprism.order=sw -cp target/classes/
com.chrisnewland.demofx.standalone.Sierpinski
fps: 1
fps: 54
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60
fps: 60

There are never more than 2500 filled triangles on screen. JDK is 1.8.0_40

I would say there is a performance problem here? (or at least a need for
documentation so as to set expectations for gc.fillPolygon).

Best regards,

Chris




On Tue, March 31, 2015 22:00, Hervé Girod wrote:
 Why don't you use Nodes rather than Canvas ?


 Sent from my iPhone


 On Mar 31, 2015, at 22:31, Chris Newland cnewl...@chrisnewland.com
 wrote:


 Hi Jim,


 Thanks, that makes things much clearer.


 I was surprised how much was going on under the hood of GraphicsContext
  and hoped it was just magic glue that gave the best of GPU
 acceleration where available and immediate-mode-like simple rasterizing
 where not.

 I've managed to find an anomaly with GraphicsContext.fillPolygon where
 the software pipeline achieves the full 60fps but ES2 can only manage
 30-35fps. It uses lots of overlapping filled triangles so I expect
 suffers from the problem you've described.

 SSCCE:
 https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/com/ch
 risnewland/demofx/standalone/Sierpinski.java

 Was full frame rate canvas drawing an expected use case for JavaFX or
 would I be better off with Graphics2D?

 Thanks,


 Chris


 On Mon, March 30, 2015 20:04, Jim Graham wrote:
 Hi Chris,



 drawLine() is a very simple primitive that can be optimized with a
 GPU
 shader.  It either looks like a (potentially rotated) rectangle or a
 rounded rect - and we have optimized shaders for both cases.  A large
  number of drawLine() calls turns into simply accumulating a large
 vertex list and uploading it to the GPU with an appropriate shader
 which is very fast.

 drawPolygon() is a very complex operation that involves things like:

 - dealing with line joins between segments that don't exist for
 drawLine() - dealing with only rendering common points of intersection
  once

 To handle all of that complexity we have to involve a rasterizer that
  takes the entire collection of lines, analyzes the stroke attributes
 and interactions and computes a coverage mask for each pixel in the
 region. We do that in software currently for all pipelines.

 For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path
  rasterization.

 For the SW pipeline, drawLine is a simplified case of drawPolygon and
 so the overhead of lots of calls to drawLine() dominates its
 performance.

 I would expect ES2 to blow the SW pipeline out of the water with
 drawLine() performance (as long as there are no additional rendering
 primitives interspersed in the set of lines).

 But, both should be on the same footing for the drawPolygon case.
 Does
 the ES2 pipeline compare similarly (hopefully better than) the SW
 pipeline for the polygon case?

 One thing I noticed is that we have no optimized case for drawLine()
 on the SW pipeline.  It generates a path containing a single MOVETO
 and LINETO and feeds it to the generalized path rasterizer when it
 could instead compute the rounded/square rectangle and render it more
 directly.  If we added that support then I'd expect the SW pipeline to
 perform the set of drawLine calls faster than drawPolygon as well...

 ...jim



 On 3/28/15 3:22 AM, Chris Newland wrote:


 Hi Robert,



 I've not filed a Jira yet as I was hoping to find time to
 investigate thoroughly but when I saw your question I thought I'd
 better add my findings.

 I believe the issue is in the ES2Pipeline as if I run with
 -Dprism.order=sw then strokePolygon outperforms the series of
 strokeLine commands as expected:

 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m line Result:
 44fps



 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly Result:
 60fps



 Will see if I can find the root cause as I've got plenty more
 examples where ES2Pipeline performs horribly on my Mac which should
 have no problem throwing around a few thousand polys.

 I realise there's a *lot* of indirection involved in making JavaFX
 support such a wide range of underlying graphics systems but I do
 think there's a bug here.

 Will file a Jira if I can contribute a bit more than feels slow
 ;)

Re: Canvas performance on Mac OS

2015-03-30 Thread Jim Graham



On 3/30/15 12:04 PM, Jim Graham wrote:

drawPolygon() is a very complex operation that involves things like:

- dealing with only rendering common points of intersection once


An example of the distinction here - try a test case where you execute 
the exact same diagonal line primitive 1,000 times on top of itself 
(identical coordinates for all of them).


Then change the example to use a Polygon that goes from point A to point 
B and back, over itself 1,000 times.


The result of all of those lines will have jagged edges even though the 
lines themselves are antialiased because the partially filled pixels 
along the edges slowly accumulate opacity until their carefully blended 
edges get lost in the accumulated error.


The result of the polygon will be identical to just drawing an 
antialiased line from point A to point B because it is turned into a 
single coverage result by the software rasterizer.


Another similar example - set an opacity of 0.1 on all of those 
rendering calls.  The (multi-)drawLine example will look like an opaque 
line of 1.0 opacity, but the polygon will still look like it has an 
opacity of 0.1 because the coverages are accumulated across the entire 
polygon before any rendering occurs and so each pixel is only blended 
once...


...jim


Re: Canvas performance on Mac OS

2015-03-30 Thread Jim Graham

Hi Chris,

drawLine() is a very simple primitive that can be optimized with a GPU 
shader.  It either looks like a (potentially rotated) rectangle or a 
rounded rect - and we have optimized shaders for both cases.  A large 
number of drawLine() calls turns into simply accumulating a large vertex 
list and uploading it to the GPU with an appropriate shader which is 
very fast.


drawPolygon() is a very complex operation that involves things like:

- dealing with line joins between segments that don't exist for drawLine()
- dealing with only rendering common points of intersection once

To handle all of that complexity we have to involve a rasterizer that 
takes the entire collection of lines, analyzes the stroke attributes and 
interactions and computes a coverage mask for each pixel in the region. 
 We do that in software currently for all pipelines.


For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs CPU path 
rasterization.


For the SW pipeline, drawLine is a simplified case of drawPolygon and so 
the overhead of lots of calls to drawLine() dominates its performance.


I would expect ES2 to blow the SW pipeline out of the water with 
drawLine() performance (as long as there are no additional rendering 
primitives interspersed in the set of lines).


But, both should be on the same footing for the drawPolygon case.  Does 
the ES2 pipeline compare similarly (hopefully better than) the SW 
pipeline for the polygon case?


One thing I noticed is that we have no optimized case for drawLine() on 
the SW pipeline.  It generates a path containing a single MOVETO and 
LINETO and feeds it to the generalized path rasterizer when it could 
instead compute the rounded/square rectangle and render it more 
directly.  If we added that support then I'd expect the SW pipeline to 
perform the set of drawLine calls faster than drawPolygon as well...


...jim

On 3/28/15 3:22 AM, Chris Newland wrote:

Hi Robert,

I've not filed a Jira yet as I was hoping to find time to investigate
thoroughly but when I saw your question I thought I'd better add my
findings.

I believe the issue is in the ES2Pipeline as if I run with
-Dprism.order=sw then strokePolygon outperforms the series of strokeLine
commands as expected:

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m line
Result: 44fps

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly
Result: 60fps

Will see if I can find the root cause as I've got plenty more examples
where ES2Pipeline performs horribly on my Mac which should have no problem
throwing around a few thousand polys.

I realise there's a *lot* of indirection involved in making JavaFX support
such a wide range of underlying graphics systems but I do think there's a
bug here.

Will file a Jira if I can contribute a bit more than feels slow ;)

Cheers,

Chris

On Sat, March 28, 2015 10:06, Robert Krüger wrote:

This is consistent with what I am observing. Is this something that
Oracle
is aware of? Looking at Jira, I don't see that anyone is working on this:

https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%22In%
20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%20la
bels%20in%20(performance)

Given that one of the One of the main reasons to use JFX for me is to be
able to develop with one code base for at least OSX and Windows and the
official statement what JavaFX is for, i.e.

JavaFX is a set of graphics and media packages that enables developers
to design, create, test, debug, and deploy rich client applications that
operate consistently across diverse platforms

and the fact that this is clearly not the case currently (8u40) as soon
as I do something else than simple forms, I run into performance/quality
problems on the Mac, I am a bit unsure what to make of all that. Is Mac
OSX
a second-class citizen as far as dev resources are concerned?

Tobi and Chris, have you filed Jira Issues on Mac graphics performance
that can be tracked?

I will file an issue with a simple test case and hope for the best.





On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland
cnewl...@chrisnewland.com
wrote:



Possibly related:


I can reproduce a massive (90%) performance drop on OSX between drawing
a wireframe polygon on a Canvas using a series of gc.strokeLine(double
x1, double y1, double x2, double y2) commands versus using a single
gc.strokePolygon(double[] xPoints, double[] yPoints, int count)
command.

Creating the polygons manually with strokeLine() is significantly
faster using the ES2Pipeline on OSX.

This is reproducible in a little GitHub JavaFX benchmarking project
I've
created: https://github.com/chriswhocodes/DemoFX


Build with ant


Run with:


# use strokeLine
./run.sh -c 5000 -m line
result: 60 (sixty) fps


# use strokePolygon
./run.sh -c 5000 -m poly
result: 6 (six) fps


System is 2011 iMac 27 / Mavericks / 3.4GHz Core i7 / 20GB RAM /
Radeon
6970M 1024MB



Re: Canvas performance on Mac OS

2015-03-28 Thread Robert Krüger
I have filed this now:

https://javafx-jira.kenai.com/browse/RT-40377


On Sat, Mar 28, 2015 at 11:06 AM, Robert Krüger krue...@lesspain.de wrote:

 This is consistent with what I am observing. Is this something that Oracle
 is aware of? Looking at Jira, I don't see that anyone is working on this:


 https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%20labels%20in%20(performance)

 Given that one of the One of the main reasons to use JFX for me is to be
 able to develop with one code base for at least OSX and Windows and the
 official statement what JavaFX is for, i.e.

 JavaFX is a set of graphics and media packages that enables developers to
 design, create, test, debug, and deploy rich client applications that
 operate consistently across diverse platforms

 and the fact that this is clearly not the case currently (8u40) as soon as
 I do something else than simple forms, I run into performance/quality
 problems on the Mac, I am a bit unsure what to make of all that. Is Mac OSX
 a second-class citizen as far as dev resources are concerned?

 Tobi and Chris, have you filed Jira Issues on Mac graphics performance
 that can be tracked?

 I will file an issue with a simple test case and hope for the best.




 On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland cnewl...@chrisnewland.com
  wrote:

 Possibly related:

 I can reproduce a massive (90%) performance drop on OSX between drawing a
 wireframe polygon on a Canvas using a series of gc.strokeLine(double x1,
 double y1, double x2, double y2) commands versus using a single
 gc.strokePolygon(double[] xPoints, double[] yPoints, int count) command.

 Creating the polygons manually with strokeLine() is significantly faster
 using the ES2Pipeline on OSX.

 This is reproducible in a little GitHub JavaFX benchmarking project I've
 created: https://github.com/chriswhocodes/DemoFX

 Build with ant

 Run with:

 # use strokeLine
 ./run.sh -c 5000 -m line
 result: 60 (sixty) fps

 # use strokePolygon
 ./run.sh -c 5000 -m poly
 result: 6 (six) fps

 System is 2011 iMac 27 / Mavericks / 3.4GHz Core i7 / 20GB RAM / Radeon
 6970M 1024MB

 Looking at the code paths in javafx.scene.canvas.GraphicsContext:

 gc.strokeLine() maps to writeOp4(x1, y1, x2, y2, NGCanvas.STROKE_LINE)

 gc.strokePolygon() maps to writePoly(xPoints, yPoints, nPoints, true,
 NGCanvas.STROKE_PATH) which involves significantly more work with adding
 to and flushing a GrowableDataBuffer.

 I've not had time to dig any deeper than this but it's surely a bug when
 building a poly manually is 10x faster than using the convenience method.

 Cheers,

 Chris

 On Fri, March 27, 2015 21:26, Tobias Bley wrote:
  In my opinion the whole graphics performance on MacOSX isn’t good at
  all with JavaFX….
 
 
  Am 27.03.2015 um 22:10 schrieb Robert Krüger krue...@lesspain.de:
 
 
  The bad full screen performance is without the arcs. It is just one
  call to fillRect, two to strokeOval and one to fillOval, that's all. I
  will build a simple test case and file an issue.
 
  On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham james.gra...@oracle.com
  wrote:
 
 
  Hi Robert,
 
 
  Please file a Jira issue with a simple test case.  Arcs are handled
  as a generalized shape rather than via a predetermined shader, but it
  shouldn't be that slow.  Something else may be going on.
 
  Another test might be to replace the arcs with rectangles or ellipses
  and see if the performance changes...
 
  ...jim
 
 
 
  On 3/27/15 1:52 PM, Robert Krüger wrote:
 
 
  Hi,
 
 
  I have a super-simple animation implemented using AnimationTimer
  and Canvas
  where the canvas just performs a few draw operations, i.e. fills the
   screen with a color and then draws and fills 2-3 circles and I have
  already observed that each drawing operation I add, results in
  significant CPU load (e.g. when I draw  10 arcs in addition to the
  circles, the CPU load goes up to 30-40% on a Mac Book Pro for a
  Canvas size of 600x600(!).
 
 
  Now I tested the animation in full screen mode (only with a few
  circles) and playback is unusable for a serious application (very
  choppy). Is 2D canvas performance known to be very bad on Mac or am
  I doing something
  wrong? Are there workarounds for this?
 
  Thanks,
 
 
  Robert
 
 
 
 
 
  --
  Robert Krüger
  Managing Partner
  Lesspain GmbH  Co. KG
 
 
  www.lesspain-software.com
 
 





 --
 Robert Krüger
 Managing Partner
 Lesspain GmbH  Co. KG

 www.lesspain-software.com




-- 
Robert Krüger
Managing Partner
Lesspain GmbH  Co. KG

www.lesspain-software.com


Re: Canvas performance on Mac OS

2015-03-28 Thread Robert Krüger
This is consistent with what I am observing. Is this something that Oracle
is aware of? Looking at Jira, I don't see that anyone is working on this:

https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%20labels%20in%20(performance)

Given that one of the One of the main reasons to use JFX for me is to be
able to develop with one code base for at least OSX and Windows and the
official statement what JavaFX is for, i.e.

JavaFX is a set of graphics and media packages that enables developers to
design, create, test, debug, and deploy rich client applications that
operate consistently across diverse platforms

and the fact that this is clearly not the case currently (8u40) as soon as
I do something else than simple forms, I run into performance/quality
problems on the Mac, I am a bit unsure what to make of all that. Is Mac OSX
a second-class citizen as far as dev resources are concerned?

Tobi and Chris, have you filed Jira Issues on Mac graphics performance that
can be tracked?

I will file an issue with a simple test case and hope for the best.




On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland cnewl...@chrisnewland.com
wrote:

 Possibly related:

 I can reproduce a massive (90%) performance drop on OSX between drawing a
 wireframe polygon on a Canvas using a series of gc.strokeLine(double x1,
 double y1, double x2, double y2) commands versus using a single
 gc.strokePolygon(double[] xPoints, double[] yPoints, int count) command.

 Creating the polygons manually with strokeLine() is significantly faster
 using the ES2Pipeline on OSX.

 This is reproducible in a little GitHub JavaFX benchmarking project I've
 created: https://github.com/chriswhocodes/DemoFX

 Build with ant

 Run with:

 # use strokeLine
 ./run.sh -c 5000 -m line
 result: 60 (sixty) fps

 # use strokePolygon
 ./run.sh -c 5000 -m poly
 result: 6 (six) fps

 System is 2011 iMac 27 / Mavericks / 3.4GHz Core i7 / 20GB RAM / Radeon
 6970M 1024MB

 Looking at the code paths in javafx.scene.canvas.GraphicsContext:

 gc.strokeLine() maps to writeOp4(x1, y1, x2, y2, NGCanvas.STROKE_LINE)

 gc.strokePolygon() maps to writePoly(xPoints, yPoints, nPoints, true,
 NGCanvas.STROKE_PATH) which involves significantly more work with adding
 to and flushing a GrowableDataBuffer.

 I've not had time to dig any deeper than this but it's surely a bug when
 building a poly manually is 10x faster than using the convenience method.

 Cheers,

 Chris

 On Fri, March 27, 2015 21:26, Tobias Bley wrote:
  In my opinion the whole graphics performance on MacOSX isn’t good at
  all with JavaFX….
 
 
  Am 27.03.2015 um 22:10 schrieb Robert Krüger krue...@lesspain.de:
 
 
  The bad full screen performance is without the arcs. It is just one
  call to fillRect, two to strokeOval and one to fillOval, that's all. I
  will build a simple test case and file an issue.
 
  On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham james.gra...@oracle.com
  wrote:
 
 
  Hi Robert,
 
 
  Please file a Jira issue with a simple test case.  Arcs are handled
  as a generalized shape rather than via a predetermined shader, but it
  shouldn't be that slow.  Something else may be going on.
 
  Another test might be to replace the arcs with rectangles or ellipses
  and see if the performance changes...
 
  ...jim
 
 
 
  On 3/27/15 1:52 PM, Robert Krüger wrote:
 
 
  Hi,
 
 
  I have a super-simple animation implemented using AnimationTimer
  and Canvas
  where the canvas just performs a few draw operations, i.e. fills the
   screen with a color and then draws and fills 2-3 circles and I have
  already observed that each drawing operation I add, results in
  significant CPU load (e.g. when I draw  10 arcs in addition to the
  circles, the CPU load goes up to 30-40% on a Mac Book Pro for a
  Canvas size of 600x600(!).
 
 
  Now I tested the animation in full screen mode (only with a few
  circles) and playback is unusable for a serious application (very
  choppy). Is 2D canvas performance known to be very bad on Mac or am
  I doing something
  wrong? Are there workarounds for this?
 
  Thanks,
 
 
  Robert
 
 
 
 
 
  --
  Robert Krüger
  Managing Partner
  Lesspain GmbH  Co. KG
 
 
  www.lesspain-software.com
 
 





-- 
Robert Krüger
Managing Partner
Lesspain GmbH  Co. KG

www.lesspain-software.com


Re: Canvas performance on Mac OS

2015-03-28 Thread Chris Newland
Hi Robert,

I've not filed a Jira yet as I was hoping to find time to investigate
thoroughly but when I saw your question I thought I'd better add my
findings.

I believe the issue is in the ES2Pipeline as if I run with
-Dprism.order=sw then strokePolygon outperforms the series of strokeLine
commands as expected:

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m line
Result: 44fps

java -cp target/DemoFX.jar -Dprism.order=sw
com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly
Result: 60fps

Will see if I can find the root cause as I've got plenty more examples
where ES2Pipeline performs horribly on my Mac which should have no problem
throwing around a few thousand polys.

I realise there's a *lot* of indirection involved in making JavaFX support
such a wide range of underlying graphics systems but I do think there's a
bug here.

Will file a Jira if I can contribute a bit more than feels slow ;)

Cheers,

Chris

On Sat, March 28, 2015 10:06, Robert Krüger wrote:
 This is consistent with what I am observing. Is this something that
 Oracle
 is aware of? Looking at Jira, I don't see that anyone is working on this:

 https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Open%2C%20%22In%
 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%20%20AND%20la
 bels%20in%20(performance)

 Given that one of the One of the main reasons to use JFX for me is to be
 able to develop with one code base for at least OSX and Windows and the
 official statement what JavaFX is for, i.e.

 JavaFX is a set of graphics and media packages that enables developers
 to design, create, test, debug, and deploy rich client applications that
 operate consistently across diverse platforms

 and the fact that this is clearly not the case currently (8u40) as soon
 as I do something else than simple forms, I run into performance/quality
 problems on the Mac, I am a bit unsure what to make of all that. Is Mac
 OSX
 a second-class citizen as far as dev resources are concerned?

 Tobi and Chris, have you filed Jira Issues on Mac graphics performance
 that can be tracked?

 I will file an issue with a simple test case and hope for the best.





 On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland
 cnewl...@chrisnewland.com
 wrote:


 Possibly related:


 I can reproduce a massive (90%) performance drop on OSX between drawing
 a wireframe polygon on a Canvas using a series of gc.strokeLine(double
 x1, double y1, double x2, double y2) commands versus using a single
 gc.strokePolygon(double[] xPoints, double[] yPoints, int count)
 command.

 Creating the polygons manually with strokeLine() is significantly
 faster using the ES2Pipeline on OSX.

 This is reproducible in a little GitHub JavaFX benchmarking project
 I've
 created: https://github.com/chriswhocodes/DemoFX


 Build with ant


 Run with:


 # use strokeLine
 ./run.sh -c 5000 -m line
 result: 60 (sixty) fps


 # use strokePolygon
 ./run.sh -c 5000 -m poly
 result: 6 (six) fps


 System is 2011 iMac 27 / Mavericks / 3.4GHz Core i7 / 20GB RAM /
 Radeon
 6970M 1024MB


 Looking at the code paths in javafx.scene.canvas.GraphicsContext:


 gc.strokeLine() maps to writeOp4(x1, y1, x2, y2, NGCanvas.STROKE_LINE)

 gc.strokePolygon() maps to writePoly(xPoints, yPoints, nPoints, true,
 NGCanvas.STROKE_PATH) which involves significantly more work with
 adding to and flushing a GrowableDataBuffer.

 I've not had time to dig any deeper than this but it's surely a bug
 when building a poly manually is 10x faster than using the convenience
 method.

 Cheers,


 Chris


 On Fri, March 27, 2015 21:26, Tobias Bley wrote:

 In my opinion the whole graphics performance on MacOSX isn’t
 good at all with JavaFX….


 Am 27.03.2015 um 22:10 schrieb Robert Krüger
 krue...@lesspain.de:



 The bad full screen performance is without the arcs. It is just one
  call to fillRect, two to strokeOval and one to fillOval, that's
 all. I will build a simple test case and file an issue.

 On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham
 james.gra...@oracle.com
 wrote:



 Hi Robert,



 Please file a Jira issue with a simple test case.  Arcs are
 handled as a generalized shape rather than via a predetermined
 shader, but it shouldn't be that slow.  Something else may be
 going on.

 Another test might be to replace the arcs with rectangles or
 ellipses and see if the performance changes...

 ...jim




 On 3/27/15 1:52 PM, Robert Krüger wrote:



 Hi,



 I have a super-simple animation implemented using
 AnimationTimer
 and Canvas where the canvas just performs a few draw operations,
 i.e. fills the screen with a color and then draws and fills 2-3
 circles and I have already observed that each drawing operation
 I add, results in
 significant CPU load (e.g. when I draw  10 arcs in addition to
 the circles, the CPU load goes up to 30-40% on a Mac Book Pro
 for a Canvas size of 600x600(!).



 Now I tested the animation in full screen mode (only with a 

Re: Canvas performance on Mac OS

2015-03-28 Thread Robert Krüger
On Sat, Mar 28, 2015 at 11:22 AM, Chris Newland cnewl...@chrisnewland.com
wrote:

 Hi Robert,

 I've not filed a Jira yet as I was hoping to find time to investigate
 thoroughly but when I saw your question I thought I'd better add my
 findings.

 I believe the issue is in the ES2Pipeline as if I run with
 -Dprism.order=sw then strokePolygon outperforms the series of strokeLine
 commands as expected:

 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m line
 Result: 44fps

 java -cp target/DemoFX.jar -Dprism.order=sw
 com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly
 Result: 60fps

 Will see if I can find the root cause as I've got plenty more examples
 where ES2Pipeline performs horribly on my Mac which should have no problem
 throwing around a few thousand polys.

 I realise there's a *lot* of indirection involved in making JavaFX support
 such a wide range of underlying graphics systems but I do think there's a
 bug here.

 Will file a Jira if I can contribute a bit more than feels slow ;)

 Cheers,

 Chris


Great, thanks!


Re: Canvas performance on Mac OS

2015-03-27 Thread Jim Graham

Hi Robert,

Please file a Jira issue with a simple test case.  Arcs are handled as a 
generalized shape rather than via a predetermined shader, but it 
shouldn't be that slow.  Something else may be going on.


Another test might be to replace the arcs with rectangles or ellipses 
and see if the performance changes...


...jim

On 3/27/15 1:52 PM, Robert Krüger wrote:

Hi,

I have a super-simple animation implemented using AnimationTimer and Canvas
where the canvas just performs a few draw operations, i.e. fills the screen
with a color and then draws and fills 2-3 circles and I have already
observed that each drawing operation I add, results in significant CPU load
(e.g. when I draw  10 arcs in addition to the circles, the CPU load goes
up to 30-40% on a Mac Book Pro for a Canvas size of 600x600(!).

Now I tested the animation in full screen mode (only with a few circles)
and playback is unusable for a serious application (very choppy). Is 2D
canvas performance known to be very bad on Mac or am I doing something
wrong? Are there workarounds for this?

Thanks,

Robert



Re: Canvas performance on Mac OS

2015-03-27 Thread Robert Krüger
The bad full screen performance is without the arcs. It is just one call to
fillRect, two to strokeOval and one to fillOval, that's all. I will build a
simple test case and file an issue.

On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham james.gra...@oracle.com wrote:

 Hi Robert,

 Please file a Jira issue with a simple test case.  Arcs are handled as a
 generalized shape rather than via a predetermined shader, but it shouldn't
 be that slow.  Something else may be going on.

 Another test might be to replace the arcs with rectangles or ellipses and
 see if the performance changes...

 ...jim


 On 3/27/15 1:52 PM, Robert Krüger wrote:

 Hi,

 I have a super-simple animation implemented using AnimationTimer and
 Canvas
 where the canvas just performs a few draw operations, i.e. fills the
 screen
 with a color and then draws and fills 2-3 circles and I have already
 observed that each drawing operation I add, results in significant CPU
 load
 (e.g. when I draw  10 arcs in addition to the circles, the CPU load goes
 up to 30-40% on a Mac Book Pro for a Canvas size of 600x600(!).

 Now I tested the animation in full screen mode (only with a few circles)
 and playback is unusable for a serious application (very choppy). Is 2D
 canvas performance known to be very bad on Mac or am I doing something
 wrong? Are there workarounds for this?

 Thanks,

 Robert




-- 
Robert Krüger
Managing Partner
Lesspain GmbH  Co. KG

www.lesspain-software.com


Re: Canvas performance on Mac OS

2015-03-27 Thread Tobias Bley
In my opinion the whole graphics performance on MacOSX isn’t good at all with 
JavaFX….


 Am 27.03.2015 um 22:10 schrieb Robert Krüger krue...@lesspain.de:
 
 The bad full screen performance is without the arcs. It is just one call to
 fillRect, two to strokeOval and one to fillOval, that's all. I will build a
 simple test case and file an issue.
 
 On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham james.gra...@oracle.com wrote:
 
 Hi Robert,
 
 Please file a Jira issue with a simple test case.  Arcs are handled as a
 generalized shape rather than via a predetermined shader, but it shouldn't
 be that slow.  Something else may be going on.
 
 Another test might be to replace the arcs with rectangles or ellipses and
 see if the performance changes...
 
...jim
 
 
 On 3/27/15 1:52 PM, Robert Krüger wrote:
 
 Hi,
 
 I have a super-simple animation implemented using AnimationTimer and
 Canvas
 where the canvas just performs a few draw operations, i.e. fills the
 screen
 with a color and then draws and fills 2-3 circles and I have already
 observed that each drawing operation I add, results in significant CPU
 load
 (e.g. when I draw  10 arcs in addition to the circles, the CPU load goes
 up to 30-40% on a Mac Book Pro for a Canvas size of 600x600(!).
 
 Now I tested the animation in full screen mode (only with a few circles)
 and playback is unusable for a serious application (very choppy). Is 2D
 canvas performance known to be very bad on Mac or am I doing something
 wrong? Are there workarounds for this?
 
 Thanks,
 
 Robert
 
 
 
 
 -- 
 Robert Krüger
 Managing Partner
 Lesspain GmbH  Co. KG
 
 www.lesspain-software.com



Re: Canvas performance on Mac OS

2015-03-27 Thread Chris Newland
Possibly related:

I can reproduce a massive (90%) performance drop on OSX between drawing a
wireframe polygon on a Canvas using a series of gc.strokeLine(double x1,
double y1, double x2, double y2) commands versus using a single
gc.strokePolygon(double[] xPoints, double[] yPoints, int count) command.

Creating the polygons manually with strokeLine() is significantly faster
using the ES2Pipeline on OSX.

This is reproducible in a little GitHub JavaFX benchmarking project I've
created: https://github.com/chriswhocodes/DemoFX

Build with ant

Run with:

# use strokeLine
./run.sh -c 5000 -m line
result: 60 (sixty) fps

# use strokePolygon
./run.sh -c 5000 -m poly
result: 6 (six) fps

System is 2011 iMac 27 / Mavericks / 3.4GHz Core i7 / 20GB RAM / Radeon
6970M 1024MB

Looking at the code paths in javafx.scene.canvas.GraphicsContext:

gc.strokeLine() maps to writeOp4(x1, y1, x2, y2, NGCanvas.STROKE_LINE)

gc.strokePolygon() maps to writePoly(xPoints, yPoints, nPoints, true,
NGCanvas.STROKE_PATH) which involves significantly more work with adding
to and flushing a GrowableDataBuffer.

I've not had time to dig any deeper than this but it's surely a bug when
building a poly manually is 10x faster than using the convenience method.

Cheers,

Chris

On Fri, March 27, 2015 21:26, Tobias Bley wrote:
 In my opinion the whole graphics performance on MacOSX isn’t good at
 all with JavaFX….


 Am 27.03.2015 um 22:10 schrieb Robert Krüger krue...@lesspain.de:


 The bad full screen performance is without the arcs. It is just one
 call to fillRect, two to strokeOval and one to fillOval, that's all. I
 will build a simple test case and file an issue.

 On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham james.gra...@oracle.com
 wrote:


 Hi Robert,


 Please file a Jira issue with a simple test case.  Arcs are handled
 as a generalized shape rather than via a predetermined shader, but it
 shouldn't be that slow.  Something else may be going on.

 Another test might be to replace the arcs with rectangles or ellipses
 and see if the performance changes...

 ...jim



 On 3/27/15 1:52 PM, Robert Krüger wrote:


 Hi,


 I have a super-simple animation implemented using AnimationTimer
 and Canvas
 where the canvas just performs a few draw operations, i.e. fills the
  screen with a color and then draws and fills 2-3 circles and I have
 already observed that each drawing operation I add, results in
 significant CPU load (e.g. when I draw  10 arcs in addition to the
 circles, the CPU load goes up to 30-40% on a Mac Book Pro for a
 Canvas size of 600x600(!).


 Now I tested the animation in full screen mode (only with a few
 circles) and playback is unusable for a serious application (very
 choppy). Is 2D canvas performance known to be very bad on Mac or am
 I doing something
 wrong? Are there workarounds for this?

 Thanks,


 Robert





 --
 Robert Krüger
 Managing Partner
 Lesspain GmbH  Co. KG


 www.lesspain-software.com