This is important Thanks guys Sent from my iPhone
> On Apr 8, 2015, at 9:25 AM, Chris Newland <cnewl...@chrisnewland.com> wrote: > > Hi Jim, > > I'll post the verbose prism output from my iMac when I get home. > > Just tried this on my Linux workstation and the performance gap is the > same between es2 and sw so I don't think it's an OSX issue. > > uname -a > Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux > > "$JAVA_HOME/bin/java" -classpath target/DemoFX.jar > com.chrisnewland.demofx.standalone.Sierpinski > fps: 1 > fps: 20 > fps: 31 > fps: 32 > fps: 33 > fps: 35 > fps: 34 > fps: 33 > > "$JAVA_HOME/bin/java" -Dprism.order=sw -classpath target/DemoFX.jar > com.chrisnewland.demofx.standalone.Sierpinski > fps: 1 > fps: 54 > fps: 56 > fps: 60 > fps: 59 > fps: 60 > fps: 61 > fps: 61 > fps: 60 > > This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580 > graphics card running driver 304.125 > > Regards, > > Chris > > >> On Wed, April 8, 2015 00:16, Jim Graham wrote: >> OK, I took the time to put my rMBP on a diet yesterday and find room to >> install a 10.10 partition. I get the same numbers for Sierpinski on 10.10, >> so my theory that something changed in the OGL implementation for 10.10 >> doesn't hold water. >> >> But, I then tried it using the integrated graphics. I get really poor >> performance using the integrated Intel 4000 graphics, but I get great >> numbers on the discrete nVidia 650m. It makes sense that the Intel >> graphics wouldn't be as powerful as the discrete graphics, but we >> shouldn't be taxing it that much to make that big of a difference. >> >> Just to be sure - is that iMac a dual graphics system, or is it >> all-AMD-all-the-time? You can see which GPU is being used if you run it >> with -Dprism.verbose=true... >> >> ...jim >> >> >>> On 4/2/15 4:13 PM, Jim Graham wrote: >>> >>> On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw. Are you >>> running a newer version of MacOS? >>> >>> ...jim >>> >>> >>>> On 3/31/15 3:40 PM, Chris Newland wrote: >>>> >>>> Hi Hervé, >>>> >>>> >>>> That's a valid question :) >>>> >>>> >>>> Probably because >>>> >>>> >>>> a) All my non-UI graphics experience is with immediate-mode / raster >>>> systems >>>> >>>> b) I'm interested in using JavaFX for particle effects / demoscene / >>>> gaming so assumed (perhaps wrongly?) that scenegraph was not the way >>>> to go for that due to the very large number of nodes. >>>> >>>> Numbers for my Sierpinski filled triangle example: >>>> >>>> >>>> System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M >>>> 1024 MB >>>> >>>> >>>> java -Dprism.order=es2 -cp target/classes/ >>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1 >>>> fps: 23 >>>> fps: 18 >>>> fps: 25 >>>> fps: 18 >>>> fps: 23 >>>> fps: 23 >>>> fps: 19 >>>> fps: 25 >>>> >>>> >>>> java -Dprism.order=sw -cp target/classes/ >>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1 >>>> fps: 54 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> fps: 60 >>>> >>>> >>>> There are never more than 2500 filled triangles on screen. JDK is >>>> 1.8.0_40 >>>> >>>> >>>> I would say there is a performance problem here? (or at least a need >>>> for documentation so as to set expectations for gc.fillPolygon). >>>> >>>> Best regards, >>>> >>>> >>>> Chris >>>> >>>> >>>> >>>> >>>> >>>>> On Tue, March 31, 2015 22:00, Hervé Girod wrote: >>>>> >>>>> Why don't you use Nodes rather than Canvas ? >>>>> >>>>> >>>>> >>>>> Sent from my iPhone >>>>> >>>>> >>>>> >>>>>> On Mar 31, 2015, at 22:31, Chris Newland >>>>>> <cnewl...@chrisnewland.com> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hi Jim, >>>>>> >>>>>> >>>>>> >>>>>> Thanks, that makes things much clearer. >>>>>> >>>>>> >>>>>> >>>>>> I was surprised how much was going on under the hood of >>>>>> GraphicsContext >>>>>> and hoped it was just magic glue that gave the best of GPU >>>>>> acceleration where available and immediate-mode-like simple >>>>>> rasterizing where not. >>>>>> >>>>>> I've managed to find an anomaly with GraphicsContext.fillPolygon >>>>>> where the software pipeline achieves the full 60fps but ES2 can >>>>>> only manage 30-35fps. It uses lots of overlapping filled triangles >>>>>> so I expect suffers from the problem you've described. >>>>>> >>>>>> SSCCE: >>>>>> https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/ >>>>>> com/ch >>>>>> >>>>>> risnewland/demofx/standalone/Sierpinski.java >>>>>> >>>>>> Was full frame rate canvas drawing an expected use case for >>>>>> JavaFX or >>>>>> would I be better off with Graphics2D? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> >>>>>>> On Mon, March 30, 2015 20:04, Jim Graham wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> drawLine() is a very simple primitive that can be optimized >>>>>>> with a GPU >>>>>>> shader. It either looks like a (potentially rotated) rectangle >>>>>>> or a rounded rect - and we have optimized shaders for both >>>>>>> cases. A large number of drawLine() calls turns into simply >>>>>>> accumulating a large vertex list and uploading it to the GPU >>>>>>> with an appropriate shader which is very fast. >>>>>>> >>>>>>> drawPolygon() is a very complex operation that involves things >>>>>>> like: >>>>>>> >>>>>>> >>>>>>> - dealing with line joins between segments that don't exist for >>>>>>> drawLine() - dealing with only rendering common points of >>>>>>> intersection once >>>>>>> >>>>>>> To handle all of that complexity we have to involve a >>>>>>> rasterizer that takes the entire collection of lines, analyzes >>>>>>> the stroke attributes and interactions and computes a coverage >>>>>>> mask for each pixel in the region. We do that in software >>>>>>> currently for all pipelines. >>>>>>> >>>>>>> For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs >>>>>>> CPU path >>>>>>> rasterization. >>>>>>> >>>>>>> For the SW pipeline, drawLine is a simplified case of >>>>>>> drawPolygon and so the overhead of lots of calls to drawLine() >>>>>>> dominates its performance. >>>>>>> >>>>>>> I would expect ES2 to blow the SW pipeline out of the water >>>>>>> with drawLine() performance (as long as there are no additional >>>>>>> rendering primitives interspersed in the set of lines). >>>>>>> >>>>>>> But, both should be on the same footing for the drawPolygon >>>>>>> case. Does >>>>>>> the ES2 pipeline compare similarly (hopefully better than) the >>>>>>> SW >>>>>>> pipeline for the polygon case? >>>>>>> >>>>>>> One thing I noticed is that we have no optimized case for >>>>>>> drawLine() on the SW pipeline. It generates a path containing a >>>>>>> single MOVETO and LINETO and feeds it to the generalized path >>>>>>> rasterizer when it could instead compute the rounded/square >>>>>>> rectangle and render it more directly. If we added that support >>>>>>> then I'd expect the SW pipeline to perform the set of drawLine >>>>>>> calls faster than drawPolygon as well... >>>>>>> >>>>>>> ...jim >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 3/28/15 3:22 AM, Chris Newland wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi Robert, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I've not filed a Jira yet as I was hoping to find time to >>>>>>>> investigate thoroughly but when I saw your question I thought >>>>>>>> I'd >>>>>>>> better add my findings. >>>>>>>> >>>>>>>> I believe the issue is in the ES2Pipeline as if I run with >>>>>>>> -Dprism.order=sw then strokePolygon outperforms the series of >>>>>>>> strokeLine commands as expected: >>>>>>>> >>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw >>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m line >>>>>>>> Result: >>>>>>>> 44fps >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw >>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly >>>>>>>> Result: >>>>>>>> 60fps >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Will see if I can find the root cause as I've got plenty more >>>>>>>> examples where ES2Pipeline performs horribly on my Mac which >>>>>>>> should have no problem throwing around a few thousand polys. >>>>>>>> >>>>>>>> I realise there's a *lot* of indirection involved in making >>>>>>>> JavaFX >>>>>>>> support such a wide range of underlying graphics systems but I >>>>>>>> do think there's a bug here. >>>>>>>> >>>>>>>> Will file a Jira if I can contribute a bit more than "feels >>>>>>>> slow" ;) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On Sat, March 28, 2015 10:06, Robert Krüger wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> This is consistent with what I am observing. Is this >>>>>>>>> something that Oracle is aware of? Looking at Jira, I don't >>>>>>>>> see that anyone is working on this: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Op >>>>>>>>> en%2C% 20%2 >>>>>>>>> 2In% >>>>>>>>> 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)% >>>>>>>>> 20%20A >>>>>>>>> ND%2 >>>>>>>>> 0la >>>>>>>>> bels%20in%20(performance) >>>>>>>>> >>>>>>>>> Given that one of the One of the main reasons to use JFX >>>>>>>>> for me is to be able to develop with one code base for at >>>>>>>>> least OSX and Windows and >>>>>>>>> the official statement what JavaFX is for, i.e. >>>>>>>>> >>>>>>>>> "JavaFX is a set of graphics and media packages that >>>>>>>>> enables developers to design, create, test, debug, and >>>>>>>>> deploy rich client applications that operate consistently >>>>>>>>> across diverse platforms" >>>>>>>>> >>>>>>>>> and the fact that this is clearly not the case currently >>>>>>>>> (8u40) >>>>>>>>> as soon as I do something else than simple forms, I run into >>>>>>>>> performance/quality problems on the Mac, I am a bit unsure >>>>>>>>> what to make of all that. Is Mac OSX a second-class citizen >>>>>>>>> as far as dev resources are concerned? >>>>>>>>> >>>>>>>>> Tobi and Chris, have you filed Jira Issues on Mac graphics >>>>>>>>> performance that can be tracked? >>>>>>>>> >>>>>>>>> I will file an issue with a simple test case and hope for >>>>>>>>> the best. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland >>>>>>>>> <cnewl...@chrisnewland.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Possibly related: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I can reproduce a massive (90%) performance drop on OSX >>>>>>>>>> between drawing a wireframe polygon on a Canvas using a >>>>>>>>>> series of gc.strokeLine(double x1, double y1, double x2, >>>>>>>>>> double y2) commands versus using a single >>>>>>>>>> gc.strokePolygon(double[] xPoints, double[] yPoints, int >>>>>>>>>> count) command. >>>>>>>>>> >>>>>>>>>> Creating the polygons manually with strokeLine() is >>>>>>>>>> significantly faster using the ES2Pipeline on OSX. >>>>>>>>>> >>>>>>>>>> This is reproducible in a little GitHub JavaFX >>>>>>>>>> benchmarking project I've created: >>>>>>>>>> https://github.com/chriswhocodes/DemoFX >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Build with ant >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Run with: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> # use strokeLine >>>>>>>>>> ./run.sh -c 5000 -m line >>>>>>>>>> result: 60 (sixty) fps >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> # use strokePolygon >>>>>>>>>> ./run.sh -c 5000 -m poly >>>>>>>>>> result: 6 (six) fps >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> System is 2011 iMac 27" / Mavericks / 3.4GHz Core i7 / >>>>>>>>>> 20GB RAM >>>>>>>>>> / >>>>>>>>>> Radeon >>>>>>>>>> 6970M 1024MB >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Looking at the code paths in >>>>>>>>>> javafx.scene.canvas.GraphicsContext: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> gc.strokeLine() maps to writeOp4(x1, y1, x2, y2, >>>>>>>>>> NGCanvas.STROKE_LINE) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> gc.strokePolygon() maps to writePoly(xPoints, yPoints, >>>>>>>>>> nPoints, true, NGCanvas.STROKE_PATH) which involves >>>>>>>>>> significantly more work with adding to and flushing a >>>>>>>>>> GrowableDataBuffer. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I've not had time to dig any deeper than this but it's >>>>>>>>>> surely a bug when building a poly manually is 10x faster >>>>>>>>>> than using the convenience method. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, March 27, 2015 21:26, Tobias Bley wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> In my opinion the whole graphics performance on MacOSX >>>>>>>>>>> isnââ¬â¢t good at all with JavaFXââ¬Â¦. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Am 27.03.2015 um 22:10 schrieb Robert Krüger >>>>>>>>>>>> <krue...@lesspain.de>: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The bad full screen performance is without the arcs. >>>>>>>>>>>> It is >>>>>>>>>>>> just one call to fillRect, two to strokeOval and one >>>>>>>>>>>> to fillOval, that's all. I will build a simple test >>>>>>>>>>>> case and file an issue. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham >>>>>>>>>>>> <james.gra...@oracle.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hi Robert, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Please file a Jira issue with a simple test case. >>>>>>>>>>>>> Arcs >>>>>>>>>>>>> are handled as a generalized shape rather than via a >>>>>>>>>>>>> predetermined shader, but it shouldn't be that >>>>>>>>>>>>> slow. Something else may >>>>>>>>>>>>> be going on. >>>>>>>>>>>>> >>>>>>>>>>>>> Another test might be to replace the arcs with >>>>>>>>>>>>> rectangles or ellipses and see if the performance >>>>>>>>>>>>> changes... >>>>>>>>>>>>> >>>>>>>>>>>>> ...jim >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/27/15 1:52 PM, Robert Krüger wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have a super-simple animation implemented using >>>>>>>>>>>>>> AnimationTimer >>>>>>>>>>>>>> and Canvas where the canvas just performs a few >>>>>>>>>>>>>> draw operations, i.e. fills the screen with a >>>>>>>>>>>>>> color and then draws and fills 2-3 circles and I >>>>>>>>>>>>>> have already observed that each drawing operation >>>>>>>>>>>>>> I add, results in >>>>>>>>>>>>>> significant CPU load (e.g. when I draw < 10 arcs >>>>>>>>>>>>>> in addition to the circles, the CPU load goes up >>>>>>>>>>>>>> to 30-40% on a Mac Book Pro for a Canvas size of >>>>>>>>>>>>>> 600x600(!). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Now I tested the animation in full screen mode >>>>>>>>>>>>>> (only >>>>>>>>>>>>>> with a few circles) and playback is unusable for a >>>>>>>>>>>>>> serious application (very choppy). Is 2D canvas >>>>>>>>>>>>>> performance known to be very bad on Mac or am I >>>>>>>>>>>>>> doing something wrong? Are there workarounds for >>>>>>>>>>>>>> this? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Robert >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Robert Krüger >>>>>>>>>>>> Managing Partner >>>>>>>>>>>> Lesspain GmbH & Co. KG >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> www.lesspain-software.com >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Robert Krüger >>>>>>>>> Managing Partner >>>>>>>>> Lesspain GmbH & Co. KG >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> www.lesspain-software.com > >