Re: [OpenJDK 2D-Dev] How does antialiasing with the OpenGL pipeline work?

2007-10-01 Thread Jim Graham
One thing the bug report doesn't seem to mention is that the tiles are 
32x32 (it implies it by talking about the 1024 values being copied), and 
that there is a function which quickly tells you whether a tile is all 
0s or all 1s so the renderer can either skip or do quicker fills of 
regions that are all inside or all outside the path.


The sun.java2d.pipe.AATileGenerator interface currently in the public 
sources defines the methods in that workflow and shows a sample of how 
they work.  You can see it being used in a real world setting in the 
AAShapePipe class in the same package...


...jim

Clemens Eisserer wrote:

Hi Roman,


Hey come on! I'd like to know the answer too. Give us a pointer to this
bug report! ;-)

Of course not ... here it is:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6529101 ;)

Although I don't understand / it does not mention all the details it
seems they let produce those alpha-masks by doctus and indeed upload
those alpha-masks to vram which seems to cause performance-problems
for simple operations covering large areas.
What I still don't know is how these alpha-masks look in detail and
wether the normal rendering is done? I guess the mask is used as
mask and then painted with color ;)

Btw. I read in your blog that you've done a rasterizer - whats the
state of it, or the renderer implemented by Jim? I am just looking for
a simple one to lern/reserach a bit in this topic.

Thanks, lg Clemens


Re: [OpenJDK 2D-Dev] Why does the OpenGL pipeline not use setupBlitVector?

2008-06-05 Thread Jim Graham
This is often a problem that can happen if you don't have a prototype 
for the function.  The default passing semantics for floating point is 
to pass them as doubles unless there is a prototype that says that they 
are floats.


Did you get the prototype correct, and did you make sure it was included?

...jim

Clemens Eisserer wrote:

Hi again,

I now inserted printf-statements to see the value of the parameters
before and after passing them:
In X11TextRenderer:  numGlpyhs:10, usePositions:0,
subpixPos:0, rgbOrder:0, lcdc:140, glypx:100.50,
glyphy:400.50, Images: -1376605040, NULL
In  X11TextRenderer_md: numGlpyhs:10, usePositions:0, subpixPos:0,
rgbOrder:0, lcdc:140, glypx:0.00, glyphy:3.392578, Images: 0
Positions:1081673728

So somehow it looks like passing the jfloats causes troubles, after
commenting them out, everything works as expected:
In X11TextRenderer: numGlpyhs:10, usePositions:0, subpixPos:0,
rgbOrder:0, lcdc:140, glypx:100.50, glyphy:400.5, Images:
-1376605792
In _md: numGlpyhs:10, usePositions:0, subpixPos:0, rgbOrder:0,
lcdc:140, glypx:-0.00, glyphy:-0.00, Images: -1376605792,
Positions:0

Any idea why passing the jfloats fail?

Thank you in advance, lg Clemens


2008/6/3 Clemens Eisserer [EMAIL PROTECTED]:

Hi Dmitri,


 Did you try to run it with -Xcheck:jni (preferably on a fastdebug
 build)? What does it say?

Thanks for the hint, it cleans the array-handle is not valid.

I added printf-statements and Hotspot is of course right, the
original object handle was != NULL, but the one I passed is zero.
The strange thing is if I call a method in the same shared library
(fontmanager.so) the handle is passed correctly, but when calling into
libmawt.so the array-handle is NULL.
I pass the array-handles by value, is this ok?

For now I could work-arround that by simply calling
GetPrimitiveArrayCritical in the first method, however thats somehow
strange and I would like to understand where my mistake is to learn of
my faults ;)

Thanks a lot for your patience, Clemens

One thing that also puzzles me is how the compiler knows about
AWTXRDrawGlyphList?
There's no header-file which specifies it, does the compiler guess?
This is the how the code looks like:

1.) In libmawt.so the method which is called and crashes:
void AWTXRDrawGlyphList
   (JNIEnv *env, jobject self,
jlong dstData, jint numGlyphs, jboolean usePositions,
   jboolean subPixPos, jboolean rgbOrder, jint lcdContrast,
   jfloat glyphListOrigX, jfloat glyphListOrigY,
   jlongArray imgArray, jfloatArray posArray)
{
   images = (jlong *) (*env)-GetPrimitiveArrayCritical(env, imgArray, NULL);
}

2.)The method called from JNI in libfontmanager.so and a test-dummy method:
JNIEXPORT void JNICALL Java_sun_font_X11TextRenderer_doDrawGlyphList
 //Method called from JNI
   (JNIEnv *env, jobject xtr,
jlong dstData, jint totalGlyphs, jboolean usePositions,
   jboolean subPixPos, jboolean rgbOrder, jint lcdContrast,
   jfloat glyphListOrigX, jfloat glyphListOrigY,
   jlongArray images, jfloatArray positions)
{
 TESTDrawGlyphList(env, xtr, dstData, totalGlyphs, usePositions,
 //Does not crash
subPixPos, rgbOrder, lcdContrast, glyphListOrigX,
glyphListOrigY, images, positions);

 AWTXRDrawGlyphList(env, xtr, dstData, totalGlyphs, usePositions,
//Does crash, because images == NULL
subPixPos, rgbOrder, lcdContrast, glyphListOrigX,
glyphListOrigY, images, positions);
}

JNIEXPORT void TESTDrawGlyphList //DummyTestMethod
   (JNIEnv *env, jobject self,
jlong dstData, jint numGlyphs, jboolean usePositions,
   jboolean subPixPos, jboolean rgbOrder, jint lcdContrast,
   jfloat glyphListOrigX, jfloat glyphListOrigY,
   jlongArray imgArray, jfloatArray posArray)
{
   jlong *images;
   images = (jlong *) (*env)-GetPrimitiveArrayCritical(env, imgArray, NULL);
}

These are the outputs when I run the code with Xcheck:jni:
Pointer: -1388390664 - Value of imgArray-handle before passed to the
2nd method in libmawt.so
Pointer: 0 - Passed value
FATAL ERROR in native method: Non-array passed to JNI array operations

and this is without XCheck:jni:
V  [libjvm.so+0x24252b]
C  [libmawt.so+0x1fc81]  AWTXRDrawGlyphList+0xa1
C  [libfontmanager.so+0xb393]
Java_sun_font_X11TextRenderer_doDrawGlyphList+0x113
j  sun.font.X11TextRenderer.doDrawGlyphList(JIZZZIFF[J[F)V+0

So the first call succeed (which basically does exectly the same, its
just in another shared library), but the second time the parameter is
not passed :-/



Re: [OpenJDK 2D-Dev] Rotation per-pixel accuracy

2008-06-06 Thread Jim Graham

Good to hear it as it avoids some hard decisions.

Per-pixel consistency is always the desired goal, but sometimes you have 
to look at what is possible to accomplish with the APIs we depend on and 
make some hard decisions.  If it would take a work around technique 
that would take 100x as long in order to get per-pixel accuracy we start 
looking the other way, when it takes only 20% longer, we usually go for 
it and somewhere in between is a line we've never really drawn very well.


And then there is always the possibility that a fresh approach will 
discover a new way to get per-pixel accuracy without any performance 
hits and so we sometimes stall on the hard question waiting for someone 
to have a flash of inspiration.  :-(


So, all in all, good to hear that we don't have to make a decision 
here... ;-)


...jim

Clemens Eisserer wrote:

Hi again,


Are there rules that rendering has to be per-pixel consistent, with
what should I compare to see wether my implementation works correct?

Sorry about the traffic ... it seems different rounding errors were
the cause for the different results,
the rotated text was drawn to the BI at the position 0/20, whereas it
was rendered at 100/100 to screen.

With adjusted positions I get consistent rendering with rotation also.

Sorry for the traffic :-/

Thanks, Clemens


Re: [OpenJDK 2D-Dev] Reporitory, Accalerating blits with EA

2008-07-21 Thread Jim Graham

Clemens Eisserer wrote:

Extra alpha has the same behavior for all AlphaComposite instances.  In a
nutshell, the extra alpha value gets logically multiplied with the source
before the actual compositing operation.  The AlphaComposite docs explain
this process in great detail (look for the A[sub]ac factor):
http://java.sun.com/javase/6/docs/api/java/awt/AlphaComposite.html

Thanks for the explanation.
I guess the reason for the behaviour I see is a xserver-bug ... I am
not really sure what to do about it till now.


One thing that might explain the difference is whether or not the opaque 
destinations are considered premultiplied or not.  I believe that we 
consider them non-premultiplied in which case the extraalpha is 
multiplied in, the result is stored to the destination, which involves 
dividing the alpha back out == no change.  If the system treats the 
destinations as premultiplied then it multiplies the alpha into the 
color, then stores the multiplied (which looks faded) result into the 
destination...


...jim


Re: [OpenJDK 2D-Dev] Optimizing pixmap reads and write

2008-11-04 Thread Jim Graham

Hi Roman,

This isn't well documented in SurfaceData.h, but there is a set of flags 
SD_LOCK_NEED_PIXELS which indicates if you need to do the read.  This 
is used in X11SurfaceData.c, but that file has so many ifdefs and 
alternate pixel access modes now that you really have to dig through it 
to see this.  It is also used in GDIWindowSurfaceData.cpp which is a 
little less complicated than the X11 counterpart, but not by much.


SD_LOCK_NEED_PIXELS is an OR of the READ and the PARTIAL flags. 
Primitives that know that they will be obliterating the entire rectangle 
that they are asking to lock will use SD_LOCK_WRITE without the PARTIAL 
flag.  If they cannot guarantee that they will obliterate the 
destination then they use the PARTIAL flag (in addition to WRITE) so 
that the read will be triggered.


Hope that helps...

...jim

Roman Kennke wrote:

I'm currently implementing a SurfaceData for VxWorks/WindML.
Unfortunately, this graphics library (WindML) doesn't provide me direct
framebuffer access. Therefore I have to perform read and write
operations for rendering operations (at least, for images and likewise
non-primitives). I was thinking that in many cases I can avoid reading
the surface data (i.e. for rendering opaque images) or in some cases
writing (when transferring the surface data to another incompatible
surface). So I implemented my GetRasInfo like this (pseudocode):

malloc(pixels, size);
if (lockFlags  SD_LOCK_READ) {
  readPixelsFromSurface();
}

and the Release function:

if (lockFlags  SD_LOCK_WRITE) {
  writePixelsToSurface();
}
free(pixels);

Unfortunately, this doesn't work well. When rendering translucent or
bitmask images, it does NOT set the the SD_LOCK_READ flag, and therefore
I don't read the surface pixels here, resulting in uninitialized
background for these images. This means, it only renders correctly if I
ignore the SD_LOCK_READ flag and read every time, even if I wouldn't
need to.

My question is, did I get something wrong in my understanding of the
flags? Or is this a bug? Or just something that hasn't been
implemented/optimized yet, because it isn't needed on OpenJDKs primary
platforms? (Although, I would think, that at least for non-DGA surfaces
it would be a nice little optimization at least on X11).

/Roman



Re: [OpenJDK 2D-Dev] [PATCH] 4494651: Wrong width and height in BufferedImage GraphicsConfiguration objects

2009-02-23 Thread Jim Graham

Torsten Landschoff wrote:
Hi Jim, 


On Tue, Feb 17, 2009 at 06:27:43PM -0800, Jim Graham wrote:
The width and height of a GraphicsConfig is essentially irrelevant  
information.  If you get the GraphicsConfig of a component, it doesn't  


Why is there a method then to query irrelevant information?


The size of a GC is very relevant for screen GCs - it defines the size 
of the device that the component is being rendered on.  Therefore the 
GC.getBounds method is somewhat useful for those GCs.  That was the use 
case for which the method was created.


The GCs from BufferedImages are another story.  There is no real 
device associated with a BufferedImage and any virtual device that you 
can conceive for it doesn't necessarily have a size.  The closest 
thing to a size of the destination device you might have would be as 
big as you have memory or the limits (if any) that the rendering 
engine's algorithms have.  In particular, that method is not defined as 
returning the size of the thing it came from, it is defined as 
returning the size of the universe in/on which the thing it came from 
lives and the size of the BufferedImage is not a relevant or analogous 
piece of information.


The return value of GC.getBounds() had a use in mind when it was created 
- it just doesn't happen to be the use that it looks like you want to 
make of it.


tell you how big that component was, so why should the GC of a BufImg  
bear any relation to the dimensions of the image?


Why would the graphics configuration contain the size of the component?


It doesn't contain it.  That is what I said and you quoted.

The GC from a component is not tied to the size of the component.

The GC from a BufferedImage should therefore not be tied to the size of 
the image.


The use case for me was to fix a drawing routine which made two passes 
over the input data to have the second pass paint over the first. This 
turned out to be quite slow with the main time going into loading the 
data. So I optimized it to paint in one pass, by using a BufferedImage 
to paint the overlay data and merge it later. The only object that 
could give me the size of the output Graphics device was - surprise -

the Graphics2D instance.

This works fine when drawing directly to the screen, but unfortunately 
not when double buffering is used. I think this is a very valid use case.


I think I see your case point here, but I think it is a red herring. 
You say that the G2D tells you the size of the output screen device.  I 
suppose that is true, but that information does not tell you how big the 
component on that screen is.  So, I suppose if you are rendering full 
screen then a by-product of the GC returning the size of the device is 
that it also tells you the size of the thing you are rendering to, but 
if you are rendering to just a window or component, that information is 
not useful since you really need to know the bounds of the 
component/widget/window and it doesn't give you that.


So, a universally useful piece of information that tells you the size of 
the thing you are rendering to should come from somewhere other than 
the size of the GC in a Graphics2D.  Note that we sort of provide that 
information in another form.  When your paint(), or paintComponent() 
method is called, we have set the clip to the bounds of the component 
and so doing a Graphics.getClipBounds() will give you the information 
you want.  Unfortunately, I don't think the clip is set if you call 
Component.getGraphics() or Image.getGraphics() and there is no relevant 
and related Graphics.getDeviceBounds(), though maybe there should be?


Historically, the bounds of a GC of a BufferedImage has never reliably 
returned similar information, and I don't think attempting to overload 
that method on GC is a good way to start providing this information.


Thus, I do not wish to pursue any further any attempts to make the GC a 
useful item for describing the dimensions of a particular Graphics2D's 
destination.


If anything, I'd fix it by having it report some fixed bogus (positive,  
large) dimensions rather than the dimensions of the first image that it  
was created from...


Why? Code calling it will only fall over with that, perhaps even worse 
than the status quo when expecting valid values. 


Why would such code fall over?  What use have developers been making of 
this, essentially random (random for BufferedImages) data in the past?


This information hasn't been useful in the past, so changing it will not 
break any correctly written programs.


The bounds information for a GC has not been semantically tied to the 
dimensions of a Graphics2D destination in the past, but it has had cases 
where it coincidentally returned bounds that happened to meet that need 
(i.e. full screen rendering).  The coincidence there was not by design 
and it cannot be extended to all cases so that accidental synergy of 
that one use case should not be turned into a specification.


Using

Re: [OpenJDK 2D-Dev] More incompatibilities

2009-03-04 Thread Jim Graham

This is almost there.  A couple of points about the solution, though:

- If you skip the MOVETO then you need to make sure that you later emit 
an lsink.moveTo otherwise the lsink object will complain about bad 
state.  If you look in ProcessPath.c you will see that a skip boolean 
is set whenever a moveto is skipped (bad name, I would have called it 
movetoSkipped or something) and that tells the other path cases to 
turn their point into a moveto if they need to.


- If you never got any path segments because you skipped them all, make 
sure that calling lsink.close() isn't a problem.  Note that if you get a 
CLOSE as the first thing in the path then it's OK to have lsink.close() 
throw an exception, but now that you are skipping coordinates, the 
burden is on you that if you skipped some coordinates, don't call 
close() in an invalid state because the incoming path state *was* 
correct - it's your edited path that got mangled down to just a close(), 
right?


- Finally, we should consider what we should do with huge coordinates 
that aren't infinite.  Why reject infinity, but not maxfloat-1?  The 
Ductus and ProcessPath.c pipelines reject all coordinates outside of 
maxfloat/2 and maxfloat/4 respectively which are huge compared to the 
range of S15_16, but they aren't infinite.  Also, those pipelines can 
handle huge float values, but values greater than those limits have the 
potential of being driven to infinity by doing things like (a+b)/2 
where the result will be infinity if a and b are more than inf/2.  I'm 
not sure what the best strategy is for our S15_16 code is, though, in 
the short term.  Perhaps long term we want to have all path processing 
done in float and only the inner loops done in fract, but we aren't 
there yet... :-(


...jim

Roman Kennke wrote:

Hi Jim,

I think the shape iterators used in the other pipelines (which should be 
visible as it was code that we wrote, even if it isn't used for Pisces) 
took a more flexible approach, testing each segment for NaN and overflow 
and ignoring individual segments until the shape became finite again. 
This happens somewhere in the src/share/classes/sun/java2d/pipe classes...


Ah cool. In this case we can implement it quite easily as in the
attached patch. Seems the output is 1:1 the same as with the non-free
JDK6.

/Roman


...jim

Roman Kennke wrote:

Hi again,


3. NotANumberTest: Double.NaN isn't handled gracefully.

The problem here is that the renderer in OpenJDK is originally written
for ME and uses fixed point arithmetic. I can't think of a quick fix,
because shapes are processed by iterating over them, this means, by the
time we hit the NaN, we might already have processed (==rendered) some
of the shape, but your test seems to suggest that you expect nothing to
be rendered in this case. The specification doesn't say anything about
this particular problem (at least I can't find anything). One solution
would be to pre-check all the incoming shapes for NaN or other invalid
values (infinity, etc) and not go into the iteration at all. But this
seems like quite a big overhead to me. We could also make the
floating-fixed conversion to throw an exception, that we would have to
catch higher up in the call tree and rollback what has already been
rendered (which doesn't seem easy either, because in the case of
strokeTo() this lies outside of the pisces renderer).

/Roman


Re: [OpenJDK 2D-Dev] How to determine if SurfaceType supports transparency?

2009-06-07 Thread Jim Graham
That isn't currently possible, but it sounds like a useful thing to add. 
 One problem is that there are some types that know that they have 
alpha, others that know that they do not, and still others which are too 
general and may have alpha or may not, so how do you encapsulate that 
information in a query?


For example, IntArgb does have alpha, IntRgb does not, but AnyIntDcm may 
or may not have alpha.


If we simply have ST implement Transparency then I suppose the types 
that are too general could simply return Translucent in order to avoid 
the promise of opacity.  It's the safe answer since Translucent 
doesn't guarantee that the pixels themselves will not be opaque - just 
that they have the option to be non-opaque...


...jim

Clemens Eisserer wrote:

Hello,

Is it possible to determine wether a SurfaceType does support transparency?
I know its possible with SurfaceData, but I need to know at
loop-registerion time where I only have access to SurfaceTypes of
course.

Thank you in advance, Clemens


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-15 Thread Jim Graham

Hi Mario,

How are the drawGlyphList methods called when the loops is null?  I ask 
because they are only ever installed on the SunGraphics2D object by 
virtue of a call to validatePipe() and all calls to validatePipe() 
should set the loops.  So, there is a bug somewhere that causes these 
loops to be installed without a full validate process.


As I said in the email you quoted.  All calls from pipelines 
(GlyphListLoopPipe and AATextRenderer are both pipeline objects) should 
be safe because all calls to validatePipe() should set the loops.


Having said that I note that there are some pipelines that do not 
require loops and it would be OK for a call to validate that only uses 
such non-loop-based pipelines to leave the loops field uninitialized, 
but all validate calls which use loop-based pipes must update the loops 
field.


Eliminating all of those uses of loops we then fall into the case where 
the usage in checkFontInfo is the only usage that does not occur from a 
pipeline...


...jim

Mario Torre wrote:

Il giorno mer, 10/06/2009 alle 03.02 -0700, Jim Graham ha scritto:
What is the need for this fix?  Is there a bug being fixed here other 
than you don't like the look of the code?


The reason I ask is that your fix requires a method call with a test for 
every graphics operation.  I'd prefer if we didn't add overhead unless 
it was necessary for correct function.


Also, the partially initialized state issue - while that technique can 
in general lead to issues, I don't think it is problematic here.  If 
that is the concern then we could target removing just that line.  Any 
references to the field from pipeline objects is safe since they won't 
be installed until a validate() is called which always sets the loops. 
The only suspect reference to loops would be the call from 
checkFontInfo() which might be invoked before the first validate() so it 
would need the protection against null, but almost no other piece of 
code you've modified can ever encounter a null loops field (or if it 
does then some validate() code forgot to set it)...


...jim


Hi Jim!

Thanks for the kind reply.

I was tracking a bug in our SDL backend when I put my eyes on this code,
but the bug itself is not related, just thought that this kind of code
leads to unexpected errors (I shoot myself in the foot already with this
stuff sometime ago...).

I noticed that the references to this variable are not always protected
by a call to validate though. If I don't set the loop in the
constructor, and don't check for null in the getter, I get NPE in
various places, for example:

sun.java2d.pipe.GlyphListLoopPipe.drawGlyphList
sun.java2d.pipe.AATextRenderer.drawGlyphList

I get those two with the Java2D demo, but there are other places that
blindly just use the loop (in fact, everywhere loops is referenced, it
is just used without checking for null), and they trust on the fact that
loops is just never null.

There are two solution to this in my mind, either check for null
everywhere loops is used (which is what I proposed with the getter) or
selectively check for null in places we know it may be null (for example
either in AATextRenderer.drawGlyphList, GlyphListLoopPipe.drawGlyphList,
or in general in the various calls inside SunGraphics2D that end up in
those methods).

The third solution, that works so far, is to protect checkFontInfo()
with a null check like you proposed, because in fact checkFontInfo is
called before validate, and initialise there a valid instance of the
RenderLoops, if they are null, which is the best option for
performances, too.

To be honest, those solutions looks a bit hacky to me, because we end up
checking in places where it may be used other than in places where it
is actually used, but for the sake of saving few bytecodes and a method
invocation, maybe we can go this path. Or, because it seems I opened a
can of worm, I understand if you don't want to fix this issue at all ;)

I have prepared a new webrew with the proposed fix, where I check for
null in checkFontInfo:

http://cr.openjdk.java.net/~neugens/100068/webrev.02/

I added a small comment to make clear that this guy may be null if not
set via validate, checkFontInfo or setLoops.

Hope this helps!

Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-16 Thread Jim Graham
My design goal was doesn't corrupt anything if run from multiple 
threads (in other words don't require synchronization in any common 
places just to keep it functional and don't require it to be used from a 
particular thread), but only correct behavior if run from 1 thread at a 
time.


In other words, it can be used by multiple threads in sequence, but not 
simultaneously.


...jim

Roman Kennke wrote:

Hi there,

first of all, let me say that - especially in the light of this thread, 
I support Mario's change (removal of this one line from the constructor) 
for the following reasons:


- It should not be necessary as you say, this field should always be 
initialized before use by validatePipe().

- If it's not, it's most likely a bug.
- Therefore this line is only good for hiding bugs.

One thing is not quite clear to me:

- A race condition where thread A is validating the pipeline and 
installs the pipeline objects but hasn't yet reached the code to 
install the loops while thread B starts rendering using that SG2D thus 
invoking an operation on a partially initialized pipeline - in this 
case the NPE is appropriate and allowed since we don't support 
multi-threaded use of the Graphics2D objects.


I was always under the assumption, that Java2D should be thread safe. 
And in many places we make sure it is, e.g. in the SurfaceData objects. 
But now you say, it isn't. So what is the real thing about Java2D and 
thread safety. Is it only supposed to be thread safe when each thread 
gets its own Graphics2D object? I think I remember back in the GNU 
Classpath days we had a test case that shared a Graphics (no Graphics2D 
back then..) object between threads and was supposed to be ok, but I 
might be wrong here... Can you enlighten me what is the situation w/ 
Java2D and thread safety?


/ Roman



Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-16 Thread Jim Graham

Ah, I think I see the problem.  Phil can chime in here if I'm wrong.

Text now uses loops in most cases, but many of the validate methods 
assume that they don't need the loops.  I don't think that used to be 
the case, but it changed as a result of the LCD text loop work.  It used 
to be that AA used the alphafill field of SG2D and not the loops, and it 
still does for most rendering.  But now text rendering will almost 
always go through the loops and it is only working because of that call 
to getRenderLoops() in the constructor (and the fact that we never set 
it back to null.


I'd like to see invalidate() set the loops to null and see how far we 
get - I'll bet that we'd get NPEs all over the place, especially with AA 
rendering.


In the long run, this is another straw on the camel's back as far as 
rethinking the validation framework.  It's been tweaked and hacked quite 
a lot over the past 10 years and so there are a lot of issues that arise 
like this that aren't readily obvious from the code.  I don't think the 
camel's back is broken yet, but it is becoming more and more confusing 
for new engineers to start tinkering with it without seeing things break 
from seemingly innocuous changes.  :-(


For now, I'm wary of removing that call without a lot of testing.  I 
think putting a loops=null line in invalidatePipe() might accelerate 
some of that testing, though.


And Phil might want to chime in with a description of the new 
requirements on loops in light of the post-LCD text work...  Phil?


...jim

Mario Torre wrote:
Il giorno lun, 15/06/2009 alle 13.37 -0700, Jim Graham ha scritto: 

Hi Mario,

How are the drawGlyphList methods called when the loops is null?  I ask 
because they are only ever installed on the SunGraphics2D object by 
virtue of a call to validatePipe() and all calls to validatePipe() 
should set the loops.  So, there is a bug somewhere that causes these 
loops to be installed without a full validate process.


Hi Jim,

So, I spent some time today (sorry for the delay, I had to find some
free time slot for that and had to make a cake for my girl friend, which
was the most difficult part :), and I debugged the Java2D demo with just
the non initialised loops (so, no checks for null loops anywhere else in
the code).

The Demo starts fine, but after some point I get the error attached in
this mail.

As I said in the email you quoted.  All calls from pipelines 
(GlyphListLoopPipe and AATextRenderer are both pipeline objects) should 
be safe because all calls to validatePipe() should set the loops.


I see that validatePipe is indeed called always, but sometimes the path
that is chosen doesn't validate the rendering loop. I've seen that
most of the time this is ok, because the loops are not used.

I guess this is the case for all the various text rendering (LCD or AA)
in swing components, for example (is this correct?).

Having said that I note that there are some pipelines that do not 
require loops and it would be OK for a call to validate that only uses 
such non-loop-based pipelines to leave the loops field uninitialized, 
but all validate calls which use loop-based pipes must update the loops 
field.


Exactly. I'm not deep into the code yet to distinguish when they are
needed or not, but...

Eliminating all of those uses of loops we then fall into the case where 
the usage in checkFontInfo is the only usage that does not occur from a 
pipeline...


...this is exactly the case, putting a null check here solves the NPE.

For what I can see, at some point some component needs to paint to an
offscreen surface (I don't think the offscreen surface is special,
I think it's the application code that drives the bug, but anyway...).

This is the background image from the Java2D Intro pane, I think the
other pane do something similar. Once the SunGraphics2d object is created,
some redering hints are set. This is the application code:

g2.setRenderingHint(RenderingHints.KEY_ANTIALIASING, 
RenderingHints.VALUE_ANTIALIAS_ON);


That is, turn on antialiasing, then:

g2.clearRect(0, 0, d.width, d.height);

Now what? This of course goes through validatePipe:

The first two if/else statements fall through but not this
(SurfaceData, line 535 in OpenJDK):

} else if (sg2d.antialiasHint == SunHints.INTVAL_ANTIALIAS_ON) {

This guy sets the pipes but doesn't set the RederingLoops.

Then, again application code:

scene.render(d.width, d.height, g2);

Which after few loops finally renders the string on screen,
which causes the crash.

So, in other words, everything goes fine until we draw text with
the AA redering hints turned on.

Of course, before it was not failing because of the rendering loops
were initialised in the constructor.

I'm not sure if we need to check for a null loops at the beginning
of validatePipe or in checkFontoInfo. Maybe we can save something if we
use checkFontInfo but I'm not so sure.

I hope this helps,
Mario



Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-18 Thread Jim Graham
That makes a lot of sense since I think the introduction of the 
GlyphListLoopPipe in its current form was where the original methodology 
of always install loops if you plan to use loops was first (and only 
time) violated.


I think I raised the issue at the time and Phil pointed out that, in 
practice, the loops never really change for a given surface type (as in, 
if a different compositing mode is set or if a non-color paint is set 
then we use entirely different mechanisms to render so the only loops 
that exist for these are solid color loops and those loops are always 
valid for all cases that use these pipes).


While that may mean we don't have a practical bug yet, the weakness of 
that argument in general, and the desire to resolve the issue from the 
original bug report may mean we should revisit this practice.


One solution would be to always set the loops for the validations that 
install one of these pipes.  That could have potential performance 
impact, but it would be no worse than the validation sequences that 
already set the loops every time so I don't think it would be serious, 
or even measurable.


Another solution could involve splitting these loops out into a separate 
structure that can be set once in the lifetime of an SG2D and then 
reused as appropriate.  But this is banking on the only one version of 
these loops really exists anyway constraint and some day that 
constraint is likely to disappear and then we really will need to pick 
new loops on every validate.


As such, I think I'd prefer the first solution - to just pick new loops 
every time they are needed (i.e. unless the pipeline really doesn't use 
the loops directly, or indirectly through these text pipes)...


...jim

Mario Torre wrote:

Il giorno gio, 18/06/2009 alle 00.34 +0200, Mario Torre ha scritto:


But I'll debafterug it a little further and send you some updates asap.

Cheers,
Mario


Hello!

The pipelines that use a loops after being invalidated without checking
if it's valid or not (null) are those so far:

LCDTextRenderer
GlyphListLoopPipe
AATextRenderer

Those all are subclasses of GlyphListLoopPipe.

The LCDTextRenderer never fails if I don't explicitly set to null the
loops in invalidatePipe, same happens for instances of GlyphListLoopPipe
which don't fail immediately as when I set the loops to null, so I think
some of those calls use loops there were initialised somehow from
validatePipe in some previous call to this method, but that should not
be valid anymore.

There are no other places where I get NPE, but I've tested only with the
Java2D demo so far.

Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-18 Thread Jim Graham

Whoops!  I abbreviated and sent you off on a wild goose chase.

There are many different loops!

There just happen to be (I think!?) only 1 kind of *TEXT* loop in the 
system for a given surface.  I left out that qualifier below and 
misdirected you.


Thus, if the text loops were split out then it may be that they can be 
created once per SurfaceData, but that is definitely *NOT* true of the 
other loops that reside in the RenderLoops, and it may not be true of 
the text loops in the future, it just happens to be probably true for 
text loops for now...


...jim

Roman Kennke wrote:

Hi Jim,

If only one instance of the loops is used during the lifetime of a 
Surface, then we can easily keep this instance there instead of the 
SG2D, validating would not initialize any new objects, but only put 
stuff together that is already present in the SurfaceData anyway. Or did 
I get something wrong here?

i
/ Roman

Jim Graham wrote:
That makes a lot of sense since I think the introduction of the 
GlyphListLoopPipe in its current form was where the original 
methodology of always install loops if you plan to use loops was 
first (and only time) violated.


I think I raised the issue at the time and Phil pointed out that, in 
practice, the loops never really change for a given surface type (as 
in, if a different compositing mode is set or if a non-color paint is 
set then we use entirely different mechanisms to render so the only 
loops that exist for these are solid color loops and those loops are 
always valid for all cases that use these pipes).


While that may mean we don't have a practical bug yet, the weakness of 
that argument in general, and the desire to resolve the issue from the 
original bug report may mean we should revisit this practice.


One solution would be to always set the loops for the validations that 
install one of these pipes.  That could have potential performance 
impact, but it would be no worse than the validation sequences that 
already set the loops every time so I don't think it would be serious, 
or even measurable.


Another solution could involve splitting these loops out into a 
separate structure that can be set once in the lifetime of an SG2D and 
then reused as appropriate.  But this is banking on the only one 
version of these loops really exists anyway constraint and some day 
that constraint is likely to disappear and then we really will need to 
pick new loops on every validate.


As such, I think I'd prefer the first solution - to just pick new 
loops every time they are needed (i.e. unless the pipeline really 
doesn't use the loops directly, or indirectly through these text 
pipes)...


...jim

Mario Torre wrote:

Il giorno gio, 18/06/2009 alle 00.34 +0200, Mario Torre ha scritto:


But I'll debafterug it a little further and send you some updates asap.

Cheers,
Mario


Hello!

The pipelines that use a loops after being invalidated without checking
if it's valid or not (null) are those so far:

LCDTextRenderer
GlyphListLoopPipe
AATextRenderer

Those all are subclasses of GlyphListLoopPipe.

The LCDTextRenderer never fails if I don't explicitly set to null the
loops in invalidatePipe, same happens for instances of GlyphListLoopPipe
which don't fail immediately as when I set the loops to null, so I think
some of those calls use loops there were initialised somehow from
validatePipe in some previous call to this method, but that should not
be valid anymore.

There are no other places where I get NPE, but I've tested only with the
Java2D demo so far.

Cheers,
Mario




Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-06-22 Thread Jim Graham

Hi Mario,

The changes look safe, but I wanted to bring up the following issues:

- There are a dozen or so fields in SG2D that are commonly accessed 
everywhere in the pipelines including loops.  These changes introduce an 
accessor for loops, but that now looks out of place with no accessors 
for any of the other fields that get accessed directly.  Personally the 
lack of accessors hasn't bothered me, particularly since this is 
performance sensitive code and accessors can sap performance by nickels 
and dimes if you don't implement them correctly - to wit:


- Accessors can be inlined if they are final, but these aren't.  As it 
turns out, SG2D itself is final and so I believe that is enough for them 
to be inlined, but I tend to make methods final as well to underscore 
that they are intended to be inlined, and also in case we eventually 
decide to make the class non-final.  On the other hand, we haven't 
really bothered with accessors in the first place in this code to avoid 
the code bloat and the potential points of heuristic failure.


- I agree that it would be nice to ask the pipes if they need loops, I'm 
not sure why your solution didn't work, but I prefer that to making them 
a side effect of getTextPipe() since it makes it harder to know when the 
code you are editing needs to get the loops or not, and it makes it hard 
to see where they come from when editing the many validatePipe() methods.


...jim

Mario Torre wrote:

Il 18/06/2009 21:52, Jim Graham ha scritto:


One solution would be to always set the loops for the validations that
install one of these pipes. That could have potential performance
impact, but it would be no worse than the validation sequences that
already set the loops every time so I don't think it would be serious,
or even measurable.


Hi Jim,

I'm trying this approach.

Basically, I just set the loops to null now in invalidate and validate 
them back in validatePipe:


http://cr.openjdk.java.net/~neugens/100068/webrev.03/

What I do is to get valid Loops when we call getTextPipe, but other than 
that, the behavior is basically unchanged. I tried with various 
applications, including NetBeans, the Java2D demo and the FontTest app, 
playing around with the text configurations (AA, LCD, size, Glyphs etc.. 
even output to a PNG file) and looks good, but I may miss something of 
course.


I moved to a 64 bit machine, I don't think this makes any difference in 
this case, but maybe it's a good thing to say.


I would like to see a method in the pipelines that does something like:

boolean mustInitialiseLoopsBecauseWeReallyWantToUseThemGranted()

or so, that we may call at the end of the validatePipe method, but when 
I tried this I got a funny StackOverflowException, maybe I did something 
wrong, but looks like it's not so easy to change this code and still 
make it behave correctly, so I would like to go deeper in this issue 
even if you think we can close the bug (of course if the proposed fix is 
evaluated as complete).


Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-07-21 Thread Jim Graham
It's odd that J2DBench showed a difference since nothing you did should 
have affected those benchmarks.  I don't think it has any benchmark 
which shows the impact of a pipeline validation, so your standalone test 
is the only one that I think addresses the issue.


Rather than setting up a machine for a 72 hour run (not sure a run of 
what, though - J2DBench?), I'd rather see either trying to do the check 
with a virtual method call (Pipe.needsLoops) or just going back to the 
old style of initializing them in the validation branches that we know 
need them and we can revisit this mechanism some other time when we have 
the time to really figure out how to make it cheap.  In particular, I 
have some ideas for how to make validation incredibly cheap at the cost 
of a few K of lookup tables per SurfaceData, but I think the scale of 
that is beyond us for now...


...jim

Mario Torre wrote:

Il 15/07/2009 23:41, Jim Graham ha scritto:

Numbers that small aren't statistically significant.  Our J2DBench
benchmark calibrates each test to run a number of iterations that result
in at least 2.5 seconds of run time. Try upping your loop iterations by
a factor of 100 and you'll get numbers with better accuracy and
precision...

...jim


Hi Jim,

I multiplied the small test numbers by 100 and this is the result:

Patched JDK:

warmed up run time in ms: 3226
total time in ms: 9586

Clean JDK:

warmed up run time in ms: 3039
total time in ms: 9172

I also run the more meaningful Java2DBench. I had no time to run an 
extensive benchmark, and only limited to what I think it's the very bare 
minimum:


http://cr.openjdk.java.net/~neugens/100068/comparision-100068-0.1/Summary_Report.html 



So, looks like this approach has indeed an impact, although is well 
within 10%.


Should I try the other approaches? I hardly see how a method call can be 
faster than an instanceof, but maybe I miss something obvious.


If you want, I can try to setup a machine at work to run a full 72 hour 
test, but this will take some time (  72*2 hours :).


Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-08-04 Thread Jim Graham

Roman Kennke wrote:

http://cr.openjdk.java.net/~neugens/100068/webrev.06/


So the short story is the webrev.05 was actually better and we better
forget about webrev.06 at this point?


It also looks like the webrev.05 is better than a stock JDK - even more 
promising!



Regarding webrev.05, I find the 5x instanceof a bit ugly. I am not sure
how to avoid it though. Maybe put the pipe fields in a container. This
container could have a (marker) subclass that is instanceof'ed for.
Whenever a SD sets up loop based pipes, it uses the subclass. Otherwise
it uses the base class. Then you'd only need one instanceof check
against the container. But OTOH you would get double field access in a
couple of cases, of which I don't know if hotspot optimizes them away
somehow. And it's probably not worth thinking about any of this if
impact is not noticable already. Maybe the if cascade bails out at the
first check already? Maybe it's worth ordering the if cascade so that
the most likely case is the first one, etc?


It's only 5x instanceof in a single place in the code and it makes the 
entire business of loop validation much cleaner so I'm loving it in a 
global/general sense even if it is uglier at just that one line of code.


However, I would modify the code style for it to move the curly to the 
following line like this:


if (foopipe  instanceof blah ||
blahpipe instanceof blah ||
barpipe  instanceof blah)
{
sg2d.loops = ...;
}

This is almost completely in line with our code style guidelines (are 
those published on the OpenJDK site?) with the only minor variation 
being the open curly on the line by itself which is a personal 
preference (that is used through most of java2d) to make the break 
between multi-line conditionals and body more visible.  I(we?) find that 
otherwise the body looks like another line of conditional tests...


...jim



Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-09-17 Thread Jim Graham
I can't speak for the changes to the build files, but I only have a 
couple of very small suggestions for the code changes.


Our coding style for continued class declarations would suggest using 
the following indentation:


public class FooPipe extends SomeKindaPipe
implements LoopBasedPipe
{

with the { on a new line by its own after the implements clause, and 
indentations of 4 spaces (see LoopPipe).


Also, the interface was not represented in the webrevs and its name 
should be singular (not Pipes)...


...jim

Mario Torre wrote:

Il 05/08/2009 10:37, Mario Torre ha scritto:

I'll send you an update shortly based on the latest OpenJDK checkout 
even.


So, shortly turned out to be more than one month.

I'm so sorry, I had quite few problems at work and some personal stuff 
as well that kept me quite busy on other stuff. Anyway, here is an 
update of the patch with the suggestions from the last comment:


http://cr.openjdk.java.net/~neugens/100068/webrev.07/

I hope it's fine this time :)

Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-09-17 Thread Jim Graham
If the NetBeans stuff is not due to changes I've made (i.e. it is due to 
their overzealous I'm going to rewrite all your build files for you 
because you 'stupidly' chose to use a different version of NB than 
everyone else in your project hahaha, then I just revert the files 
manually (either right click and revert on the 3 files in NB, unchecking 
the save originals checkbox), or using something like:


% hg status nbproject
{ make sure output is only the things I don't want }
% hg revert --no-backup nbproject
{ making sure there are no typos... ;-) }

As far as ignoring them in your webrev, you can do:

% hg status  webrev.files
{ edit webrev.files down to a single filename per line with /es }
% webrev webrev.files

Also, if you have new files in the repository then doing an hg add 
will help include them in the webrevs, or if you use the webrev file 
list technique above, then the new files will appear in the file with a 
preceding ? which you can just edit down to the name of the new file 
itself and webrev will include it...


...jim

Mario Torre wrote:

Il 17/09/2009 15:22, Mario Torre ha scritto:

Il 05/08/2009 10:37, Mario Torre ha scritto:


I'll send you an update shortly based on the latest OpenJDK checkout
even.


So, shortly turned out to be more than one month.

I'm so sorry, I had quite few problems at work and some personal stuff
as well that kept me quite busy on other stuff. Anyway, here is an
update of the patch with the suggestions from the last comment:

http://cr.openjdk.java.net/~neugens/100068/webrev.07/


ops, of course, please ignore the netbeans stuff, I forgot to remove 
them before creating the webrev (is there any way to ask webrev to 
ignore some files?)


Cheers,
Mario


Re: [OpenJDK 2D-Dev] Review Reqeust for Bug 100068 - SunGraphics2D exposes a reference to itself while non fully initialised

2009-09-25 Thread Jim Graham
Sorry, after the flurry of emails here on other topics I just realized 
this was getting buried.


The latest changes look good to me.

I just did a one last grep through the sources pass and noticed that 
X11SD and GDIWindowSD also call getRenderLoops.  They always call it, 
though, because they install text pipes that will likely need it. 
Perhaps we should check it for null before bothering with the call?  I 
looked quickly through and I can't totally vouch for the fact that the 
loops are always needed unconditionally, but it did look likely.  Also, 
I think we've likely always had an extra call to validate the loops in 
there so this is nothing new, but since we never were setting them to 
null before we started we never had a way to check have they been 
updated by someone? before.


I think we could add a quick if (loops == null) {...} in those 2 
classes and maybe add a comment // assert(some pipe will always be a 
LoopBasedPipe) to indicate that we aren't checking the types of the 
pipes because we think it is always true in case that assumption changes 
in the future.  Does that make sense?


(Again, this isn't a new problem - just an opportunity to fix up some 
old redundancy.)


...jim

Mario Torre wrote:

Il 17/09/2009 22:27, Jim Graham ha scritto:

http://cr.openjdk.java.net/~neugens/100068/webrev.08/

:)

Mario


Re: [OpenJDK 2D-Dev] Review request for bug 100053

2009-10-01 Thread Jim Graham

I could go too ways on this.

It looks like the code is looking to drop arrays that have grown so that 
it doesn't waste memory.  Do we reuse these objects?  If not, then the 
code can be deleted.


If we do reuse them, then why not just set them to null and let them get 
recreated the next time rather than allocate something which may not get 
used for a while?


In either case, if the array was null, is there a need to allocate one? 
 Shouldn't it be if (array != null  array.length  DEFAULT)?


...jim

Dmitri Trembovetski wrote:


  Hi Roman,

Roman Kennke wrote:

Hi Dmitri,

   a comment about the test: would the bug reproduce if you just 
rendered into a BufferedImage? If so, no need for creating a frame 
and such.


Oh yes. Stupid me :-)

   Regarding the fix, it looks ok - but there are other places in the 
code where the 'crossings' is accessed - are those safe from an NPE?


I followed all usages of crossings and crossingsIndices and they all end
up in endRendering(). There is a loop at the beginning of which the
arrays get initialized using setCrossingsExtends(). The problem occured
when this loop was never entered, in that case we hit
crossingListFinished() with (possibly) null arrays. There I added the
null checks and it should cover all potential NPEs on these arrays.

The updated webrev is:

http://cr.openjdk.java.net/~rkennke/100053/webrev.01/

Ok now?


  Yes, looks fine.

  Thanks,
Dmitri




Thanks, Roman




Re: [OpenJDK 2D-Dev] AWT Dev [PATCH FOR APPROVAL]: Fix broken build on newer versions of X11 (libXext = 1.1.0)

2009-11-03 Thread Jim Graham

Andrew John Hughes wrote:

There's http://lists.x.org/archives/xorg-devel/2009-June/001242.html
but I avoided posting this in the original mail because it seems to
have changed again between that commit and the final release,
presumably due to compatibility issues (XShm.h is back and it's now
shmproto.h as seen in the patch).  I've built the repo with this patch
here with the old version, and others have built it with the new
version; it does work for both.  The same patch is already in Gentoo's
ebuild and IcedTea, and a similar patch has been used for the Fedora
rawhide RPMs for some time.  It would be good to get it upstream as
well.


At first I was going to ask how the existing #include succeeds when the 
link says that Xshm.h is going away, but now I see that you said they 
brought it back.  What is it now?  Just an empty include to prevent 
#include failures?  (I don't see how that works since the build will 
break anyway as soon as a missing constant is referenced...?)


(It seems odd that they bring it back to [not really] avoid build 
breakages, but then don't just have it include the new split files to 
finish the backwards compatibility story...?)


...jim



Re: [OpenJDK 2D-Dev] AWT Dev [PATCH FOR APPROVAL]: Fix broken build on newer versions of X11 (libXext = 1.1.0)

2009-11-03 Thread Jim Graham
Yes, indeed, that all makes sense for your fix.  I wasn't intending to 
register an objection with the fix, I was just curious about the changes 
they made which, as you say, seem quite convoluted...


...jim

Andrew John Hughes wrote:

2009/11/3 Jim Graham jim.a.gra...@sun.com:

Andrew John Hughes wrote:

There's http://lists.x.org/archives/xorg-devel/2009-June/001242.html
but I avoided posting this in the original mail because it seems to
have changed again between that commit and the final release,
presumably due to compatibility issues (XShm.h is back and it's now
shmproto.h as seen in the patch).  I've built the repo with this patch
here with the old version, and others have built it with the new
version; it does work for both.  The same patch is already in Gentoo's
ebuild and IcedTea, and a similar patch has been used for the Fedora
rawhide RPMs for some time.  It would be good to get it upstream as
well.

At first I was going to ask how the existing #include succeeds when the link
says that Xshm.h is going away, but now I see that you said they brought it
back.  What is it now?  Just an empty include to prevent #include failures?
 (I don't see how that works since the build will break anyway as soon as a
missing constant is referenced...?)

(It seems odd that they bring it back to [not really] avoid build breakages,
but then don't just have it include the new split files to finish the
backwards compatibility story...?)

   ...jim




It's quite convoluted, that's why I was just going to avoid posting
the link, as it makes things even more confusing.  I believe the
reinstated XShm.h does have content that was still needed.

The initial version I linked to did remove XShm.h, so the original fix
for Fedora 12 removed XShm.h, added the two additional headers and
defined some other stuff which I believe was in XShm.h originally.  It
was a pretty nasty patch, hence why it wasn't committed to IcedTea or
OpenJDK.  I gather now that XShm.h is back and has the additional
material in it.  I don't have a copy locally to check, but several
people have said this fix works and Fedora RPMs have been built with
the original fix.  More importantly, I have confirmed myself that it
doesn't break earlier versions, which are still used on the majority
of systems.  It's now several months on from our initial discovery of
the problem and more and more people are asking about this in e-mail
and on IRC, so a general fix is needed and this fits the bill.

Hope that makes some sense!

Thanks,


Re: [OpenJDK 2D-Dev] Font rendering issue

2010-05-18 Thread Jim Graham
The thing that bothers me about this fix is that the value being 
returned here is the raw computed value.  All of the values in this 
routine are being returned in floating point sub-pixel maximum 
accuracy.  I don't see why *this* code needs to round this value.  If 
something that uses the data returned from this method needs an integer 
then it should be up to that code to do whatever rounding is 
appropriate, but rounding at the most primitive level to fix a bug at a 
higher level is premature (IMHO)...


...jim

Mario Torre wrote:

Il giorno dom, 09/05/2010 alle 14.17 +0200, Mario Torre ha scritto:

Il giorno dom, 09/05/2010 alle 12.14 +0200, Mario Torre ha scritto:


ly = (jfloat) ROUND(FT26Dot6ToFloat(
  scalerInfo-face-size-metrics.height +
  bmodifier) + ay - dy);


And here is the proposed webrev:

http://cr.openjdk.java.net/~neugens/100134/webrev.02/

As noted, this doesn't really fix all the bugs, it just fixes the
rounding for leading, which, by chance, workarounds the other issues and
appear to fix the rendering as well.

Cheers,
Mario


Any comment on that?

Cheers,
Mario


Re: [OpenJDK 2D-Dev] Font rendering issue

2010-05-18 Thread Jim Graham
Phil's message brings up another issue for me with the patch.  Why use 
ROUND instead of a ceiling operation?  Do we know what the best option 
is for the code above?


Again, I would strongly favor leaving these base calculations in the 
scalar alone and focus more on making sure the proper operations are 
done in the code above (whether, contextually, they may be a round or a 
ceiling or working directly with the raw float value)...


...jim

Phil Race wrote:

Too many emails with too many comments for me to address them all.

- Swing understands that there's no guarantee that all the pixels fit
  in the reported height of the line. I don't think you want to space
  out the text so much that you guarantee no glyph overlap. Actually
  due to some perhaps questionable choices of the fields in the font
  which should be used AND the way that the height of logical
  fonts is assembled (the maximum leading of all components +
  the maximum descent of all components + the maximum ascent of all 
components)
  its higher than I'd like. Much as I'd love to fix this its going to 
get someone upset,

  so I've steered well clear. This goes all the way back to 1.1

 And I doubt we'll be able to go changing Swing either.

 Irritating as this may, where there are consequences for upstream code
  its not something I'd want to sign off on lightly unless its clearly 
fixing a blatant bug.
 We have less compatibility history to maintain (in the behavioural 
sense) in the freetype code

  so that's easier to change.

 Anything in the shared code, I'd want to actually try out. Any claimed 
errors

 in the closed code, I'd want to track down and see if its actually so and
 what we can/should do about it.

 If I have things right, the most obvious problem Mario saw is a negative
 value for leading. That could be an incorrect interpretation of sign 
somewhere?

Seems like rounding up to get rid of it isn't addressing the real problem

-phil.

On 5/18/2010 2:39 PM, Mario Torre wrote:

Il giorno mar, 18/05/2010 alle 23.33 +0200, Mario Torre ha scritto:
  

Il giorno mar, 18/05/2010 alle 21.07 +0200, Roman Kennke ha scritto:


Hi Mario,

  

ly = (jfloat) ROUND(FT26Dot6ToFloat(
   scalerInfo-face-size-metrics.height +
   bmodifier) + ay - dy);

 
 

Just one little note.

Because this is not necessarily just a rounding problem, but it's an
hinting problem, it would be possible to construct a case where this fix
is not enough either.

I'm not so deep into this code (this area in general) to know by
intuition if this is the case or not, or if there's an obvious
alternative, this is another reason I'm pushing a bit for a reply, I
would like to get some insight from the people who wrote this code in
the first place ;)

And yes, I want this fixed in OpenJDK as soon as possible, now that I
know the cause and a possible solution, editing in NetBeans is starting
to be so annoying, makes me feel really bad!!!

Thanks,
Mario

   




Re: [OpenJDK 2D-Dev] [OpenJDK Rasterizer] Antialiased horizontal or vertical lines.

2010-06-18 Thread Jim Graham
The Ductus pipeline will do the same thing with STROKE_PURE.  It sounds 
like maybe the Pisces pipeline doesn't support STROKE_NORMALIZE yet?


...jim

Denis Lila wrote:

Hello.

I noticed that anti aliased horizontal and vertical lines are 
not drawn properly. I've described the results (with pictures)

here: http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=509

In the patch, I try shifting line coordinates by half a pixel as 
soon as they're passed into Renderer.java:lineTo, and it seems to

fix it.

However, this seems a bit too easy, so if anyone can think of any
ways in which this fix breaks something, please let me know.

Thank you,
Denis.



Re: [OpenJDK 2D-Dev] Fix for uniformly scaled dashed lines.

2010-06-18 Thread Jim Graham

Hi Denis,

Here are my thoughts on it:

- Lines are affinely transformed into lines.  The slope may be different 
before and after the transform, but both have a single slope.


- The ratio of a line length to its transformed line length is a scale 
factor that depends solely on the angle of the line.  Thus, for 
determining dashing you can simply compute this scale factor once for a 
given line and then that single scale factor can be applied to every 
dash segment.


It appears that your setup code takes these factors into account, though 
I haven't done a grueling line by line analysis as to whether you got 
the math right.


One more optimization is that once you know the angle of the line then 
you have a factor for how the length of a segment of that line relates 
to its dx and dy.  Note that for horizontal and vertical lines one of 
those factors may be Infinity, but every line will have a non-zero and 
non-infinity factor for one of those two dimensions.


This means that you can calculate the dashing by simply looping along 
the major axis of the line and comparing either the dx, or the dy to 
scaled lengths that represent the lengths of the transformed dashes 
projected onto the major axis.


Finally, the other dx,dy can be computed from the dx,dy of the major 
axis with another scale.  I am pretty sure that this dx=dy or dy=dx 
scale factor might be zero, but it would never be infinite if you are 
calculating along the major axis of the transformed line, but I didn't 
write out a proof for it.


Taking both of these concepts into account - can that make the inner 
loop even simpler?


...jim

Denis Lila wrote:

Hello.

I think I have a fix for this bug: 
http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=504

The problem is caused by the symmetric variable in pisces/Dasher.java.
symmetric is set to (m00 == m11  m10 == -m01), and never changed.

It is only used in one place (in lineTo) to simplify the computation of
the length of the line before an affine transformation A was applied to it.

This is why it causes a problem:
If A = [[a00, a01], [a10, a11]] and (x,y) is a point obtained by applying
A to some other point (x',y'), then what we want is the length of the vector
(x',y'), which is ||Ainv*(x,y)||. Ainv = (1/det(A)) * [[a11, -a01],[-a10, a00]],
so, after some calculations, ||Ainv*(x,y)|| ends up being equal to
sqrt(x^2*(a11^2 + a10^2) + y^2*(a00^2 + a01^2) - x*y*(a11*a01 + a00*a10)) * 
1/|det(A)|.
If symmetric==true, this simplifies to:
sqrt((a11^2 + a01^2) * (x^2 + y^2)) * 1/|det(A)|, and
|det(A)| = a11^2 + a01^2, so, the final answer is:
sqrt((x^2 + y^2)) / sqrt(det(A)). Therefore the problem in Dasher.java
is that it divides by det(A), not sqrt(det(A)).

My fix for this was to remove the symmetric special case. Another possible fix
would have been to introduce an instance sqrtldet and set it to sqrt(det(A)),
and divide by that instead of det(A). This didn't seem worth it, because the 
only
benefit we gain by having the symmetric variable is to save 3 multiplications
and 1 division per iteration of the while(true) loop, at the expense of making the 
code more complex, harder to read, introducing more opportunity for bugs, and adding

hundreds of operations of overhead (since PiscesMath.sqrt would have to be 
called to
initialize sqrtldet).

To make up for this slight performance loss I have moved the code that computes
the transformed dash vectors outside of the while loop, since they are constant
and they only need to be computed once for any one line.
Moreover, computing the constant dash vectors inside the loop causes
them to not really be constant (since they're computed by dividing numbers that
aren't constant). This can cause irregularities in dashes (see comment 14 in
http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=197).

I would very much appreciate any comments/suggestions.

Thank you,
Denis Lila.



Re: [OpenJDK 2D-Dev] Fix for drawing round endcaps on scaled lines.

2010-06-18 Thread Jim Graham
Sigh - that makes sense.  One issue is that the resulting paths it 
generates are much more verbose than they need to be.  This would 
generally mean that it takes far more storage than it would otherwise 
need - and it means that if the result needs to be transformed then it 
would take many more computations to transform each segment than the bezier.


So, perhaps it would be worth having it check the type of the output and 
do either a bezier or a bunch of lines depending on if it is a PC2D or a 
LineSink?


Also, it isn't really that difficult to for Renderer to include its own 
Cubic/Quadratic flattening code, but it might involve more calculations 
than the round-cap code since it would have to be written for arbitrary 
beziers whereas if you know it is a quarter circle then it is easier to 
know how far to subdivide...  :-(


...jim

Denis Lila wrote:

So, I have been thinking about this, and I can't see a good
way to do it that wouldn't involve heavy changes to Pisces.

In order for Stroker to generate Bezier quarter circles, it would
have to implement a curveTo method, which means Stroker should 
start implementing PathConsumer2D and instead of using a LineSink

output it would have to use a PathConsumer2D output (either that, or
LineSink should include a curveTo method, but then there won't really
be any difference between a LineSink and a PathConsumer2D. By the way,
LineSink doesn't have any implemented methods, so why is it an abstract
class as opposed to an interface?)

Stroker is used in 3 ways:
1. As an implementation of BasicStroke's createStrokedShape method. This
uses a Path2D object as output.
2. As a way of feeding a PathConsumer2D without calling createStrokedShape
to generate an intermediate Shape. This uses a PathConsumer2D output.
3. As a way of feeding lines to a Renderer object, which generates alpha
tiles used for anti-aliasing that are fed to a cache and extracted as needed
by an AATileGenerator. Obviously, Stroker's output here is a Renderer.

1 and 2 aren't problems, because the underlying output objects support
Bezier curves. 3, however, doesn't, and it seems like implementing a 
curveTo method for Renderer would be very difficult because the way it 
generates alpha tiles is by scanning the drawn edges with horizontal

scan lines, and for each scan line finding the x-intersections of the scan
lines and the edges. Then it determines the alpha values (I'm not too sure
how it does this).
In order to implement Bezier curves in Renderer, we would have to have
a quick way of computing, for each scan line, all its intersections with
however many Bezier curves are being drawn.

I haven't given much thought to how this could be done, as I am not very
familiar with Bezier curves, but it doesn't seem easy enough to justify
fixing such a small bug.

- Original Message -
From: Jim Graham james.gra...@oracle.com
To: Denis Lila dl...@redhat.com
Cc: 2d-dev@openjdk.java.net
Sent: Wednesday, June 9, 2010 7:42:33 PM GMT -05:00 US/Canada Eastern
Subject: Re: [OpenJDK 2D-Dev] Fix for drawing round endcaps on scaled lines.

I don't understand - why do we generate sample points based on the size 
of the cap?  Why not generate a pair of bezier quarter-circles and let 
the rasterizer deal with sampling?


...jim

Denis Lila wrote:

Hello.

I think I have a fix for this bug:
http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=506

Basically, the problem is that if there is a magnifying affine transformation set on the 
graphics object and one tries to draw a line with small thickness and round end caps, the 
end caps appear jagged. This is because the computation of the length of the array that 
contains the points on the pen with which the decoration is drawn does not 
take into account the size of the pen after the magnification of the affine 
transformation. So, for example, if the line length was set to 1, and the transformation 
was a scaling by 10, the resulting pen would have a diameter of 10, but only 3 pen points 
would be computed (pi*untransformedLineWidth), so the end cap looks like a triangle.

My fix computes an approximation of the circumference of the transformed pen 
(which is an ellipse) and uses that as the number of points on the pen. The 
approximation is crude, but it is simple, faster than alternatives 
(http://en.wikipedia.org/wiki/Ellipse#Circumference), and I can say from 
observations that it works fairly well.

There is also icing on the cake, in the form of slight improvements in 
performance when the scaling is a zooming out. Example: if the original line 
width was 100, but g2d.scale(0.1,0.1) was set, then the resulting line would 
have a width of 10, so only ~31 points are necessary for the decoration to look 
like a circle, but without this patch, about 314 points are computed (and a 
line is emitted to each one of them).

I appreciate any feedback.

Regards,
Denis Lila.



Re: [OpenJDK 2D-Dev] Fix for drawing round endcaps on scaled lines.

2010-07-07 Thread Jim Graham

Hi Denis,

We need a lot of upgrades in this area, but they will come with some 
engineering cost.


Note that the Ductus line widening code widens curves to curves, but 
many (most?) line wideners flatten everything to do widening so they 
only have to deal with algorithms that widen and join lines.  I haven't 
applied myself to the task of creating an algorithm that can do their 
direct curve widening technique yet so I don't have a canned solution 
for that, but it would make the results of 
BasicStroke.createStrokedShape() much less verbose.


Also, for a practical matter the existing down-stream rendering classes 
that take the output from these widening classes only deal with lines, 
but if they took the curves directly then they might be able to do a 
more efficient job of flattening the segments.  Until then, the only 
case where it would matter much for these classes to output curves would 
be for BS.createStrokedShape() and even that would probably require some 
more plumbing from the code that feeds these classes.


But, I want to raise visibility of these issues here because we should 
not be blindly accepting flattening as the law when we consider upgrades 
to this part of the code.  It is currently an easy way out and as we 
spend time on the code we should provide resistance to accepting that 
philosophy by default...


...jim

Denis Lila wrote:

That's true.

Well, if we're worried about the generated paths being verbose
and taking long to process then the problem extends beyond just
drawing round end caps. As far as I can see, whenever a path is 
drawn that doesn't consist only of straight lines (i.e. an ellipse),

a flattening path iterator is being used to feed Stroker. So all
the bezier curves are still broken down into tiny straight lines,
just not by Stroker itself.

So, my question is, given a bezier curve C and a number w, is
there a way of quickly computing the control points of two bezier
curves C1, C2 such that the stuff between C1 and C2 is the widened
path?
More formally: compute the control points of C1, C2, where
C1 = {(x,y) + N(x,y)*(w/2)  | (x,y) in C}
C1 = {(x,y) - N(x,y)*(w/2)  | (x,y) in C}, where N(x,y) is the normal
of C at (x,y).

If we could do this easily, then we can just make a new class that
outputs bezier curves that is similar in purpose to Stroker, but that 
is used only when the output can handle bezier curves. This way, the

only use left for Stroker would be when anti-aliasing, and for
every thing else we wouldn't have to use a flattening path iterator.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

Consider the case of using BasicStroke.createStrokedShape().  How do
you 
know how many pixels the resulting path will occupy?  You can't reduce


to concrete samples if you don't know the transform.

So, for rendering, then you may be correct.  But for cases where the 
path is being asked for then beziers are the only responsible

solution...

...jim

Denis Lila wrote:

Hello Jim.

I thought about checking the output and changing the behaviour 
depending on whether the output is a PC2D or a LineSink, but I

didn't

implement it because I thought the point was to get rid of the

sampling

at this stage. However, if performance is the issue, then I guess

I'll

start working on it.

Although, I wonder whether it is really worth it. I think most lines

drawn

won't be wider than about 5 pixels, which means that the current way

will

emit about 7 lines, so that's 14 coordinates. 2 bezier quarter

circles will

require 12 coordinates. In terms of storage, there isn't much

difference, and

for lines of width 4 or smaller the current method is more

efficient.

I'm also guessing that it's harder for the rasterizer to deal with

bezier

curves than with straight lines, so is it possible that replacing
the 

3.14*lineWidth/2 lines generated by the current method with 2 bezier
quarter circles isn't worth it (for small lineWidths)?

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Sigh - that makes sense.  One issue is that the resulting paths it
generates are much more verbose than they need to be.  This would
generally mean that it takes far more storage than it would

otherwise

need - and it means that if the result needs to be transformed then

it

would take many more computations to transform each segment than

the

bezier.

So, perhaps it would be worth having it check the type of the

output
and 
do either a bezier or a bunch of lines depending on if it is a PC2D

or
a 
LineSink?


Also, it isn't really that difficult to for Renderer to include

its
own 
Cubic/Quadratic flattening code, but it might involve more
calculations 
than the round-cap code since it would have to be written for
arbitrary 
beziers whereas if you know it is a quarter circle then it is

easier
to 
know how far to subdivide...  :-(


...jim

Denis Lila wrote:

So, I

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-07 Thread Jim Graham
For AA this is exactly what we do (round to nearest pixel centers for 
strokes).  Note that this is done prior to any line widening code is 
executed.


For non-AA we normalize coordinates to, I believe the (0.25, 0.25) 
sub-pixel location.  This is so that the transitions between widening of 
lines occurs evenly (particularly for horizontal and vertical wide 
lines).  If you round to pixel edges then you have the following 
progression (note that the line width grows by half on either side of 
the original geometry so you have to consider the line widths where 
you encounter the pixel centers to your left and right (or above and 
below) which govern when that column (or row) of pixels first turns on):


width 0.00 = 0.99  nothing drawn (except we kludge this)
width 1.00 = 1.00  1 pixel wide (col to left turns on)
width 1.01 = 2.99  2 pixels wide (col to right turns on)
width 3.00 = 3.00  3 pixels wide (etc.)
width 3.01 = 4.99  4 pixels wide

Note that it is nearly impossible to get an odd-width line.  You 
basically have to have exactly an integer width to get an odd-width 
line.  This is because at the odd widths you reach the half pixel 
locations on both sides of the line at the same time.  Due to the 
half-open insideness rules only one of the pixels will be chosen to be 
inside this path.  Just below these sizes and you fail to hit either 
pixel center.  Just at the integer size you reach both pixel centers at 
the same time.  Just slightly larger than that width and now you've 
fully enclosed both pixel centers and the line width has to increase by 
nearly 2.0 until you reach the next pixel centers.


(The kludge I talk about above is that we set a minimum pen width so 
that we never fail to draw a line even if the line width is set to 0.0, 
but the above table was a theoretical description of the absolute rules.)


If we rounded them to pixel centers, then the transitions look like this:

width 0.00 = 0.00  nothing drawn (modulo kludge)
width 0.01 = 1.99  1 pixel wide (column you are in turns on)
width 2.00 = 2.00  2 pixels wide (column to left turns on)
width 2.01 = 3.99  3 pixels wide (column to right turns on)
width 4.00 = 4.00  4 pixels wide (etc.)
width 4.01 = 5.99  5 pixels wide

We have a similar effect as above, but biased towards making even line 
widths harder.


So, by locating lines at (0.25, 0.25) subpixel location we end up with a 
 very even progression:


width 0.00 = 0.50  nothing drawn (modulo kludge)
width 0.51 = 1.50  1 pixel wide (column you are in turns on)
width 1.51 = 2.50  2 pixel wide (column to left gets added)
width 2.51 = 3.50  3 pixel wide (column to right gets added)
width 3.51 = 4.50  4 pixel wide (etc.)

This gives us nice even and gradual widening of the lines as we increase 
the line width by sub-pixel amounts and the line widths are fairly 
stable around integer widths.


Also, note that we don't say when stroking as you might want to 
normalize both strokes and fills so that they continue to match.  I 
believe that we normalize both strokes and fills for non-AA and we only 
normalize strokes for AA (and leave AA fills as pure).  AA is less 
problematic with respect to creating gaps if your stroke and fill 
normalization are not consistent.


The rounding equations are along the lines of:

v = Math.floor(v + rval) + aval;

For center of pixel you use (rval=0.0, aval=0.5)
For 0.25,0.25 rounding use  (rval=0.25, aval=0.25)
For edge of pixel you use   (rval=0.5, aval=0.0)

Also, we came up with an interesting way of adjusting the control points 
of quads and cubics if we adjusted their end points, but I don't know if 
what we did was really the best idea.  For quads we adjust the control 
point by the average of the adjustments that we applied to its 2 end 
points.  For cubics, we move the first control point by the same amount 
as we moved the starting endpoint and the second control point by the 
amount we moved the final endpoint.  The jury is out on whether that is 
the most aesthetic technique...


...jim

Denis Lila wrote:

Regarding VALUE_STROKE_NORMALIZE the API says:
Stroke normalization control hint value -- geometry should
be normalized to improve uniformity or spacing of lines and
overall aesthetics. Note that different normalization 
algorithms may be more successful than others for given 
input paths. 


I can only think of one example where VALUE_STROKE_NORMALIZE makes a visible
difference between the closed source implementation and OpenJDK:
when drawing anti-aliased horizontal or vertical lines of width 1, Pisces 
draws a 2 pixel wide line with half intensity (because integer coordinates

are between pixels). Sun's jdk with VALUE_SROKE_NORMALIZE turned on draws
a 1 pixel line with full intensity. This could to achieved by just
checking for normalization and rounding 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-07-07 Thread Jim Graham
The first part means that if the scale is uniform in X and Y 
(AffineTransform has some logic to determine this property in its 
getType() method) then we can use X11 to do line widening by just giving 
it a scaled line width.  Also, X11 is limited to integer line widths so 
we would only want to do this if the SPEED hint is specified (not 
QUALITY) or if the scaled line width was close to an integer.


If we are going to use a software rasterizer to widen the line and then 
send over spans to render, it may be faster to just give X11 the 
original path and a scaled line width and ask it to widen the line. 
Even if it uses a software renderer the reduction in protocol traffic is 
a win, and their rasterizer is probably optimized for integer polygons 
and may likely be faster than our more general curve-handling code.


But, I would rank this low on optimizations at this point...

...jim

Clemens Eisserer wrote:

Hi Denis,


In sun.java2d.x11.X11Renderer, line 340, it says:
   // REMIND: X11 can handle uniform scaled wide lines
   // and dashed lines itself if we set the appropriate
   // XGC attributes (TBD).


Also, it is a known issue that Pisces does not support the STROKE_CONTROL
hint.

I have been wanting to implement these two features, and I have a few questions:
Has anything been decided on the first issue? Do we still want to implement it?
If yes, can anyone give me some rough suggestions as to how I can get started?


Its just my personal opinion, but I would recommend not implementing it.
Xorg falls back to software anyway for anything more complex than
solid rectangles and blits
and those code-paths will only be triggered for non-antialised
rendering with solid colors.

Implementing it in Pisces would help every backend OpenJDK supports :)

Just checked and I also ignore the STROKE_CONTROL stuff completly in
the cairo based Jules rasterizer.
Curious how that could be mapped to Cairo, do you know any more
in-depth explanation how it works - or examples how it should look
like?

Thanks, Clemens


Re: [OpenJDK 2D-Dev] Fix for uniformly scaled dashed lines.

2010-07-07 Thread Jim Graham
This looks good, but don't assert that the transform is non-singular. 
Transforms can frequently be singular and that isn't an exception.


Actually, a singular transform simply means that the entire coordinate 
space has collapsed to a line or a point and so nothing need be 
rendered.  If there is a way to shortcut the rendering to a NOP that 
would be the right reaction to a singular transform...


...jim

Denis Lila wrote:

Hello Jim.

Here is a webrev for the patch: 
http://icedtea.classpath.org/~dlila/webrevs/dasherFix/webrev/

I have eliminated most of the setup code, like you suggested.

This is both more efficient, and far clearer than the original code
(and especially my version of the patch. It is also correct in cases 
where the transform is [[n,0],[0,n]]).


I still have my version of the patch (the one with dashesToCompute),
and as I mentioned in another e-mail, the allocation of the array
can be eliminated, since it can be turned into a field. So it should
have better performance in pretty much all cases.
If you would like me to send a webrev for that too, e-mail me.

Thank you,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

One request on your code - please don't use the variable lowercase
L. 
  On my screen with Courier font I cannot distinguish between the
number 
1 and the lowercase L character and so I cannot verify if your code is


correct.

Also, by inner loop I meant the single loop.  I use the term to mean

the loop that does all the work at the innermost level without
regard 
to whether the case contains only 1 loop and is therefore a degenerate


application of the term.

My comment about the major axis stuff was an optimization that is no

longer needed.  I though I saw calls to hypot() in the inner loop, but
I 
just noticed that those were in deleted code and the new code has no 
such calls, so you can ignore it.  If it makes the comment clearer, 
major axis is either the X or Y axis depending on whether the line
is 
more horizontal or vertical, but you can ignore it now.


I will note that the 2 arrays you compute are simply scaled versions
of 
the dash array and so we could eliminate the extra allocations by
simply 
computing the values inside the loop at the cost of a multiply per
dash 
segment to offset the cost of an allocation per incoming line segment.


Also, you would no longer need to compute the dashesToCompute value

and the setup code would be much, much simpler (basically you just
need 
to compute the untransformed length and the cx and cy values and then


jump into the loop).

I'm leaning towards the multiplies in the loop to greatly simplify the

code...

(One last comment - have you checked what happens with the code in the

presence of a degenerate transform?  A non-invertible transform may
run 
the risk of an infinite loop if you assume that you can reverse
compute 
the line length and end up with a finite value...)


...jim

Denis Lila wrote:

Hello Jim.

Thank you for your reply. It seems my code did not fully take into
account your second point after all.
The dx's of the transformed dashes are di*newx/x,y (where
di is the untransformed dash length, newx is the transformed x 
coordinate, and x,y is the untransformed line length). Obviously,

newx/x,y is constant for all dash segments, so it can be computed
outside of the loop, but I was computing t=di/x,y inside the loop,
and then t*newx also inside the loop.

I have fixed this and I included an improved version of the patch.

However, I do not understand the second part of your e-mail
(One more optimization ...). I am not sure what you mean by
major axis, how one would loop along it, and what the inner

loop

is. There are no nested loops in this method.

Also, the computation of the dxi and dyi of the transformed dash

segment

dash[i] involves just 1 multiplication and 1 bit shift (along with

an

overhead of 2 divisions and 2 bit shifts).
The computation of the actual endpoint of the dashes (done in the

while(true)

loop) most of the time involves just 2 additions.
I am not sure how this can be made any simpler.

Thank you,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

Here are my thoughts on it:

- Lines are affinely transformed into lines.  The slope may be
different 
before and after the transform, but both have a single slope.


- The ratio of a line length to its transformed line length is a

scale
factor that depends solely on the angle of the line.  Thus, for 
determining dashing you can simply compute this scale factor once

for
a 
given line and then that single scale factor can be applied to
every 

dash segment.

It appears that your setup code takes these factors into account,
though 
I haven't done a grueling line by line analysis as to whether you

got

the math right.

One more optimization is that once you know the angle of the line

then

you have a factor for how the length of a segment

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-13 Thread Jim Graham

Hi Denis,

float operations are pretty fast and they are usually done in a separate 
part of the processor so the compiler can schedule a lot of bookkeeping 
instructions to run in parallel with waiting for the results of the FP 
instruction.  In the end, they often end up being free if there are 
enough bookkeeping instructions (branches, fetching data, etc.) in 
amongst the data.


Given how many alternate instructions you are pointing out for the fixed 
point code it would be very likely that this would be a win.


The main reason that Pisces is implemented heavily in fixed point is 
that it was originally written for the mobile platform where there are 
no FP instructions.  We don't have to worry about that on the desktop 
(any more).


I strongly support you converting things to fp if you want to do it...

...jim

On 7/12/2010 8:05 AM, Denis Lila wrote:

Hello.

Is it ok if I do this? I already started working on it on Friday.
I think I should be done by tomorrow.

But yes, I agree that we should convert to floating point. As for
performance, there's also the fact that right now we're trading
one double multiplication for 2 casts to long, 1 long multiplication,
1 bit shift of a long, and 1 cast back to an int. I don't know much
about how these are done in hardware, but it doesn't seem like they'd
be faster than the double multiplication.

As for large coordinates, there's a bug report about it (one not
reported by me :) ) here: https://bugzilla.redhat.com/show_bug.cgi?id=597227
I submitted a matching bug report on bugs.sun.com (ID 6967436), but I
can't find it when I search for it.

Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Sigh...

Pisces could really stand to upgrade to floats/doubles everywhere, for

several reasons:

- FPU ops are likely as fast if not faster on modern hardware due to
parallel execution of FP instructions alongside regular instructions.

- Don't have to worry about getting coordinates larger than 32K (I
don't
think this is well tested, but I imagine that Pisces will not deal
with
it very gracefully).

- Lots of code is greatly simplified not having to deal with the
semantics of how to do fixed point multiplies, divides, and
conversions.

I meant to do this during the original integration, but I wanted to
get
the task done as quickly as possible so that we could have an open
source alternative to the closed Ductus code so I left that task for a

later update.  But, now that a lot of work is being done on the code
to
fix it up, it might be better to do the float conversion now so that
the
maintenance is easier and before we end up creating a lot of new fixed

point code.

My plate is full right now, but hopefully I can interest someone else
in
doing a cleanup cycle?  (Donning my Tom Sawyer hat... ;-)

...jim


Re: [OpenJDK 2D-Dev] Fix for drawing round endcaps on scaled lines.

2010-07-13 Thread Jim Graham

Hi Denis,

In looking through this code, I can see where it emits the proper join 
for ccw turns, but it looks like it just emits a round join for cw 
turns.  Is that true?  Doesn't this mean that a cw path will always have 
round joins?  Or am I missing something?


My expectation was that the code would emit the proper joins for both 
ccw and cw turns, but it would either place it in the outgoing or the 
return path depending on whether the turn was ccw, no?  I don't see how 
that happens (but I haven't actually tested the code)...


...jim

Denis Lila wrote:

Hello Jim.

I'm terribly sorry about that.
I fixed it. Now the only highlighted lines in the webrev should be
lines I actually changed.
Please take another look at it.
(link is still the same: 
http://icedtea.classpath.org/~dlila/webrevs/bezierRoundJoins/webrev/)

Thanks,
Denis.

PS: I added a new method: emitReverse, that goes through the reverse list
and emits it. It's used in 2 methods which used to just do it themselves.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

You moved some code around without modifying it.  This makes it hard
to 
see what changed and what didn't and verify that the changes are 
accurate.  Also, it adds diffs to the file that are unrelated to the 
fixing of the bug.  Was there a practical reason for why you moved the


code around?  In particular I refer to:

- the setup code that deals with if (joinStyle == JOIN_MITER)

- the setOutput method - which not only moved, but lost its javadocs

- computeMiter() and drawMiter()

All of that code appears to be unchanged at first glance, but it is

hard to tell from the webrevs.  Also, a couple of stylistic issues:

- you changed the declarations of isCCW which moved its arguments
over, 
but the continuation lines aren't indented to match


- The two flavors of emitCurveTo() should probably be next to each 
other (i.e. not have emitLineTo() between them or fall between the two


flavors of emitLineTo()).

In general I think the changes are OK, but I'm still reviewing them
and 
the above issues sprung to mind on a first pass (and/or they are 
complicating the contextual review) so I thought I'd mention them 
earlier than later...


...jim

Denis Lila wrote:

Hello.

I just noticed that approximation of circle arcs by bezier curves
had already been implemented in ArcIterator by Jim Graham. 


It computes the same control points as my solution, but it does so
more straightforwardly, without any rotation, so it is faster and
clearer. I have updated my solution to include this.

The link remains the same.

Thanks,
Denis.

- Denis Lila dl...@redhat.com wrote:


Hello.

I think I got this working. The webrev is at:


http://icedtea.classpath.org/~dlila/webrevs/bezierRoundJoins/webrev/

(NOTE: this is not a final version. I have included 2 versions
of 2 methods. Only one set should be kept. See below for more.)

My Changes:
---
1.
I've made LineSink into an interface, rather than an abstract
class,
because all of its methods were abstract, so it makes more sense

this

way.

2.
I've introduced a new interface that extends LineSink called
PathSink,
which allows the curveTo method, so there have been no changes to
Stroker's public interface. When someone wants to create a Stroker
with a PathSink output, it simply passes its constructor a

PathSink,

so the only changes outside of Stroker are in

PiscesRenderingEngine,

where the methods that handle Path2D and PathConsumer2D objects
create nameless PathSinks instead of nameless LineSinks.

3. In Stroker:
I've introduced a method called drawBezRoundJoin, analogous to
computeRoundJoin. In drawRoundJoin I detect whether the output is
a PathSink. If it is, I call drawBezRoundJoin, otherwise

everything

proceeds as it used to. drawBezRoundJoin uses computeBezierPoints

to

compute the control points. computeBezierPoints computes the

control

points
for an arc of t radians, starting at angle a, with radius r
by computing the control points of an arc of radius 1 of t radians
that
starts at angle -t/2. This is done by solving the equations

resulting

from the constraints that (P3-P2) and (P1-P0) must be parallel to

the

arc's tangents at P3 and P0 respectively, and that B(1/2)=(1,0).

Then

the
points are scaled by r, and rotated counter clockwise by a+t/2.
Then drawBezRoundJoin emits the curve.
All this is done in a loop which is used to break up large

arcs

into
more than one bezier curve. Through the iterations, the computed
control
points don't change - the only thing that changes is how they're
rotated.
So a good alternative approach would be to do the rotation

outside

of
computeBezierPoints, and call computeBezierPoints once outside of

the

loop,
so that the control points aren't recomputed unnecessarily.
I have included code for this in the methods computeBezierPoints2

and

drawBezRoundJoin2. This is my favoured approach, since

Re: [OpenJDK 2D-Dev] Fix for drawing round endcaps on scaled lines.

2010-07-13 Thread Jim Graham
Sorry, ignore this message.  I was misinterpreting the field 
internalJoins to mean joins on the inside of the path (!ccw).  It 
actually means implicit joins in the middle of a flattened curve and I 
can see why those are always done round, so the code makes sense now 
(naming of the variable notwithstanding)...


...jim

Jim Graham wrote:

Hi Denis,

In looking through this code, I can see where it emits the proper join 
for ccw turns, but it looks like it just emits a round join for cw 
turns.  Is that true?  Doesn't this mean that a cw path will always have 
round joins?  Or am I missing something?


My expectation was that the code would emit the proper joins for both 
ccw and cw turns, but it would either place it in the outgoing or the 
return path depending on whether the turn was ccw, no?  I don't see how 
that happens (but I haven't actually tested the code)...


...jim

Denis Lila wrote:

Hello Jim.

I'm terribly sorry about that.
I fixed it. Now the only highlighted lines in the webrev should be
lines I actually changed.
Please take another look at it.
(link is still the same: 
http://icedtea.classpath.org/~dlila/webrevs/bezierRoundJoins/webrev/)


Thanks,
Denis.

PS: I added a new method: emitReverse, that goes through the reverse list
and emits it. It's used in 2 methods which used to just do it themselves.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

You moved some code around without modifying it.  This makes it hard
to see what changed and what didn't and verify that the changes are 
accurate.  Also, it adds diffs to the file that are unrelated to the 
fixing of the bug.  Was there a practical reason for why you moved the


code around?  In particular I refer to:

- the setup code that deals with if (joinStyle == JOIN_MITER)

- the setOutput method - which not only moved, but lost its javadocs

- computeMiter() and drawMiter()

All of that code appears to be unchanged at first glance, but it is

hard to tell from the webrevs.  Also, a couple of stylistic issues:

- you changed the declarations of isCCW which moved its arguments
over, but the continuation lines aren't indented to match

- The two flavors of emitCurveTo() should probably be next to each 
other (i.e. not have emitLineTo() between them or fall between the two


flavors of emitLineTo()).

In general I think the changes are OK, but I'm still reviewing them
and the above issues sprung to mind on a first pass (and/or they are 
complicating the contextual review) so I thought I'd mention them 
earlier than later...


...jim

Denis Lila wrote:

Hello.

I just noticed that approximation of circle arcs by bezier curves
had already been implemented in ArcIterator by Jim Graham.
It computes the same control points as my solution, but it does so
more straightforwardly, without any rotation, so it is faster and
clearer. I have updated my solution to include this.

The link remains the same.

Thanks,
Denis.

- Denis Lila dl...@redhat.com wrote:


Hello.

I think I got this working. The webrev is at:


http://icedtea.classpath.org/~dlila/webrevs/bezierRoundJoins/webrev/

(NOTE: this is not a final version. I have included 2 versions
of 2 methods. Only one set should be kept. See below for more.)

My Changes:
---
1.
I've made LineSink into an interface, rather than an abstract
class,
because all of its methods were abstract, so it makes more sense

this

way.

2.
I've introduced a new interface that extends LineSink called
PathSink,
which allows the curveTo method, so there have been no changes to
Stroker's public interface. When someone wants to create a Stroker
with a PathSink output, it simply passes its constructor a

PathSink,

so the only changes outside of Stroker are in

PiscesRenderingEngine,

where the methods that handle Path2D and PathConsumer2D objects
create nameless PathSinks instead of nameless LineSinks.

3. In Stroker:
I've introduced a method called drawBezRoundJoin, analogous to
computeRoundJoin. In drawRoundJoin I detect whether the output is
a PathSink. If it is, I call drawBezRoundJoin, otherwise

everything

proceeds as it used to. drawBezRoundJoin uses computeBezierPoints

to

compute the control points. computeBezierPoints computes the

control

points
for an arc of t radians, starting at angle a, with radius r
by computing the control points of an arc of radius 1 of t radians
that
starts at angle -t/2. This is done by solving the equations

resulting

from the constraints that (P3-P2) and (P1-P0) must be parallel to

the

arc's tangents at P3 and P0 respectively, and that B(1/2)=(1,0).

Then

the
points are scaled by r, and rotated counter clockwise by a+t/2.
Then drawBezRoundJoin emits the curve.
All this is done in a loop which is used to break up large

arcs

into
more than one bezier curve. Through the iterations, the computed
control
points don't change - the only thing that changes is how they're
rotated.
So

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-20 Thread Jim Graham
 of these changes 
below, but in the end, 5 or 6 methods would need their math updated to 
make this work right and I'm not sure if I've described it well enough 
for you to make all the needed changes.  Perhaps this could be a 
follow-on project, but it would greatly simplify a lot of the code in here.

 End can of worms 

216 - if you save the sx0,sy0 before subpix'ing them then you don't have 
to unsubpix them here.  (Though you still need the subpix sy0 for line 
209 - or you could just call subpix on it for that line and at least 
you'd save 1 call to sub/unsub).


236,237 - here is what I was talking about above (can of worms).  minY 
and maxY should probably both be ceil'd and then maxY would be the first 
scanline not processed and then the problem above about double 
crossings at the scanlines wouldn't be an issue.


243 - minY = maxY if the above is done (can of worms).

262 - y  maxY if the above is done (can of worms).

256,264 - casting to int is problematic.  It truncates towards 0 which 
means negatives are ceil'd and positives are floor'd.  It would be best 
to use floor here instead.  On the other hand, if negative numbers are 
always off the left side of the drawable then this is moot.


317,318 - blech (can of worms) - looks like more math designed for the 
minY up to and including maxY approach - I far prefer the minY up to, 
but not including maxY approach as mentioned above.  ;-) 
_endRendering(), setCrossingsExtents, computeCrossingsForEdge (already 
mentioned above), computeBounds, and renderStrip() would all need to be 
updated for that.  How comfortable do you feel with that conversion?


597 - you deleted the call to arraycopy?  D'oh! (Actually, 
Arrays.newSize() is faster than allocate and copy so it couldn't hurt 
to switch to it here.  Also, expanding by 10% seems a little 
conservative for large edge lists.  I'd rather see double, but never 
more than 10*INIT_EDGES or something like that.  This may be more 
limited memory on mobile thinking here.


611 - (can of worms) ceil = ceil if the above scanline crossing 
change is done.  OK, enough for now on harping on the can of worms.


721 - Arrays.sort()

Stroker.java

- I didn't get to this file yet.  I'm going to send these comments along 
so you can take a look at them and then review Stroker tomorrow...


...jim

Denis Lila wrote:

Hello again.

I know I promised this last week, and I'm sorry it's late, but some nasty 
bugs popped up. The webrev is at: 
http://icedtea.classpath.org/~dlila/webrevs/floatingPoint/webrev/


I took this opportunity to make some changes that aren't related to floating
point conversion, since this effort wasn't really directed at solving any one
particular bug, but towards general refactoring and improving of Pisces 
(although
it does solve a certain bug).
1. 
I removed some write-only variables in Renderer and Stroker.


2. 
I removed Dasher and Stroker's ability for the same object to be used with

more than one output, transform, width, etc. Jim Graham said we should consider
removing the no argument constructor for Dasher in a previous e-mail (which
is equivalent to doing what I've done since you can't have this functionality
without a no argument constructor and methods like setParameters and setOutput).
I thought this was a good idea because this functionality was never being used,
it clutters things up (among other things, it necessitates making the clients 
call setOutput and setParameters before the object can be used. This sort of

thing is not a good idea since methods should be as self-contained as possible),
and even if it is used, all it does is save us the creation of some objects. 
Since
a huge number of these objects would never be needed and since they are not 
very expensive to create (Stroker is the biggest, and it has about 38 members), 
this is a premature optimization.


3.
(2) allowed me to declare a whole bunch of members final (things like 
output,
lineWidth, transformation matrix entries and anything that can't change).

4.
I changed the crossing sorting algorithm in Renderer from bubble sort to 
insertion sort. Not a huge difference, but mine is shorter, it should perform

slightly better, and I think it the algorithm is easier to understand.

5.
I inserted comments on things which used to confuse me. Please check these -
when reading code, there is nothing worse than a wrong explanation in a comment.

6.
I removed the if(false ... block in Renderer. I tried to make it work some
time ago, but it was complicated and more trouble than it was worth - it would
only be used when filling a rectangle, and not even a rectangle that was part of
some path. The rectangle had to be the only thing being rendered. This is fast 
enough
without being treated as a special case.

I think that's it...
As for testing: I tested this fairly thoroughly. I used dashed strokes, various 
affine transformations, lines longer than 2^16

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-21 Thread Jim Graham

Denis Lila wrote:

Hello Jim.

Thank you very much for taking the time to read through this.

169 - if origLen reaches the end of the dash exactly (the == case) 


You're right, I should. I can't just replace = with == though,
because the results will be the same: in the equal case origLen will
become 0, and on the next iteration, the (origLen  dash[idx]-phase) will
be true, and we will do a goTo(x1,y1), which is what we just did in the
previous iteration (unless dash[idx] is 0, in which case the results 
will be even worse). The best solution to this is to just do a nested

check for the == case.


Ah, right - because there is no break when origLen becomes zero. 
Sounds like you're on it.


171 - Aren't x0,y0 stored as subpix values?  You would then be comparing 
a subpix value to a non-subpix value.  Perhaps if the subpix calls are 
moved to the top of the function I think this should work OK?


That's true, they are. This is very puzzling. If a horizontal line is
added, when the crossings for it are being computed, dxBydy should be NaN, and
wouldn't an error be thrown when we try to cast to an int in the call to 
addCrossing?


I'm not sure - I didn't trace it through very far - I just noted that 
the values were likely in different resolutions.



194,197 - Shouldn't these be constants, or based on the SUB_POS_XY?


I suppose I should make a biasing constant. I don't think they should be 
based
on SUB_POS_XY though, because the biasing is done to subpixel coordinates so
there is no danger that if our supersampling is fine enough the biasing will 
make the coordinates jump over some scan line.


I'm guessing you punted on my can of worms suggestion then.  ;-)

216 - if you save the sx0,sy0 before subpix'ing them then you don't have 
to unsubpix them here.  (Though you still need the subpix sy0 for line 
209 - or you could just call subpix on it for that line and at least

you'd save 1 call to sub/unsub).


Ok. I'll just save subpixed and unsubpixed versions of sx0, sy0. That should
eliminate all sx0,sy0 related calls to tosubpix and topix except in moveTo.


and lineTo.  You may only need 3 of those values, though, if I remember 
my code reading well enough.


256,264 - casting to int is problematic.  It truncates towards 0 which 
means negatives are ceil'd and positives are floor'd.  It would be best 
to use floor here instead.  On the other hand, if negative numbers are 
always off the left side of the drawable then this is moot.


That's why I left it at int casting. Do you still think I should change it
to floor?


If you mean floor, I think it best to use floor.  Unless you can prove 
that negatives aren't really an issue and that the strange truncation on 
them won't be numerically a problem - but I don't think it is worth it 
for this code.


Speaking of which, is there a good way to edit and build openJDK from eclipse? 
Then this sort of embarrassing error could be avoided (including the printStats() call).


I don't use Eclipse, sorry.  :-(

As for Arrays.newSize()... I can't find it here: 
http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/api/java/util/Arrays.html

Is this a new function added in 7?


Sorry, make that Arrays.copyOf(..., newSize).  I tried to type the name 
from memory and got it wrong.



721 - Arrays.sort()


I thought about using this, but I did some measurements, and it turns out that 
Arrays.sort() is a bit slower if the portion of the array being sorted has fewer

than about 70 elements.


I wonder what the typical number of elements is.  Is this sorting 
crossings per line?  Then simple primitives like circles should only 
have 2 per line, right?  Is it worth testing for really small numbers of 
elements (much lower than 70) and doing a manual sort?  Or am I 
misunderstanding what is being sorted there?


How comfortable do you feel with that conversion? 


I'll try to do it and include it in a new patch along with, hopefully, a better 
way
to iterate through strips, and especially crossings. Right now all the iteration
state is saved in global variables. This is... not good. I spent far too much 
time last
week on bugs caused by this sort of thing. Ideally, any members that can't be put at 
the top of a class (like our edge and crossing data) should be put in classes of their own.


That sounds good, but also consider doing it in separate stages to 
reduce churn in the code reviewing (and you then have revs to go back 
and test which stage caused a probem if we find a bug later):


- first get all on floats
- then change strip management
- then change to open coordinate intervals
- (or vice versa)


Do you have any ideas about how to iterate through edges in a strip without 
going through
every edge? I was thinking of maybe using some sort of tree to split the 
drawing surface,
but I haven't given it much thought.


If you look for something like the native code for 
sun/java2d/pipe/ShapeSpanIterator.c you will see the way I 

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-27 Thread Jim Graham

Hi Denis,

I'll try to get through both versions and see if I can find anything 
that was hurting performance with your EdgeLists.  I'm guessing that 
this version was created because of the performance issues you found 
with the EdgeList version?  Does this perform more closely to the 
existing code than the EdgeList version?


...jim

Denis Lila wrote:

Hello again.

This attachmet is a can of worms implementation without all the fancy (and 
slow)
iteration. It also includes all of the other suggestions you sent in your first
review of Dasher and Renderer last week (most importantly, the firstOrientation
issue, horizontal lines filtering, and adding prefixes to variable names to make
it clear whether they refer to pixels, or subpixels).

Regards,
Denis.

- Denis Lila dl...@redhat.com wrote:


Hello Jim.

I implemented your can of worms idea. It works, and it got rid of
the biasing.
I wasn't able to send a webrev, but there are many changes and a side
by side
comparison would probably be useless, so I just attached the file. I
hope this is
ok.

I also implemented a better iterating structure for the lines and
the
strips and crossings. I think it is better in every way, except
performance.
The new file is more than 200 lines smaller than the old one. The only
members of Renderer are now the AA variables and the position
variables
(sx*, sy*, x*, y*).
What I've done is I added an EdgeList class, which encapsulates all
the edge
related variables in the old Renderer. At first, I had an Edge class
in addition
to the EdgeList class, and while this was much nicer, it turned out to
be too
expensive (see last paragraph).
I've also added a ScanLineIterator, so instead of _endRendering
iterating
through strips, and then calling renderStrip() which iterates through
the
scanlines in that strip, and then through the crossings in that
scanline,
what happens now is that _endRendering uses the IteratorScanLine to
iterate through each scanline, get get its crossings and iterate
through them
to accumulate the alpha. By the way, a ScanLine is a type defined by
an
interface which exports methods for getting the y coord of the line,
the
number of crossings in it, the ith crossing, and a method for sorting
its crossings.
The class that implements ScanLine is ScanLineIterator itself. I made
a
ScanLine class, but I was afraid performance would suffer because of
all the
object creations (this turned out not to be an issue, after I switched
to the
current way, and remeasured things). I did not switch back because
this is
only slightly worse.

As for performance: I wrote a simple program that tries to draw a
dashed path
that consists of about 160 dashed lines of width 1 and length 3,
going
from the centre of the frame to some point. On my machine, this takes
about 4.9
seconds in openjdk6, and 26 seconds using the attached file. Back when
I was using
the Edge class it took about 39 seconds. Everything without hundres of
thousands of
edges is not much slower
I have not changed any of the algorithms. ScanLineIterator still goes
through
strips of the same size and computes crossings in every strip using
the same
method as before, so I don't know why it's so slow. It can't be
because of anything
happening in _endRendering, because there are only about 9000
scanlines and for each
of them I've just added a few calls to one line getters (which used to
be direct
accesses into arrays).

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Denis Lila wrote:

Hello Jim.

Thank you very much for taking the time to read through this.


169 - if origLen reaches the end of the dash exactly (the ==

case)

You're right, I should. I can't just replace = with ==

though,

because the results will be the same: in the equal case origLen

will

become 0, and on the next iteration, the (origLen 

dash[idx]-phase)

will

be true, and we will do a goTo(x1,y1), which is what we just did

in

the

previous iteration (unless dash[idx] is 0, in which case the

results

will be even worse). The best solution to this is to just do a

nested

check for the == case.

Ah, right - because there is no break when origLen becomes zero.
Sounds like you're on it.


171 - Aren't x0,y0 stored as subpix values?  You would then be

comparing

a subpix value to a non-subpix value.  Perhaps if the subpix

calls

are

moved to the top of the function I think this should work OK?

That's true, they are. This is very puzzling. If a horizontal

line is

added, when the crossings for it are being computed, dxBydy should

be NaN, and

wouldn't an error be thrown when we try to cast to an int in the

call to addCrossing?

I'm not sure - I didn't trace it through very far - I just noted

that

the values were likely in different resolutions.


194,197 - Shouldn't these be constants, or based on the

SUB_POS_XY?

I suppose I should make a biasing constant. I don't think they

should be based

on SUB_POS_XY though, because

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-28 Thread Jim Graham

Woohoo, Denis!  I look forward to seeing the new version!

...jim

On 7/28/2010 5:51 AM, Denis Lila wrote:

Hello Jim.

This one performs almost identically to what is already there
in openjdk6 and 7, since it's exactly what I sent for review
last week, but with all the changes you suggested implemented.

I would actually like to ask you to not look at either one of
them. First of all, there is an ArrayIndexOutOfBoundsException
possible in emitRow.
And secondly, even if there wasn't, last night I implemented
your algorithm from ShapeSpanIterator.c to iterate through
the edges. I have yet to debug it, but it makes everything much,
much simpler, and it should make it far faster, so we get the best
of both worlds.

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

I'll try to get through both versions and see if I can find anything
that was hurting performance with your EdgeLists.  I'm guessing that
this version was created because of the performance issues you found
with the EdgeList version?  Does this perform more closely to the
existing code than the EdgeList version?

...jim

Denis Lila wrote:

Hello again.

This attachmet is a can of worms implementation without all the

fancy (and slow)

iteration. It also includes all of the other suggestions you sent in

your first

review of Dasher and Renderer last week (most importantly, the

firstOrientation

issue, horizontal lines filtering, and adding prefixes to variable

names to make

it clear whether they refer to pixels, or subpixels).

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:


Hello Jim.

I implemented your can of worms idea. It works, and it got rid

of

the biasing.
I wasn't able to send a webrev, but there are many changes and a

side

by side
comparison would probably be useless, so I just attached the file.

I

hope this is
ok.

I also implemented a better iterating structure for the lines

and

the
strips and crossings. I think it is better in every way, except
performance.
The new file is more than 200 lines smaller than the old one. The

only

members of Renderer are now the AA variables and the position
variables
(sx*, sy*, x*, y*).
What I've done is I added an EdgeList class, which encapsulates

all

the edge
related variables in the old Renderer. At first, I had an Edge

class

in addition
to the EdgeList class, and while this was much nicer, it turned out

to

be too
expensive (see last paragraph).
I've also added a ScanLineIterator, so instead of _endRendering
iterating
through strips, and then calling renderStrip() which iterates

through

the
scanlines in that strip, and then through the crossings in that
scanline,
what happens now is that _endRendering uses the IteratorScanLine

to

iterate through each scanline, get get its crossings and iterate
through them
to accumulate the alpha. By the way, a ScanLine is a type defined

by

an
interface which exports methods for getting the y coord of the

line,

the
number of crossings in it, the ith crossing, and a method for

sorting

its crossings.
The class that implements ScanLine is ScanLineIterator itself. I

made

a
ScanLine class, but I was afraid performance would suffer because

of

all the
object creations (this turned out not to be an issue, after I

switched

to the
current way, and remeasured things). I did not switch back because
this is
only slightly worse.

As for performance: I wrote a simple program that tries to draw a
dashed path
that consists of about 160 dashed lines of width 1 and length

3,

going
from the centre of the frame to some point. On my machine, this

takes

about 4.9
seconds in openjdk6, and 26 seconds using the attached file. Back

when

I was using
the Edge class it took about 39 seconds. Everything without hundres

of

thousands of
edges is not much slower
I have not changed any of the algorithms. ScanLineIterator still

goes

through
strips of the same size and computes crossings in every strip

using

the same
method as before, so I don't know why it's so slow. It can't be
because of anything
happening in _endRendering, because there are only about 9000
scanlines and for each
of them I've just added a few calls to one line getters (which used

to

be direct
accesses into arrays).

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Denis Lila wrote:

Hello Jim.

Thank you very much for taking the time to read through this.


169 - if origLen reaches the end of the dash exactly (the ==

case)

 You're right, I should. I can't just replace= with ==

though,

because the results will be the same: in the equal case origLen

will

become 0, and on the next iteration, the (origLen

dash[idx]-phase)

will

be true, and we will do a goTo(x1,y1), which is what we just did

in

the

previous iteration (unless dash[idx] is 0, in which case the

results

will be even worse). The best solution to this is to just do a

nested

check for the == case.

Ah, right - 

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-28 Thread Jim Graham

Hi Denis,

Only some minor comments so far:

Stroker.java:

- Should det be precomputed and saved?  (You calculate it in the 
constructor anyway, but don't save it.)

- Should test for uniform scale be precomputed?
- (Is test for uniform scale too strict?  Can a rotated uniform scale 
use the same code as upright uniform scale?)
- Why are m00_2_m01_2 et al no longer precomputed (they only need to be 
precomputed if scale is not uniform which isn't common)?
- lineLength method is user space length isn't it? - a more 
descriptive name might help avoid confusion if someone is modifying code 
here later.

- line 614 - missing space

Dasher.java:

- Line 187 - use leftInThisDashSegment here?

I still have to look at Renderer.java in depth, but I thought I'd send 
these minor comments along while they are fresh in my email buffer...


...jim

Denis Lila wrote:

Hello.

And, here it is: http://icedtea.classpath.org/~dlila/webrevs/fpBetterAA/webrev/

Just as I thought, it's way faster than the original. I made a new
test, which is supposed to be more realistic than drawing 3 length
lines. It consists of splitting a 1000x1000 frame in 100 10x10 squares
and in each square drawing 20 or so curves with random start/end/control
points in that square. In this case it's at least twice faster (~0.85
versus ~1.95 seconds).

Unfortunately I've had to do a lot of the same optimizations that I was
trying to remove (i.e. making everything a global variable). I tried
writing a higher level version of it, but it completely negated the
performance gains of the new algorithm. Anyway, it's still not as bad
as before because the algorithm is inherently clearer and we only iterate
through scan lines instead of iterating through strips and then scanlines
in that strip, and then edges, all in the same method.

It's also better organized, and logically separate parts of the code don't
really touch each other's variables much. And it's definitely better
commented. 


I also made some minor changes outside of Renderer that did not appear in
the first version of this patch last week:
1. I fixed the issue Jim pointed out in Dasher, where if 
(origLen == leftInThisDashSegment) the dash index was not being incremented.

2. In dasher, I no longer copy the dash array.
3. I removed files PiscesMath.java and Transform4.java because they are no
longer needed.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Woohoo, Denis!  I look forward to seeing the new version!

...jim

On 7/28/2010 5:51 AM, Denis Lila wrote:

Hello Jim.

This one performs almost identically to what is already there
in openjdk6 and 7, since it's exactly what I sent for review
last week, but with all the changes you suggested implemented.

I would actually like to ask you to not look at either one of
them. First of all, there is an ArrayIndexOutOfBoundsException
possible in emitRow.
And secondly, even if there wasn't, last night I implemented
your algorithm from ShapeSpanIterator.c to iterate through
the edges. I have yet to debug it, but it makes everything much,
much simpler, and it should make it far faster, so we get the best
of both worlds.

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

I'll try to get through both versions and see if I can find

anything

that was hurting performance with your EdgeLists.  I'm guessing

that

this version was created because of the performance issues you

found

with the EdgeList version?  Does this perform more closely to the
existing code than the EdgeList version?

...jim

Denis Lila wrote:

Hello again.

This attachmet is a can of worms implementation without all the

fancy (and slow)

iteration. It also includes all of the other suggestions you sent

in

your first

review of Dasher and Renderer last week (most importantly, the

firstOrientation

issue, horizontal lines filtering, and adding prefixes to

variable

names to make

it clear whether they refer to pixels, or subpixels).

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:


Hello Jim.

I implemented your can of worms idea. It works, and it got rid

of

the biasing.
I wasn't able to send a webrev, but there are many changes and a

side

by side
comparison would probably be useless, so I just attached the

file.

I

hope this is
ok.

I also implemented a better iterating structure for the lines

and

the
strips and crossings. I think it is better in every way, except
performance.
The new file is more than 200 lines smaller than the old one.

The

only

members of Renderer are now the AA variables and the position
variables
(sx*, sy*, x*, y*).
What I've done is I added an EdgeList class, which encapsulates

all

the edge
related variables in the old Renderer. At first, I had an Edge

class

in addition
to the EdgeList class, and while this was much nicer, it turned

out

to

be too
expensive (see last paragraph).
I've also added a ScanLineIterator, so

Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-29 Thread Jim Graham

Hi Denis,

The changes look fine and I'm moving on to Renderer.java...

I'd make as many methods private as you can.  It looks like all your new 
methods are private, but some old methods are still public even though I 
don't think they are accessed elsewhere (like addEdge()?).  I think 
private is enough for Hotspot to inline so that should help 
performance a bit.  I usually use final, but I think private does 
the same thing.


You should create constants for the indices into the struct to make the 
code more readable (and to simplify updating the EDGE layout):


X0_OFF = 0
Y0_OFF = 1
// etc.

I don't think you touch X0 and Y0 after initializing them and then using 
them for the first setCurY so you could just use them as the curX and 
curY and save 2 slots in the table.


You initialize edges twice - once when it is declared, and once in the 
constructor.  Note that the initialization where it is declared uses 
INIT_SIZE which is not a multiple of EDGE size anyway.


It looks like there is a missing statement in moveTo to initialize 
this.x0...?


I need some more time on it, but I thought I would send along these 
comments in the meantime...


...jim

Denis Lila wrote:

Hello Jim.

I made the changes you point out, except for your second point
(I don't have time to think about it right now).

I stopped precomputing m00_2_m01_2 because it was only being
used in one place. But I guess it worth it to precompute it
if it's going to be used often (and computeOffset is called
for every line).

The new webrev is at 
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

Thanks,
Denis.


Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-30 Thread Jim Graham

Hi Denis,

More thoughts on Renderer.java.

-- Skipping gaps (minor optimization) --

If there is a gap in the edges in Y, say if a path consists of two 
subpaths, one that covers y=[0..10] and another that covers 
y=[1000..1010], then I think you will iterate over each y value from 10 
to 1000, but have no work to do on each scan line.  That could possibly 
waste a bit of time.


On the other hand, fixing that would have to take into account whether 
or not you are done with a given alpha row, so the NextY function 
can't simply skip - the skipping has to be done at the higher level in 
endRendering - or at least with the cooperation of endRendering.  Since 
it asks what the current Y is near the top of the loop, it could 
detect if the current Y jumped out of the given alpha row, emit it, and 
prepare for a new alpha row starting at the new Y...?


For now, it might be simpler to ignore this and revisit that later since 
it isn't going to be common to have large such jumps in the middle of 
most paths.


-- Done with skipping gaps --

-- Alpha accumulation opt --

Filling the alpha row.  I had an interesting optimization here that I 
never got around to.  Instead of filling the array entries with the 
alpha values, fill them with the deltas of the alpha values.  The inner 
loop in endRendering then becomes something like:


alpha[pix_x0  ] += NUM_POS - (x0  MASK);
alpha[pix_x0+1] += (x0  MASK);
alpha[pix_x1  ] -= NUM_POS - (x1  MASK);
alpha[pix_x1+1] -= (x1  MASK);

The [pix+1] lines were the gotcha that always confused me when I tried 
to do this in the past.  Basically if you enter an inside region at 1 
subpixel before the end of a pixel then you need to add one to the 
coverage for that pixel.  But, starting with the next pixel you want the 
total contribution of the interior region to be NUM_POS per pixel, but 
you've only added 1 so far - so you have to add the additional amount so 
that the total amount added for each enter crossing ends up being 
NUM_POS.  Similarly, when you decrement for the exit crossing you need 
to ensure that the total negative delta sums to NUM_POS across the pixel 
where the exit happens and the following pixel.  Does that make sense?


Note that you could still have the single pixel optimization which 
would look like:


alpha[pix_x0  ] += (x1 - x0);
alpha[pix_x0+1] -= (x1 - x0);

and is equivalent to the above 4 lines.

then when you do the RLE, you simply have to use an accumulation as you 
scan the alpha row:


byte nextVal = startVal + alphaRow[i];

So, for the cost of an add per pixel as you do the RLE you can avoid 
having to do the loop in the multiple pixel section when filling the 
alpha array.


I don't think I've ever quantified this optimization so it might be 
better to investigate it after the current code goes in and adopt it 
only if it shows a big savings.


-- Done with alpha accumulation opt --

Something about the emit last row code bothers me.  First, I would 
guess that the y == boundsMaxY - 1 test would have already flushed the 
row, right?  Also, why do the for loop?  Why not just flush the row if 
it isn't flushed?  I kind of feel like the y == boundsMaxY-1 test 
could be removed from inside the loop and simply test if there is 
unflushed data in the alpha array then just make a call to emitRow() 
with the appropriate values after the loop.


while (hasNext()) {
...
if ((y  MASK) == MASK) {
emitRow(...);
pix_min,max = MAX,MIN;
}
}
if (pix_min  pix_max) {
emitRow(...);
}

Am I missing something?

So, the upshot is that I didn't find anything wrong.  You can take or 
leave my suggestions for improvements as you see fit and maybe save some 
of them for a second round after this goes back...


...jim

On 7/29/2010 2:16 PM, Denis Lila wrote:

Hello Jim.

I made the changes you point out, except for your second point
(I don't have time to think about it right now).

I stopped precomputing m00_2_m01_2 because it was only being
used in one place. But I guess it worth it to precompute it
if it's going to be used often (and computeOffset is called
for every line).

The new webrev is at 
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

Thanks,
Denis.


Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-30 Thread Jim Graham
Just to clarify.  My second message that I just sent mostly contained 
some additional optimizations to consider for now or later, but this 
message below contained at least one (maybe 2) thing(s) that looked like 
a bug and a few maintenance issues that I think should be done before 
finalizing this set of changes...


...jim

On 7/29/2010 5:27 PM, Jim Graham wrote:

Hi Denis,

The changes look fine and I'm moving on to Renderer.java...

I'd make as many methods private as you can. It looks like all your new
methods are private, but some old methods are still public even though I
don't think they are accessed elsewhere (like addEdge()?). I think
private is enough for Hotspot to inline so that should help
performance a bit. I usually use final, but I think private does the
same thing.

You should create constants for the indices into the struct to make the
code more readable (and to simplify updating the EDGE layout):

X0_OFF = 0
Y0_OFF = 1
// etc.

I don't think you touch X0 and Y0 after initializing them and then using
them for the first setCurY so you could just use them as the curX and
curY and save 2 slots in the table.

You initialize edges twice - once when it is declared, and once in the
constructor. Note that the initialization where it is declared uses
INIT_SIZE which is not a multiple of EDGE size anyway.

It looks like there is a missing statement in moveTo to initialize
this.x0...?

I need some more time on it, but I thought I would send along these
comments in the meantime...

...jim

Denis Lila wrote:

Hello Jim.

I made the changes you point out, except for your second point
(I don't have time to think about it right now).

I stopped precomputing m00_2_m01_2 because it was only being
used in one place. But I guess it worth it to precompute it
if it's going to be used often (and computeOffset is called
for every line).

The new webrev is at
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

Thanks,
Denis.


Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-07-30 Thread Jim Graham

Hi Denis,

It looks fine.  Hopefully we can eventually figure out why the sorting 
on the fly didn't pan out.


Denis Lila wrote:
Hi Jim. 


Thanks for all your suggestions. I fixed the edge array indexing
issue, the moveTo bug (not assigning x0), and the double
initialization issue. I also improved the emission of the last row
to what you said. The link is the same as the last one I sent.


I'm guessing the test for y == boundsMaxY-1 at line 470 could probably 
also be deleted now (since it will be handled by the end test when y 
reaches maxY)?


But everything looks in order!

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-04 Thread Jim Graham

Hi Denis,

First, comments on the high level normalizer (Normalizing iterator):

- If there is no normalization going on, I would use the Shape's own 
flattening (i.e. getPathIterator(at, flat)).  The reason being that some 
shapes may know how to flatten themselves better, or faster, than a 
Flattening Iterator.  In particular, rectangles and polygons would 
simply ignore the argument and save themselves the cost of wrapping with 
an extra iterator.  This would probably only be a big issue for very 
long Polygons.


- Line 331 - the initializations to NaN aren't necessary as far as I can 
tell...?


- Rather than saving mode in the normalizing iterator, how about 
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA and 
then simply add those constants in rather than having to have the 
conditional with the 2 different equations?


...jim



Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-08-05 Thread Jim Graham

Hi Denis,

That's great!  I just did a last minute double-check of your last 
(final) webrevs to be sure.


Have you tested Java2Demo with these changes?  I'd also run any 
regression tests you can find with the changes.  If there are no 
problems there, then you are good to go to push it...


...jim

On 8/5/2010 8:08 AM, Denis Lila wrote:

Hello.


Are you a registered OpenJDK developer?

I am now.
Can I go ahead and push it?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-05 Thread Jim Graham

Hi Denis,

I'll wait for some clean webrevs once you get the float stuff in for a 
final review.  I did take a really quick look and thought that a better 
way to handle OFF would be to set rval to -1 and then check rval  0 
as the (quicker) test for OFF in the currentSegment() method.  Does that 
make sense?


In any case, let's wait for cleaner webrevs to go further on this 
(hopefully in a day or so?)...


...jim

On 8/5/2010 8:06 AM, Denis Lila wrote:

Hi Jim.

I made all the suggested changes.
Links:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

First, comments on the high level normalizer (Normalizing iterator):

- If there is no normalization going on, I would use the Shape's own
flattening (i.e. getPathIterator(at, flat)).  The reason being that
some
shapes may know how to flatten themselves better, or faster, than a
Flattening Iterator.  In particular, rectangles and polygons would
simply ignore the argument and save themselves the cost of wrapping
with
an extra iterator.  This would probably only be a big issue for very
long Polygons.

- Line 331 - the initializations to NaN aren't necessary as far as I
can
tell...?

- Rather than saving mode in the normalizing iterator, how about
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA and

then simply add those constants in rather than having to have the
conditional with the 2 different equations?

...jim


Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-08-06 Thread Jim Graham

Hi Denis,

Well, I guess it's a good thing that Java2Demo had a path like that in 
it - not a very common case, so it's good we found it!


The fix looks fine.  It still seems like there is way more logic there 
than is needed - hopefully if we can get rid of flips at some point, 
much of it will go away.


Fixes look good to go to me...

...jim

On 8/5/2010 3:58 PM, Denis Lila wrote:

Hi Jim.

I didn't know about Java2Demo. If I did I would have run it sooner.
But I ran it a few hours ago, and everything looked fine (surprisingly
high fps too) until I got to the append test.

Apparently I introduced a bug when solving the 2 consecutive moveTos bug.
Basically, when there's a close() after a horizontal lineTo(), the lineTo
in close() won't be executed because it's inside the if (firstOrientation != 0)
test. So instead of going back to the starting point, close will stay where
it is, which will draw a triangle above the rectangle.

I fixed this by introducing a variable that keeps track of the last method
called (lineTo, moveTo, or close), and instead of checking for firstOrientation 
!= 0
in close(), I check for (last == LINE_TO).

webrev (hopefully final):
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

I'm sorry about this. I wish I had known about Java2Demo sooner.

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

That's great!  I just did a last minute double-check of your last
(final) webrevs to be sure.

Have you tested Java2Demo with these changes?  I'd also run any
regression tests you can find with the changes.  If there are no
problems there, then you are good to go to push it...

...jim

On 8/5/2010 8:08 AM, Denis Lila wrote:

Hello.


Are you a registered OpenJDK developer?

I am now.
Can I go ahead and push it?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

2010-08-06 Thread Jim Graham
To the end of getting rid of flips - all they are used for is to 
resize the crossings array, right?  This is not a huge array that costs 
a lot to resize - why not simply use a default array of, say, 30 
elements and then resize it if we ever have more crossings than that? 
Only very complicated paths would have more than 30 crossings to track. 
 The check for array length is only needed once per scanline since we 
know how many active edges are on each scan line (hi-lo) and you can 
only have 1 crossing per active edge so with one test per scanline we 
can keep the crossings array within range...


...jim

Jim Graham wrote:

Hi Denis,

Well, I guess it's a good thing that Java2Demo had a path like that in 
it - not a very common case, so it's good we found it!


The fix looks fine.  It still seems like there is way more logic there 
than is needed - hopefully if we can get rid of flips at some point, 
much of it will go away.


Fixes look good to go to me...

...jim

On 8/5/2010 3:58 PM, Denis Lila wrote:

Hi Jim.

I didn't know about Java2Demo. If I did I would have run it sooner.
But I ran it a few hours ago, and everything looked fine (surprisingly
high fps too) until I got to the append test.

Apparently I introduced a bug when solving the 2 consecutive moveTos 
bug.

Basically, when there's a close() after a horizontal lineTo(), the lineTo
in close() won't be executed because it's inside the if 
(firstOrientation != 0)
test. So instead of going back to the starting point, close will stay 
where

it is, which will draw a triangle above the rectangle.

I fixed this by introducing a variable that keeps track of the last 
method
called (lineTo, moveTo, or close), and instead of checking for 
firstOrientation != 0

in close(), I check for (last == LINE_TO).

webrev (hopefully final):
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

I'm sorry about this. I wish I had known about Java2Demo sooner.

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

That's great!  I just did a last minute double-check of your last
(final) webrevs to be sure.

Have you tested Java2Demo with these changes?  I'd also run any
regression tests you can find with the changes.  If there are no
problems there, then you are good to go to push it...

...jim

On 8/5/2010 8:08 AM, Denis Lila wrote:

Hello.


Are you a registered OpenJDK developer?

I am now.
Can I go ahead and push it?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] Fwd: Various fixes to pisces stroke widening code

2010-08-09 Thread Jim Graham
What was the problem - I might have a guess as to the cause if I saw a 
picture.  You should file a bug against it to make sure it doesn't fall 
through the cracks.


But, the webrev you sent looks good to go.

If you want some more optimization comments, I'll always have more (I'm 
evil that way)...  ;-)


- Testing for y0==y1 in lineTo is redundant.  addEdge already ignores 
the line with a looser test.  It does take more processing to reject the 
edge in that case, though, but that test in lineTo is saving less and 
less work with every revision.


- pix_sxy0 aren't really needed.  close() could simply:
addEdge(x0, y0, sx0, sy0);
this.x0 = sx0;
this.y0 = sy0;
and bypass what little processing is left in lineTo.

- addEdge rejects horizontal edges.  It could also reject any of the 
following classes of edges:

- completely above clip
- completely below clip
- completely to the right of clip
since those edges will never contribute to the coverage.  The algorithm 
should skip the above and below edges reasonably quickly, but this would 
save storage for them.  The edges to the right would have to be updated 
every scanline and waste storage, but that isn't a huge hit.  This only 
helps for corner-case huge paths which aren't common.  At least you 
aren't iterating over miles and miles of irrelevant geometry which would 
be an important performance hit.


But, these 3 are really getting down to the nitty gritty.  I'd check it 
in before I drive you crazy... ;-)


...jim

Denis Lila wrote:

- Forwarded Message -
From: Denis Lila dl...@redhat.com
To: Jim Graham james.gra...@oracle.com
Sent: Monday, August 9, 2010 4:58:10 PM GMT -05:00 US/Canada Eastern
Subject: Re: [OpenJDK 2D-Dev] Various fixes to pisces stroke widening code

Hi Jim.

Good idea. I've implemented it. I also noticed the quicksort 
method wasn't very friendly to 0 length arrays. I fixed that.

I ran Java2Demo and a few of my tests, and everything looks good.
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

Everything, except the joins demo, that is. The interior of the
star's right arm (our left) and left leg is not drawn exactly like
in proprietary java or openjdk6. I am not sure if this is a bug
since I don't know exactly how the results of a bs.createStrokedShape
are supposed to look. However, I am almost certain that this
isn't my fault, since I can reproduce the behaviour with an only 
slightly changed version of openjdk7 with completely different

changes than what's in this patch. I'll run it next morning with
a fresh build of openjdk7 to be completely sure.

Can I push it?

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:

To the end of getting rid of flips - all they are used for is to 
resize the crossings array, right?  This is not a huge array that
costs 
a lot to resize - why not simply use a default array of, say, 30 
elements and then resize it if we ever have more crossings than that?


Only very complicated paths would have more than 30 crossings to
track. 
  The check for array length is only needed once per scanline since we


know how many active edges are on each scan line (hi-lo) and you can 
only have 1 crossing per active edge so with one test per scanline we


can keep the crossings array within range...

...jim

Jim Graham wrote:

Hi Denis,

Well, I guess it's a good thing that Java2Demo had a path like that
in 

it - not a very common case, so it's good we found it!

The fix looks fine.  It still seems like there is way more logic
there 

than is needed - hopefully if we can get rid of flips at some point,
much of it will go away.

Fixes look good to go to me...

...jim

On 8/5/2010 3:58 PM, Denis Lila wrote:

Hi Jim.

I didn't know about Java2Demo. If I did I would have run it

sooner.

But I ran it a few hours ago, and everything looked fine

(surprisingly

high fps too) until I got to the append test.

Apparently I introduced a bug when solving the 2 consecutive
moveTos 

bug.
Basically, when there's a close() after a horizontal lineTo(), the

lineTo
in close() won't be executed because it's inside the if 
(firstOrientation != 0)

test. So instead of going back to the starting point, close will
stay 

where
it is, which will draw a triangle above the rectangle.

I fixed this by introducing a variable that keeps track of the last
method
called (lineTo, moveTo, or close), and instead of checking for 
firstOrientation != 0

in close(), I check for (last == LINE_TO).

webrev (hopefully final):
http://icedtea.classpath.org/~dlila/webrevs/fpBetterAAv2/webrev/

I'm sorry about this. I wish I had known about Java2Demo sooner.

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

That's great!  I just did a last minute double-check of your last
(final) webrevs to be sure.

Have you tested Java2Demo with these changes?  I'd also run any
regression tests you can

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-10 Thread Jim Graham

Hi Denis,

I think the first version is a better choice for now since you said that 
the performance difference isn't noticeable.  I think the lower level 
flattening might look a little different if we ever decide to upgrade 
the pipeline to deal with curves.  In particular, you are still 
flattening above the dashing/stroking code and I think the flattening 
should be done below that code (i.e. in Renderer).


So, I'd go with the first one with the following comments:

- You indent by 8 spaces in a few places.  Is that a tabs vs. spaces 
issue?  We try to stick to 4 space indentations with no tabs for 
consistentcy.


- I'd make the internal error message a little less personal. ;-) 
normalization not needed in OFF mode or something.


- lines 362,363 - you don't need to set cur_adjust variables here, they 
are already being set below.


Other than that, it looks good to go...

...jim

Denis Lila wrote:

Hi Jim.

So, I have the nicer webrevs.
FlatteningIterator version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/

Pisces flattening version:
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

I dealt with the issue of handling OFF by just not accepting it as an input.
After all, a normalizing iterator only needs to be created, and is only created
if the normalization mode is not OFF.

Thanks,
Denis.

- Jim Graham james.gra...@oracle.com wrote:


Hi Denis,

I'll wait for some clean webrevs once you get the float stuff in for a

final review.  I did take a really quick look and thought that a
better 
way to handle OFF would be to set rval to -1 and then check rval 
0 
as the (quicker) test for OFF in the currentSegment() method.  Does
that 
make sense?


In any case, let's wait for cleaner webrevs to go further on this 
(hopefully in a day or so?)...


...jim

On 8/5/2010 8:06 AM, Denis Lila wrote:

Hi Jim.

I made all the suggested changes.
Links:


http://icedtea.classpath.org/~dlila/webrevs/fpWithStrokeControl/webrev/
http://icedtea.classpath.org/~dlila/webrevs/fpWithSCandPiscesFlattening/webrev/

Thanks,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

First, comments on the high level normalizer (Normalizing

iterator):

- If there is no normalization going on, I would use the Shape's

own

flattening (i.e. getPathIterator(at, flat)).  The reason being

that

some
shapes may know how to flatten themselves better, or faster, than

a

Flattening Iterator.  In particular, rectangles and polygons would
simply ignore the argument and save themselves the cost of

wrapping

with
an extra iterator.  This would probably only be a big issue for

very

long Polygons.

- Line 331 - the initializations to NaN aren't necessary as far as

I

can
tell...?

- Rather than saving mode in the normalizing iterator, how about
saving 2 constants: (0.0, 0.5) for AA and (0.25, 0.25) for non-AA

and

then simply add those constants in rather than having to have the
conditional with the 2 different equations?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-11 Thread Jim Graham

Denis Lila wrote:

Hi Jim.

I think the first version is a better choice for now since you said that 
the performance difference isn't noticeable.  I think the lower level 
flattening might look a little different if we ever decide to upgrade 
the pipeline to deal with curves.  In particular, you are still 
flattening above the dashing/stroking code and I think the flattening 
should be done below that code (i.e. in Renderer).


Wouldn't we still need to flatten for dashing? Is there some way to
quickly compute the arc length of a bezier curve from t=0 to t=some_number?
As far as I can see the function for this computation is the integral of
sqrt(polynomial_of_degree_4), and that would be pretty nasty.
Or maybe we can get around this somehow?


There should be.  Google turns up a few hits for compute arc length for 
bezier curve that should be enlightening.


BTW, have you looked at a widened dashed curved path with the closed 
JDK?  I'm pretty sure it outputs dashed curves which proves the point.


I have also computed these lengths for other projects (the shape 
morphing used in some JavaOne demos and Java FX) using the following 
process:


- Compute the length of the control polynomial.
- Compute the length of the line between the endpoints.
- When they are within epsilon return the average as the arc length.
- Otherwise subdivide and try again.

I think you could also do something that looked at the relative angles 
of all of the control segments and if they are close enough to each 
other then you can compute the arc length using a simplified equation or 
simply empirically match this to the close enough rule as above.


...jim



Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-24 Thread Jim Graham

Hi Denis,

On 8/23/2010 4:18 PM, Denis Lila wrote:

 To widen cubic curves, I use a cubic spline with a fixed number of curves 
for
each curve to be widened. This was meant to be temporary, until I could find a
better algorithm for determining the number of curves in the spline, but I
discovered today that that won't do it.
 For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0, 32.0, 34.0, 
28.0, 5.0)
looks bad all the way up to ~200 curves. Obviously, this is unacceptable.

It would be great if anyone has any better ideas for how to go about this.
To me it seems like the problem is that in the webrev I chop up the curve to be
interpolated at equal intervals of the parameter.


I think a more dynamic approach that looked at how regular the curve 
was would do better.  Regular is hard to define, but for instance a 
bezier section of a circle could have parallel curves computed very 
easily without having to flatten or subdivide further.  Curves with 
inflections probably require subdividing to get an accurate parallel curve.


Perhaps looking at the rate of change of the slope (2nd and/or 3rd 
derivatives) would help to figure out when a given section of curve can 
be approximated with a parallel version?


I believe that the BasicStroke class of Apache Harmony returns widened 
curves, but when I tested it it produced a lot more curves than Ductus 
(still, probably a lot less than the lines that would be produced by 
flattening) and it had some numerical problems.  In the end I decided to 
leave our Ductus stuff in there until someone could come up with a more 
reliable Open Source replacement, but hopefully that code is close 
enough to be massaged into working order.


You can search the internet for parallel curves and find lots of 
literature on the subject.  I briefly looked through the web sites, but 
didn't have enough time to remember enough calculus and trigonometry to 
garner a solution out of it all with the time that I had.  Maybe you'll 
have better luck following the algorithms... ;-)


As far as flattening at the lowest level when doing scanline conversion, 
I like the idea of using forward differencing as it can create an 
algorithm that doesn't require all of the intermediate storage that a 
subdividing flattener requires.  One reference that describes the 
technique is on Dr. Dobbs site, though I imagine there are many others:


http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN

You can also look at the code in 
src/share/native/sun/java2d/pipe/ProcessPath.c for some examples of 
forward differencing in use (and that code also has similar techniques 
to what you did to first chop the path into vertical pieces).  BUT 
(*Nota Bene*), I must warn you that the geometry of the path is somewhat 
perturbed in that code since it tries to combine path normalization 
and rasterization into a single process.  It's not rendering pure 
geometry, it is rendering tweaked geometry which tries to make non-AA 
circles look round and other such aesthetics-targeted impurities.  While 
the code does have good examples of how to set up and evaluate forward 
differencing equations, don't copy too many of the details or you might 
end up copying some of the tweaks and the results will look strange 
under AA.  The Dr. Dobbs article should be your numerical reference and 
that reference code a practical, but incompatible, example...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-24 Thread Jim Graham

Hi Denis,

On 8/24/2010 3:35 PM, Jim Graham wrote:

As far as flattening at the lowest level when doing scanline conversion,
I like the idea of using forward differencing as it can create an
algorithm that doesn't require all of the intermediate storage that a
subdividing flattener requires. One reference that describes the
technique is on Dr. Dobbs site, though I imagine there are many others:

http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN


Just to provide a basic overview...

You can iterate a line with a constant delta-T using:

x += dx;
y += dy;

Similarly, you can iterate a quad curve with a constant delta-T using:

dx += ddx;
x  += dx;
dy += ddy;
y  += dy;

and a cubic using:

ddx += dddx;
dx  += ddx;
x   += dx;
ddy += dddy;
dy  += ddy;
y   += dy;

There are then techniques to apply to evaluate the dd[d]x and dd[d]y to 
see if the curve is flat enough for your particular needs.  If it 
isn't flat enough, then some simple math performed on the d* variables 
can double or halve the sampling rate for a localized portion of the 
curve.  Once you pass the curvy section, you can then reduce the 
sampling rate again by examining the d* variables.


Done right, this could probably be integrated at the innermost loop of 
the renderer to reduce its storage requirements for curves, but that 
would mean the inner loop would have to switch on the curve type to 
determine which sampling equations apply (though you could simply have 
quads have ddd[xy] = 0 and lines have dd[d][xy] = 0 and use a single set 
of code perhaps without too much performance impact).  Otherwise, this 
could simply be used to flatten and produce edges with less intermediate 
storage (and faster hopefully)...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-08-25 Thread Jim Graham

Hi Denis,

At the bottom-most rendering level monotonic curves can be cool to deal 
with, but I'm dubious that they help with widening.  For one things, I 
think you need more breaks than they would give you and also they might 
sometimes break a curve when it doesn't need it.


One way in which they may not break enough is that I think that 
inflections also need to be broken in order to find a parallel curve 
(though I suppose a very tiny inflection might still be approximated by 
a parallel curve easily) and inflections can happen at any angle without 
going horizontal or vertical.


Secondly, although a circle tends to be represented by quadrant sections 
which are monotonic, a circle rotated by 45 degrees would have 
horizontal and vertical sections in the middle of each quadrant and you 
would split those, but really they can be made parallel regardless of 
angle so these would be unnecessary splits.


My belief is that lengths and angles of the control polygon help 
determine if it is well-behaved and can be made parallel simply by 
offsetting.  Some formula involving those values would likely be happy 
with circle sections regardless of their angle of rotation.  I believe 
the Apache Harmony version of BasicStroke used those criteria...


...jim

On 8/25/2010 2:36 PM, Denis Lila wrote:

Hello Jim.


I think a more dynamic approach that looked at how regular the curve
was would do better.  Regular is hard to define, but for instance a
bezier section of a circle could have parallel curves computed very
easily without having to flatten or subdivide further.  Curves with
inflections probably require subdividing to get an accurate parallel
curve.


I'm not sure if you read it, but after the email with the webrev link
I sent another describing a different method of widening: split the
curve at every t where dx/dt == 0 and dy/dt == 0. This guarantees that
there will be no more than 5 curves per side, and since each curve will
be monotonic in both x and y the curve that interpolates its parallel
should do a pretty good job.

This is far better than interpolating at regular t intervals, but I'm
trying to find a better way. I don't like this because the split
depend not only on the curve itself, but also on the axes. The axes are
arbitrary, so this is not good. For example a curve like this
|
\_ would get widened by 1 curve per side (which is good and optimal), but
if we rotate this curve by, say, 30 degrees it would be widened by 2 curves
per side.
It also doesn't handle cusps and locations of high curvature very well (although
I think the latter is a numerical issue that could be made better by using
doubles).

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

On 8/23/2010 4:18 PM, Denis Lila wrote:

  To widen cubic curves, I use a cubic spline with a fixed number

of curves for

each curve to be widened. This was meant to be temporary, until I

could find a

better algorithm for determining the number of curves in the spline,

but I

discovered today that that won't do it.
  For example, the curve p.moveTo(0,0),p.curveTo(84.0, 62.0,

32.0, 34.0, 28.0, 5.0)

looks bad all the way up to ~200 curves. Obviously, this is

unacceptable.


It would be great if anyone has any better ideas for how to go about

this.

To me it seems like the problem is that in the webrev I chop up the

curve to be

interpolated at equal intervals of the parameter.




Perhaps looking at the rate of change of the slope (2nd and/or 3rd
derivatives) would help to figure out when a given section of curve
can
be approximated with a parallel version?

I believe that the BasicStroke class of Apache Harmony returns widened

curves, but when I tested it it produced a lot more curves than Ductus

(still, probably a lot less than the lines that would be produced by
flattening) and it had some numerical problems.  In the end I decided
to
leave our Ductus stuff in there until someone could come up with a
more
reliable Open Source replacement, but hopefully that code is close
enough to be massaged into working order.

You can search the internet for parallel curves and find lots of
literature on the subject.  I briefly looked through the web sites,
but
didn't have enough time to remember enough calculus and trigonometry
to
garner a solution out of it all with the time that I had.  Maybe
you'll
have better luck following the algorithms... ;-)

As far as flattening at the lowest level when doing scanline
conversion,
I like the idea of using forward differencing as it can create an
algorithm that doesn't require all of the intermediate storage that a

subdividing flattener requires.  One reference that describes the
technique is on Dr. Dobbs site, though I imagine there are many
others:

http://www.drdobbs.com/184403417;jsessionid=O5N5MDJRDMIKHQE1GHOSKH4ATMY32JVN

You can also look at the code in
src/share/native/sun/java2d/pipe/ProcessPath.c for some examples of
forward differencing in use 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-02 Thread Jim Graham
OK, I see.  You were doubting that the thing that came after Pisces 
could be that much different considering that Pisces is rendering many 
more sub-pixels.


Actually, embarrassingly I think it can.  It just means the non-AA 
renderer has some performance issues.  One thing I can think of is that 
the SpanShapeIterator uses a native method call per path segment and the 
cost of the context switches into native and back for each path segment 
dominate the performance of long paths.  It was something I was meaning 
to fix for a long time (when that code was first written native code was 
so much faster than Java and the native transition was quick - since 
then Hotspot came along, got a lot better, and the native transitions 
got much, much slower).


So, yes, this isn't out of the question...

...jim

On 9/2/2010 3:40 PM, Denis Lila wrote:

Use which?  The stroking code or the rendering code?
I believe that the way I set it up was that Pisces replaced both the
stroke widening/dashing code and the AA renderer - both were parts that
we relied on Ductus for.  But, the widening code would talk to one of
our other existing rasterizers for non-AA.  Look at
LoopPipe.draw(sg2d, s).  It (eventually) calls RenderEngine.strokeTo()
directed at a SpanShapeIterator...


I think there's a misunderstanding. All I meant was that, even when AA is off,
we do use pisces for widening, but it doesn't do any rasterization.


- Jim Grahamjames.gra...@oracle.com  wrote:


...jim

On 9/2/2010 3:20 PM, Denis Lila wrote:

Do we use Pisces for non-AA?  Pisces should clock in slower for AA

than

non-AA, but I think we use one of the other pipes (not Ductus) for
non-AA and maybe it just isn't as good as Pisces?


We definitely use it for non-AA.
I traced it.

Denis.

- Jim Grahamjames.gra...@oracle.com   wrote:


On 9/2/2010 2:43 PM, Denis Lila wrote:

Actually, I had a question about the test I wrote which takes 20

seconds. When

I turned antialiasing on, the test dropped from 20 seconds to

2.5.

This is very

puzzling, since antialiasing is a generalization of

non-antialiased

rendering

(a generalization where we pretend there are 64 times more pixels

than there

actually are). Of course, the paths followed after pisces for AA

and

non-AA are

completely different, but whatever came after pisces in the

non-AA

case would

have the same input as Renderer has in the AA case (input gotten

from Stroker).

Can you take a guess as to what was causing such a large

difference?





I think Pisces was integrated only as a Ductus replacement which

means


it was used only for AA, but check if I'm mistaken...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-03 Thread Jim Graham

On 9/3/2010 6:03 AM, Denis Lila wrote:

the cost of the context switches into native and back for each path
segment dominate the performance of long paths.


I see. That makes sense.


It was something I was meaning to fix for a long time (when that code
was first written native code was so much faster than Java and the
native transition was quick - since then Hotspot came along, got a
lot better, and the native transitions got much, much slower).


Do you think this will still be worth it after removing flattening?


That depends on the performance differential after your de-flattening 
fixes.  Are both now relatively close in performance?


Either way I imagine that performance will improve if we reduce the 
native interface transitions - it just may change in relative priority 
if your new widener is less abusive towards it...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-10 Thread Jim Graham

Hi Denis,

Things got really busy for me over the past week so I wasn't able to 
keep up with the discussion on this, but I will be looking more at it 
next week.  In the meantime it sounds like you are on the right track. 
I wish I'd have investigated it to the level you are at so I could be of 
more immediate help, but hopefully I'll get there when I review your 
various changes...


...jim

On 9/7/2010 2:11 PM, Denis Lila wrote:

Hello Jim.

So, I finally have a webrev for serious consideration:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/
There are still some printing statements I used for debugging, and
the whitespace is probably pretty bad (tell me if this poses a problem
when reading the code, and I'll clean it up), but I don't want to
waste time removing that stuff unless necessary, since this is
doubtlessly not the last version. I also included a Test.java
file that I found useful for testing and debugging. It has a main
method, and it allows pisces to run as a standalong project in
eclipse (as long as you set the JRE to be openjdk7 since it needs
to know about AATileGenerator and some other non public interfaces).

 From testing it, the only problem I noticed is that it doesn't do
very well with tight loops. So, a path like
p.moveTo(0,0);p.curveTo(1000, 1000, 400, 500, 0, -150);
isn't stroked very well when using the rotating algorithm. When using
just the make monotonic algorithm it is ok (right now, it is set to
use the latter - you can change this by uncommenting Stroker.java:1011
and commenting out Stroker.java:1012). This leads me to believe that
we need to detect and perhaps subdivide at loops in addition to the
current subdivision locations. However, I have not yet looked too deeply
into why the problem arises and how to fix it. I welcome suggestions.

Thanks,
Denis.


I figured out what the problem is. The problem isn't really tight loops.
The problem is cusps in the offset curves. These happen when the line width
is equal to the radius of curvature of the curve being processed (although,
this may be just a necessary condition and not sufficient, but this doesn't
matter).

It seems like we have to split at values of t where the above condition
holds. However, I can't see a way to do this without resorting to Newton's 
method
for finding the roots of RadiusOfCurvature(t) - lineWidth. It would be
really easy, however, if we had the arc length parametrization of the curve
in question, but this won't necessarily be a polynomial. A good way might be
to find a polynomial approximation to its inverse (this would make dashing 
considerably
easier too).

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:




- Jim Grahamjames.gra...@oracle.com  wrote:


OK, I see.  You were doubting that the thing that came after

Pisces


could be that much different considering that Pisces is rendering

many


more sub-pixels.

Actually, embarrassingly I think it can.  It just means the non-AA
renderer has some performance issues.  One thing I can think of is
that
the SpanShapeIterator uses a native method call per path segment and
the
cost of the context switches into native and back for each path
segment
dominate the performance of long paths.  It was something I was
meaning
to fix for a long time (when that code was first written native code
was
so much faster than Java and the native transition was quick - since
then Hotspot came along, got a lot better, and the native

transitions


got much, much slower).

So, yes, this isn't out of the question...

...jim

On 9/2/2010 3:40 PM, Denis Lila wrote:

Use which?  The stroking code or the rendering code?
I believe that the way I set it up was that Pisces replaced both

the

stroke widening/dashing code and the AA renderer - both were

parts

that

we relied on Ductus for.  But, the widening code would talk to

one

of

our other existing rasterizers for non-AA.  Look at
LoopPipe.draw(sg2d, s).  It (eventually) calls

RenderEngine.strokeTo()

directed at a SpanShapeIterator...


I think there's a misunderstanding. All I meant was that, even

when

AA is off,

we do use pisces for widening, but it doesn't do any

rasterization.



- Jim Grahamjames.gra...@oracle.com   wrote:


...jim

On 9/2/2010 3:20 PM, Denis Lila wrote:

Do we use Pisces for non-AA?  Pisces should clock in slower for

AA

than

non-AA, but I think we use one of the other pipes (not Ductus)

for

non-AA and maybe it just isn't as good as Pisces?


We definitely use it for non-AA.
I traced it.

Denis.

- Jim Grahamjames.gra...@oracle.comwrote:


On 9/2/2010 2:43 PM, Denis Lila wrote:

Actually, I had a question about the test I wrote which takes

20

seconds. When

I turned antialiasing on, the test dropped from 20 seconds to

2.5.

This is very

puzzling, since antialiasing is a generalization of

non-antialiased

rendering

(a generalization where we pretend there are 64 times more


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-09-27 Thread Jim Graham

Hi Denis,

On 9/27/2010 7:43 AM, Denis Lila wrote:

Hi Jim.

How much faster?  I'm worried about this, especially given our tiled
approach to requesting the data.  What was the bottleneck before?
(It's been a while since I visited the code - we weren't computing the
crossings for every curve in the path for every tile being generated
were we?)


 Not much faster. I'm working on someting better.


Then hopefully we can get to something with better memory and CPU costs.


 I'm not sure about the bottleneck, but what we were doing before is:
1. Flatten (by subdividing) every curve so that we deal only with lines.
2. Add each line to a list sorted by y0. When end_rendering was called
for each scanline we found the crossings of the scanline and every line
in our line list, which we used to compute the alpha for that scanline's
pixel row. All this would be put into RLE encoded temporary storage and it
would be read back and converted into tile form by PiscesTileGenerator.

 Speaking of which, would it not be better to get rid of PiscesCache and
just keep a buffer with the current tile row in Renderer.java. This would
be possible because the specification for AATileGenerator says the iteration
is like: for (y...) for (x...);.
Why is PiscesCache there? It isn't being used as a cache at all. Could it be?
Also, why do we output tiles, instead of just pixel rows (which I guess would
just be nx1 tiles). Is it because we would like to use getTypicalAlpha to 
eliminate
completely transparent or completely opaque regions as soon as possible (and the
longer a tile is the less of a chance it has at being either of those two)?


That was basically cramming what we had into the interface's box.  The 
cache existed for something that was being done on mobile, but it 
doesn't have much of a place in our APIs so it was just reused for tile 
generation.  If we have a much more direct way of doing it then it would 
be great to get rid of it.


I think we can support ALL1s and ALL0s reasonably without the cache.


I can see your points here.  I think there are solutions to avoid much
of the untransforming we can consider, but your solution works well so
let's get it in and then we can look at optimizations if we feel they
are causing a measurable problem later.


 I should say this isn't quite as bad as I might have made it seem. Firstly,
this IO handler class I made elimiinates transformations when Dasher
communicates with Stroker. More importantly, no untransforming is done
when the transformation is just a translation or is the identity or is singular
and when STROKE_CONTROL is off, we only transform the output path. That's
because the most important reason for handling transforms the way I do now
is because we can't normalize untransformed paths, otherwise coordinate
adjustments might be magnified too much. So, we need to transform paths
before normalization. But we also can't do the stroking and widening
before the normalization. But if normalization is removed we can just pass
untransformed paths into Stroker, and transform its output (which is still
somewhat more expensive than only trasnforming the input path, since
Stroker produces many 3-7 curves for each input curve).


Can the untransform be eliminated in the case of scaling?  (Whether just 
for uniform scaling, or maybe even for non-uniform scaling with no 
rotation or shearing?)



I'm not sure I understand the reasoning of the control point
calculation.  I'll have to look at the code to register an opinion.


 I'm sorry, my explanation wasn't very clear. I attached a picture that
will hopefully clarify things.
But, in a way, the computation I use is forced on us. Suppose we have a
quadratic curve B and we need to compute one of its offsets C. C'(0)
and C'(1) will be parallel to B'(0) and B'(1) so we need to make sure
our computed offset has this property too (or it would look weird around
the endpoints). Now, B'(0) and B'(1) are parallel to p2-p1 and p3-p2
where p1,p2,p3 are the 3 control points that define B, so if the control
points of C are q1, q2, q3 then q2-q1 and q3-q2 must be parallel to p2-p1
and p3-p2 respectively. At this point, we need more constraint, since
our system is underdetermined. We use the constraints that q1 = C(0)
and q3 = C(1) (so, the endpoints of the computed offset are equal to the
endpoints of the ideal offset). All we have left to compute is q2, but
we know the direction of q2-q1 and the direction of q3-q2, so q2 must
lie on the lines defined by q1+t*(q2-q1) and q3+t*(q3-q2) so q2 must
be the intersection of these lines.


I agree that if you are creating a parallel curve then intersection is 
the way to go.  I guess what I was potentially confused about was 
whether there are cases where you need to subdivide at all?  Regardless 
of subdivision, when you get down to the final step of creating the 
parallel curves then I believe offsetting and finding the intersection 
is correct (though I reserve the possibility 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-12 Thread Jim Graham

Hi Denis,

On 10/12/2010 6:01 AM, Denis Lila wrote:

Hi Jim.


2. I changed how the alpha map is managed in PiscesTileGenerator to
something that's a bit clearer and uses less memory (the latter comes
from changing the +300 in the alpha tile allocation to +1. If there was
a good reason for using 300, please tell me).


Did I do that?  Wow.  I wish I knew.  There were probably some bugs in
the alpha accumulation at some point.  Since it was indexed by a byte, I
find it hard to believe that it would need 300 entries of padding.


 I don't know who did it. I didn't mean to imply I thought it was you.
In hindsight, the wording of If there was a good reason for using 300,
please tell me was pretty terrible. I only asked because 300 seemed like a
very out of place number and I thought it was a bugfix, but I couldn't see
for what bug, so I thought you might know since you've helped me out in
this sort of situation before (i.e. sx0, sy0 in Dasher).


It was most likely me since this code hasn't been touched much since I 
hacked it together.  I wasn't put out by your comment, I was simply 
making a public showing of confusion to cover my embarrassment.  ;-)



One thing - will we ever need more than one alpha map in practice?  I
don't believe we will since it depends on the maxalpha from the Renderer
which is a fixed value.  So, the hashmap cache is probably overkill
compared to just seeing if the existing one is the right size, no?


Right now, it's true that there will never be more than one alpha map, so
you might say the HashMap is overkill, but I don't think this is a problem
because performance wise it costs nearly nothing and I think the code is easier
to read now.
But it's not a big deal, I can change it back if you want.


No, I think it's OK if it doesn't show up on any benchmarks...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-13 Thread Jim Graham

HI Denis,

I'm just now getting down to the nitty gritty of your webrevs (sigh).

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


PiscesRenderingEngine.java:
line 278 - the det calculation is missing b.

line 296 - is there an epsilon that we can use?  == with floating 
point often fails with calculations.


line 308 - miterlimit is a ratio of lengths and should not need to be 
scaled.


line 332 - I think you can use a null transform for the same result.

line 338 - null here too

TransformingPolyIOHelper should be in its own file - we consider more 
than one class per file to be bad form these days, especially if the 
class is used outside of that file.


I'm a little troubled by how the PolyIOHelper fits into the design. 
It's odd to talk to the same object for both input and output.  I have 
some ideas there, but I think I'll leave it for a followon email and effort.


Dasher.java:

lines 110,111 - shouldn't you check if there are any first segments 
before writing the extra move?


lines 150-152 - starting should be left true until you reach the end of 
the dash, no?  Otherwise you only hold back the starting segments up 
until the first piece of a curve.  Everything should be held back 
until you reach an off piece.  I don't think the logic for these 
variables is right yet.  Here is what I see:


   boolean needsMoveto;

in moveTo and pathDone:
if (firstSegBuf is not empty) {
output moveto(sx,sy)
output firstSegs
}
needsMoveto = true;  // not needed in pathDone

in goTo() {
if (ON) {
if (starting) {
store it in firstSegBuf
} else {
if (needsMoveto) {
output moveto(x0,y0);
needsMoveto = false;
}
send it to output
}
} else {
starting = false;
needsMoveto = true;
// nothing goes to output
}
}

and in ClosePath:
lineToImpl(sx, sy, LINE);
if (firstSegBuf is not empty) {
if (!ON) {  // Or if (needsMoveto)
output moveTo(sx, sy)
}
output firstSegs
}

I don't see a need for firstDashOn or fullCurve

line 228 - Lazy allocate lc?  Polygons, rectangles, and lines won't need 
it to be dashed (though dashing is already expensive enough it might not 
be noticeable, still waste is waste and there is only one piece of code 
that uses lc so it is easy to protect with a lazy allocation)


line 235 - is there a reason to output a curve of 0 length?  lines of 0 
length are omitted...


line 257 - shouldn't the left and right indices always be at 0 and 
type-curCurveoff?  It looks like after the first time through this loop 
you are storing the right half on top of the left half (see line 262)? 
That would output odd values if any curve gets subdivided more than 
once, though, right?


line 289 - LenComputer always allocates maxcurves segements which is 
8*1024 words (32K) + object overhead * 1024 + 2 more arrays of 1025 
words.  That's a lot of memory for the worst case scenario.  It might be 
nice to come back to this and have it be more dynamic (and it is more of 
a push to have the lc variable be lazily allocated above).  Also, if 
you are clever enough, you never need storage for more than about 10 
(maybe 11) curves even if you produce 1024 t's and len's - due to the 
recursive nature of the subdivision that leaves one half dormant while 
the other half is explored.


line 347,352 - it might be cleaner to have the calling function keep 
track of how far into the curve you are and communicate this to the 
method so it doesn't have to clobber its data based on an assumption of 
how the calling function is structured.  But, this arrangement works 
fine for the current purposes and you do have a TODO comment in there 
about this.


Stroker.java:

line 129 - proof that miterLimit does not need to be scaled... ;-)

I'm going to send this buffer of comments off now and continue on with 
Stroker.java in a future email...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-13 Thread Jim Graham

Round 2

On 10/13/2010 3:40 PM, Jim Graham wrote:

HI Denis,

I'm just now getting down to the nitty gritty of your webrevs (sigh).

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


Stroker.java:

Are you happy with the current variable names?  You're doing the bulk of 
the work now so if they make sense to you now then it might be best to 
leave them alone, but they give me headaches trying to figure them out. 
 I think you are right that it might help to create some vector 
helper classes.  I eventually got used to the naming by the time I was 
done with the file, but yikes - this will hurt the next guy that comes 
along to maintain the code.

The sx0,sy0,sdx,sdy variables are (reasonably) well named.
The x0,y0,pdx,pdy variables aren't consistent.  Perhaps cx0,cy0,cdx,cdy 
for current would make more sense?
The mx0,my0,omx,omy variables are even further from the prior naming 
conventions, what about smx,smy,cmx,cmy?


I would combine the emit*To methods into just the one version that takes 
a boolean.  The number of times you call them without the boolean are 
few and far between and the versions that don't take the boolean are so 
few lines of code that inlining them into the boolean versions of the 
methods will still make for nice and tight code.


line 208 - isn't this just side = false since side is either 0 or 1?
Also, side is only ever 1 for an end cap in which case we need exactly 2 
90 degree beziers which are very simple to compute and could be hard 
coded.  Was there a reason not to just have a special roundCap 
function which would be 2 hardcoded and fast emitCurveTo calls?  The 
code would be something like:

curveTo(/*x+mx,y+my,*/
x+mx-C*my, y+my+C*mx,
x-my+C*mx, y+mx+C*my,
x-my,  y+mx);
curveTo(/*x-my,y+mx,*/
x-my-C*mx, y+mx-C*my,
x-mx-C*my, y-my+C*mx,
x-mx,  y-my);
where C = 0.5522847498307933;
// Computed btan constant for 90 degree arcs

(rest of drawRoundJoin method - it may take some doing, but eventually 
this method should simplify based on: there will only ever be 1 or 2 
curves, Math.sin(Math.atan2()) cancels out as does 
Math.cos(Math.atan2()) though to do so requires Math.hypot() which is a 
simpler calculation than any of the transcendentals.  So, if there was 
an easy test for acute/obtuse angle and a way to find the center of an 
angle (both I'm sure we could find on the net), then we could eliminate 
the transcendentals from this method).


line 283 - doesn't this simplify to?:
   t = x10p*(y0-y0p) - y10p*(x0-x0p)
(source: http://local.wasp.uwa.edu.au/~pbourke/geometry/lineline2d/)
then calculating:
   t = (...)/den;
can amortize the dividend from the following 2 calculations.

line 337 - shouldn't this just return?  I don't think that empty lines 
should modify the path at all.  If this is a case of moveto(x,y); 
lineto(x,y) then the finish() code should deal with the path that 
never went anywhere - i.e. drawing a dot, shouldn't it?  The only 
problem is that moveTo never set up omx,omy so finish will likely draw 
something random.  Perhaps if moveto (and closepath) simply set up 
omx,omy to w,0 (or 0,-w if you prefer) then finish would do a reasonable 
thing for empty paths?


line 374 - why is this moveto here?  Doesn't this break the joined path 
into 2 separate paths?  Should this be a lineto?  (Also, sx0==x0 and 
sy0==y0 at this point).


line 389 - The test here is different from closePath.  What if they were 
both prev == DRAWING_OP_TO?


line 394 - or prev = CLOSE to match the initial state?  (I guess it 
shouldn't really matter unless an upstream feeder has a bug.)


line 486 - this leaves the current point in a different place than line 
510, does that matter?


My head started spinning when evaluating the parallel curve methods so 
I'll stop here for now...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-14 Thread Jim Graham

Round 3...

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


I'm going to set the rest of Stroker.java aside for a moment and focus 
on other areas where I have some better knowledge...


Renderer.java:

lines 83, 91, 99: can't these be folded into the prior loops?  You can 
update their Y while searching for the [eqc]hi value.


lines 178,192: indent to the preceding (?  (Also, I'm a big fan of 
moving the { to a line by itself if an conditional or argument list 
was wrapped to more than 1 line - the 2D team tends to use that style 
everywhere in the 2D code...)


line 193: add fieldForCmp here instead of every time in the loop?  (The 
compiler will probably/hopefully do that anyway)


line 238: If X0,Y0,XL,COUNT were bumped up by 1 then you could just 
reuse SLOPE from the linear indices - just a thought.


lines 521,527,533: Why are these executed twice?  You call these methods 
again after the initialize common fields code.  That seems like double 
the work just to maybe save 4 lines of shared code?  Maybe put the 4 
lines of shared code in a helper function that all of the init() methods 
call?


line 574: indentation?

line 566: shouldn't horizontal lines be ignored?  they don't affect 
rasterization.


line 612: delete?  Or will this be making a comeback sometime?

lines 624,626: indentation?

lines 724,725: doesn't the assert false omit these?  I usually throw an 
InternalError in cases like this with a description of what went wrong.


I've read up through the use of the cache in emitRow().  I'll continue 
with reviewing the cache in the next set, meanwhile I also took a look 
at the helper classes...


Helpers.java:

line 37: If it can't be instantiated, why does it take arguments?

getTransformedPoints isn't used?
getUntransformedPoints isn't used?
fillWithIndxes(float) isn't used?
evalQuad isn't used?  (Though it does provide symmetry with evalCubic 
which is used)

getFlatness* aren't used?
ptSegDistSq isn't used?

line 105: There is a closed form solution to cubic roots.  I 
unfortunately used a bad version in CubicCurve2D.solveCubic and I don't 
remember if I ever went back and fixed it (it may even have been 
Cardano's method for all I know).  There are versions out there which 
do work better.  The problem with the one in CC2D was that I copied it 
out of Numerical Recipes in C and apparently the author somehow assumed 
that all cubics would have 1 or 3 roots, but a cubic of the form 
(x-a)(x-a)(x-b) has 2 roots.  D'oh!  While I did find other 
implementations out there on the net, in the end fixing the closed form 
solution seemed wrought with issues since many of the tests would use 
radically different approaches depending on tiny changes in one of the 
intermediate results and so I worried about FP error even in doubles 
possibly skewing the results.  I think you should leave your code in 
there, but I wanted to fill you in on other possibities.


BezCurve.java:

Didn't you get a complaint that this class is defined in a file of the 
wrong name?  Maybe the compiler doesn't complain because the class isn't 
public, but one of the names should change to match.


line 59: I'd throw an internal error and the compiler would be appeased.

line 35: If you make this a create factory then it can leverage the 
code in the existing constructors - just a thought.


I'll stop here and hit send - not much left after this round...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-15 Thread Jim Graham

Round 4...

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


BezCurve.java:

I'd add some set() methods to BezCurve/Curve and then use a scratch 
instance in Renderer (and other places?) to reuse those calculations, 
such as:


Curve constructors (obviously)
Renderer.curveOrQuadTo()
Renderer.initQuad()
Renderer.initCurve()
Stroker.findSubdivPoints()

lines 179,182 - using your d* variables, wouldn't these be:
   a = 2*(dax*dax + day*day)
   b = 3*(dax*dbx + day*dby)
   c = 2*(dax*cx + day*cy) + dbx*dbx + dby*dby
   d = dbx*cx + dby*cy
It has fewer multiplies and more of the multipliers are powers of 2 
which are faster than non-power-of-2 multiplies.


line 206,210 - a nit - it didn't really confuse me, but halfway through 
reading this it occurs to me that these are really t0 and t1, right?


line 212 - if x0 (t0?) is 0 then you don't need to include it in the 
roots, no?


line 249,257 - these corrections are exponential compared to the sample 
code on the wikipedia page - was that the slight modification that you 
mentioned in the comments?


line 303 - isn't it enough to just look at the previous T value (or keep 
a running prevt variable) rather than update every T value in the 
array?  Isn't this enough?

int prevt = 0; /* field in Iterator */
next() {
curt = Ts[next];
split = (curt - prevt) / (1 - prevt);
prevt = curt;
}

ROCsq - I looked at the wikipedia article and it wasn't clear how it 
directly related to your function since the article is dealing with the 
curvature of a function graphed against its own t, but you are dealing 
with 2 parametric equations combined and graphed against each other.  I 
think I'm going to have to just trust you on this one for now.  :-(


Done with BezCurve.java...

Stroker.java:

lines 558 (et al) - create a helper function for all of these 
(degenerates to a line) cases?


lines 621-626 and 643-646 - I'm not sure what the derivation of these 
lines are.  I tried to do my own equations, but I seem to be heading in 
a different direction than you used and it didn't seem like I was going 
to converge.  What equations did you set up to solve to get these 
calculations?  From what I can see we have:

  - new p1,p4 are calculated
  - new p(0.5) is calculated
  - the corresponding dx,dy at t=0,0.5,1 (gives slopes)
  - slopes at t=0, 0.5 and 1 should be the same for all curves
and what you are trying to compute are two scaling constants c1 and c2 
that represent how much to scale the dx1,dy1 and dx4,dy4 values to get a 
curve that interpolates both position and slope at t=0.5.  A comment 
might help here...  :-(


line 687 - dup?

line 856 - use a scratch Curve instance and set methods to reduce GC?

line 857 - this is true if the first vector is parallel to either axis, 
but the comment after it says only parallel to the x axis - is that a 
problem?  Also, if both are 0 then no parallel constraint is applied 
even if it does start out parallel.  I imagine that this is all OK 
because the both 0 case should be rare/non-existant and the y-axis 
case will also benefit from the same optimization...?


lines 861-863:  sin/cos and atan2 cancel each other out.  It is faster 
to compute the hypotenuse and then divide the x and y by the hypotenuse 
to get the cos and sin.  (cos = x/hypot; sin = y/hypot;)


Cache and TileGenerator look ok...

I think I'm done at this point except for not understanding the 
parallel cubic equations I mentioned above and the ROCsq method...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-18 Thread Jim Graham

Hi Denis,

Looks like some great new work here!  I'll try to keep the pie in the 
sky suggestions down now so we can get this in soon...


On 10/18/2010 2:19 PM, Denis Lila wrote:

Hi Jim.


I'm just now getting down to the nitty gritty of your webrevs (sigh).


Thanks. I hope it's not too bad.


The code was great - what sucked was all of the cobwebs on my trig and 
curve math neurons.



PiscesRenderingEngine.java:
line 296 - is there an epsilon that we can use?  == with floating
point often fails with calculations.


I was thinking maybe something more like the ULP stuff you did in one of 
the other files.  I don't think 2 non-equal fp values can be subtracted 
and produce a value that is as small as MIN_VALUE unless you are 
subtracting 2 extremely tiny numbers.



line 338 - null here too


If this is now line 341 you still use at which might be a non-null 
identity transform.  I'd just use null as some shapes might try to do 
some work if they get a non-null identity transform, but null pretty 
much tells them it's identity.



I turned LengthComputer into an iterator. I think it's much cleaner now.
There's no longer any of that scale every t in the array so that they become
valid parameters of the right subdivided curve.
It also uses less memory - just limit+1 curves. I guess I am clever enough ;)
(though unfortunately not clever enough to have thought of the idea myself).


Interesting solution.  I like it.

line 248,251 - I thought it was a bug that you used 2 when I thought you 
should use 0, but it turns out that it doesn't matter because the last 
point of left is the first point of right.  So, I'm not sure why you use 
2, but it isn't a bug.  However...


You only need the array to be 8+6 if you take advantage of that shared 
point and store the 2 halves at 0..type and type-2 .. 2*type-2. 
Just a thought.  No real bug here.



I found a problem with Dashing though. Curves like moveTo(0,0);
curveTo(498,498,499,499,500,500); are not handled well at all.
http://icedtea.classpath.org/~dlila/webrevs/noflatten2/webrev/ is the link
with the new webrev. I have fixed this problem by doing binary search on
the results of the flattening. I really don't like this solution because
it does *a lot* more subdivisions than just flattening.


Ah, I get it now.  Hmmm.  We can leave it for now, but I'm pretty sure 
we can detect cases like this because the sides of the control polygon 
are not relatively equal and only do the recursion if the control 
polygon indicates some amount of acceleration is happening.  Leave it 
for now and make a mental note of this for later.  Also, if there is 
acceleration then I think you could just solve either the X or the Y 
cubic for the necessary point (xs = interp(x0,x1,len/leafLen), solve for 
xs).


One simplification to your binary search - since we know the length is 
relatively close to chord length, just compute the point on the curve at 
t and then use the distance formula to the start point to compute the 
arc length - no subdividing needed, just an eval and a linelen, and 
bsBuf goes away...


...jim


Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


HI Denis,

On 10/6/2010 1:36 PM, Denis Lila wrote:


webrev:
http://icedtea.classpath.org/~dlila/webrevs/noflatten/webrev/


TransformingPolyIOHelper should be in its own file - we consider more
than one class per file to be bad form these days, especially if the
class is used outside of that file.

I'm a little troubled by how the PolyIOHelper fits into the design.
It's odd to talk to the same object for both input and output.  I have
some ideas there, but I think I'll leave it for a followon email and
effort.

Dasher.java:





 boolean needsMoveto;

in moveTo and pathDone:
  if (firstSegBuf is not empty) {
  output moveto(sx,sy)
  output firstSegs
  }
  needsMoveto = true;  // not needed in pathDone

in goTo() {
  if (ON) {
  if (starting) {
  store it in firstSegBuf
  } else {
  if (needsMoveto) {
  output moveto(x0,y0);
  needsMoveto = false;
  }
  send it to output
  }
  } else {
  starting = false;
  needsMoveto = true;
  // nothing goes to output
  }
}

and in ClosePath:
  lineToImpl(sx, sy, LINE);
  if (firstSegBuf is not empty) {
  if (!ON) {  // Or if (needsMoveto)
  output moveTo(sx, sy)
  }
  output firstSegs
  }

I don't see a need for firstDashOn or fullCurve


Stroker.java:

line 129 - proof that miterLimit does not need to be scaled... ;-)

I'm going to send this buffer of comments off now and continue on with

Stroker.java in a future email...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-18 Thread Jim Graham

Hi Denis,

On 10/18/2010 2:21 PM, Denis Lila wrote:

Are you happy with the current variable names?


Not really. The names you suggested are much better. I'm using them now.
As for making a vector class, I think we should push this and then decide.
It's absence has already done most of the damage it could possibly do, so
it's not an urgent matter. And besides, pushing a good version of this first
will make it easier to determine the performance impact of the vector class.


Woohoo!  Note comment needs updating at line 90.


I introduced a drawRoundCap method. This eliminated the side argument from
the round join drawing, which made it easier to eliminate the trig function
calls. I did this by using dot products to compute cosines (which was possible
because now Stroker takes only untransformed paths, and all lineWidths are the
same), and I used the double angle identities to compute any sines.
I came up with my own ways of detecting acute/obtuse angles and finding the 
centres
of angles (my own meaning I didn't look at any websites), and they consist of:
1. if (omx * mx + omy * my)= 0 then the angle is acute (line 200).
2. I explain this in a comment in the file (line 208).


Yay.  And I can't believe you got that much mileage out of that one 
change.  Cool!  I'll verify the math tomorrow (it doesn't look hard, but 
I'm almost out of here).



I was tempted to do this. I didn't because the boolean versions will need
absolute coordinates, while the non boolean ones require relative ones. So
if the non boolean versions need to be called and all we have are the boolean
ones, 2 dummy arguments need to be supplied. However, I've looked at the code
more closesly, and it turns out that we only use the non boolean versions
from inside the boolean versions, so I've followed your suggestion (except
on emitLineTo, since the non boolean version of that is used quite a bit).


OK, no problem.


line 374 - why is this moveto here?  Doesn't this break the joined
path into 2 separate paths?  Should this be a lineto?


It does break it into 2 separate paths, but that's the correct behaviour
in this case. Mathematically speaking, the 2 offset curves are spearate
curves (despite any intersections). This changes when we use caps, but
when using closePath(), caps aren't drawn so weishould/i  have 2 separate
paths. This is also the behaviour of oracle's closed source java (which
can be seen in one of the Java2Ddemo demos - the one that draws the offset
curves of a star with a vertical slider controlling the line width).


Oh, duh!  I get it.  I had been looking at Dasher all day before that 
and so I was thinking of this in terms of connecting the last dash to 
the first which would, of course, be one continuous path, but this is 
Stroker so if you get a close then it has an inner and outer path. 
Sorry for the distraction.



line 389 - The test here is different from closePath.  What if they
were both prev == DRAWING_OP_TO?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the started variable since it's equivalent to 
prev==DRAWING_OP_TO.


Interesting.  I'll have to trace this later, but it sounds good.


line 337 - shouldn't this just return?  I don't think that empty lines
should modify the path at all.  If this is a case of moveto(x,y);
lineto(x,y) then the finish() code should deal with the path that
never went anywhere - i.e. drawing a dot, shouldn't it?  The only
problem is that moveTo never set up omx,omy so finish will likely draw
something random.  Perhaps if moveto (and closepath) simply set up
omx,omy to w,0 (or 0,-w if you prefer) then finish would do a
reasonable thing for empty paths?


The reason I made it the way it is is to match what proprietary java
does. If one tries to draw a path like moveTo(0,0);lineTo(100,-100);
lineTo(100,-100);lineTo(0,-200); instead of ignoring the second lineTo(100,-100)
it will instead behave as if it were something like 
lineTo(100.1,-100.1),
and it will draw the join. Of course, just because proprietary java does it, it
doesn't mean it's the right thing to do. So, should I still make it ignore 
segments
of 0 length?


No, let me think about this some more.  Compatible is a good default for 
now until we understand it fully so let's not derail for that.



line 283 - doesn't this simplify to?:
 t = x10p*(y0-y0p) - y10p*(x0-x0p)
(source: http://local.wasp.uwa.edu.au/~pbourke/geometry/lineline2d/)
then calculating:
 t = (...)/den;
can amortize the dividend from the following 2 calculations.


I am using this t calculation now. I don't see how what I had simplified
into this though. This is makes me think we were using a wrong equation, which 
is
puzzling since I didn't notice any problems with drawing miters or quadratic 
beziers.
Well, maybe I just made some mistake in trying to show they're equivalent. It 
doesn't
matter.


No, actually they 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-19 Thread Jim Graham

On 10/19/2010 10:38 AM, Denis Lila wrote:

Hi Jim.

If I haven't replied to a suggestion, that means I've implemented and
I thought it was a good idea, so I don't have anything to say about it.


That's mostly true too for me, but there are a couple that I might go 
back to - I'll let you know when I think I've reached a 100% coverage 
(getting close).



getTransformedPoints isn't used?
getUntransformedPoints isn't used?
fillWithIndxes(float) isn't used?
evalQuad isn't used?  (Though it does provide symmetry with evalCubic
which is used)
getFlatness* aren't used?
ptSegDistSq isn't used?


Should I get rid of these? I wanted to do it, but I wanted to wait until
just before pushing because I was afraid I'd start needing them again at
some point in the future.


OK, use your best judgment.  If they are small and they add to symmetry 
of services (like evalQuad) or they might be used later, then it isn't a 
big deal, but dead code in private APIs shouldn't be just left laying 
around if we can help it.


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-19 Thread Jim Graham

Hi Denis,

On 10/19/2010 10:40 AM, Denis Lila wrote:

ROCsq - I looked at the wikipedia article and it wasn't clear how it
directly related to your function since the article is dealing with
the curvature of a function graphed against its own t, but you are dealing
with 2 parametric equations combined and graphed against each other.
I think I'm going to have to just trust you on this one for now.  :-(


http://en.wikipedia.org/wiki/Radius_of_curvature_%28applications%29
Did you look at the above wikipedia article? When researching I came across
2 of them, and one of them only mentions natural parameterizations, but
the above has the general equation for a R-Rn function, then below that
they have the special case where n=2, x(t)=t, and y(t)=f(t). I used the
first equation on that page.
Actually, I wrote a simple program to make sure the radius of curvature
function was correct. I have attached it. It's not a proof, but I think
it is convincing. Just hold down the mouse button and move it horizontally.
This will change the t value on the curve and the circle drawn will have
radius equal to Math.sqrt(ROCsq). You can also change the control points of
the curve. There's a bug where when you run it the window is really tiny, so
you have to manually resize it and maximize it.


I actually did read that article, but I wasn't seeing the fact that it 
was a multiple parametric equation and that the || were distance 
calculations rather than simply absolute values.  Now I see it. 
Plugging those concepts in to the first equation the mapping is very direct.


One thing that confused me when I was proof-reading it was that the 
numerator seemed to be dx2dy2 squared when it should be cubed.  Then I 
spotted the final *dx2dy2 term at the end which makes it cubed.  I 
wasn't sure why you isolated that term out there instead of just 
grouping it with the rest of the numerator - is there a danger of 
overflow if you multiply it before you do the division?  If so, then 
that is fine since it doesn't actually affect the number of fp ops so it 
should be the same performance.



lines 621-626 and 643-646 - I'm not sure what the derivation of these
lines are.  I tried to do my own equations, but I seem to be heading
in a different direction than you used and it didn't seem like I was
going to converge.  What equations did you set up to solve to get these
calculations?  From what I can see we have:
- new p1,p4 are calculated
- new p(0.5) is calculated
- the corresponding dx,dy at t=0,0.5,1 (gives slopes)
- slopes at t=0, 0.5 and 1 should be the same for all curves
and what you are trying to compute are two scaling constants c1 and c2
that represent how much to scale the dx1,dy1 and dx4,dy4 values to get
a curve that interpolates both position and slope at t=0.5.  A comment
might help here...  :-(


I see how (dxm,dym) was confusing. The only reason for computing the slope
at t=0.5 is to get the point of the offset curve at t=0.5. We don't make
the computed curve and the input curve have equal slopes at t=0.5 because
this would give us an overdetermined system. What we're trying to do in this
function is to approximate an ideal offset curve (call it I) of the input
curve B using a bezier curve Bp. The constraints I use to get the equations are:


It does help *a lot*, though, so thank you for writing it up.  I would 
move it closer to the code in question since the function has such a 
long preamble that separates the comment from the code that implements 
it (also method comments are usually reserved for API documentation 
purposes).


lines 544,559 - I'd remove the line numbers from the comment.  They are 
already wrong and they won't survive any more edits any better.  ;-)


In #2, you have a bunch of I'() || B'() which I read as the slope of 
the derivative (i.e. acceleration) is equal, don't you really mean I() 
|| B() which would mean the original curves should be parallel? 
Otherwise you could say I'() == B'(), but I think you want to show 
parallels because that shows how you can use the dxy1,dxy4 values as the 
parallel equivalents.


I would rename det43 to invdet43 to indicate that it is the inverse of 
the determinant.  I kept looking at it and thinking he has the 
determinant in the wrong side until I noticed that it was in the 
denominator of det43 (which is hard to read in parenthesized C-math).


One side note.  At first glance I would have thought that the final 
equations would have subtracted the c2*dxy4 terms rather than adding 
them (since dxy4 represent p4-p3, not p3-p4 and so the linear 
interpolation equation looks backwards), but that isn't true because 
you did all of your math looking to find the c2 that belongs in this 
equation (as backwards as it seems) and so you got that answer. 
Interestingly if you look at the effect on the results if you calculate 
the dxy4 terms the other way around, they are simply negated and the 
impact would be that c1 would be unaffected (both num and 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track rev?  What happens
at 180 degrees (is that test reliable for the randomization that might
happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the computed
point is on the side it should be (and multiplying by -1 if not), it is used
to compute the sign of cv in drawBezApproxForArc, and for computing rev.


The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure modes at 180
degrees then reworking the math to produce the right sign in the first
place may be more robust for that case.  A test for this is to render
(0,0) -  (100,0) -  (0,0) with round caps and then rotate it through
360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I attached
the results. It never fails. It is still possible that there could be some
case where it fails, but this does prove that such a case would be very rare.


Also, line 256 - does that track rev?


It does. I changed the test to if(rev).


Cool, but above I was also asking the same question about line 231, and 
you provided a lot of information about line 231 (and a test to verify 
it), but didn't answer if the test in line 231 also tracks rev the same 
way...?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham

On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of I'() || B'() which I read as the slope
of the derivative (i.e. acceleration) is equal, don't you really mean
I() || B() which would mean the original curves should be parallel?
Otherwise you could say I'() == B'(), but I think you want to show
parallels because that shows how you can use the dxy1,dxy4 values as
the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that vectors
A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero t),
not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the derivative of I().  My issue 
is not with the || operator, but with the fact that you are applying it 
to the I'() instead of I().


Also, how is A(t) and B(t) are parallel not the same as the curves A 
and B are parallel at t?


Also, A(t) = c*B(t) is always true for all A and B and all t if you take 
a sample in isolation.  Parallel means something like A(t) = c*B(t) 
with the same value of c for some interval around t, not that the 
values at t can be expressed as a multiple.


Again, I'() || B'() says to me that the derivative curves are parallel, 
not that the original curves are parallel...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
Right, but it seemed to me that if omxy was the from vector and mxy 
was the to vector, that the computed mmxy should always be predictably 
on the same side of it, no?  If it was on the wrong side then it 
wouldn't be a random occurence, it must be related to the input data. 
So either it is always on the right side, always on the wrong side (i.e. 
just reverse the rotation in the math), or always on the right/wrong 
side depending on the CWness of the join angle - which would be 
reflected in rev...  No?


...jim

On 10/20/10 10:29 AM, Denis Lila wrote:

Cool, but above I was also asking the same question about line 231,
and you provided a lot of information about line 231 (and a test to verify
it), but didn't answer if the test in line 231 also tracks rev the
same way...?


Oh, no, line 231 isn't mean to be related to rev at all. It just checks
to see on which side of the (omx,omy),(mx,my) chord the computed (mmx, mmy)
is.

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

One clarification:

On 10/20/10 7:11 AM, Denis Lila wrote:

When would the isCW test trigger?  Does it track rev?  What

happens

at 180 degrees (is that test reliable for the randomization that

might

happen when omxy are directly opposite mxy)?


isCw is used for computing the arc bisector by testing whether the

computed

point is on the side it should be (and multiplying by -1 if not), it

is used

to compute the sign of cv in drawBezApproxForArc, and for computing

rev.



The only reason I ask is
because I think the sign of mmxy is probably controllable by
understanding the input conditions, but this test should be safe
(modulo if it really works at 180 degrees).  If it has failure

modes at 180

degrees then reworking the math to produce the right sign in the

first

place may be more robust for that case.  A test for this is to

render

(0,0) -   (100,0) -   (0,0) with round caps and then rotate it

through

360 degrees and see if the round caps invert at various angles.


I already did that. I drew 100 lines like the one you describe. I

attached

the results. It never fails. It is still possible that there could

be some

case where it fails, but this does prove that such a case would be

very rare.



Also, line 256 - does that track rev?


It does. I changed the test to if(rev).


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-20 Thread Jim Graham
OK, I can see how your terminology works now, but it seems odd to me.  I 
never consider re-expressing the coordinates on a curve as a vector and 
basing geometric properties on those constructed vectors.  I either 
consider the points on the curve, or its tangent or its normal - none of 
which is the value you are expressing.  You are, essentially, operating 
on tangent vectors, but you aren't calling them that, you are calling 
them something like the vector of the derivative which is a relative 
(direction only) version of the tangent vector (which has location and 
direction).  When one talks about curves and being parallel, my mind 
tends to think of the tangents of the curves being parallel and tangents 
are directed by the first derivative.


Also, if you are going to use your definition of vector then parallel 
is an odd term to use for values that originate from the same point 
(points considered as a vector are taken to originate from 0,0) - really 
you want those vectors to be collinear, not (just) parallel.


So, either || means the coordinates of the curves expressed as vectors 
are collinear or it means the curves (i.e. the tangents of the curve 
at the indicated point) are parallel.  Saying vector I() is parallel 
to vector B() didn't really have meaning to me based on the above biases.


So, I get your comment now and all of the math makes sense, but the 
terminology seemed foreign to me...


...jim

On 10/20/10 10:48 AM, Denis Lila wrote:

Also, how is A(t) and B(t) are parallel not the same as the curves A
and B are parallel at t?


Well, suppose A and B are lines with endpoints (0,0), (2,0) for A
and (0,1),(2,1) for B. Obviously, for all t, A and B are parallel at t.
However let t = 0.5. Then A(t) = (1,0) and B(t) = (1, 1). The vectors
(1,0) and (1,1) are not parallel, so saying A(t) || B(t) is the same
as saying that there exists c such that (1,0) = c*(1,1), which isn't true.

However, A'(t)=(2,0) and B'(t)=(2,0), and the vectors
(2,0) and (2,0) are parallel.

Does this make more sense?

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


On 10/20/10 7:54 AM, Denis Lila wrote:

In #2, you have a bunch of I'() || B'() which I read as the

slope

of the derivative (i.e. acceleration) is equal, don't you really

mean

I() || B() which would mean the original curves should be

parallel?

Otherwise you could say I'() == B'(), but I think you want to

show

parallels because that shows how you can use the dxy1,dxy4 values

as

the parallel equivalents.


Not really. I've updated the comment explaining what || does, and
it should be clearer now. Basically, A(t) || B(t) means that

vectors

A(t) and B(t) are parallel (i.e. A(t) = c*B(t), for some nonzero

t),

not that curves A and B are parallel at t.


I'm not sure we are on the same page here.

I'() is usually the symbol indicating the derivative of I().  My
issue
is not with the || operator, but with the fact that you are applying
it
to the I'() instead of I().

Also, A(t) = c*B(t) is always true for all A and B and all t if you
take
a sample in isolation.  Parallel means something like A(t) = c*B(t)
with the same value of c for some interval around t, not that the
values at t can be expressed as a multiple.

Again, I'() || B'() says to me that the derivative curves are
parallel,
not that the original curves are parallel...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-21 Thread Jim Graham

Hi Denis,

I saw something in the latest webrev that reminded me of an earlier comment.

On 10/18/2010 2:21 PM, Denis Lila wrote:

line 389 - The test here is different from closePath.  What if they
were both prev == DRAWING_OP_TO?


I am now using prev!=DRAWING_OP_TO (not ==, since it is supposed to execute
finish() if no nonzero length lines have been fed to Stroker yet). In fact
I have removed the started variable since it's equivalent to 
prev==DRAWING_OP_TO.


It looks like closePath still uses a different test than moveTo and 
pathDone.  They all test for DRAWING_OP_TO, but closepath uses != 
whereas the others use ==.  Is that right?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

On 10/25/2010 7:34 AM, Denis Lila wrote:

(and I have some ideas on further optimizations to consider if you are still
game after this goes in)...


I'd love to hear what they are.


Here are my thoughts:

- Currently Renderer has more stages than we probably should have:
for (each full pixel row) {
for (each subpixel row) {
for (each curve on subpixel row) {
convert curve into crossing
crossing is (subpixelx:31 + dir:1)
}
sort crossings for subpixel row
apply wind rule and convert crossings into alpha deltas
}
convert alpha deltas into run lengths
}
for (each tile) {
convert run lengths into alphas
}
I'm thinking we should be able to get rid of a couple of those stages...

- One alternate process would be:
(all work is done in the tail end of the cache itself)
for (each full pixel row) {
for (each curve on full pixel row) {
convert curve into N subpixel crossings
(subpixelx:28 + subpixelfracty:3 + dir:1)
}
}
sort all crossings for entire pixel row
maybe collapse raw crossings into modified accumulated crossings
record start of row in index array
for (each tile) {
convert raw or collapsed crossings directly into alphas
}
Note that we could simply leave the raw crossings in the cache and then 
process them in the tile generator, but that would require more storage 
in the cache since a given path would tend to have 8 different entries 
as it crossed every subpixel row.  If we collapse them, then we can sum 
the entries for a given x coordinate into a single entry and store:

(pixelx:25 + coveragedelta:7)
where the coverage delta is a signed quantity up to N_SUBPIX^2 so it 
uses the same storage we needed for the subpixel parts of both X and Y 
plus the direction bit.  It likely needs a paired value in the next x 
pixel location just like our current alpha deltas needs as well.  If 
we wanted to optimize that then we could devote one more bit to the 
next pixel will consume all of the remainder of the N^2 coverage, but 
there would be cases where that would not be true (such as if the pixel 
row has only partial vertical coverage by the path).  It's probably 
simpler to just have deltas for every pixel.


The storage here would likely be similar to the storage used for the 
current cache since the current RLE cache uses 2 full ints to store a 
coverage and a count.  And in cases where we have one pixel taking up 
partial coverage and the following pixel taking up the remainder of the 
full coverage then we have 4 ints, but the crossing delta system would 
only have 2 ints.


Other thoughts...

- Create a curve class and store an array of those so you don't have to 
iterate 3 different arrays of values and use array accesses to grab the 
data (each array access is checked for index OOB exceptions).


- Or perform AFD on curves as they come into Renderer, but only save 
line segment edges in the edges array.  This would use more storage, but 
the iterations of the AFD would happen in a tight loop as the data comes 
in rather than having to store all of their AFD data back in the quads 
and curves arrays and then reload the data for every sub-pixel step. 
Renderer still takes curves, it just breaks them down immediately rather 
than on the fly.  If there are only a small number of edges per curve 
then the storage might not be that much worse because the quad and curve 
arrays already store more values than the edge array.


- Convert to native.  Note that we use a native version of the pisces 
that you started with to do some rendering in FX.  I tried porting to 
use your new (Java) renderer in FX and performance went down even though 
you show it to be faster than what was there before.  So, your Java 
renderer compares favorably to the old Java pisces, but both compare 
unfavorably to the old native pisces.  Maybe we should convert your code 
to native and see if that gives us a performance boost.  It's nice to 
use pure Java, but there is a lot of shoehorning of data going on here 
that could be done much more easily and naturally in native code.


How is that for food for thought?

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-25 Thread Jim Graham

Hi Denis,

Just to be certain - you are still planning on putting the existing 
stuff back and we're talking about future work, right?  I'd love to get 
a stake in the ground here.


On 10/25/2010 3:30 PM, Denis Lila wrote:

- Create a curve class and store an array of those so you don't have
to iterate 3 different arrays of values and use array accesses to grab
the data (each array access is checked for index OOB exceptions).


I actually implemented something like this in my first draft of the current
version of renderer. I didn't stick with it because it was a slower than
even what we have in openjdk6. Granted, my implementation was a bit more
high level than simply making 3 classes to represent lines quads and cubics,
but it seemed pretty hopeless, so I didn't spend any time figuring out
exactly what it was that made it slower.


Hmmm...  Perhaps object allocation overhead was biting us there.  In 
native code you could cobble this up with batch allocation of space and 
pseudo-object/struct allocation out of the batches.



- Or perform AFD on curves as they come into Renderer, but only save
line segment edges in the edges array.  This would use more storage,
but the iterations of the AFD would happen in a tight loop as the data
comes in rather than having to store all of their AFD data back in the quads


I considered this before abandoning the high level version I mention above.
I didn't go with it because, while I am not worried about the higher memory
consumption, I was afraid of the impact that having this many edges would
have on the qsort call and on lines 99-113 and 140-148 in next().


I can see worrying about qsort, but I think one qsort would be 
inherently faster than 3 qsorts which you have anyway so the difference 
would get lost in the noise.  Also, I'm not sure how the 99 to 113 and 
140 to 148 would be affected.  The path will have the same number of 
crossings per sample row regardless of whether the curves have been 
flattened or not.  You might be adding and deleting edges from the 
active list more often, though (in other words, 99 would dump more 
curves and 140 would take in more curves), but the number of edges or 
curves at any given point would not be affected by flattening.  Also, 
the way you've written the loops at 99, they have to copy every 
edge/quad/curve that *doesn't* expire so a skipped curve is actually 
less work for that loop.  The loops at 140 would have to occasionally do 
more processing, but it is made up for in the fact that 99 does less 
work and the nextY processing is simpler.



How about this: we change the format of the edges array to be an array of
sequences of edges. So, right now the edges array has this format:
E* where E represents 6 consecutive floats {ymin,ymax,curx,cury,or,slope}.
I am proposing we change it to {n,or}({ymin,ymax,curx,cury,slope})^n.
lineTo's would add an edge sequence with n=1 to the edges array. If a call
to quadTo or curveTo produced N curves, it would simply put N,or at the
end of the edges array, and then append the data for the N produced edges.
I think this would give us the best of both worlds, in that we can do all
the AFD iterations in a tight loop close to the input methods and it
doesn't present any problems with respect to sorting or managing the
active list. It can probably be implemented completely transparently with
respect to ScanlineIterator. The details of
the implementation involve an interesting speed/memory trade off, but we
can discuss that later.


I think this might be overkill since sorts tend to have logN behavior so 
doubling the number of edges would not double the time taken in the 
sort.  Also, I would think that the sort would be a small amount of time 
compared to the rest of the processing, wasn't it?


If we are really worried about the y-sort, then how about creating a 
bunch of buckets and doing a bucket sort of the edges?  As they are 
added to the list of segments, we accumulate their indices in a row list 
based on their startY so that each step of the next() simply moves to 
the next Y and adds the edges mentioned in the list there.  Some work 
would have to be done on how to manage the storage simply (like a 
rownext field in the edge structure so that they are linked in a 
linked list), but then there would be no qsort at all for the cost of 
new int[N_ROWS] and an extra field in every edge.



- Convert to native.  Note that we use a native version of the pisces
that you started with to do some rendering in FX.  I tried porting to
use your new (Java) renderer in FX and performance went down even
though you show it to be faster than what was there before.  So, your Java
renderer compares favorably to the old Java pisces, but both compare
unfavorably to the old native pisces.  Maybe we should convert your
code to native and see if that gives us a performance boost.  It's nice to
use pure Java, but there is a lot of shoehorning of data going on
here that could be done much more easily 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-27 Thread Jim Graham

Hi Denis,

On 10/26/2010 6:58 AM, Denis Lila wrote:

90% (guesstimate) of the time edges do not cross each other, thus if
you sort the crossings without reordering the active edges then you just
end up doing the same sorting work (same swaps) on the next scanline.  My
SpanShapeIterator code actually reordered the edges on every sample
line to match their current X coordinates in a way that involved 1 compare
per edge that was processed and only occasionally a swap of 2 edge
pointers.  It would basically eliminate the sort at line 149 at the
cost of only doing a compare in the nextY processing for the very very
common case.


I also implemented this some time ago. I didn't keep it because it slowed
things down a bit. However, I only tested it with very complex and large
paths, and in hindsight, I shouldn't have based my decision on that, so I
will re-implement it.


I tried using this new rasterizer in FX and I had a test case which had 
a few shapes that were essentially zig-zaggy shapes on the top and 
bottom and fills between the zigzags (kind of like a seismic chart with 
fills between the pen squiggles).


When I added a simple sort of the linear edges the performance nearly 
doubled in speed.  Here is the code I replaced your quad-next-edge loop 
with:


for (int i = elo; i  ehi; i++) {
int ptr = edgePtrs[i];
int cross = ((int) edges[ptr+CURX])  1;
if (edges[ptr+OR]  0) {
cross |= 1;
}
edges[ptr+CURY] += 1;
edges[ptr+CURX] += edges[ptr+SLOPE];
int j = crossingIdx;
while (--j = 0) {
int jcross = crossings[j];
if (cross = jcross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[elo+j+1] = edgePtrs[elo+j];
}
crossings[j+1] = cross;
edgePtrs[elo+j+1] = ptr;
crossingIdx++;
}

I then did a conditional sort, moved to right after the qlo-qhi and 
clo-chi loops:


for (int i = qlo; i  qhi; i++) {
// same stuff
}
for (int i = clo; i  chi; i++) {
// same stuff
}
if (qhi  qlo || chi  clo) {
Arrays.sort(crossings, 0, crossingIdx);
}

In the case of my test case where I only had a polygon to fill, the 
performance doubled.  Obviously I didn't do much for the case where 
there were curves because this was just a quick hack to see the value of 
sorting the edges.  If we moved to a Curve class or some other way to 
consolidate the 3 lists (may be easier in native code), this might win 
in more cases...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-10-28 Thread Jim Graham

Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might win
in more cases...


Does that mean you no longer think we should flatten every curve as soon
as we get it?


No, I was just discussion the feasibility of that one suggestion in the 
context of all of the possibilities of whether or not we took the other 
choices.  If you think that flattening will pay off then we don't have 
to worry about the 3 lists.  It was just that when I hacked it in, I 
still had your 3 lists and so the pros and cons of horizontal edge 
sorting were relative to that version of the renderer...


...jim


Re: [OpenJDK 2D-Dev] Java's definition of the SRC operator

2010-10-29 Thread Jim Graham
SRC behaves like SRC, but AA is another part of the equation.  It works 
like this (for any rule):


blendresult = PORTER_DUFF(rule, rendercolor, dstcolor, extraalpha)
// For SRC, blendresult = rendercolor modulated by extra alpha
storedresult = INTERP(dstcolor, blendresult, aacoverage)
// For full aa coverage, storedresult = blendresult

The only part of this that could possibly be interpreted as behaving 
like SRC_OVER would be the second INTERP and it depends on the aa 
coverage, not on the alpha of the colors involved.  Is that what they 
were talking about?


But, the interior of shapes should all have full aa coverage and so 
should just store the blendresult (which, in the case of SRC is 
rendercolor)...


...jim

On 10/29/10 12:06 PM, Clemens Eisserer wrote:

Hi,

Some users reported problems with the IntelliJ Idea's editor when
running with xrender enabled.
It turned out that there are some differences between how Java and
xrender interpret the SRC operator.

Is the general rule, that SRC behaves like SRC_OVER when antialiasing
is enabled?
Are there some special cases that need to be taken care of?

Thanks, Clemens


Re: [OpenJDK 2D-Dev] Strange text rendering with SRC + Extra Alpha

2010-10-29 Thread Jim Graham
Why is SRC with an extra alpha handled any differently than SrcNoEa with 
a color that has alpha?  The two cases are supposed to be folded 
together because it doesn't matter where the alpha comes from.


There is also a paintType indicator that indicates when the paint is 
opaque.  If you only register the loops for opaque paints then I think 
the state may not be enough as you say, but if the loops can handle 
alpha then they can handle Src with Ea...


...jim

On 10/29/10 12:36 PM, Phil Race wrote:

I didn't get time to look at this yesterday.
I think the problem is (sort of) in SunGraphics2D.setComposite() where
we have

}
} else if (newCompType == CompositeType.SrcNoEa ||
newCompType == CompositeType.Src ||
newCompType == CompositeType.Clear)
{

This then categorises the composite state SRC as COMP_ISCOPY
regardless of whether there is an extra alpha.

This code was introduced in
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6248957

I believe that there's some handling of the extra alpha for most
rendering but
its enough to mislead SurfaceData into thinking it can create and use
LCD glyphs.
I've not yet looked into what exactly happens after that but it
seems like we end up in a loop that uses the LCD glyph
data as if its greyscale mask and you'll get garbage.

If this is the case we can't rely on the compositeState variable and
will need to check the composite type.

-phil.

Denis Lila wrote:

Hello.

After noticing some weird things I made some modifications to
Clemens' test. I've attached the results and the tester.
The most interesting results are the first 18 lines in each
image, all of which are produced by rendering to INT_RBG type
buffered images. The first 6 lines have both AA_ON and subpixel aa.
Lines 7-12 have only subpixel AA, and lines 13-18 have only AA_ON.
In each group of 6, the only thing that changes is the order in
which the hints, the composite, and the colour are set.

Lines 1-6 and 13-18 don't look right on closed jdk (1.6.0_21-b06).
The only thing they have in common is that the AA hint is on.
Lines 7-12 look ok - they only have LCD AA. We can see that the order
doesn't matter.

On openJDK7 (with a build of a fresh clone of the 2d tree) lines
1-3 look like what Clemens is seeing, but the next 3 don't look bad,
even though all of the first 6 lines are supposed to be drawn with
a black colour, SRC compositing, and both AA_ON and LCD AA.
The same goes for lines 13-15. What lines 1-3 and 13-15 have in
common is that the colour is set after the composite.
Lines 7-12 still look ok, but they are a bit different from what closed
java produces, even though subpixel antialiasing is clearly used in both.

That's it for my observations.
Regards,
Denis.

- Clemens Eisserer linuxhi...@gmail.com wrote:


Hi again,

Attached is the sample program as well as the output I get on
jdk7b99,
when running the sample program attached.

- Clemens










Re: [OpenJDK 2D-Dev] Strange text rendering with SRC + Extra Alpha

2010-10-29 Thread Jim Graham
If you allow ALPHACOLOR (paintState = ALPHACOLOR) then you should be 
able to handle Src with EA...


...jim

On 10/29/10 12:49 PM, Phil Race wrote:

This fixes it, although the same may need to be done to OGL and D3D
subclasses
of SurfaceData.java

-phil.

diff --git a/src/share/classes/sun/java2d/SurfaceData.java
b/src/share/classes/sun/java2d/SurfaceData.java
--- a/src/share/classes/sun/java2d/SurfaceData.java
+++ b/src/share/classes/sun/java2d/SurfaceData.java
@@ -450,6 +450,7 @@ public abstract class SurfaceData
// For now the answer can only be true in the following cases:
if (sg2d.compositeState = SunGraphics2D.COMP_ISCOPY 
sg2d.paintState = SunGraphics2D.PAINT_ALPHACOLOR 
+ sg2d.imageComp != CompositeType.Src 
sg2d.clipState = SunGraphics2D.CLIP_RECTANGULAR 
sg2d.surfaceData.getTransparency() == Transparency.OPAQUE)
{


Phil Race wrote:

I didn't get time to look at this yesterday.
I think the problem is (sort of) in SunGraphics2D.setComposite() where
we have

}
} else if (newCompType == CompositeType.SrcNoEa ||
newCompType == CompositeType.Src ||
newCompType == CompositeType.Clear)
{

This then categorises the composite state SRC as COMP_ISCOPY
regardless of whether there is an extra alpha.

This code was introduced in
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6248957

I believe that there's some handling of the extra alpha for most
rendering but
its enough to mislead SurfaceData into thinking it can create and use
LCD glyphs.
I've not yet looked into what exactly happens after that but it
seems like we end up in a loop that uses the LCD glyph
data as if its greyscale mask and you'll get garbage.

If this is the case we can't rely on the compositeState variable and
will need to check the composite type.

-phil.

Denis Lila wrote:

Hello.

After noticing some weird things I made some modifications to
Clemens' test. I've attached the results and the tester.
The most interesting results are the first 18 lines in each
image, all of which are produced by rendering to INT_RBG type
buffered images. The first 6 lines have both AA_ON and subpixel aa.
Lines 7-12 have only subpixel AA, and lines 13-18 have only AA_ON.
In each group of 6, the only thing that changes is the order in
which the hints, the composite, and the colour are set.

Lines 1-6 and 13-18 don't look right on closed jdk (1.6.0_21-b06).
The only thing they have in common is that the AA hint is on.
Lines 7-12 look ok - they only have LCD AA. We can see that the order
doesn't matter.

On openJDK7 (with a build of a fresh clone of the 2d tree) lines
1-3 look like what Clemens is seeing, but the next 3 don't look bad,
even though all of the first 6 lines are supposed to be drawn with
a black colour, SRC compositing, and both AA_ON and LCD AA.
The same goes for lines 13-15. What lines 1-3 and 13-15 have in
common is that the colour is set after the composite.
Lines 7-12 still look ok, but they are a bit different from what closed
java produces, even though subpixel antialiasing is clearly used in
both.

That's it for my observations.
Regards,
Denis.

- Clemens Eisserer linuxhi...@gmail.com wrote:



Hi again,

Attached is the sample program as well as the output I get on
jdk7b99,
when running the sample program attached.

- Clemens













Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-01 Thread Jim Graham

Hi Denis,

A generic suggestion - make all of your classes final - that gives the 
compiler the maximum flexibility to inline any methods you write.


With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The 
first test of does this edge need to be swapped with the next lower 
edge is probably 99.999% guaranteed to be false.  Thus, trying to 
optimize anything else to simplify swapping is likely a step in the 
wrong direction.


The casting may be hurting a bit more, and it is hurting on every access 
to an edge.


I'm guessing the best model to use would be to write the code to assume 
no swapping is necessary at all.  Get that code as simple and as fast as 
can be.  Then add as little perturbation of that code to manage swapping 
when it is necessary, and that will likely be the optimal 
implementation.  I think you could probably even do some benchmarking on 
a path that is carefully tested to process lots of edges without any X 
sorting and get that as fast as you can without any swap code, and then 
add the swap code that impacts the performance of that operation as 
little as possible.  The key is that the swap code have as little impact 
on the performance of the case that never needs any swapping at all 
first and foremost - then make swapping faster within that constraint...


...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the scanlines
too. What happens is that on any iteration, the active list is the
doubly linked list buckets[nextY-boundsMinY]. I did this because I thought
less memory would need to be moved around compared to when we just kept
the active list pointers in an array. For example, with doubly linked
lists we can implement insertion sort with O(1) writes. With arrays we
have to use O(n) writes. This also allows us to get rid of the the edgePtrs
array.

I ran some benchmarks, and unfortunately I was wrong, somehwere. All the tests
are at least 10% slower than the insertion sort version of what we have now.
I can't immediately see why this is. The only thing I can think of is that
there are a lot of float -  int casts, but then again, I don't know how
slow this operation is. It would be good if it's because of the casts because
it would no longer be an issue when we convert to native.

I alo made an unrelated change: I now find the orientation and update x0,y0
much earlier than we used to. The way I was doing it before was silly.

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

Good news!

On 10/28/2010 3:27 PM, Denis Lila wrote:

If we moved to a Curve class or some other way to
consolidate the 3 lists (may be easier in native code), this might

win

in more cases...


Does that mean you no longer think we should flatten every curve as

soon

as we get it?


No, I was just discussion the feasibility of that one suggestion in
the
context of all of the possibilities of whether or not we took the
other
choices.  If you think that flattening will pay off then we don't have

to worry about the 3 lists.  It was just that when I hacked it in, I
still had your 3 lists and so the pros and cons of horizontal edge
sorting were relative to that version of the renderer...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-02 Thread Jim Graham

Hi Denis,

I had a bit of luck with the following next() method:

private int next() {
// TODO: make function that convert from y value to bucket idx?
int bucket = nextY - boundsMinY;
for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = 
(int)edges[ecur+NEXT]) {

edgePtrs = LilaHelpers.widenArray(edgePtrs, edgeCount, 1);
edgePtrs[edgeCount++] = ecur;
// REMIND: Adjust start Y if necessary
}
int crossingCount = edgeCount;
crossings = LilaHelpers.widenArray(crossings, 0, 
crossingCount);

nextY++;
for (int i = 0; i  edgeCount; i++) {
int ecur = edgePtrs[i];
float curx = edges[ecur+CURX];
int cross = ((int) curx)  1;
edges[ecur+CURX] = curx + edges[ecur+SLOPE];
if (edges[ecur+OR]  0) {
cross |= 1;
}
int j = i;
while (--j = 0) {
int jcross = crossings[j];
if (jcross = cross) {
break;
}
crossings[j+1] = jcross;
edgePtrs[j+1] = edgePtrs[j];
}
crossings[j+1] = cross;
edgePtrs[j+1] = ecur;
}
int newCount = 0;
for (int i = 0; i  edgeCount; i++) {
int ecur = edgePtrs[i];
if (edges[ecur+YMAX]  nextY) {
edgePtrs[newCount++] = ecur;
}
}
edgeCount = newCount;
return crossingCount;
}

This allowed me to:

- delete a lot of the bucket helper functions.
- get rid of the back pointers
- pare an edge down to 5 values (YMAX, CURX, OR, SLOPE, and NEXT)

I also used the following addLine() method:

private void addLine(float x1, float y1, float x2, float y2, int or) {
final int firstCrossing = (int)Math.max(Math.ceil(y1), boundsMinY);
if (firstCrossing = boundsMaxY) {
return;
}
final int ptr = numEdges * SIZEOF_EDGE;
final float slope = (x2 - x1) / (y2 - y1);
edges = LilaHelpers.widenArray(edges, ptr, SIZEOF_EDGE);
numEdges++;
edges[ptr+OR] = or;
edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
edges[ptr+SLOPE] = slope;
edges[ptr+YMAX] = y2;
final int bucketIdx = firstCrossing - boundsMinY;
addEdgeToBucket(ptr, bucketIdx);
}

How does that fare for you?

...jim

On 11/2/2010 4:10 PM, Denis Lila wrote:

Hi Jim.

I implemented a middle ground between what I sent yesterday and
what we have now: using the array of linked lists only to replace
the quicksort (I think this was your original suggestion).

Unfortunately, this didn't work out (at least according to the
benchmarks). Curves were no different than what we used to have,
while the performance on lines (especially horizontal ones) decreased
significantly.

It's not obvious to me why this happened, so I think now I will put
this type of optimization aside and convert to JNI, where profiling
will be easier (for me - I haven't been able to make OProfile work
for java yet).

Regards,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

A generic suggestion - make all of your classes final - that gives the

compiler the maximum flexibility to inline any methods you write.

With respect to the algorithm choices:

I think they key is that the X sorting rarely has any work to do.  The

first test of does this edge need to be swapped with the next lower
edge is probably 99.999% guaranteed to be false.  Thus, trying to
optimize anything else to simplify swapping is likely a step in the
wrong direction.

The casting may be hurting a bit more, and it is hurting on every
access
to an edge.

I'm guessing the best model to use would be to write the code to
assume
no swapping is necessary at all.  Get that code as simple and as fast
as
can be.  Then add as little perturbation of that code to manage
swapping
when it is necessary, and that will likely be the optimal
implementation.  I think you could probably even do some benchmarking
on
a path that is carefully tested to process lots of edges without any X

sorting and get that as fast as you can without any swap code, and
then
add the swap code that impacts the performance of that operation as
little as possible.  The key is that the swap code have as little
impact
on the performance of the case that never needs any swapping at all
first and foremost - then make swapping faster within that
constraint...

...jim

On 11/1/10 3:13 PM, Denis Lila wrote:

Hi Jim.

I implemented your bucket sort idea. I'm not just using the buckets
to remove the y-sort. I use them in the iteration through the

scanlines

too. What happens is that on any 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

On 11/8/2010 6:34 AM, Denis Lila wrote:

Hi Clemens.


I've only followed your discussion with Jim but skipped all the
in-depth discussion.
 From my prior experiences usually  JNI is not woth the trouble, if
you don't have a serious reason why using native code would have
advantages (like the possibility of using SIMD or when value-types
would be a huge benefit), it has its own performance pitfalls
especially if the workload is small and things like Get*ArrayCritical
cause scalability problems because they have to lock the GC.


Well, Jim Graham said that a native version of the engine still runs
a lot faster than the version with all my changes. That's why I thought


Actually, that report is old.  I've now got the new Java version turning 
in double the frame rates of the old native version.



it would be a good idea. Also, when not doing antialiasing we usually
feed paths to native consumers, so I thought if pisces used JNI, we
could reduce the java-C transitions five fold. But then I realized that
with antialiasing the opposite would happen, so I'm not sure whether
JNI is a good idea.


That's a good point that the other rasterizers will end up using this 
stroking engine and they are native.  We can worry about cleaning that 
up later.  JNI might eventually be a good idea, but lets fix the 
algorithm first and then worry about whether it will help this renderer 
or if we can make the interface to the other renderers simpler.


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham
It's still a work in progress, but I've cleaned up a lot of logic and 
made it faster in a number of ways.  Note that I've abstracted out the 
cache stuff and created an AlphaConsumer interface which may evolve 
over time.


In FX we actually consume alphas in larger chunks than the code in JDK 
which was driven by Ductus's 32x32 mandate, so I would have had to make 
completely incompatible changes to emitRow - so I moved it behind an 
interface.  For the JDK code, if you want to integrate this version, I 
would have the cache implement the new interface and move your version 
of emitRow into the Cache object.  I'd send you the new code for my 
AlphaConsumer, but it is incompatible with what you need to cache so it 
won't help you.


You'll also need a bit of un-translation cleanup as we have mirrors of 
all of java.awt.geom with special new APIs that FX needs.


...jim

On 11/8/2010 6:40 AM, Denis Lila wrote:

Hi Jim.


Also, I've gotten another 20% improvement out of the design with a few
more tweaks.  (Though I measured the 20% in the stripped down version
that I'm prototyping with FX so I'm not sure how much of that 20%
would show up through the layers of the 2D code.  Overall, I've about
doubled the frame rates of the prototype since your first drop that you
checked in to the OpenJDK repository.)


Can I see your new version?


Attached.


How about looking more at the stroking end of the process and I'll dig
a little more into optimal rasterization code.  I have a lot of
experience with optimizing rasterizer code (and JNI if it comes to that), but
very little with the curve manipulations involved in stroking (math is so
*hard* at my age ;-)...


Sounds good. Have you implemented your idea of processing one pixel row at a
time, as opposed to processing subpixel rows? If not, I could do that.


Not yet.  Right now I've gotten a lot of mileage out of a few tweaks of 
the bookkeeping of the sample-row-at-a-time version.  I'm still mulling 
over exactly how to make that go faster.


...jim
package com.sun.openpisces;

/**
 * @author Flar
 */
public interface AlphaConsumer {
public int getOriginX();
public int getOriginY();
public int getWidth();
public int getHeight();
public void setMaxAlpha(int maxalpha);
public void setAndClearRelativeAlphas(int alphaDeltas[], int y,
  int firstdelta, int lastdelta);
}
package com.sun.openpisces;

import com.sun.javafx.geom.PathConsumer;
import com.sun.javafx.geom.PathIterator;
import com.sun.javafx.geom.Rectangle;
import java.util.Iterator;

/**
 * @author Flar
 */
public final class OpenPiscesRenderer implements PathConsumer {

public static void feedConsumer(PathIterator pi, PathConsumer pc) {
float[] coords = new float[6];
while (!pi.isDone()) {
int type = pi.currentSegment(coords);
switch (type) {
case PathIterator.SEG_MOVETO:
pc.moveTo(coords[0], coords[1]);
break;
case PathIterator.SEG_LINETO:
pc.lineTo(coords[0], coords[1]);
break;
case PathIterator.SEG_QUADTO:
pc.quadTo(coords[0], coords[1],
  coords[2], coords[3]);
break;
case PathIterator.SEG_CUBICTO:
pc.curveTo(coords[0], coords[1],
   coords[2], coords[3],
   coords[4], coords[5]);
break;
case PathIterator.SEG_CLOSE:
pc.closePath();
break;
}
pi.next();
}
pc.pathDone();
}

private final class ScanlineIterator {

private int[] crossings;
private int[] edgePtrs;
private int edgeCount;

// crossing bounds. The bounds are not necessarily tight (the scan line
// at minY, for example, might have no crossings). The x bounds will
// be accumulated as crossings are computed.
private final int maxY;
private int nextY;

private static final int INIT_CROSSINGS_SIZE = 10;

private ScanlineIterator() {
crossings = new int[INIT_CROSSINGS_SIZE];
edgePtrs = new int[INIT_CROSSINGS_SIZE];

// We don't care if we clip some of the line off with ceil, since
// no scan line crossings will be eliminated (in fact, the ceil is
// the y of the first scan line crossing).
nextY = getFirstScanLineCrossing();
maxY = getScanLineCrossingEnd()-1;
}

private int next() {
// TODO: make function that convert from y value to bucket idx?
int cury = nextY++;
int bucket = cury - boundsMinY;
int count = this.edgeCount;
int ptrs[] = this.edgePtrs;
 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-08 Thread Jim Graham

A couple of questions about the code that I haven't touched...

Is there some reason why the AFD for cubics doesn't have any tests for 
dddxy (the constants for its equations), but the AFD for quads is 
testing the ddxy on every loop?  I know that these values do change when 
the AFD variables are doubled or halved, but why does the cubic version 
get away with only testing out to the n-1th order differences but the 
quad version has to test out to the nth order differences?


Also, what is the value of breaking the pieces into monotonic segments 
prior to flattening?  Is it simply amortizing the cost of determining if 
the segment is up or down?  I guess this used to be done because we 
needed monotonic (in Y) curve segments since we did a top-down iteration 
across all segments, but now they aren't in the rasterization loop.  If 
it is simply a performance issue then I may experiment with eliminating 
that stage and seeing if I can make it go faster overall.


Finally, I discovered (while testing for other problems) that the curves 
are not truly monotonic after slicing them.  I realized this years ago 
when I was writing my Area code (see sun.awt.geom.Curve) and put in 
tweaking code to make them monotonic after they were split.  They are 
never off by more than a few bits, but you can't trust the curve 
splitting math to generate purely monotonic segments based on a t 
generated by some unrelated math.  Sometimes the truly horizontal or 
vertical t value requires more precision than a float (or even a double) 
can provide and splitting at the highest representable float less than 
the t value produces a pair of curves on one side of the hill and 
splitting at the next float value (which is greater than the true t 
value) produces curves on the other side of the hill.  Also, when the 
curve has been split a few times already, the t values loose accuracy 
with each split.  This will all be moot if I can eliminate the splitting 
code from the renderer, but it may also play a factor in the stroke/dash 
code...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-11-09 Thread Jim Graham

I ended up going with:

- get rid of edgeMxy in all methods but addLine()
- addLine computes min/max of first/lastScanline
- addLine also computes min/max of x1,x2 values

this turned out to be just about the same speed for my FX rendering 
version (which I believe is more sensitive than the way it is integrated 
into JDK, so it should be even less noticeable in JDK).  It also paved 
the way for a couple of other optimizations that ended up netting about 
1FPS for my current test case that I use so I'm happy for now.


The code is a lot simpler now...

...jim

On 11/9/2010 3:26 PM, Denis Lila wrote:

Hi again.

I just thought of this: if we're really concerned about the accuracy
of the edgeMinX edgeMaxX variables, we could find the curves'
critical points and use them to compute the min/max X values. After all,
we're creating (or rather setting) the Curve objects anyway. This isn't
as fast as using the bounding boxes, but it's close and much more accurate.

Regards,
Denis.

- Denis Liladl...@redhat.com  wrote:


Hi Jim.


All lines generated from a given allegedly monotonic curve are
recorded with the same or (orientation) value.  But, if the curves
are not truly monotonic then it might be theoretically possible to
generate a line that is backwards with respect to the expected

orientation.  It

would then get recorded in the edges array with the wrong

orientation

and slope and then rasterization might unravel.


I see. In that case, I think it's a good idea if we don't make curves
monotonic. I already did this, by moving the edgeMin/axX/Y handling
and orientation computations in addLine. This did make it slower
compared
to the file you sent me, but only by very, very little. Curves were
affected the most, and they were only 1% slower. I think we can handle
this, especially since lines were about 1% faster. The code is also 70
lines shorter.

The edgeM* members are used only so we don't have to iterate through
every
scanline if this is not necessary, and so that we can tell PiscesCache
that the bounding box is smaller than what Renderer is given. However,
now
that we keep the bucket list, I think it would be more efficient if we
got rid if EdgeM[in|ax]Y and simply computed the y bounds by looking
at the
head and tail of the bucket list.
Also, perhaps we can keep track of edgeM[in|ax]X using the bounding
boxes
of curves, instead of the lines in the flattened curves. This would
not
be accurate, but I don't think it would affect rendering. It would
simply
result in a few more alpha boxes than necessary. I don't think these
would
be too bad, because mostly they're just going to be all 0 so they will
be skipped because getTypicalAlpha() is now implemented.
How do you think we should handle these 4 variables?

Thank you,
Denis.

- Jim Grahamjames.gra...@oracle.com  wrote:


Hi Denis,

On 11/8/2010 2:39 PM, Denis Lila wrote:

Finally, I discovered (while testing for other problems) that the
curves are not truly monotonic after slicing them.  I realized

this

years ago

when I was writing my Area code (see sun.awt.geom.Curve) and put

in

tweaking code to make them monotonic after they were split.  They

are

never off by more than a few bits, but you can't trust the curve
splitting math to generate purely monotonic segments based on a t
generated by some unrelated math.  Sometimes the truly horizontal

or

vertical t value requires more precision than a float (or even a
double) can provide and splitting at the highest representable

float less than

the t value produces a pair of curves on one side of the hill and
splitting at the next float value (which is greater than the true

t

value) produces curves on the other side of the hill.  Also, when

the

curve has been split a few times already, the t values loose

accuracy

with each split.  This will all be moot if I can eliminate the
splitting code from the renderer, but it may also play a factor

in

the

stroke/dash
code...


Making curves monotonic is only used for optimization purposes,
so it can't see how it would affect rendering correctness.


Fortunately, the non-monotonicity is limited to a few bits of
precision
so this may never generate an errant edge in practice unless
flattening
gets really fine-grained...

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-06 Thread Jim Graham

Hi Denis,

On 12/6/2010 4:21 PM, Denis Lila wrote:

Hi Jim.


line 134 - what if numx or numy == 0 because only roots outside [0,1]
were found?


In this case lines 151-162 will execute, and nothing is wrong. The only
problem is when both numx and numy are 0. This is certainly possible in
the general case (but only with quadratic curves), but the way we're using
this function, the intermediate value theorem guarantees at least one root
will be found. Of course, finite precision math doesn't necessarily care
what calculus has to say, but in this case I can't see how the root computation
could fail. On the other hand, one could argue that this is exactly why we need
to defend against this case, so I've added some checks.


I'm sure you will likely find a root, but the method you are using is 
roots*inAB* which may throw the root out because it is out of range, no?


In looking at that method it looks like the cubic handling code tries 0 
and 1 in addition to the critical points it finds using a root, but the 
quadratic code that it chooses if a==0 will throw out all roots outside 
the 0,1 range and may end up with 0 answers.  The cubic code further can 
reject all of the points (if they are all non-zero and same sign) and 
also return no answers, but may have fewer cases where it would do that.


Still, my point was not that you might fail to find a root, but that the 
roots may get rejected and end up with no answers in range.



line 145 - what if d0 and d1 are both 0?  NaN results.  What if you
just used a simple d0  d1 ? tx : ty - how far off would that be?


I tried that out on a curve with very high acceleration, and it makes a 
noticeable
difference. So, instead I'm using
if (w0 == Float.NaN) {
return tx;
}


Read the IEEE spec on NaN.  It's a special value that has this bizarre 
property that it is the only number that is not equal to itself. ;-)  In 
fact, the test for NaN is usually if (x == x) notNaN else NaN.  If 
you want to be explicit and formal then you can use the static 
Float.isNaN() method (which is essentially that test - x!=x).


Same thing on Dasher line 363 where you also test for NaN.


line 357 - another optimization would be to check the acceleration in
the range and if it is below a threshold then simply use the t from
line 348 as the t for the split


I like this. I tried implementing it. I haven't tested it yet though, and
I still have to pick a good cutoff acceleration. For now I'm using 1/leaflen.
We definitely don't want it to just be a constant, since the longer the leaf,
the worse it will be to allow acceleration, so for longer leaves we want to
skip the getTCloseTo call only if the acceleration is smaller.


A lot of the lines before you test MaxAcc are not needed unless you go 
into the if.  In particular, x,y,[xy][01] are only used if you call 
getTCloseTo().



Renderer.java:  Is this just a straight copy of what I was working on?


I'm pretty sure it is, yes.


Actually I think you've updated the AFD code so I should really take a 
look... :-(


;-)


TransformingPathConsumer2D:

Any thoughts on whether we need translation in the scale filter and
whether a 4-element non-translation filter would be useful?  I think
the current code that drives this passes in the full transform and its
inverse, but it could just as easily do delta transforms instead and
save the adding of the translation components...


I thought about this long ago, but I wasn't sure it wouldn't break anything.
Today, I got a bit more formal with the math, and I think it's ok
to eliminate the translation. I've implemented this (though, I haven't had
time to test it yet. That's for tomorrow).


Right now you have (note that the common terminology for transform 
without translation is delta transform):


PathIterator
= DeltaAT = Normalize
= DeltaInverseAT = strokers
= FullAT = renderer

The problem is that normalization needs proper sub-pixel positioning so 
you need to hand it the true device space coordinates with proper 
translation.  You need this:


PathIterator
= FullAT = Normalize
= DeltaInverseAT = strokers
= DeltaAT = renderer

I would skip the creation of atNotTranslationPart and just inverse the 
original transform (since I don't think the inversion is affected by 
translation - you can see this in the calculations in 
AT.createInverse()), and then have the transforming consumers apply a 
delta transform - i.e. create a TPC2D.deltaTransformConsumer() method 
which would apply just the non-translation parts of AT to the consumer.


If you want to get really fancy with optimizations you could have an 
inverseDeltaTransformConsumer() method that would calculate the 
inversions on the fly to avoid construction of a scratch AT.  Since it 
is just weird transpose with signs and divide by the determinant in 
the most general case and even simpler (invert Mxx 

Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

On 12/8/2010 9:37 AM, Denis Lila wrote:

Shouldn't it be [A, B]?


I thought about this when implementing it, but I don't think it mattered
whether it was closed or half open, and the closed interval would have been
somewhat more awkward to implement.


I'm not sure how the closed interval is awkward.  Isn't it just proper 
choice of = and = vs.  and  in the testing method?



getMaxAcc functions - don't we want the furthest value from 0,
positive or negative?  You are looking for most positive value
and negative accelerations are equally problematic, aren't they?
If so then these functions need some work.


You're right about both, but there's a much more serious problem that I
didn't think of when writing them: the value I compute in the if
statement in Dasher:355 is not an upper bound on the acceleration of
the curve. The acceleration is:
C'(t).dot(C''(t))/len(C'(t)) which in terms of the parameter polynomials is
(x'(t)*x''(t) + y'(t)*y''(t))/sqrt(x'(t)^2 + y'(t)^2)
What those functions would compute if they were correct would be
max(abs(x''(t))) and max(abs(y''(t))), and the sum of these is not
closely related to the maximum absolute acceleration, which is what we
want.
Without the upper bound property, I don't think it's a very meaningful
test, and I think we should abandon this optimization. Do you agree?


How about if the 3 segments of the control polygon are all close to 
each other in length and angle, then the optimization applies.  Is that 
easy to test?


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-08 Thread Jim Graham

Hi Denis,

On 12/8/2010 12:04 PM, Denis Lila wrote:

I'm not sure how the closed interval is awkward.  Isn't it just proper
choice of = and= vs.  and in the testing method?


In the filtering function, yes, but I was referring to cubicRootsInAB in
Helpers:122-133 where we iterate through intervals. For each interval,
we have the values of the function at the ends, and if the left one
is 0, we just add it as a zero and skip the call to CubicNewton. In order
to get roots in [A,B], we would either have to test both endpoints (which
would be more expensive and it would force us to find a way of avoiding
duplicate roots), or we would have to deal with the last interval as
a special case. The latter is not that bad, but it is more awkward than
what we have now.


Perhaps it would be better if RootsInAB would advertise that it is 
returning approximations of a high precision, but they won't be exact 
and roots near the endpoints may not be caught and so the caller should 
be prepared to evaluate the endpoints manually to see if they represent 
interesting values for the purposes of why the roots were requested.


And then do that in the functions that call it.


How about if the 3 segments of the control polygon are all close to
each other in length and angle, then the optimization applies.  Is
that easy to test?


Hmm, that would actually be extremely easy to test and it would cost almost
nothing. We already compute the control polygon lengths in onLeaf, and
we're already assuming that every leaf is flat enough, so we probably
don't even need to check the angles. Comparing lengths should be good enough.
I'll try this out.


Actually, even if the lengths aren't close the lengths may give you 
enough information about the acceleration along the curve that you can 
do a decent approximation of the accelerated T value.  The T could be 
biased by some formula that is weighted by the ratios of the control 
polygon lengths.  As a very crude example, say you assumed that if the 
scaled leaf length fell into the first polygon segment's length then t 
should be proportionally a value from 0 to 1/3, and if it fell between 
the first poly len and the second then it would be proportionally a 
value from 1/3 to 2/3, etc.  The code might look like this:


polylen = l01 + l12 + l23
linelen = l03
// If l01==l12==l23 then most of the following becomes
// a NOP and t=leaflen/linelen
polyleaflen = leaflen * polylen / linelen;
if polyleaflen  l01 then t = (polyleaflen/l01)/3
else if polyleaflen  l01+l12 then t = ((pll-l01)/l12 + 1)/3
else if t = ((pll-l01-l12)/l23)+2)/3

An even better approximation would involve some more math, but that 
might be better than simply using the linear interpolation along the 
segment connecting their endpoints.


(Also, note that in the original code we probably would have just been 
dashing along the flattened curve anyway and so we might have just been 
using the raw linear t in that case - so anything we do here is a 
refinement of what we used to do and icing on the cake to some extent)...


...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

On 12/10/2010 8:27 AM, Denis Lila wrote:

Hi Jim.

Yes. The improvement shown by the bench marks is substantial.


Then this is great news!


Indeed :-)


Woohoo!


How often do we end up needing getTCloseTo in practice?


It depends on the ratios of the lengths of the sides of the control
polygon. The closer they are to 1, the less we need it. I'm not sure
how to answer more precisely - for that I would need a representative
sample of curves drawn in practice.


I was thinking of, say, when applied to a circle.  Does that get by 
without needing getTCloseTo?



However, I did run dashing and stroking benchmarks. Stroking shows
25% speedup (because of the refinements to the transform pipeline)
and cubic dashing has improved by 80%. Of course, all this is useless
if I've done something to make things look horrible, so I'm going to
run the gfx tests again.


Good job!

...jim


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-10 Thread Jim Graham

Hi Denis,

The example I gave was intended to be very crude - I was simply 
describing the technique, but as I said it would require better math to 
really know what the right formula would be.


With respect to finding a cubic root, currently you are doing that in 2 
dimensions, but what if we converted to 1 dimension?


Consider that the control polygon is fairly linear.  What if we 
rotated our perspective so that it was horizontal and then squashed it 
flat?  Consider instead a 1 dimensional bezier with control values of:


(where |mn| is the length of the m-n control polygon of the original 
curve - sum of all segments from point m to point n)


0.0, |01|, |02|, |03|

Solve that 1 dimensional bezier for v=(leaflen*polylen)/linelen...

...jim

On 12/10/2010 8:23 AM, Denis Lila wrote:

Hi Jim.


Actually, even if the lengths aren't close the lengths may give you
enough information about the acceleration along the curve that you can
do a decent approximation of the accelerated T value.  The T could be
biased by some formula that is weighted by the ratios of the control
polygon lengths.  As a very crude example, say you assumed that if the
scaled leaf length fell into the first polygon segment's length then t
should be proportionally a value from 0 to 1/3, and if it fell between
the first poly len and the second then it would be proportionally a
value from 1/3 to 2/3, etc.  The code might look like this:


I implemented this, and I'm not sure how to use this new approximation.
I mean, currently there are really two t's. The first one is the parameter
along the line connecting the 2 endpoints of the curve and the second
is the result that we return. We can't use this new approximation to
replace the first t, because for a curve like
(0,0),(1000,0),(1000,0),(1000,0) and a desired length of 500, the t
would be 1/6, so the computed (x,y) would be (1000/6,0) instead of
(500,0), which would be right (and which is what we compute now).

The only sensible way to use this kind of approximation would be as a
direct replacement for getTCloseTo. I tried that, and its quality to
speed ratio compared to getTCloseTo is remarkably good, but it's not
really usable because the differences are very noticeable.
I'll try to implement a good cubic root finder this weekend, and maybe
then getTCloseTo will be much faster and we won't have to worry about
this.


(Also, note that in the original code we probably would have just been
dashing along the flattened curve anyway and so we might have just
been
using the raw linear t in that case - so anything we do here is a
refinement of what we used to do and icing on the cake to some
extent)...


I'd say the dashing precision is better than what we used to have. It's
only slightly better, but even that is impressive when you consider that
we were doing up to 1024 subdivisions before, and now it's only 16, I think.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham
Very nice!  How does it compare to CubicCurve.solveCubic() (which I know 
has issues with the 2 root case, but I got it from a reliable source - 
some textbook on Numerical Recipes)?


Also, one area that I had issues with the version I used in CC2D was 
that it chose a hard cutoff to classify the number of points and 
floating point errors might cause a given cubic to jump over that point 
despite the fact that the equation was of the other form.  Hopefully 
that jumping between root counts only happens when two roots are very 
close to each other so that the choice is between listing N or 
N+epsilon and N-epsilon - I can live with that inaccuracy, but it 
seemed like the version in CC2D might choose between it's either a 
single root of 4.25, or three roots of -127.3, 23.5, and 42.6 and I 
would scratch my head and think - wow, what a difference a bit made!


Luckily I don't think we actually ever relied on CC2D.solveCubic for 
correctness in any of our code, but it would be nice to fix it if a more 
stable method is available.


...jim

On 12/13/2010 12:23 PM, Denis Lila wrote:

Hi again.

I found an implementation of a closed form cubic root solver
(from graphics gems):
http://read.pudn.com/downloads21/sourcecode/graph/71499/gems/Roots3And4.c__.htm

I did some micro benchmarks, and it's about 25% slower than the one I have.
I'm thinking we should use it anyway because it's much better in every
other way: it finds all roots, it doesn't make its callers give it a root
array that is longer than the total number of roots, and most importantly,
it doesn't fail because of an iteration upper bound. From my testing, I noticed
that this happens too frequently for comfort in my cubicRootsInAB. I haven't
noticed any rendering artifacts caused by this, but I also haven't tried
rendering every possible curve and it's better to prevent bugs than fix them.

What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-13 Thread Jim Graham

Hi Denis,

Those sound like just the kind of problems I believed existed in the 
CC2D algorithm.


You might want to submit it as a separate push and get credit for fixing 
4645692 (solveCubic doesn't return all answers), and maybe even the 
following failures in the containment methods (which could be closed as 
dups if this fixes the problems) as well:


4724552
4493128
4074742
4724556
(etc.  Those were just the bugs I found on the first 2 pages of a bug 
database search)


Double (triple, etc.) credit - woohoo!  ;-)

...jim

On 12/13/2010 2:30 PM, Denis Lila wrote:



Very nice!  How does it compare to CubicCurve.solveCubic() (which I
know
has issues with the 2 root case, but I got it from a reliable source
-
some textbook on Numerical Recipes)?


I wrote a tests that generated 2559960 polynomials, and in 2493075 of
those, the computed roots were identical to within 1e-9. They were
different in the remaining 66885 cases, so that's 97.4% agreement.

I've looked through some of the differences, and in every case the
function from graphics gems is better in one of two ways:
1. the gg version will report more roots than the cc2d version, in
which case the polynomial has a double root and the cc2d version
completely misses it (example poly: a=19.00, b=-20.00,
c=-17.00, d=18.00, where cc2d misses the root at 1).

2. the gg version will report fewer roots than the cc2d version, in
which case there was a 0 root and the cc2d version incorrectly split
it into -1e-163 and 1e-163.

So, the graphics gems version seems to be much more stable. It
does have a problem where it can return NaN sometimes, because it
assumes that the polynomial is not a quadratic one, but that can
easily be fixed.

So, should I put this new cubic root finder in CubicCurve.solveCubic
and use that in pisces?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-14 Thread Jim Graham

Hi Denis,

That sounds like some very good ideas for making this method very accurate.

On the other hand, we're starting to get into the territory where an 
advanced math package should be catering to these requirements.  The 
solveCubic was an internal helper function for implementing the hit 
testing methods that I decided to make public back in 1.2 days.  There's 
a question on how much accuracy we should bother with.


Also, I wrote new hit testing code in jdk6 that used bezier recursion to 
compute the values and it ran way faster than any root-finding methods 
(mainly because all you have to worry about is subdividing enough that 
the curve can be classified as above, below, or to the left or right and 
you're done), so the containment methods could probably be fixed by 
simply using the new code in sun.awt.geom.Curve and this method could be 
updated with the new equations you found and left as an approximate 
method like it always has been?


...jim

On 12/14/2010 5:57 PM, Denis Lila wrote:

Hi Jim.


How big are these errors expressed as multiples of the ULP of the
coefficients?  Obviously 1e-17 is a lot smaller than 1e-4, but was
1e-17
representing just a couple of bits of error or was it still way off
with respect to the numbers being used? And were these fairly obscure
equations that were off?


The coefficients I used were eqn={-0.1, 0, 1, 1e-7} so compared to the ulps
of the coefficients, 1e-4 is pretty large.

I'm about to go now, but I would like to write this idea first:
it seems to me like the number of roots reported is much more
important than whether their accuracy is 1e-4 or 1e-17. So,
how about we solve for the roots of the derivative (which can be
done very quickly). Computing lim{x--+/-inf}f(x) is very easy
(just a test on the most significant coefficient). We can then
evaluate the polynomial on the critical points and this would
allow us to very quickly compute the exact number of roots. If
the number computed using the closed form formula does not
match the real number, we do some refining work.

If we really wanted to optimize, we could also record how close
constants like D and q are to 0, and if they're within a certain
threshold, we could mark the roots they spawn as suspicious,
and only do the test in the above paragraph (i.e. solving for
critical points) if there are suspicious roots. And if the
computed numbers of roots don't match up, we could concentrate
on refining only the suspicious roots.

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-17 Thread Jim Graham

Hi Denis,

Lines 1094-1096, they could also be NaN if any of the numerators were 
also zero and these tests might fail (but only for the case of all of 
them being zero I guess, otherwise one of the other divisions would 
result in infinity).  Are accidental infinities (caused by overflow 
rather than d==0.0) really a problem for the code to handle?


I don't have any recommendations on your comment about the code that 
tests q for zero.  You could probably check if 2*u and -u were distinct 
as an alternate test, but that would cost you a cbrt() call.  In 
general, though, I'm guessing it's a rare case so saving the call to 
cbrt() is probably not important.  I will note, though, that if a number 
is very close to 0, but not 0, then its cube root will be a larger 
number than the original.


I just noticed that the code you are replacing actually used to refine 
the roots so maybe you should do some of this magic.  However, it only 
bothered to refine the roots if there were 3 roots and they were near 0 
or 1 (because that might cause the caller to reject the root if it fell 
on the wrong side of 0 or 1).  Either way, look and see if fixRoots() 
and its friends are dead code now and/or if they should be replaced with 
your refinement techniques below.


I tried to review the code for correctness of formulae, but since I 
never really understood how the first version worked (I just copied it 
from Numerical Recipes), all I could do was to compare to the old 
version that it was similar to.  Frustratingly, the variable names 
conflicted and one of the values was calculated with reversed sign so it 
ended up being more frustrating than educational.  :-(


Since you apparently tested the new code extensively and got it from a 
source that had some amount of editorial review - I'll trust that 
process instead of my crossed eyes...


...jim

On 12/15/2010 7:13 AM, Denis Lila wrote:

Hi Jim.


Also, I wrote new hit testing code in jdk6 that used bezier recursion
to compute the values and it ran way faster than any root-finding methods
(mainly because all you have to worry about is subdividing enough that
the curve can be classified as above, below, or to the left or right
and you're done), so the containment methods could probably be fixed by
simply using the new code in sun.awt.geom.Curve and this method could
be updated with the new equations you found and left as an approximate
method like it always has been?


Well, I already have half the code I need to implement the ideas I
wrote, so... why not?

However, making solveCubic that good does not really seem to be a high
priority, so how about we quickly push just the new implementation I
found so we can fix most cases where an obvious root is missed, then
push this dashing/AA webrev (which will depend on the new solveCubic),
then I can fix the implementation of intersect using the recursive
subdivision you mentioned, and then I can take my time finishing
the implementation of the ideas from my last e-mail (these bugs/rfes
have waited 10+ years - they can wait 1-2 more months). Right now,
I would like to give priority to better pisces support of the new
parallelogram pipes and this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=662230

Here's the webrev for the new solveCubic implementation:
http://icedtea.classpath.org/~dlila/webrevs/cc2d/webrev/

Regards,
Denis.


Re: [OpenJDK 2D-Dev] AWT Dev 7002627 : JNI Critical Arrays should be released with the original (unmodified) pointer

2010-12-21 Thread Jim Graham

Hi Andrew,

Do you really need the = NULL on the declarations?  They are 
initialized on the following line, that should be good enough for any 
compiler or lint processing.


Other than that, the new fix looks good...

...jim

On 12/21/2010 2:57 AM, Steve Poole wrote:



Hi Andrew -  please feel free to change the patch :-)

Thanks

Steve




From:   Andrew Bryginandrew.bry...@oracle.com
To: Steve Poole/UK/i...@ibmgb
Cc: 2d-dev2d-dev@openjdk.java.net
Date:   21/12/2010 10:42
Subject:Re: [OpenJDK 2D-Dev]AWT Dev  7002627 : JNI Critical Arrays
 should be released with the original (unmodified) pointer



Hello Steve,

   the fix looks reasonable. However, pMask declaration on line 134
causes a compiler warning:
../../../src/share/native/sun/awt/../java2d/pipe/BufferedMaskBlit.c,
line 134: warning: declaration can not follow a statement

   Would you mind if I modify your fix a bit in order to avoid this warning?
   Please take a look at webrev:
   http://cr.openjdk.java.net/~bae/7002627/webrev/

Thanks,
Andrew

On 12/21/2010 12:23 PM, Steve Poole wrote:


Thanks Anthony.


Regards

Steve Poole




From:   Anthony Petrovanthony.pet...@oracle.com
To: Steve Poole/UK/i...@ibmgb
Cc: 2d-dev@openjdk.java.net
Date:   21/12/2010 09:01
Subject:Re: [OpenJDK 2D-Dev]AWT Dev   7002627 : JNI Critical Arrays
  should be released with the original (unmodified) pointer
Sent by:2d-dev-boun...@openjdk.java.net



I'm adding the patch attached to the original message on the awt-dev@

list.


--
best regards,
Anthony

On 12/20/2010 3:57 PM, Anthony Petrov wrote:

Hi Steve,

This is a 2D issue, and as such I'm CC'ing 2d-dev@ and BCC'ing awt-...@.

--
best regards,
Anthony

On 12/20/2010 11:04 AM, Steve Poole wrote:

Hi all - please find attached a patch for your consideration. I've

build

and tested the change on Linux and Solaris at head (which is to say

I've

run the automatic jtreg tests ) and the change doesn't seem to have
broken
anything. Its fairly trivial anyway.


Regards

Steve Poole

(See attached file: 7002627.export)

[attachment 7002627.export deleted by Steve Poole/UK/IBM]







Re: [OpenJDK 2D-Dev] AWT Dev 7002627 : JNI Critical Arrays should be released with the original (unmodified) pointer

2010-12-22 Thread Jim Graham

Thanks Andrew, it looks good to go!

...jim

On 12/22/2010 8:15 AM, Andrew Brygin wrote:

Hi Jim,

I have updated the fix according to your comment:
http://cr.openjdk.java.net/~bae/7002627/webrev/
http://cr.openjdk.java.net/%7Ebae/7002627/webrev/

Thanks,
Andrew

On 22.12.2010 0:09, Jim Graham wrote:

Hi Andrew,

Do you really need the = NULL on the declarations? They are
initialized on the following line, that should be good enough for any
compiler or lint processing.

Other than that, the new fix looks good...

...jim

On 12/21/2010 2:57 AM, Steve Poole wrote:



Hi Andrew - please feel free to change the patch :-)

Thanks

Steve




From: Andrew Bryginandrew.bry...@oracle.com
To: Steve Poole/UK/i...@ibmgb
Cc: 2d-dev2d-dev@openjdk.java.net
Date: 21/12/2010 10:42
Subject: Re: [OpenJDK 2D-Dev]AWT Dev 7002627 : JNI Critical Arrays
should be released with the original (unmodified) pointer



Hello Steve,

the fix looks reasonable. However, pMask declaration on line 134
causes a compiler warning:
../../../src/share/native/sun/awt/../java2d/pipe/BufferedMaskBlit.c,
line 134: warning: declaration can not follow a statement

Would you mind if I modify your fix a bit in order to avoid this
warning?
Please take a look at webrev:
http://cr.openjdk.java.net/~bae/7002627/webrev/

Thanks,
Andrew

On 12/21/2010 12:23 PM, Steve Poole wrote:


Thanks Anthony.


Regards

Steve Poole




From: Anthony Petrovanthony.pet...@oracle.com
To: Steve Poole/UK/i...@ibmgb
Cc: 2d-dev@openjdk.java.net
Date: 21/12/2010 09:01
Subject: Re: [OpenJDK 2D-Dev]AWT Dev 7002627 : JNI Critical Arrays
should be released with the original (unmodified) pointer
Sent by: 2d-dev-boun...@openjdk.java.net



I'm adding the patch attached to the original message on the awt-dev@

list.


--
best regards,
Anthony

On 12/20/2010 3:57 PM, Anthony Petrov wrote:

Hi Steve,

This is a 2D issue, and as such I'm CC'ing 2d-dev@ and BCC'ing
awt-...@.

--
best regards,
Anthony

On 12/20/2010 11:04 AM, Steve Poole wrote:

Hi all - please find attached a patch for your consideration. I've

build

and tested the change on Linux and Solaris at head (which is to say

I've

run the automatic jtreg tests ) and the change doesn't seem to have
broken
anything. Its fairly trivial anyway.


Regards

Steve Poole

(See attached file: 7002627.export)

[attachment 7002627.export deleted by Steve Poole/UK/IBM]









Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham

Hi Denis,

Line 1099 - I decided to check out Cordano's method and noticed a 
discrepancy.  The comment here says we are calculating the p and q for 
this equation, but the values assigned to the p and q variables in lines 
1102,1103 happen to be p/3 and q/2.  That's fine because almost all of 
the values needed in the remaining logic in Cordano's method actually 
need those values instead of the original p and q so it makes sense to 
calculate them up front.  Unfortunately, this means that the names here 
and the values assigned to them and the comment above them conflict.  If 
the variables could be named p/3 and q/2 then all would be clear, 
but I don't know how to do that naming very easily.  Perhaps the comment 
could be simply reworded:


// substitute blah blah blah
//x^3 + Px + Q = 0
// Since we actually need P/3 and Q/2 for all of the
// calculations that follow, we will calculate
//p = P/3
//q = Q/2
// instead and use those values for simplicity of the code.

Line 1105 - single quotes in comments freaks out my version of gnuemacs. 
 I usually try to avoid them, except in pairs, but there isn't a better 
way to word this comment.  :-(


Lines 1157-1163 - the old code used to copy the eqn before it got 
clobbered with roots.  Here it is too late.  You probably need to move 
this code up near line 1135 before the 3 roots are stuffed into the res 
array.  (Did you test the eqn==res case?)


I noticed that the Casus irreducibilis case isn't in Cordano's method. 
 He only finds roots for the 1 and 2 root case and punts for 3 roots. 
So, this is someone else's method.  It would be nice to figure out who 
or what and list a reference, even though the Graphics Gems and the old 
code didn't.  The closest reference I can find is unattributed on 
Wikipedia, but you could include it in a comment for reference:


http://en.wikipedia.org/wiki/Cubic_function#Trigonometric_.28and_hyperbolic.29_method

Line 1133 - I don't understand why that term has -q in it.  The above 
link and the original code both computed essentially the arccos of this 
formula without the negation of q.  ???  Since acos(-v) == pi - acos(v) 
this would seem to negate the result and bias it by pi/3.  Negating it 
won't affect the eventual cosine, but the bias by pi/3 will.  Am I 
missing something?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-23 Thread Jim Graham
The regression tests for this bug do not call the method directly.  They 
may exercise the function indirectly in some pipelines, but not all 
pipelines will use this method (the current version of Pisces in OpenJDK 
doesn't even use it until you integrate your other changes as far as I 
know).


I'd write a regression test for this bug that is directly applicable to 
the method being tested (find what values are being handed to the method 
by these test cases and then just call Cubic.solveCubic directly with 
those values and figure out if the answers are reasonable).


If you want to include these rendering tests as extra verification along 
with your other changes, then that is fine.


Also, I think we might have a script that forceably checks the value of 
the @bug tag and ensures that it is a legal bug database number, so 
using a RedHat bug number won't work very well.  Is there an existing 
bug that this could be tagged with?


...jim

On 12/20/2010 9:36 AM, Denis Lila wrote:

Hi Jim.


Lines 1094-1096, they could also be NaN if any of the numerators were
also zero and these tests might fail (but only for the case of all of
them being zero I guess, otherwise one of the other divisions would
result in infinity). Are accidental infinities (caused by overflow
rather than d==0.0) really a problem for the code to handle?


I'm not sure if they're a problem, but I thought that case should have
been handled just for robustness. However, I've changed the test
to d==0 because testing for infinities should be done, but not there.
For example, if the constant term was huge and d==0.5 we could get
an infinity but that shouldn't really be handled as a quadratic
polynomial. I will deal better with these cases in a future webrev.


I just noticed that the code you are replacing actually used to refine
the roots so maybe you should do some of this magic.


I missed that in the original code. I changed it now.


Also, in the webrev you'll find five regression tests that I would like
to push to openjdk7. They test for various problems the rendering engine
used to have. They're all pretty simple and I would appreciate it if you
could take a quick look at them. They're in the same webrev as cc2d because
it was more convenient for me, but obviously when/if they're pushed they
will be a separate changeset.

One more thing: there is a regression test for the rendering engine
called TestNPE that I think is problematic because it doesn't
necessarily test the rendering engine. It just draws an antialiased
line, which could be handled in any number of ways, so it's not very
robust. In fact, after we're done with the parallelogram pipelines,
the code that used to throw the NPE won't even execute, making this
test useless. We should either discard it or change it to use the
rendering engine explicitly, like my tests do. What do you think?

Regards,
Denis.


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2010-12-28 Thread Jim Graham

Hi Denis,

I'm attaching a test program I wrote that compares the old and new 
algorithms.


Obviously the old one missed a bunch of solutions because it classified 
all solutions as 1 or 3, but the new one also sometimes misses a 
solution.  You might want to turn this into an automated test for the 
bug (and maybe use it as a stress test with a random number generator).


I think one problem might be that you use is close to zero to check if 
you should use special processing.  I think any tests which say do it 
this way and get fewer roots should be conservative and if we are on 
the borderline and we can do the code that generates more solutions then 
we should generate more and them maybe refine the roots and eliminate 
duplicates.  That way we can be (more) sure not to leave any roots unsolved.


The test as it is has a test case (I just chose random numbers to check 
and got lucky - d'oh!) that generates 1 solution from the new code even 
though the equation had 2 distinct solutions that weren't even near each 
other...


...jim



CubicSolver.java
Description: java/


Re: [OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

2011-01-04 Thread Jim Graham

Hi Denis,

What about logic like this:

boolean checkRoots = false;
if (D  0) {
// 3 solution form is possible, so use it
checkRoots = (D  -TINY);  // Check them if we were borderline
// compute 3 roots as before
} else {
double u = ...;
double v = ...;
res[0] = u+v; // should be 2*u if D is near zero
if (u close to v) {// Will be true for D near zero
res[1] = -res[0]/2;  // should be -u if D is near zero
checkRoots = true;  // Check them if we were borderline
// Note that q=0 case ends up here as well...
}
}
if (checkRoots) {
if (num  2  (res[2] == res[1] || res[2] == res[0]) {
num--;
}
if (num  1  res[1] == res[0]) {
res[1] = res[--num];  // Copies res[2] to res[1] if needed
}
for (int i = num-1; i = 0; i--) {
res[i] = refine(res[i]);
for (int j = i+1; j  num; j++) {
if (res[i] == res[j]) {
res[i] = res[--num];
break;
}
}
}
}

Note that we lose the optimization of calculating just 2*u and -u for 
the 2 root case, but that only happened in rare circumstances.  Also, if 
D is near zero and negative, then we generate 3 roots using 
transcendentals and potentially refine one away, but that should also be 
an uncommon situation and there but for the grace of being a tiny 
negative number would we have gone anyway so I think it is OK to take 
the long way to the answer.


Also, one could argue that if we used the transcendentals to calculate 
the 3 roots, it couldn't hurt to refine the answers anyway.  The other 
solutions should have higher precision, but the transcendental results 
will be much less accurate.


Finally, this lacks the refine them anyway if any of them are near 0 or 
1 rule - the original only did that if the transcendentals were used, 
but it would be nice to do that for any of the cases.  It might make 
sense to have a variant that takes a boolean indicating whether to 
ensure higher accuracy around 0 and 1, but that would require an API 
change request...


...jim

On 1/4/11 2:02 PM, Denis Lila wrote:

Hi Jim.


The test as it is has a test case (I just chose random numbers to
check
and got lucky - d'oh!) that generates 1 solution from the new code
even
though the equation had 2 distinct solutions that weren't even near
each
other...


I figured out why this happens. It's because of cancellation in the
computation of D (two large numbers are subtracted and the result is
supposed to be 0 or close to 0, but it's about 1e-7, which wasn't
enough to pass the iszero test). I've been working on this and I
came up with a couple of different ways. They are in the attached
file (it's a modified version of the file your CubicSolve.java).

The first thing I did was to modify solveCubicOld. I tried to get
a bit fancy and although I think I fixed the problems it had, the
end result is ugly, complicated and it has small problems, like
returning 3 very close roots when there should only be one.

The other solution is to just check if the roots of the derivative
are also roots of the cubic polynomial if only 1 root was computed
by the closed form algorithm. This doesn't have the numerical
accuracy of the first way (which used bisectRoots when things went wrong)
but it's much faster, doesn't have the multiple roots problem, and it's
much simpler. I called your trySolve function on a few hundred
polynomials with random roots in [-10, 10] and it never finds fewer
roots than there actually are. Sometimes it finds 3 roots when there are
only 2, but I don't think this is a huge problem.

I've attached what I have so far.

Regards,
Denis.

- Original Message -

Hi Denis,

I'm attaching a test program I wrote that compares the old and new
algorithms.

Obviously the old one missed a bunch of solutions because it
classified
all solutions as 1 or 3, but the new one also sometimes misses a
solution. You might want to turn this into an automated test for the
bug (and maybe use it as a stress test with a random number
generator).

I think one problem might be that you use is close to zero to check
if
you should use special processing. I think any tests which say do it
this way and get fewer roots should be conservative and if we are on
the borderline and we can do the code that generates more solutions
then
we should generate more and them maybe refine the roots and eliminate
duplicates. That way we can be (more) sure not to leave any roots
unsolved.





...jim


  1   2   3   4   5   6   7   >