https://issues.apache.org/bugzilla/show_bug.cgi?id=50852
--- Comment #21 from Vincent Hennebert <[email protected]> 2011-06-16 17:58:06 UTC --- Hi Martin, I finally got round to having a look at your patch. First, I’d like to thank you for having taken the time to create 20 smaller self-contained patches. This made the review much easier. So thanks for that! The new data structure is indeed much more efficient than the DOM that FOP manipulates at the moment. Unfortunately, this doesn’t solve (AFAIU) the fundamental problem that we recently discovered: empty content may be the cause for a wrong final structure tree. Take the following table: <fo:table width="100%" table-layout="fixed"> <fo:table-body> <fo:table-row> <fo:table-cell> <fo:block>Cell 1.1</fo:block> </fo:table-cell> <fo:table-cell> <fo:block>Cell 1.2</fo:block> </fo:table-cell> </fo:table-row> <fo:table-row> <fo:table-cell> <fo:block/> </fo:table-cell> <fo:table-cell> <fo:block>Cell 2.2</fo:block> </fo:table-cell> </fo:table-row> </fo:table-body> </fo:table> The content of the first cell in the second row is empty, which will result into a TR structure element having only one TD kid for the second cell; that TD element will be mistakenly interpreted by a screen reader as belonging to the first column. See also discussion here: http://markmail.org/message/mn7jdbxmjdq7ey52 To solve this we need to integrate the handling of the structure tree into the normal processing chain (FO tree -> Layout Engine -> Area tree) instead of bypassing it. That said, I have a few comments and questions relating to some of your specific patches: [PATCH 10/20] Avoid overhead of creating writers I can imagine that if there are a lot of PDF objects to stream, creating an instance of BufferedWriter and OutputStreamWriter for each of them may have quite some performance impact. However replacing them with calls to PDFObject.encode everywhere it is necessary is not really an option. This makes the code difficult to read and maintain, and is error-prone as it’s very easy to miss one call somewhere. I think the problem of encoding text into the output should be solved by defining a specialized PDFOutputStream that would be able to stream both String and bytes. That PDFOutputStream would be passed around to objects that then wouldn’t have to handle their own wrapper or make calls to encode. Does that make sense? [PATCH 11/20] Add support for clearing objects at write time I’m wondering why this is necessary? Isn’t it just possible to null out references to the objects and let the garbage collector do the work? [PATCH 12/20] Add support for lazy object number assignment Same here, what’s exactly the purpose of lazy object number assignment? Thanks, Vincent -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
