Re: DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
On Sun, Apr 04, 2004 at 06:41:41AM -, [EMAIL PROTECTED] wrote: --- Additional Comments From [EMAIL PROTECTED] 2004-04-04 06:41 --- Anyway, please check the latest solution applied. Here I brought in the old solution to solve the extra spaces problem, while keeping the new solution for the more commonly occurring spaces at the front of fo:text. (As stated above, I suspect 10%), so more optimization may be needed here. (Perhaps an error in my logic of when RemoveA vs. RemoveB should be called.) Comments most welcome. The code seems OK to me. I have a remark on the logic of nextCharCalled. It is equal to curIndex - startIndex. Also, in remove() you should separately consider the case nextCharCalled == 0: it should not happen. I have looked at the output in the area tree (at output). It does not reproduce the error in the first block of my test fo, the one with the colored inline FOs. That may be an error of the PDF renderer. It does reproduce the error in the second block: The second line is longer than in the other blocks, and the last line is dropped. That may be an error in the layout managers. I saw no problem with the logic of RemoveA, RemoveB or RemoveC. RemoveB does handle trailing spaces except for the last one. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
Thanks for looking through the code, Simon. --- Simon Pepping [EMAIL PROTECTED] wrote: I have a remark on the logic of nextCharCalled. It is equal to curIndex - startIndex. Also, in remove() you should separately consider the case nextCharCalled == 0: it should not happen. nextCharCalled is a terribly named variable--but I haven't thought of anything better. Suggestions welcome for its renaming. The process of moving from RemoveA (where I can just nicely increment the startIndex, i.e., to remove leading spaces) to RemoveB (where I need to start doing arraycopies, e.g., to remove spaces between words) is activated when two or more consecutive nextChar() iterations occur *without* a remove() call in between. This would indicate that we've encountered a valid character, so we've hit a word and now any subsequent space removal has to occur via RemoveB method. Whenever remove() is called during RemoveA time, I just keep resetting it to zero for the logic above to occur. It may not be the best, however, as I mentioned earlier RemoveB appears to be called too soon too often. I'll get all this commented soon in the code. I have looked at the output in the area tree (at output). It does not reproduce the error in the first block of my test fo, the one with the colored inline FOs. That may be an error of the PDF renderer. I'm unsure if this is good news or bad news. ;-) But anyway, more work for us... Glen
[Fwd: DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly]
Comments most welcome. Glen, In general I like your solution. However, I did a bit of testing with inlines and the effects are very serious. Valid non-whitespace characters from the enclosing block are simply deleted if there is whitespace at the start of the fo:inline. Chris
Re: [Fwd: DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly]
Yes, this needs more work then. I did not consider fo:inlines when starting. Still, I'm surprised that fo:inlines are within the same FOText object (the fo:block is already parsed into several FOText objects, roughly FOText object per line in the source document)--the appearance of an fo:inline should precipitate a new FOText object, one would think. That could be one solution. Glen --- Chris Bowditch [EMAIL PROTECTED] wrote: Comments most welcome. Glen, In general I like your solution. However, I did a bit of testing with inlines and the effects are very serious. Valid non-whitespace characters from the enclosing block are simply deleted if there is whitespace at the start of the fo:inline. Chris
DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27901. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27901 [PATCH] TextCharIterator.remove() does not work properly --- Additional Comments From [EMAIL PROTECTED] 2004-04-03 20:01 --- I like the idea of the black list. But it is not necessary to make it explicit. It is possible to remove multiple white space characters in a single loop. In the attached patch I have implemented this idea in the constructor of FOText, so that FOText contains a text without multiple spaces. The attached test fo file demonstrates that it works well, except when the block is split up in several FOs. In that case text disappears. I have checked that that does not happen in the FOText constructor and that TextLayoutManager.textArray is also correct. In view of the fact that the results are different whether the inlines have color or not, it may also be an error in the renderer. The intermediate array caLong could be skipped if it were not necessary to truncate ca to its exact length. This could be done if every method and client would use node.length instead of node.ca.length.
DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27901. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27901 [PATCH] TextCharIterator.remove() does not work properly --- Additional Comments From [EMAIL PROTECTED] 2004-04-03 20:03 --- Created an attachment (id=4) my patch
DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27901. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27901 [PATCH] TextCharIterator.remove() does not work properly --- Additional Comments From [EMAIL PROTECTED] 2004-04-04 06:41 --- The BlackList method is not my favorite, because I suspect only a few percent of extra spaces found are between words--the extra complexity appears unwarranted. My debugging has indicated that 95%+ of extra spaces all appear at the beginning of the FOText object, so it is best to just increment a start index for those. For the few percent of errant extra spaces between words, then an arraycopy can be called, removing the spaces one-by-one. (Keep in mind, one month ago we were doing an arraycopy for *every* space, even those at the beginning of the string, so a 95% reduction in arraycopies is already a huge improvement.) Please advise if you've come to different conclusions--but the complexity doesn't appear necessary to me. HOWEVER/Note: I've only been working with fo:blocks, not with fo:inlines, so my above reasoning may not be valid when those are taken into consideration. Anyway, please check the latest solution applied. Here I brought in the old solution to solve the extra spaces problem, while keeping the new solution for the more commonly occurring spaces at the front of fo:text. The (commented-out) System.out.println's in the remove() method are useful for visualizing the space removal, and indicate the following: RemoveA: Newest method, that just increments the start index until a word is found. No array copies needed here, very speedy. RemoveB and RemoveC: The previous method, ideally just activated for the relatively rare occurrences of extra spaces between words. Unfortunately, RemoveB and RemoveC are called much more often (50% or so) than they should be (As stated above, I suspect 10%), so more optimization may be needed here. (Perhaps an error in my logic of when RemoveA vs. RemoveB should be called.) Comments most welcome. Glen
DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27901. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27901 [PATCH] TextCharIterator.remove() does not work properly [EMAIL PROTECTED] changed: What|Removed |Added Summary|TextCharIterator.remove() |[PATCH] |does not work properly |TextCharIterator.remove() ||does not work properly
DO NOT REPLY [Bug 27901] - [PATCH] TextCharIterator.remove() does not work properly
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=27901. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=27901 [PATCH] TextCharIterator.remove() does not work properly --- Additional Comments From [EMAIL PROTECTED] 2004-04-02 22:05 --- I hope to get back to this over the weekend--I was working the logging issue last weekend.