[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1616657543 Hi, @THauserr! The comment to the PDColor.toRGB method says "throws IllegalStateException if this color value is a pattern." I checked several color spaces and found that they either throw an UnsupportedOperationException or don't throw an exception. PDPattern color space throws UnsupportedOperationException too. Actually, I think, UnsupportedOperationException is better here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1606049748 > > Hi, @lehmi! Have a question about your commit PDFBOX-5551: replace malformed (sub)stream with an empty one, DRY git-svn-id: https://svn.apache.org/repos/asf/pdfbox/trunk@1910585 13f79535-47bb-0310-9956-ffa450edef68. PDPage.getContentsForRandomAccess method. Line 194. If substream is malformed and you wrote "it will be skipped" then why you didn't remove it from inputStreams list? > > I had two reasons in my mind. First of all in the case of a content stream which consists of several streams I didn't want to change the number of substreams so that the number of substreams is still in line with the number of COSArray elements. And in the case of a content streams which has only one stream I prefer to return an empty stream instead of returning null so that we don't have to change the code processing the stream It looks like, if the exception will be thrown then last item will not be added. Is it expected behaviour? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1606024808 > > Hi, @lehmi! Have a question about your commit PDFBOX-5551: replace malformed (sub)stream with an empty one, DRY git-svn-id: https://svn.apache.org/repos/asf/pdfbox/trunk@1910585 13f79535-47bb-0310-9956-ffa450edef68. PDPage.getContentsForRandomAccess method. Line 194. If substream is malformed and you wrote "it will be skipped" then why you didn't remove it from inputStreams list? > > I had two reasons in my mind. First of all in the case of a content stream which consists of several streams I didn't want to change the number of substreams so that the number of substreams is still in line with the number of COSArray elements. And in the case of a content streams which has only one stream I prefer to return an empty stream instead of returning null so that we don't have to change the code processing the stream Thanks, understood. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1605995006 Hi, @lehmi! Have a question about your commit PDFBOX-5551: replace malformed (sub)stream with an empty one, DRY git-svn-id: https://svn.apache.org/repos/asf/pdfbox/trunk@1910585 13f79535-47bb-0310-9956-ffa450edef68. PDPage.getContentsForRandomAccess method. Line 194. If substream is malformed and you wrote "it will be skipped" then why you didn't remove it from inputStreams list? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1546223251 The PDIndexed.toRGB method has an array length check. There is length larger then 1 check but why not one-equality check or large-then-zero check? The IndexOutOfBoundsException possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1531222910 Hi, @THausherr ! These changes also available for line 159 (PDSeparation.toRGBImage). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1523556089 Hi, @THausherr! I found several issues for pdfbox. 1 PDColorspace and PDSpecialColorSpace have the same implementation of getCOSObject method (inheritance) 2 PageDrawer.getDashArray. Line 750: I not found PDFBOX-3360. Line 751: condition Float.isNaN(phase) is always false because phase is integer. Maybe you need something like Float.isNaN(Float.intBitsToFloat(phase)). 3 CMapParser.readDictionary. Second condition of while loop is always true because key is LiteralName and MARK_END_OF_DICTIONARY is String. It looks like you need compare to LiteralName.name field. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1493299810 @lehmi , as I see, > @valerybokov thanks for the pointer. I've fixed the buggy condition and replaced the and- with an or-operator including some brackets. Checking nextThreeBytes[0] two times was intented. I've added another case to the end of string detection, see comment. IMHO to split up those 3 cases in two statements makes it more readable. @lehmi , as I see, for both your versions if "nextThreeBytes[0] == ASCII_CR" then braces will be set to zero with no other checks. ![image](https://user-images.githubusercontent.com/67366451/229348848-94def155-2c90-40d3-8081-4cbc210da75f.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1493288760 Hi, @lehmi ! I saw your commit "PDFBOX-5178: add LF-combinations to checkForEndOfString..." (https://github.com/apache/pdfbox/commit/86b957ad44b4b23015def8c1319be49450370a3a). As I see, you added more conditions to find the end of string. There seems to be a bug in last 2 conditions (line 428: nextThreeBytes[2] == '/' && nextThreeBytes[2] == '>'). As you see nextThreeBytes[2] cannot have 2 values in the same time. It looks like the OR-operator needed. Additionally, line 424 looks suspect, because you are checking the "nextThreeBytes[0] == ASCII_CR" condition 2 times (427). Did you forget round brackets for 2 conditions in the line 424? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1465123288 Hi, @THausherr! The second parameter of the COSWriterObjectStream.writeObject method is of type Object. But you always send COSBase type (you can change Object type to COSBase for line 401 in your code). This means that the writeOperator inside writeObject is redundant. I think, there is some bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1464949026 ![image](https://user-images.githubusercontent.com/67366451/224496506-92e0f9a1-8310-4493-a1ac-6ad225c41b78.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1464946362 Hi, @THausherr! As I understand, java tries to find a method with the same parameter type or parental type. If you will put Object as parameter then java will use ContentStreamWriter.writeObject(Object) method. If you will put Operator type then java will use writeObject(Operator) method. This is a reason why I extracted writeObject(COSBase) method. I could extract writeObject(COSName) method because it uses inside a loop but, maybe, it will make code less readable. I found typo in FeatureTable.toString but i don't know can i change it to not break backward compatibility? I mean, can the string "lookupListIndiciesCount" be changed to "lookupListIndicesCount"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1448831195 Hi, @THausherr ! Several possible performance improvements. Inside GlyphSubstitutionTable.readScriptList method we create a scriptTables variable. Then we fill it in loop. We then use it in the next loop. We can do this in the first loop and remove this variable. The same about the langSysRecords variable in the readScriptTable method. You able to instantiate a regular exprestion one time to not do it for each iteration in readFeatureList method (Pattern pattern = Pattern.compile("\\w{4}")). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1305398903 ![image](https://user-images.githubusercontent.com/67366451/200287554-7865b8f5-f614-4dee-9701-be24261f1ae8.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1297194157 A new test ScratchFileBufferTest.testBuffersClosed method. Should the scratchFileBuffer2 and scratchFileBuffer4 be closed after checks? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1257198654 > I don't really understand the last comment, but this may be related to changes in PDFBOX-5489. It is test in my fork. Yes, it looks like problem from PDFBOX-548. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1257182580 Hi, @THausherr! I got a bug in my version of library. Several months ago SplitterTest was added from me. The PDDocument.setHighestImportedObjectNumber throws with a NoSuchElementException because importedPage variable is empty and test fails. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1229397504 > ttfFont is still null if parse() fails. One could argue that `ff2Stream.getCOSObject().createView()` should be closed if parse() fails, but that wasn't your argument. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1229394668 > Re 1: this is a known problem, a comment mentions PDFBOX-4963 Re 2: TTFParser does not have a close method It was ttfFont. ![image](https://user-images.githubusercontent.com/67366451/187061807-4851023f-b6bb-4d00-9493-a3973d872a22.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1229394240 > ![image](https://user-images.githubusercontent.com/67366451/187061704-fb94cf9f-7fac-41e9-9cb6-cb2ce55fa90c.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1229247301 Hi, @THausherr! I found two methods with possible memory leaks. 1 TrueTypeFont.finalize. If the super.finalize method will throw an exception then the close method will not be called. Maybe you will use try-finally. Additionally, it is a bit weird code when you call finalize and then close methods. 2 PDTrueTypeFont.PDTrueTypeFont(COSDictionary). The ttfParser instance will not be closed if the parse method will throw an exception. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1206351676 Hi, @THausherr ! The testPDFBox5484. Don't you need to free the doc variable? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186576956 > Sorry, I completely missed that the original returns had been removed. I don't know how this happened because it's obvious. Your code example is fine too. As I understood, my commit has no bug and it is simpler then old version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186575091 > Heh! I believe, the De Morgan's law works. I tried to interpolate this code to a little test. Tell me, please, where is bug? import java.util.ArrayList; public class HelloWorld{ public static void main(String []args){ System.out.println(v1() + " " + v2()); } static boolean v1(){ ArrayListlist = new ArrayList(); // list.add(2); list.add(4); //list.add(5); if (!list.contains(2) && !list.contains(4) && !list.contains(5)) return false; return true; } static boolean v2(){ ArrayListlist = new ArrayList(); //list.add(2); list.add(4); //list.add(5); return list.contains(2) || list.contains(4) || list.contains(5); } } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186562664 > > re [85368a5](https://github.com/apache/pdfbox/pull/107/commits/85368a5312e69b3a863e9163f19d63c8945b6c3f) - this is good but shouldn't the returns be changed as well? > > re those 2 LGTM alerts - can you please revert the `getDelta()` change in your huge PR (I've already mentioned that I don't want to introduce that change because it makes the code less readable, but LGTM keeps mentioning it) and fix the `getSignatureFields()` change (it ignores "fields") > > Also, this optimization of `getSignatureFields()` isn't needed because this is just a null check against fuzzers etc, this shouldn't be null in production. "shouldn't the returns be changed as well?" I didn't get. What I see: Your version //if all not contains if (!true && !true && !true) return false //(then) one or more contains return true My version //if one or more contains else false return true || true || true So, logic is the same but the code has been simplified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186438627 ![image](https://user-images.githubusercontent.com/67366451/179389503-6b076c25-f880-405a-9233-979b2a567565.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186435932 > > Hi, @THausherr! The few my observations: > > 1 AnnotationBorder.underline - unused field. > > 2 CCITTFactory.readshort - useless if-statement (wrong if body?). > > 3 PDCaretAppearanceHandler - line 73. Operation result is ignored (matrix.transformPoint(rd, rd);) > > 1. true, but I'll keep it. This is because PDLinkAppearanceHandler doesn't use AnnotationBorder for some reason. Maybe it should, but this isn't really that important. From what I see, this would require more changes, some of them breaking. > 2. I don't understand. TIFF files start with "II" or "MM". > 3. thanks, removed. I have no idea why I did this. About second item. As I understood, bitwise OR operation doesn't care about arguments order:) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186432401 COSDocument.saveIncrementalForExternalSigning. If the getSignatureDictionaries method returns an empty collection (string...), you throw an IllegalStateException. You break the loop if any signature needs to be updated. If the collection is not empty and there is no updatable signature, the foundSignature variable will not be null and you will also do this incremental save. Is this the correct behavior, I mean do you need to do this incremental save on a non-empty collection? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1186421868 Hi, @THausherr! The few my observations: 1 AnnotationBorder.underline - unused field. 2 CCITTFactory.readshort - useless if-statement (wrong if body?). 3 PDCaretAppearanceHandler - line 73. Operation result is ignored (matrix.transformPoint(rd, rd);) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1180247890 > > > existing code is better IMHO. And whoever wrote this, intended to use an array instead of a list. > > > > > > You are converting an array to a list in the getXRefRanges method. You use more memory and time to do this. I just removed the redundant operation. You are used to using lists. Only on this occasion you say that the existing code is better. > > getXRefRanges() starts with a list when it calls `new ArrayList<>()`. The current code converts this to an array at the end with `return list.toArray(new Long[list.size()]);` and loses the list. Yes. Just I mean, IMHO, it is redurant convertion. I found only one reason to no do that is backward compatibility (getXRefRanges is protected). But you often use lists and rarely arrays. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1179682790 > existing code is better IMHO. And whoever wrote this, intended to use an array instead of a list. You are converting an array to a list in the getXRefRanges method. You use more memory and time to do this. I just removed the redundant operation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1158729508 ![image](https://user-images.githubusercontent.com/67366451/174279628-c75e2409-d639-4f58-8327-a82c3a546702.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1155610797 Hi, @THausherr ! You did commit https://github.com/apache/pdfbox/commit/53bd9bb6cdaf01300aad13fa5632d830444a. How about line 249? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1121512722 > > > > Hi, @THausherr ! 1 The method PDFPrintable.print has affine transform clonning (260). Is this operation really needed or is it old code? 2 the classes from org.apache.pdfbox.pdmodel.font.encoding packege have public constructors but they have field with comment "Singleton instance of this class". The same for a packege org.apache.fontbox.encoding. 3 The packages org.apache.fontbox.encoding and org.apache.pdfbox.pdmodel.font.encoding contain the MacRomanEncoding class. The encoding tables of these classes are not much different. Perhaps they were separated due to different encoding tables. > > > > > > > > > Thanks, I made two of the three proposed changes. I prefer not to tough the third one due to the difference. > > > > > > Hi, @THausherr! About PDFPrintable, line 260. I thought you would remove the AffineTransform instance from the code (remove those lines). I can't understand the whole logic of this code, but why do you need to set a transformation clone (check the graphics2D.getTransform java doc) on line 269? > > I have removed the clone, maybe you have an older version of the code. I kept the rest of the code. The purpose of this part is to restore the original transform when painting the border. Thank you for respond. I understood. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1120470783 > > Hi, @THausherr ! 1 The method PDFPrintable.print has affine transform clonning (260). Is this operation really needed or is it old code? 2 the classes from org.apache.pdfbox.pdmodel.font.encoding packege have public constructors but they have field with comment "Singleton instance of this class". The same for a packege org.apache.fontbox.encoding. 3 The packages org.apache.fontbox.encoding and org.apache.pdfbox.pdmodel.font.encoding contain the MacRomanEncoding class. The encoding tables of these classes are not much different. Perhaps they were separated due to different encoding tables. > > Thanks, I made two of the three proposed changes. I prefer not to tough the third one due to the difference. Hi, @THausherr! About PDFPrintable, line 260. I thought you would remove the AffineTransform instance from the code (remove those lines). I can't understand the whole logic of this code, but why do you need to set a transformation clone (check the graphics2D.getTransform java doc) on line 269? About encoding. You have changed classes from pdfbox but not from fontbox. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1120468091 > This makes the code less readable IMHO, and the possible performance gain doesn't matter, LZW compression isn't used much (mostly on very old PDFs, because it's not efficient). Re your other question "it is possible": yes, it is, that part is green in the jacoco report, see .../pdfbox/target/site/jacoco/org.apache.pdfbox.filter/LZWFilter.java.html okay -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1119851991 > PDFCloneUtility.checkForRecursion() check breaks support for a some existing PDFs. I've seen a lot of PDFs where the /Prev and /Next entries point to the parent element if there is no prev or next element. So this breaks support for a a lot of PDFs which could be processed before. Will be better if you will write it here: https://issues.apache.org/jira/projects/PDFBOX/issues/PDFBOX-5426?filter=allopenissues -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1113568288 > > Hi, @THausherr ! 1 The method PDFPrintable.print has affine transform clonning (260). Is this operation really needed or is it old code? 2 the classes from org.apache.pdfbox.pdmodel.font.encoding packege have public constructors but they have field with comment "Singleton instance of this class". The same for a packege org.apache.fontbox.encoding. 3 The packages org.apache.fontbox.encoding and org.apache.pdfbox.pdmodel.font.encoding contain the MacRomanEncoding class. The encoding tables of these classes are not much different. Perhaps they were separated due to different encoding tables. > > Thanks, I made two of the three proposed changes. I prefer not to tough the third one due to the difference. Okay. Third is warning. Just a strange code. How about my propositions? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1108301964 Proposition: use java.util.concurrent.ConcurrentHashMap for CMapManager .cMapCache -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1108294910 Proposition: move the method generateAppearanceStreams to PDAbstractAppearanceHandler -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1108249415 Hi, @THausherr ! 1 The method PDFPrintable.print has affine transform clonning (260). Is this operation really needed or is it old code? 2 the classes from org.apache.pdfbox.pdmodel.font.encoding packege have public constructors but they have field with comment "Singleton instance of this class". The same for a packege org.apache.fontbox.encoding. 3 The packages org.apache.fontbox.encoding and org.apache.pdfbox.pdmodel.font.encoding contain the MacRomanEncoding class. The encoding tables of these classes are not much different. Perhaps they were separated due to different encoding tables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on PR #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1101249810 > This pull request **introduces 1 alert** when merging [d3c1021](https://github.com/apache/pdfbox/commit/d3c10211f5202cb1baa98cb4b365ff6b6a5334b0) into [9db582c](https://github.com/apache/pdfbox/commit/9db582c98f00619aae7d8db5708de3ad68fdeb7f) - [view on LGTM.com](https://lgtm.com/projects/g/apache/pdfbox/rev/pr-f2db0250a63562390f2dce579cf3c21fe6c83cf1) > > **new alerts:** > > * 1 for Boxed variable is never null Hi! The prevSum variable is int, the 'current.intValue()' is number. So, result can be int or Integer. Check next row please. We add this value to list. For list it always be Integer. I reduced redurant convertion, IMHO. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1077959801 > > Hi, @THausherr! I found a comment of SequenceRandomAccessRead.createView method was removed. > > Which one / when? Heh! It was mine. ![image](https://user-images.githubusercontent.com/67366451/159992284-3aae29ff-5f55-431d-a75a-a37cd1ee1cd0.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1077954790 If PDDocument.saveIncrementalForExternalSigning will be called several times then signingSupport instance will be createad again but released only once (PDDocument.close()) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1077915935 Hi, @THausherr! I found a comment of SequenceRandomAccessRead.createView method was removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1073034788 Hi, @THausherr! There is changed COSParser.findString method with implemented KMP algorithm. Please, check it and commit [COSParser.txt](https://github.com/apache/pdfbox/files/8309587/COSParser.txt) . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1069430606 Hi, @THausherr! PDFCloneUtility.checkForRecursion. No need to convert a parent parameter to COSBase. It can be object for faster checking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1064909170 Hi, @THausherr ! CosWriter.prepareIncrement. If cosObjectKey variable can be null getObjectFromPool method will throw NPE. A COSDocument.getObjectsByType method has no this check too. There is NPE or redurant null check? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1046265181 I have one little proposition. The method PDPage.getAnnotations(AnnotationFilter) returns instance of COSArrayList. The PageDrawer uses it inside drawPage method (reading). We able to create additional method to reduce memory usage for this case. Method: //The idea: don't use COSArrayList because it is expensive. Use ArrayList. /** * This will return a list of the annotations for this page. * * @param annotationFilter the annotation filter provided allowing to filter out specific annotations * @return List of the PDAnnotation objects, never null. * * @throws IOException If there is an error while creating the annotation list. */ public List getAnnotationsList(AnnotationFilter annotationFilter) throws IOException { COSArray annots = page.getCOSArray(COSName.ANNOTS); if (annots == null) { return Collections.EMPTY_LIST; } List actuals = new ArrayList<>(); for (int i = 0; i < annots.size(); i++) { COSBase item = annots.getObject(i); if (item != null) { PDAnnotation createdAnnotation = PDAnnotation.createAnnotation(item); if (annotationFilter.accept(createdAnnotation)) { actuals.add(createdAnnotation); } } } return actuals; } Maybe, this additional method will be better if you use it instead of PDPage.getAnnotations() for some cases. The code: public List getAnnotationsList() throws IOException { return getAnnotationsList(annotation -> true); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1045104764 Hi, @THausherr! Method PDTerminalField.importFDF (103). Can the getWidgets() method return an emply list for this case (probably an IndexOutOfBoundsException)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1042632955 A method PDHighlightAppearanceHandler.generateNormalAppearance, line 77. Can rectangle be null? A method PDInkAppearanceHandler.generateNormalAppearance, line 90. The same question. No checks for several children of PDAbstractAppearanceHandler's. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-1042618054 Hi, @THausherr! You use Graphics type for method PageDrawer.drawPage(Graphics, PDRectangle) as first parameter. But inside you always convert it to Graphics2D. Maybe, you will use Graphics2D in parameter? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997377146 > I agree that the method is also flawed because the offset isn't set to the starting point - 1 after a fail, but I doubt troubles would happen as long as we only search for %%EOF and startxref. > > ``` > RandomAccessRead source = new RandomAccessReadBuffer(new byte[0]); > PDFParser parser = new PDFParser(source, null, null, null, null); > byte [] buf = { '%', '%', '%', '%', 'E', 'O', 'F', 'O', 'F' }; > int result = parser.lastIndexOf(EOF_MARKER, buf, buf.length); > System.out.println("result: " + result); > ``` > > Can you come up with a failing test? I didn't find any matching string for lastIndexOf -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997342867 > I agree that the method is also flawed because the offset isn't set to the starting point - 1 after a fail, but I doubt troubles would happen as long as we only search for %%EOF and startxref. > Can you come up with a failing test? I'll figure out how to make the failing test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997280280 > Good find. I think this can be fixed by changing the code to be like this: > > ``` > else if (counter > 0) > { > counter = 0; > source.seek(position + 1); > position = -1L; > //continue; > } > ``` > > The code is used only when we do a brute force search on broken PDFs. If a PDF would contain "%%%EOF" then "%%EOF" wouldn't be found with the existing code. The perfect solution would be https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm but that would need more time. ok. Looks like for lastindexof method has the same problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997239316 @THausherr , I found one case for COSParser.findString when this method returns wrong result. Maybe it is impossible because of string pattern in my test cant happen in real pdf. What can you say? private static long findString(char[] string, SRC source) { long position = -1L; int stringLength = string.length; int counter = 0; int readChar = source.read(); while (readChar != -1) { if (readChar == string[counter]) { if (counter == 0) { position = source.getPosition() - 1; } counter++; if (counter == stringLength) { return position; } } else if (counter > 0) { counter = 0; position = -1L; continue; } readChar = source.read(); } return position; } public static void main(String[]args) { System.out.println(findString("aba".toCharArray(), new SRC("aaaba"))); System.out.println("aaaba".contains("aba")); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997238570 > My guess is that the comment is misleading. Should probably rather be a text like "should not happen, likely broken PDF". The code is related to https://issues.apache.org/jira/browse/PDFBOX-3891 which has bad PDFs. understood -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-997232062 PDAcroFormFromAnnotsTest.testFromAnnots3891ValidateFont (261) ![image](https://user-images.githubusercontent.com/67366451/146649656-ea4f9a2d-1ff9-41ea-aaee-43b561676446.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-992764988 > The first is a slight memory optimization but makes the code less readable. The second wouldn't work because the PDRectangle constructor takes width/height and not coordinates as parameters. Hi, @THausherr! Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-969903630 > Yeah maybe it should be done. But isValid is about big values that are outside of allowed range, so maybe they are equal to another value value in a different type, so maybe that is the reason it isn't checked. Yes may be but shoud > > Hi, @THausherrI found one strange method. COSInteger.equals. Why it has no isValid method checks? > > Yeah maybe it should be done. But isValid is about big values that are outside of allowed range, so maybe they are equal to another value value in a different type, so maybe that is the reason it isn't checked. I dont know the range of these values. Can be COSInteger be valid with max (or min) long value? If yes, IMHO, then we need isvalid check in the equals method. You are using the boolean field and long field. If max long and min long are valid you able to check to max and min values of the COSInteger instances (static fields). You will delete boolean field this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-969304770 Hi, @THausherr ! The classes JBIG2Filter (decode method) and COSWriter (getDataToSig method) uses a SequenceInputStream class with constructor with Vector creation. The Vector is syncronized collection and, you know, it is slow. Is it necessary to use this constructor but not another? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-968134524 Hm. You changed all 'this' to 'the' for PDType1Font and added a throw statement with same message with 'this'. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-964823754 @Hi, @THausherr! I found one strange method. COSInteger.equals. Why it has no isValid method checks? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-964561644 PDPageTree.get(int, COSDictionary, int). Should you check pageNum less then one to throw an IndexOutOfBoundsException (pageNum is 1-based)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-964420992 > A COSStream is a COSDictionary that has a stream (assuming that you think it's suspicious that the check is inside a block that is for COSDictionaries) Thank, you. Now I see. COSStream is a COSDictionary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-964415738 Hi @THausherr! Please check the COSDictionary.getDictionaryString method. It looks like there are error on line 1372 ("if (base instanceof COSStream)" in the wrong place). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-960491720 > > Hi, @THausherr! You have changed the PDColorSpace class. It looks like the ColorConvertOp field might be static? Why do you need many instances of ColorConvertOp for each PDColorSpace instance? (PDFBOX-5310: create ColorConvertOp only once) > > Yes this was changed for speed. The effects are amazing for files with many tiny images. There seems to be some caching going on, from looking at the source code. I assumed that this caching would be per entity, so each entity should keep its own colorspace. A common ColorConvertOp for all would mean synchronization and possibly loss of caching, even if it doesn't happen, then maybe in older versions. Understood. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-959920979 You have changed the PDColorSpace class. It looks like the ColorConvertOp field might be static? Why do you need many instances of ColorConvertOp for each PDColorSpace instance? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-959920979 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-959920979 You have changed the PDColorSpace class. It looks like the ColorConvertOp field might be static? Why do you need many instances of ColorConvertOp for each PDColorSpace instance? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-947862659 Hi, @THausherr ! I found one strange detail in a PageExtractor class. An interesting process is going on inside the extraction method. In the loop we calling the importPage method. Inside this method we changing the cropbox, mediabox and rotation for an imported page. We doeing the same in the PageExtractor.extract method. Maybe you will remove unneeded initialisations of the cropbox etc. Also, the importPage method notifies us about resources but we doeing right operation with resources in the PageExtractor.extract method (inherited resources of source document are not imported to...). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-937549360 PDSignature.getByteRange able to return empty array but getContents(...) methods have no checks. A comment in getContents() method says "@throws IOException if the pdfFile can't be read" but no message "@throws IndexOutOfBoundsException if ..." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-937549360 PDSignature.getByteRange able to return empty array but getContents(...) methods have no checks. A comment in getContents() method says "@throws IOException if the pdfFile can't be read" but no message "@throws IndexOutOfBoundsException if ..." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-913181285 > This one I won't do (except the comment, at a later time). The performance win is minimal, maybe none, but the code gets much longer. This method is called once on a PDF. There will usually be at least 16 bytes unless the PDF is broken. I changed this code not for speed but better memory usage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-907833215 Hi, @THausherr! My two observations. 1 Is it not too math to enlarge ScratchFileBuffer.pageIndexes array length twice (ScratchFileBuffer.addPage)? 2 PDVisibleSigBuilder.createTemplate. If this method will be called twice then previous template (PDDocument) not be closed. You able to store link to that template in a PDVisibleSigBuilder. If this link will be same as template field then it will be closed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-906643827 > I don't see the problem - it retrieves the last element, which is a name. That is checked earlier. Yes. Now I see -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-906594590 PDColor.PDColor(COSArray, PDColorSpace). Line 58. Index is zero based (was -1 lost?). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-905238386 StandardSecurityHandler.prepareForDecryption. Line 153: encryption.get__String__FilterName()? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-904885292 > > WinAnsiEncoding > > > WinAnsiEncoding constructor. Check comment (276) and creation of index (277). > > So what's the problem? I don't see a contradiction. No problem. I thought means decimal systemin in comment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-904818896 WinAnsiEncoding constructor. Check comment (276) and creation of index (277). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-904546337 FDFJavaScript.getDoc. This method returns null or empty map but not map with keys and values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-904028504 PDNumberTreeNode.getUpperLimit. Line 284. It look like wrong index. Should be "1". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-901352254 MapBackedScriptFeature.equals. Line 105. This is right fields comparing? The maps compares keys in the equals method and you compares keys with type List by reference but not value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-890458381 > I'm aware that a developer would be able to change it. These types of fields have been flagged by Sonar. They shouldn't change the page size after creation, and instead set the right size. Fixing this problem would probably require specialized classes. It's not a security problem. One specialized class. ImmutablePDRectangle -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-890399522 > re 1, 5, 6, 7, 8, 9: fixed, thanks > re 3: for now > re 4: will wait until it happens. Not sure if this can ever happen. About 3. Programmer able to change rectangle using its setters then LETTER will not be LETTER -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-890298627 Hi, @THausherr! 1 Maybe this is not a problem and it will not be. I found that COSArray.setString can set null as a value. Some methods, such as COSArray.indexOf, use an objects collection with no null checks. 2 PDAnnotation.getBorder. If border.size() >= 3 don't need to "create a copy to avoid changing the PDF"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-882031378 @THausherr , i have one more proposition. How about additional constructor for COSArray? private final List objects; //... field and nonparametrized constructor public COSArray(int initCapacity) { objects = new ArrayList<>(initCapacity); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-882028500 > This change makes code less readable and uses a boolean parameter which breaks the "do one thing" rule. (Yes we probably break this rule a lot, but there's no need to add more). It does save 3 or 4 bytes, but I doubt that this makes a difference. > https://twitter.com/unclebobmartin/status/1114137614377005057 Hi, @THausherr! I tried to allocate less memory. If you will purpose another way it will be good. I mean, if you will use public constructor (instead of private constructor and static method it will be simplier). As you see I got one second faster test results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-869176791 > filter() only returns a new image if dest is null ("dest color converted from src or a new, converted BufferedImage if dest is null") so it doesn't matter. > > Btw please don't bother with preflight. This subproject will be removed in 4.0. The EU funded VeraPDF is also open source and has more features. ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-869171944 Hi, @THausherr! Why for last line PDColorSpace.toRGBImageAWT returns old image but not new (`return op.filter(src, dest);`)? Sometimes reference to destination image can be changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-868950765 There is simplified version of COSOutputStream.close: ` @Override public void close() throws IOException { try { if (buffer != null) { try { // apply filters in reverse order for (int i = filters.size() - 1; i >= 0; i--) { try (InputStream unfilteredIn = new RandomAccessInputStream(buffer)) { if (i == 0) { /* * The last filter to run can encode directly to the enclosed output * stream. */ filters.get(i).encode(unfilteredIn, out, parameters, i); } else { RandomAccess filteredBuffer = scratchFile.createBuffer(); try (OutputStream filteredOut = new RandomAccessOutputStream(filteredBuffer)) { filters.get(i).encode(unfilteredIn, filteredOut, parameters, i); } finally { buffer.close(); buffer = filteredBuffer; } } } } } finally { buffer.close(); buffer = null; } } } finally { super.close(); } }` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-868308933 > Isn't `buffer` closed in the `finally` block? Same for `filteredBuffer`. Or did I misunderstand this? Yes, buffer will be closed in inner finally block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-868306265 > RandomAccessInputStream and RandomAccessOutputStream are simple wrapper classes and close isn't needed at all. Maybe we should just add an empty implementation of close to document that. Right now try-with-resources avoids sonar warnings Yes, I've seen several cases where streams could not be closed. But why are you using the trick for COSOutputStream.close? You use fake try-with-resources statement to avoid sonar warnings, then you use swapping of streams to close right stream. Why not use try-with-resources with the correct stream and without RandomAccessOutputStream? I understand, maybe this code matches the documentation, but it is too confusing, IMO. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-867467534 COSOutputStream.close. Row 131. The buffer field will not be closed because a RandomAccessInputStream not overrides close method. Or don't you need try-with resources? COSOutputStream.close. Row 146. The buffer field will not be closed because a RandomAccessOutputStream not overrides close method. Or don't you need try-with resources? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-864370716 > Isn't the "put" line all that is needed? (and the "remove" line isn't?) Yes. The "put" is all that needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-863476067 > Please don't submit code style changes. The style is mostly relevant for new code. I changed only because licenses had too many stars in the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-860922327 HorizontalMetricsTable. Line 56. Typo in the exception message? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-856119473 > Your code removes all after 0 and including this 0, but that is different to what the current code does. We don't need this at this time. Your code does the same: public static void main(String[]args) { String s=cutNullBytes(new byte[]{ 1,2,3,0,0,0,0,0,5,4,5,4,6,0,0,0},false); String s1=cutNullBytes(new byte[]{ 1,2,3,0,0,0,0,0,5,4,5,4,6,0,0,0},true); String s2=cutNullBytes(new byte[]{ 1,2,3,5,4,5,4,6,0,0,0,0},false); String s3=cutNullBytes(new byte[]{ 1,2,3,5,4,5,4,6,0,0,0,0},false); System.out.println(s.equals(s1)); System.out.println(s2.equals(s3)); } private static String cutNullBytes(byte[]bytes, boolean useLastIndexOf) { String label = new String(bytes); System.out.println("label length: " + label.length() + " label value: " + label); if (useLastIndexOf) { while (label.lastIndexOf(0) != -1) { label = label.substring(0, label.length() - 1); } } else { int index = label.indexOf(0); if (index > -1) label = label.substring(0, index); } System.out.println("label length: " + label.length() + " label value: " + label); return label; } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-855732380 > Re PDPageLabels: my understanding of the code is that it removes the 0 bytes at the end, and only these, see https://issues.apache.org/jira/browse/PDFBOX-1047 . The problem string there is <3300> so there are three 0 bytes. public static void main(String[]args) { //row with null bytes in the center cutNullBytes(new byte[]{ 1,2,3,0,0,0,0,0,5,4,5,4,6,0,0,0}, false);//todo check and change false to true //row without null bytes in the center cutNullBytes(new byte[]{ 1,2,3,5,4,5,4,6,0,0,0}, false);//todo check and change false to true } private static void cutNullBytes(byte[]bytes, boolean useLastIndexOf) { String label = new String(bytes); System.out.println("label length: " + label.length() + " label value: " + label); if (useLastIndexOf) { //slow code while (label.lastIndexOf(0) != -1) { label = label.substring(0, label.length() - 1); } } else { //fast code int index = label.indexOf(0); if (index > -1) label = label.substring(0, index); } System.out.println("label length: " + label.length() + " label value: " + label); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[GitHub] [pdfbox] valerybokov commented on pull request #107: potential memory leaks and small performance improvements
valerybokov commented on pull request #107: URL: https://github.com/apache/pdfbox/pull/107#issuecomment-855398906 > COSParser.parseTrailer. If a nextLine variable will start with a space character then a nextLine.startsWith("trailer") return false. Сan the nextLine start with a space? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org