[jira] [Commented] (GEOMETRY-99) Inconsistent equals and hashCode
[ https://issues.apache.org/jira/browse/GEOMETRY-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151450#comment-17151450 ] Matt Juntunen commented on GEOMETRY-99: --- Fixed in fc9272b2366d6b5944e1c1a9ecf568ff89f95a30 > Inconsistent equals and hashCode > > > Key: GEOMETRY-99 > URL: https://issues.apache.org/jira/browse/GEOMETRY-99 > Project: Apache Commons Geometry > Issue Type: Bug >Reporter: Matt Juntunen >Priority: Major > Labels: beta1 > > The following classes have inconsistent behavior between {{equals}} and > {{hashCode}} in regard to +0.0 and -0.0: > * EpsilonDoublePrecisionContext > * Vector1D > * Vector2D > * Vector2D > * SphericalCoordinates > * PolarCoordinates > * AffineTransformMatrix2D > * AffineTransformMatrix3D > * AxisAngleSequence > This inconsistency comes from the use of == or {{Precision.equals}} to > compare doubles in the {{equals}} method (which consider +0.0 and -0.0 to be > equal) and {{Double.hashCode}} in the {{hashCode}} method (which returns > different hash codes for +0.0 and -0.0). Hence, instances could be considered > equal but have different hash codes. > We should standardize on the use of {{Double.compare}} in {{equals}} and > {{Double.hashCode}} in {{hashCode}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEOMETRY-99) Inconsistent equals and hashCode
[ https://issues.apache.org/jira/browse/GEOMETRY-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Juntunen resolved GEOMETRY-99. --- Resolution: Fixed > Inconsistent equals and hashCode > > > Key: GEOMETRY-99 > URL: https://issues.apache.org/jira/browse/GEOMETRY-99 > Project: Apache Commons Geometry > Issue Type: Bug >Reporter: Matt Juntunen >Priority: Major > Labels: beta1 > > The following classes have inconsistent behavior between {{equals}} and > {{hashCode}} in regard to +0.0 and -0.0: > * EpsilonDoublePrecisionContext > * Vector1D > * Vector2D > * Vector2D > * SphericalCoordinates > * PolarCoordinates > * AffineTransformMatrix2D > * AffineTransformMatrix3D > * AxisAngleSequence > This inconsistency comes from the use of == or {{Precision.equals}} to > compare doubles in the {{equals}} method (which consider +0.0 and -0.0 to be > equal) and {{Double.hashCode}} in the {{hashCode}} method (which returns > different hash codes for +0.0 and -0.0). Hence, instances could be considered > equal but have different hash codes. > We should standardize on the use of {{Double.compare}} in {{equals}} and > {{Double.hashCode}} in {{hashCode}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [commons-geometry] darkma773r closed pull request #94: Windows Build Fixes
darkma773r closed pull request #94: URL: https://github.com/apache/commons-geometry/pull/94 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #94: Windows Build Fixes
darkma773r commented on pull request #94: URL: https://github.com/apache/commons-geometry/pull/94#issuecomment-653830224 Merged in afde7adad1d9e7b893d37a8278812aaff6590313 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] asfgit merged pull request #93: GEOMETRY-99: Making equals and hashCode consistent
asfgit merged pull request #93: URL: https://github.com/apache/commons-geometry/pull/93 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r opened a new pull request #94: Windows Build Fixes
darkma773r opened a new pull request #94: URL: https://github.com/apache/commons-geometry/pull/94 - fixing checkstyle suppression on windows - adjusting unit test value for more consistent result sorting This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (TEXT-178) StringSubstitutor incorrectly removes some escape characters
[ https://issues.apache.org/jira/browse/TEXT-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary D. Gregory updated TEXT-178: - Summary: StringSubstitutor incorrectly removes some escape characters (was: StringSubstitutor incorrectly removes escape characters) > StringSubstitutor incorrectly removes some escape characters > > > Key: TEXT-178 > URL: https://issues.apache.org/jira/browse/TEXT-178 > Project: Commons Text > Issue Type: Bug >Reporter: Gary D. Gregory >Priority: Major > > The class {{StringSubstitutor}} incorrectly removes escape characters. > Note that these tests pass: > {code:java} > /** > * Tests interpolation with weird boundary patterns. > */ > @Test > public void testReplaceWeirdPattens() throws IOException { > doTestNoReplace(StringUtils.EMPTY); > doTestNoReplace(EMPTY_EXPR); > doTestNoReplace("${ }"); > doTestNoReplace("${\t}"); > doTestNoReplace("${\n}"); > doTestNoReplace("${\b}"); > doTestNoReplace("${"); > doTestNoReplace("$}"); > doTestNoReplace("$$}"); > doTestNoReplace("}"); > doTestNoReplace("${}$"); > doTestNoReplace("${}$$"); > doTestNoReplace("${${"); > doTestNoReplace("${${}}"); > doTestNoReplace("${$${}}"); > doTestNoReplace("${$$${}}"); > doTestNoReplace("${$$${$}}"); > doTestNoReplace("${${}}"); > doTestNoReplace("${${ }}"); > } > {code} > But these tests fail: > {code:java} > /** > * Tests interpolation with weird boundary patterns. > */ > @Test > @Disabled > public void testReplaceWeirdPattensJira() throws IOException { > doTestNoReplace("$${"); > doTestNoReplace("$${a"); > doTestNoReplace("$$${"); > doTestNoReplace("$$${a"); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TEXT-178) StringSubstitutor incorrectly removes escape characters
Gary D. Gregory created TEXT-178: Summary: StringSubstitutor incorrectly removes escape characters Key: TEXT-178 URL: https://issues.apache.org/jira/browse/TEXT-178 Project: Commons Text Issue Type: Bug Reporter: Gary D. Gregory The class {{StringSubstitutor}} incorrectly removes escape characters. Note that these tests pass: {code:java} /** * Tests interpolation with weird boundary patterns. */ @Test public void testReplaceWeirdPattens() throws IOException { doTestNoReplace(StringUtils.EMPTY); doTestNoReplace(EMPTY_EXPR); doTestNoReplace("${ }"); doTestNoReplace("${\t}"); doTestNoReplace("${\n}"); doTestNoReplace("${\b}"); doTestNoReplace("${"); doTestNoReplace("$}"); doTestNoReplace("$$}"); doTestNoReplace("}"); doTestNoReplace("${}$"); doTestNoReplace("${}$$"); doTestNoReplace("${${"); doTestNoReplace("${${}}"); doTestNoReplace("${$${}}"); doTestNoReplace("${$$${}}"); doTestNoReplace("${$$${$}}"); doTestNoReplace("${${}}"); doTestNoReplace("${${ }}"); } {code} But these tests fail: {code:java} /** * Tests interpolation with weird boundary patterns. */ @Test @Disabled public void testReplaceWeirdPattensJira() throws IOException { doTestNoReplace("$${"); doTestNoReplace("$${a"); doTestNoReplace("$$${"); doTestNoReplace("$$${a"); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [commons-geometry] darkma773r opened a new pull request #93: GEOMETRY-99: Making equals and hashCode consistent
darkma773r opened a new pull request #93: URL: https://github.com/apache/commons-geometry/pull/93 Using Double.compare in equals and Double.hashCode in hashCode This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (GEOMETRY-99) Inconsistent equals and hashCode
[ https://issues.apache.org/jira/browse/GEOMETRY-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Juntunen updated GEOMETRY-99: -- Labels: beta1 (was: ) > Inconsistent equals and hashCode > > > Key: GEOMETRY-99 > URL: https://issues.apache.org/jira/browse/GEOMETRY-99 > Project: Apache Commons Geometry > Issue Type: Bug >Reporter: Matt Juntunen >Priority: Major > Labels: beta1 > > The following classes have inconsistent behavior between {{equals}} and > {{hashCode}} in regard to +0.0 and -0.0: > * EpsilonDoublePrecisionContext > * Vector1D > * Vector2D > * Vector2D > * SphericalCoordinates > * PolarCoordinates > * AffineTransformMatrix2D > * AffineTransformMatrix3D > * AxisAngleSequence > This inconsistency comes from the use of == or {{Precision.equals}} to > compare doubles in the {{equals}} method (which consider +0.0 and -0.0 to be > equal) and {{Double.hashCode}} in the {{hashCode}} method (which returns > different hash codes for +0.0 and -0.0). Hence, instances could be considered > equal but have different hash codes. > We should standardize on the use of {{Double.compare}} in {{equals}} and > {{Double.hashCode}} in {{hashCode}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEOMETRY-99) Inconsistent equals and hashCode
Matt Juntunen created GEOMETRY-99: - Summary: Inconsistent equals and hashCode Key: GEOMETRY-99 URL: https://issues.apache.org/jira/browse/GEOMETRY-99 Project: Apache Commons Geometry Issue Type: Bug Reporter: Matt Juntunen The following classes have inconsistent behavior between {{equals}} and {{hashCode}} in regard to +0.0 and -0.0: * EpsilonDoublePrecisionContext * Vector1D * Vector2D * Vector2D * SphericalCoordinates * PolarCoordinates * AffineTransformMatrix2D * AffineTransformMatrix3D * AxisAngleSequence This inconsistency comes from the use of == or {{Precision.equals}} to compare doubles in the {{equals}} method (which consider +0.0 and -0.0 to be equal) and {{Double.hashCode}} in the {{hashCode}} method (which returns different hash codes for +0.0 and -0.0). Hence, instances could be considered equal but have different hash codes. We should standardize on the use of {{Double.compare}} in {{equals}} and {{Double.hashCode}} in {{hashCode}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEOMETRY-95) CSG Examples
[ https://issues.apache.org/jira/browse/GEOMETRY-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151412#comment-17151412 ] Matt Juntunen commented on GEOMETRY-95: --- I've added a tutorial for BSP trees in general that will be in beta1. However, a full 3D CSG tutorial will wait for another release. > CSG Examples > > > Key: GEOMETRY-95 > URL: https://issues.apache.org/jira/browse/GEOMETRY-95 > Project: Apache Commons Geometry > Issue Type: New Feature >Reporter: Matt Juntunen >Priority: Major > > Adding Constructive Solid Geometry examples and userguide entries to help new > users to the library use these features. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEOMETRY-95) CSG Examples
[ https://issues.apache.org/jira/browse/GEOMETRY-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Juntunen updated GEOMETRY-95: -- Labels: (was: beta1) > CSG Examples > > > Key: GEOMETRY-95 > URL: https://issues.apache.org/jira/browse/GEOMETRY-95 > Project: Apache Commons Geometry > Issue Type: New Feature >Reporter: Matt Juntunen >Priority: Major > > Adding Constructive Solid Geometry examples and userguide entries to help new > users to the library use these features. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [commons-geometry] darkma773r commented on pull request #87: suggest use enhanced for when we can
darkma773r commented on pull request #87: URL: https://github.com/apache/commons-geometry/pull/87#issuecomment-653757469 Change implemented in 385c02a9b2cc924ebdaacc8ba02f93b76cac31cc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #87: suggest use enhanced for when we can
darkma773r closed pull request #87: URL: https://github.com/apache/commons-geometry/pull/87 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #91: Change a class name from Error to Exception as it is an Exception not an Error
darkma773r closed pull request #91: URL: https://github.com/apache/commons-geometry/pull/91 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #91: Change a class name from Error to Exception as it is an Exception not an Error
darkma773r commented on pull request #91: URL: https://github.com/apache/commons-geometry/pull/91#issuecomment-653756123 Merged in commit 3c25e07d19e0d7c36073713c6a60d15adf4a5ecf This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #90: simplify if conditions
darkma773r commented on pull request #90: URL: https://github.com/apache/commons-geometry/pull/90#issuecomment-653756077 Merged in commit c2f929e2b182d070caf6ed3f605380d09cd1765b This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #90: simplify if conditions
darkma773r closed pull request #90: URL: https://github.com/apache/commons-geometry/pull/90 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #89: add break to a loop.
darkma773r commented on pull request #89: URL: https://github.com/apache/commons-geometry/pull/89#issuecomment-653756063 Merged in commit b06c457b98ff09f529c25fe9fb3f6e282320ad54 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #89: add break to a loop.
darkma773r closed pull request #89: URL: https://github.com/apache/commons-geometry/pull/89 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #88: change a field to local variable, as I didn't see anywhere using it.
darkma773r closed pull request #88: URL: https://github.com/apache/commons-geometry/pull/88 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #88: change a field to local variable, as I didn't see anywhere using it.
darkma773r commented on pull request #88: URL: https://github.com/apache/commons-geometry/pull/88#issuecomment-653756040 Merged in commit 6c90e34ff11fb9fa279d9b060abf70c14ce3cd2a This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #86: Fix readme in submodule
darkma773r closed pull request #86: URL: https://github.com/apache/commons-geometry/pull/86 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #86: Fix readme in submodule
darkma773r commented on pull request #86: URL: https://github.com/apache/commons-geometry/pull/86#issuecomment-653755976 Merged in commit bcfed9b1e161b70542f0356fb189a36f5eaea062 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #85: upgrade dependencies
darkma773r closed pull request #85: URL: https://github.com/apache/commons-geometry/pull/85 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #85: upgrade dependencies
darkma773r commented on pull request #85: URL: https://github.com/apache/commons-geometry/pull/85#issuecomment-653755919 Merged in commit 735ebe6b14f87b7fb0985249b81069ad8dee4840 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r commented on pull request #84: fix javadocs and typos.
darkma773r commented on pull request #84: URL: https://github.com/apache/commons-geometry/pull/84#issuecomment-653755842 Merged in commit 6bba60a617c720753021b25a53db6925bfe05189 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-geometry] darkma773r closed pull request #84: fix javadocs and typos.
darkma773r closed pull request #84: URL: https://github.com/apache/commons-geometry/pull/84 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151287#comment-17151287 ] Stefan Bodewig commented on COMPRESS-539: - I'm sorry we crossed with our actions mid-flight [~peterlee] . I'm afraid I have to ask you to revert your change to {{IOUtils.skip}}. > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Assignee: Peter Lee >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151285#comment-17151285 ] Stefan Bodewig commented on COMPRESS-539: - COMPRESS-449 rather than 443, sorry. > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Assignee: Peter Lee >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151283#comment-17151283 ] Stefan Bodewig commented on COMPRESS-539: - not calling skip may reduce the amount of memory consumed but will slow things down considerably as skip is usually far more efficient than read. In your case I'd recommend wrapping your input stream into a {{SkipShieldingInputStream}} instead, which will have the same effect as commenting out the skip calls. Re-using the buffer in {{TarArchiveInputStream}} sounds right. No idea why this is different from TarArchiveOutputStream where it seems we've used a shared buffer since forever. [~peterlee] the team has had this discussion around COMPRESS-443 - a while before you joined - and back then explicitly decided we wanted to keep using {{skip}} for performance reasons. > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Assignee: Peter Lee >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151284#comment-17151284 ] Peter Lee commented on COMPRESS-539: Hey [~rschimpf] . Please have a look at commit 101137bfa0f3ae709c2a2771368b190ceb899ea0 if it helps. > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Assignee: Peter Lee >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Lee updated COMPRESS-539: --- Assignee: Peter Lee > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Assignee: Peter Lee >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151216#comment-17151216 ] Peter Lee edited comment on COMPRESS-539 at 7/4/20, 8:01 AM: - IIRC the IOUtils.skip was originally copied from Apache Commons IO(not a maven dependency to keep Compress as simple as we want it). I just checked the IOUtils in Apache Commons IO and found out the skip method looks like this: {code:java} /** * Skips characters from an input character stream. * This implementation guarantees that it will read as many characters * as possible before giving up; this may not always be the case for * skip() implementations in subclasses of {@link Reader}. * * Note that the implementation uses {@link Reader#read(char[], int, int)} rather * than delegating to {@link Reader#skip(long)}. * This means that the method may be considerably less efficient than using the actual skip implementation, * this is done to guarantee that the correct number of characters are skipped. * * * @param input character stream to skip * @param toSkip number of characters to skip. * @return number of characters actually skipped. * @throws IOException if there is a problem reading the file * @throws IllegalArgumentException if toSkip is negative * @see Reader#skip(long) * @see https://issues.apache.org/jira/browse/IO-203;>IO-203 - Add skipFully() method for InputStreams * @since 2.0 */ public static long skip(final Reader input, final long toSkip) throws IOException { if (toSkip < 0) { throw new IllegalArgumentException("Skip count must be non-negative, actual: " + toSkip); } /* * N.B. no need to synchronize this because: - we don't care if the buffer is created multiple times (the data * is ignored) - we always use the same size buffer, so if it it is recreated it will still be OK (if the buffer * size were variable, we would need to synch. to ensure some other thread did not create a smaller one) */ if (SKIP_CHAR_BUFFER == null) { SKIP_CHAR_BUFFER = new char[SKIP_BUFFER_SIZE]; } long remain = toSkip; while (remain > 0) { // See https://issues.apache.org/jira/browse/IO-203 for why we use read() rather than delegating to skip() final long n = input.read(SKIP_CHAR_BUFFER, 0, (int) Math.min(remain, SKIP_BUFFER_SIZE)); if (n < 0) { // EOF break; } remain -= n; } return toSkip - remain; } {code} Not calling inputstream.skip as you did in your patch. I believe this implemention would get the same memory benefit as your patch did. -BTW seems the IOUtils methods in Commons Compress are somehow out of date - maybe we can update them using reference from Commons IO.- Update : checked all the methods in IOUtils of Compress and it seems that only the skip method need to be updated. was (Author: peterlee): IIRC the IOUtils.skip was originally copied from Apache Commons IO(not a maven dependency to keep Compress as simple as we want it). I just checked the IOUtils in Apache Commons IO and found out the skip method looks like this: {code:java} /** * Skips characters from an input character stream. * This implementation guarantees that it will read as many characters * as possible before giving up; this may not always be the case for * skip() implementations in subclasses of {@link Reader}. * * Note that the implementation uses {@link Reader#read(char[], int, int)} rather * than delegating to {@link Reader#skip(long)}. * This means that the method may be considerably less efficient than using the actual skip implementation, * this is done to guarantee that the correct number of characters are skipped. * * * @param input character stream to skip * @param toSkip number of characters to skip. * @return number of characters actually skipped. * @throws IOException if there is a problem reading the file * @throws IllegalArgumentException if toSkip is negative * @see Reader#skip(long) * @see https://issues.apache.org/jira/browse/IO-203;>IO-203 - Add skipFully() method for InputStreams * @since 2.0 */ public static long skip(final Reader input, final long toSkip) throws IOException { if (toSkip < 0) { throw new IllegalArgumentException("Skip count must be non-negative, actual: " + toSkip); } /* * N.B. no need to synchronize this because: - we don't care if the buffer is created multiple times (the data * is ignored) - we always use the same size buffer, so if it it is recreated it will still be OK (if the buffer * size were variable, we would need to synch. to ensure some other thread did not create a smaller one) */ if (SKIP_CHAR_BUFFER == null) { SKIP_CHAR_BUFFER = new char[SKIP_BUFFER_SIZE]; } long remain = toSkip; while (remain > 0) { // See
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151216#comment-17151216 ] Peter Lee commented on COMPRESS-539: IIRC the IOUtils.skip was originally copied from Apache Commons IO(not a maven dependency to keep Compress as simple as we want it). I just checked the IOUtils in Apache Commons IO and found out the skip method looks like this: {code:java} /** * Skips characters from an input character stream. * This implementation guarantees that it will read as many characters * as possible before giving up; this may not always be the case for * skip() implementations in subclasses of {@link Reader}. * * Note that the implementation uses {@link Reader#read(char[], int, int)} rather * than delegating to {@link Reader#skip(long)}. * This means that the method may be considerably less efficient than using the actual skip implementation, * this is done to guarantee that the correct number of characters are skipped. * * * @param input character stream to skip * @param toSkip number of characters to skip. * @return number of characters actually skipped. * @throws IOException if there is a problem reading the file * @throws IllegalArgumentException if toSkip is negative * @see Reader#skip(long) * @see https://issues.apache.org/jira/browse/IO-203;>IO-203 - Add skipFully() method for InputStreams * @since 2.0 */ public static long skip(final Reader input, final long toSkip) throws IOException { if (toSkip < 0) { throw new IllegalArgumentException("Skip count must be non-negative, actual: " + toSkip); } /* * N.B. no need to synchronize this because: - we don't care if the buffer is created multiple times (the data * is ignored) - we always use the same size buffer, so if it it is recreated it will still be OK (if the buffer * size were variable, we would need to synch. to ensure some other thread did not create a smaller one) */ if (SKIP_CHAR_BUFFER == null) { SKIP_CHAR_BUFFER = new char[SKIP_BUFFER_SIZE]; } long remain = toSkip; while (remain > 0) { // See https://issues.apache.org/jira/browse/IO-203 for why we use read() rather than delegating to skip() final long n = input.read(SKIP_CHAR_BUFFER, 0, (int) Math.min(remain, SKIP_BUFFER_SIZE)); if (n < 0) { // EOF break; } remain -= n; } return toSkip - remain; } {code} Not calling inputstream.skip as you did in your patch. I believe this implemention would get the same memory benefit as your patch did. BTW seems the IOUtils methods in Commons Compress are somehow out of date - maybe we can update them using reference from Commons IO. > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer
[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive
[ https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151207#comment-17151207 ] Peter Lee commented on COMPRESS-539: Looks great. Would you create a github PR so the tests could be applied? > TarArchiveInputStream allocates a lot of memory when iterating through an > archive > - > > Key: COMPRESS-539 > URL: https://issues.apache.org/jira/browse/COMPRESS-539 > Project: Commons Compress > Issue Type: Bug >Affects Versions: 1.20 >Reporter: Robin Schimpf >Priority: Major > Attachments: Don't_call_InputStream#skip.patch, > Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, > image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png > > > I iterated through the linux source tar and noticed some unneeded > allocations happen without extracting any data. > Reproducing code > {code:java} > File tarFile = new File("linux-5.7.1.tar"); > try (TarArchiveInputStream in = new > TarArchiveInputStream(Files.newInputStream(tarFile.toPath( { > TarArchiveEntry entry; > while ((entry = in.getNextTarEntry()) != null) { > } > } > {code} > The measurement was done on Java 11.0.7 with the Java Flight Recorder. > Options used: > -XX:StartFlightRecording=settings=profile,filename=allocations.jfr > Baseline with the current master implementation: > Estimated TLAB allocation: 293MiB > !image-2020-06-21-10-58-07-917.png! > 1. IOUtils.skip -> input.skip(numToSkip) > This delegates in my test scenario to the InputStream.skip implementation > which allocates a new byte[] for every invocation. By simply commenting out > the while loop which calls the skip method the estimated TLAB allocation > drops to 164MiB (-129MiB). > !image-2020-06-21-10-58-43-255.png! > Commenting out the skip call does not seem to be the best solution but it > was quick for me to see how much memory can be saved. Also no unit tests > where failing for me. > 2. TarArchiveInputStream.readRecord > For every read of the record a new byte[] is created. Since the record size > does not change the byte[] can be reused and created when instantiating the > TarStream. This optimization is already present in the > TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB > allocations further to 128MiB (-36MiB). > !image-2020-06-21-10-59-10-825.png! > I attached the patches I used so the results can be verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)