[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507756#comment-15507756 ] Zhiyuan Yang commented on TEZ-3163: --- Patch looks good to me. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506002#comment-15506002 ] Rajesh Balamohan commented on TEZ-3163: --- [~jeagles] - Sorry about the delay. Patch LGTM. +1. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504258#comment-15504258 ] Hitesh Shah commented on TEZ-3163: -- \cc [~harishjp] > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504112#comment-15504112 ] Bikas Saha commented on TEZ-3163: - /cc [~hitesh] [~aplusplus] > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15503710#comment-15503710 ] Jonathan Eagles commented on TEZ-3163: -- [~gopalv], [~rajesh.balamohan], [~sseth], looking for a reviewer for this JIRA. Anyone with expertise for this patch review? > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475744#comment-15475744 ] Jonathan Eagles commented on TEZ-3163: -- [~gopalv], [~rajesh.balamohan], rebased this patch for 0.9 on master branch. At this point haven't switched the compressed level since i think that may be better off as a separated. This patch specifically allows for inflater/deflater reuse. Let me know what it will take to get this in. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475053#comment-15475053 ] TezQA commented on TEZ-3163: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12827637/TEZ-3163.2.patch against master revision 495e6f0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1960//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1960//console This message is automatically generated. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474828#comment-15474828 ] Jonathan Eagles commented on TEZ-3163: -- rebased patch for master > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.2.patch, TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194327#comment-15194327 ] Jonathan Eagles commented on TEZ-3163: -- I have also prototype a version where the emptyPartitions bitmap is replaced by a roaring bitmap. The speed is much much faster. Trying to get a sense of what approach or combination of approaches is best here. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190639#comment-15190639 ] Gopal V commented on TEZ-3163: -- [~jeagles]: when compressing bitsets with large runs, I found Deflater.BEST_SPEED + 1 to be a good trade-off (looks for 16 byte sequences) instead of the BEST_COMPRESSION which is built to find 128 byte patterns (the reducer bit-set is usually small). > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190632#comment-15190632 ] TezQA commented on TEZ-3163: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12792652/TEZ-3163.1.patch against master revision dbd763f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1556//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1556//console This message is automatically generated. > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing
[ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190419#comment-15190419 ] Jonathan Eagles commented on TEZ-3163: -- In a task attempt with 16M DME events, noticed that there were 35000 Inflater objects in memory (unreachable, finalizing). This patch attempts to reuse the Inflater/Deflater objects and reduces GC and CPU for large jobs. PERF patch above has two tests showing the value in reusing the Inflaters/Deflaters. In addition, added the NOWRAP flag to reduce 6 bytes per DME message (header and trailer) References on improvements https://github.com/ning/jvm-compressor-benchmark/blob/master/src/main/java/com/ning/jcbm/gzip/JDKGzipDriver.java http://stackoverflow.com/questions/13059533/how-to-use-java-deflateroutputstream http://java-performance.info/performance-general-compression/ > Reuse and tune Inflaters and Deflaters to speed DME processing > -- > > Key: TEZ-3163 > URL: https://issues.apache.org/jira/browse/TEZ-3163 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, > TEZ-3163.PERF.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)