[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-20 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507756#comment-15507756
 ] 

Zhiyuan Yang commented on TEZ-3163:
---

Patch looks good to me.

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-20 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506002#comment-15506002
 ] 

Rajesh Balamohan commented on TEZ-3163:
---

[~jeagles] - Sorry about the delay. 

Patch LGTM. +1.  

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-19 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504258#comment-15504258
 ] 

Hitesh Shah commented on TEZ-3163:
--

\cc [~harishjp]

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-19 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504112#comment-15504112
 ] 

Bikas Saha commented on TEZ-3163:
-

/cc [~hitesh] [~aplusplus]

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-19 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15503710#comment-15503710
 ] 

Jonathan Eagles commented on TEZ-3163:
--

[~gopalv], [~rajesh.balamohan], [~sseth], looking for a reviewer for this JIRA. 
Anyone with expertise for this patch review?

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475744#comment-15475744
 ] 

Jonathan Eagles commented on TEZ-3163:
--

[~gopalv], [~rajesh.balamohan], rebased this patch for 0.9 on master branch. At 
this point haven't switched the compressed level since i think that may be 
better off as a separated. This patch specifically allows for inflater/deflater 
reuse. Let me know what it will take to get this in.

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475053#comment-15475053
 ] 

TezQA commented on TEZ-3163:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12827637/TEZ-3163.2.patch
  against master revision 495e6f0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1960//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1960//console

This message is automatically generated.

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-09-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474828#comment-15474828
 ] 

Jonathan Eagles commented on TEZ-3163:
--

rebased patch for master

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.2.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-03-14 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194327#comment-15194327
 ] 

Jonathan Eagles commented on TEZ-3163:
--

I have also prototype a version where the emptyPartitions bitmap is replaced by 
a roaring bitmap. The speed is much much faster. Trying to get a sense of what 
approach or combination of approaches is best here.

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-03-11 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190639#comment-15190639
 ] 

Gopal V commented on TEZ-3163:
--

[~jeagles]: when compressing bitsets with large runs, I found 
Deflater.BEST_SPEED + 1 to be a good trade-off (looks for 16 byte sequences) 
instead of the BEST_COMPRESSION which is built to find 128 byte patterns (the 
reducer bit-set is usually small).

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-03-11 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190632#comment-15190632
 ] 

TezQA commented on TEZ-3163:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12792652/TEZ-3163.1.patch
  against master revision dbd763f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1556//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1556//console

This message is automatically generated.

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

2016-03-10 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190419#comment-15190419
 ] 

Jonathan Eagles commented on TEZ-3163:
--

In a task attempt with 16M DME events, noticed that there were 35000 Inflater 
objects in memory (unreachable, finalizing). This patch attempts to reuse the 
Inflater/Deflater objects and reduces GC and CPU for large jobs. PERF patch 
above has two tests showing the value in reusing the Inflaters/Deflaters. In 
addition, added the NOWRAP flag to reduce 6 bytes per DME message (header and 
trailer)

References on improvements
https://github.com/ning/jvm-compressor-benchmark/blob/master/src/main/java/com/ning/jcbm/gzip/JDKGzipDriver.java
http://stackoverflow.com/questions/13059533/how-to-use-java-deflateroutputstream
http://java-performance.info/performance-general-compression/

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --
>
> Key: TEZ-3163
> URL: https://issues.apache.org/jira/browse/TEZ-3163
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, 
> TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)