[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-05-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998181#comment-13998181
 ] 

Hudson commented on MAPREDUCE-5821:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1779 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1779/])
MAPREDUCE-5821. Avoid unintentional reallocation of byte arrays in segments
during merge. Contributed by Todd Lipcon (cdouglas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594654)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java


 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 3.0.0, 2.5.0, 2.4.1

 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-05-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998219#comment-13998219
 ] 

Hudson commented on MAPREDUCE-5821:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/])
MAPREDUCE-5821. Avoid unintentional reallocation of byte arrays in segments
during merge. Contributed by Todd Lipcon (cdouglas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594654)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java


 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 3.0.0, 2.5.0, 2.4.1

 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-05-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999011#comment-13999011
 ] 

Hudson commented on MAPREDUCE-5821:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5605/])
MAPREDUCE-5821. Avoid unintentional reallocation of byte arrays in segments
during merge. Contributed by Todd Lipcon (cdouglas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594654)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java


 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 3.0.0, 2.5.0, 2.4.1

 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967652#comment-13967652
 ] 

Chris Douglas commented on MAPREDUCE-5821:
--

+1 This looks like the intended behavior from HADOOP-5494

Good catch

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-08 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963651#comment-13963651
 ] 

Chris Douglas commented on MAPREDUCE-5821:
--

Sure, I can take a look later this week if it can wait.

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963703#comment-13963703
 ] 

Todd Lipcon commented on MAPREDUCE-5821:


no rush. It's been there for years, so what's another week? :)

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960190#comment-13960190
 ] 

Todd Lipcon commented on MAPREDUCE-5821:


The issue is that, if the input buffer doesn't have room for the value, it will 
allocate a new array. But, whenever the Merger calls nextRawValue on a disk 
file, it always first resets the buffer to {{diskIFileValue.getData()}} which 
is empty. So, at the entry to that function, the length is always 0, and the 
code path which reallocs is always taken.

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon

 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960213#comment-13960213
 ] 

Hadoop QA commented on MAPREDUCE-5821:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12638732/after-patch.png
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4485//console

This message is automatically generated.

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-04 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960314#comment-13960314
 ] 

Karthik Kambatla commented on MAPREDUCE-5821:
-

Patch looks good to me. [~tlipcon] - mind updating the patch to apply against 
trunk? 

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960824#comment-13960824
 ] 

Hadoop QA commented on MAPREDUCE-5821:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12638796/mapreduce-5821.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4487//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4487//console

This message is automatically generated.

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5821) IFile merge allocates new byte array for every value

2014-04-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960866#comment-13960866
 ] 

Todd Lipcon commented on MAPREDUCE-5821:


No new tests because this is a performance fix.

[~chris.douglas] - if you have a spare minute, want to take a look at this? I 
think you were the one who worked on this area of the code back in 2009.

 IFile merge allocates new byte array for every value
 

 Key: MAPREDUCE-5821
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5821
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 2.4.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: after-patch.png, before-patch.png, mapreduce-5821.txt, 
 mapreduce-5821.txt


 I wrote a standalone benchmark of the MapOutputBuffer and found that it did a 
 lot of allocations during the merge phase. After looking at an allocation 
 profile, I found that IFile.Reader.nextRawValue() would always allocate a new 
 byte array for every value, so the allocation rate goes way up during the 
 merge phase of the mapper. I imagine this also affects the reducer input, 
 though I didn't profile that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)