[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120715#comment-16120715 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881081/TEZ-3813.006.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.rm.TestTaskScheduler Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2609//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2609//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120652#comment-16120652 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881063/TEZ-3813.005.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in : org.apache.tez.test.TestTezJobs Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120590#comment-16120590 ] Jonathan Eagles commented on TEZ-3813: -- Couple of minor nits Seems we can removed this commented out code {code:title=FetchedInput.java} +// public long getActualSize() { +//return this.actualSize; +// } +// +// public long getCompressedSize() { +//return this.compressedSize; +// } {code} We should add \@Override to this and others who override getSize {code:title=MemoryFetchedInput.java} public long getSize() {code} > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120166#comment-16120166 ] Jonathan Eagles commented on TEZ-3813: -- [~samirkhan], Can we try removing the MemoryFetchedInput#size member. That would allow us to move us one 8 bytes boundary more for this object. We will have to avoid the null pointer exception in SimpleFetchedInputAllocator#cleanup. Perhaps just moving byteArray = null; below the notifyFreedResource call? > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118859#comment-16118859 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12880868/TEZ-3813.004.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.client.TestTezClient Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2604//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2604//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117457#comment-16117457 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12880711/TEZ-3813.003.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2603//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2603//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2603//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117354#comment-16117354 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12880696/TEZ-3813.002.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2602//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2602//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2602//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114738#comment-16114738 ] Jonathan Eagles commented on TEZ-3813: -- [~samirkhan], can you fix the findbugs warning. I'm not sure if the exception is already present or missing in the findbugs exception file. There is some extra code since getOutputStream is only used for Type.DISK and never Type.MEMORY, but it does not harm. If you have time you can refactor that, but it is fine the way it is. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113843#comment-16113843 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12880300/TEZ-3813.001.patch against master revision 614937c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2599//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2599//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2599//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113454#comment-16113454 ] Muhammad Samir Khan commented on TEZ-3813: -- Tested with filterLinesByWord and compared output before and after. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113451#comment-16113451 ] Muhammad Samir Khan commented on TEZ-3813: -- *JOL Dump:* +Before:+ Internals: {code} # Running 64-bit HotSpot VM. # Using compressed oop with 3-bit shift. # Using compressed klass with 3-bit shift. # Objects are 8 bytes aligned. # Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] # Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] Instantiated the sample instance via public org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback) org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 01 00 00 00 (0001 ) (1) 4 4 (object header) 00 00 00 00 ( ) (0) 8 4 (object header) 7a 12 01 f8 (0010 00010010 0001 1000) (-134147462) 12 4 int FetchedInput.id 0 16 8 long FetchedInput.actualSize 0 24 8 long FetchedInput.compressedSize 0 32 4 org.apache.tez.runtime.library.common.InputAttemptIdentifier FetchedInput.inputAttemptIdentifier null 36 4 org.apache.tez.runtime.library.common.shuffle.FetchedInput.Type FetchedInput.type (object) 40 4 org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback FetchedInput.callback null 44 4 org.apache.tez.runtime.library.common.shuffle.FetchedInput.State FetchedInput.state(object) 48 4 org.apache.hadoop.io.BoundedByteArrayOutputStream MemoryFetchedInput.byteStream (object) 52 4 (loss due to the next object alignment) Instance size: 56 bytes Space losses: 0 bytes internal + 4 bytes external = 4 bytes total {code} Footprint: {code} # Running 64-bit HotSpot VM. # Using compressed oop with 3-bit shift. # Using compressed klass with 3-bit shift. # Objects are 8 bytes aligned. # Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] # Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] Instantiated the sample instance via public org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback) org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput@215be6bbd footprint: COUNT AVG SUM DESCRIPTION 11616 [B 23264 [C 22448 java.lang.String 13232 org.apache.hadoop.io.BoundedByteArrayOutputStream 12424 org.apache.tez.runtime.library.common.shuffle.FetchedInput$State 12424 org.apache.tez.runtime.library.common.shuffle.FetchedInput$Type 15656 org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput 9 264 (total) {code} +After:+ Internals: {code} # Running 64-bit HotSpot VM. # Using compressed oop with 3-bit shift. # Using compressed klass with 3-bit shift. # Objects are 8 bytes aligned. # Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] # Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] Instantiated the sample instance via public org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback) org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 01 00 00 00 (0001 ) (1)