[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629316#comment-13629316 ] Hudson commented on HIVE-4078: -- Integrated in Hive-trunk-hadoop2 #151 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/151/]) HIVE-4078 : Delay the serialize-deserialize pair in CommonJoinTaskDispatcher (Gopal V via Ashutosh Chauhan) (Revision 1466768) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466768 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Fix For: 0.11.0 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629362#comment-13629362 ] Hudson commented on HIVE-4078: -- Integrated in Hive-trunk-h0.21 #2056 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2056/]) HIVE-4078 : Delay the serialize-deserialize pair in CommonJoinTaskDispatcher (Gopal V via Ashutosh Chauhan) (Revision 1466768) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466768 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Fix For: 0.11.0 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628018#comment-13628018 ] Ashutosh Chauhan commented on HIVE-4078: Proceeding with commit, since it improves state of the art and if there is a better way of doing things we can explore that on a new jira. Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620685#comment-13620685 ] Namit Jain commented on HIVE-4078: -- [~gopalv], does cloneBean() perform a complete deep copy ? Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620884#comment-13620884 ] Gopal V commented on HIVE-4078: --- No, [~namit] cloneBean() does not perform a deep copy properly. I had dropped that approach during the third iteration. SerializationUtils.clone() from apache-commons does do a deepClone, so does uk.com.robust-it's cloning lib. But those do not work because some of the tree items don't implement Serializable or depend on the getter/setter actions. I updated the patch to not do a serialize/deserialize when the tasks are non-conditional, since the conversion doesn't need to be reversible. That speeds up query27 by avoiding that step, but the conditional map-joins still need to go through the slow serialize/deserialize pair inside the for loop. Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620992#comment-13620992 ] Ashutosh Chauhan commented on HIVE-4078: [~gopalv] Since other patches on which patch this was based are not progressing, would you like to rebase this on trunk. Gains are impressive and patch looks benign to me, so lets get this in. Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
[ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621666#comment-13621666 ] Ashutosh Chauhan commented on HIVE-4078: All tests passed. [~namit] Would you want to take another look or shall I proceed with commit? Delay the serialize-deserialize pair in CommonJoinTaskDispatcher Key: HIVE-4078 URL: https://issues.apache.org/jira/browse/HIVE-4078 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gopal V Assignee: Gopal V Labels: client, perfomance Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join {code} // deep copy a new mapred work from xml InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8)); MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf()); {code} which is a very heavy operation memory wise cpu-wise. It would be better to do this only if a conditional task is required, resulting in a copy of the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira