[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629316#comment-13629316
 ] 

Hudson commented on HIVE-4078:
--

Integrated in Hive-trunk-hadoop2 #151 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/151/])
HIVE-4078 : Delay the serialize-deserialize pair in 
CommonJoinTaskDispatcher (Gopal V via Ashutosh Chauhan) (Revision 1466768)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466768
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java


 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Fix For: 0.11.0

 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629362#comment-13629362
 ] 

Hudson commented on HIVE-4078:
--

Integrated in Hive-trunk-h0.21 #2056 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2056/])
HIVE-4078 : Delay the serialize-deserialize pair in 
CommonJoinTaskDispatcher (Gopal V via Ashutosh Chauhan) (Revision 1466768)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466768
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java


 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Fix For: 0.11.0

 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628018#comment-13628018
 ] 

Ashutosh Chauhan commented on HIVE-4078:


Proceeding with commit, since it improves state of the art and if there is a 
better way of doing things we can explore that on a new jira. 

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-03 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620685#comment-13620685
 ] 

Namit Jain commented on HIVE-4078:
--

[~gopalv], does cloneBean() perform a complete deep copy ?

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620884#comment-13620884
 ] 

Gopal V commented on HIVE-4078:
---

No, [~namit] cloneBean() does not perform a deep copy properly. I had dropped 
that approach during the third iteration.

SerializationUtils.clone() from apache-commons does do a deepClone, so does 
uk.com.robust-it's cloning lib. But those do not work because some of the tree 
items don't implement Serializable or depend on the getter/setter actions.

I updated the patch to not do a serialize/deserialize when the tasks are 
non-conditional, since the conversion doesn't need to be reversible.

That speeds up query27 by avoiding that step, but the conditional map-joins 
still need to go through the slow serialize/deserialize pair inside the for 
loop.

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620992#comment-13620992
 ] 

Ashutosh Chauhan commented on HIVE-4078:


[~gopalv] Since other patches on which patch this was based are not 
progressing, would you like to rebase this on trunk. Gains are impressive and 
patch looks benign to me, so lets get this in.

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-04-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621666#comment-13621666
 ] 

Ashutosh Chauhan commented on HIVE-4078:


All tests passed. [~namit] Would you want to take another look or shall I 
proceed with commit?

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: client, perfomance
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch, HIVE-4078-trunk-rebase.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira