[jira] [Updated] (HIVE-4885) Alternative object serialization for execution plan in hive testing
[ https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4885: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Xuefu! Alternative object serialization for execution plan in hive testing Key: HIVE-4885 URL: https://issues.apache.org/jira/browse/HIVE-4885 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4885.patch Currently there are a lot of test cases involving in comparing execution plan, such as those in TestParse suite. XmlEncoder is used to serialize the generated plan by hive, and store it in the file for file diff comparison. However, XmlEncoder is tied with Java compiler, whose implementation may change from version to version. Thus, upgrade the compiler can generate a lot of fake test failures. The following is an example of diff generated when running hive with JDK7: {code} Begin query: case_sensitivity.q diff -a /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out diff -a -b /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml 3c3 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0 --- object id=MapRedTask0 class=org.apache.hadoop.hive.ql.exec.MapRedTask 12c12 object class=java.util.ArrayList id=ArrayList0 --- object id=ArrayList0 class=java.util.ArrayList 14c14 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask0 --- object id=MoveTask0 class=org.apache.hadoop.hive.ql.exec.MoveTask 18c18 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask1 --- object id=MoveTask1 class=org.apache.hadoop.hive.ql.exec.MoveTask 22c22 object class=org.apache.hadoop.hive.ql.exec.StatsTask id=StatsTask0 --- object id=StatsTask0 class=org.apache.hadoop.hive.ql.exec.StatsTask 60c60 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask1 --- object id=MapRedTask1 class=org.apache.hadoop.hive.ql.exec.MapRedTask {code} As it can be seen, the only difference is the order of the attributes in the serialized XML doc, yet it brings 50+ test failures in Hive. We need to have a better plan comparison, or object serialization to improve the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4885) Alternative object serialization for execution plan in hive testing
[ https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4885: -- Fix Version/s: 0.12.0 Status: Patch Available (was: Open) Alternative object serialization for execution plan in hive testing Key: HIVE-4885 URL: https://issues.apache.org/jira/browse/HIVE-4885 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4885.patch Currently there are a lot of test cases involving in comparing execution plan, such as those in TestParse suite. XmlEncoder is used to serialize the generated plan by hive, and store it in the file for file diff comparison. However, XmlEncoder is tied with Java compiler, whose implementation may change from version to version. Thus, upgrade the compiler can generate a lot of fake test failures. The following is an example of diff generated when running hive with JDK7: {code} Begin query: case_sensitivity.q diff -a /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out diff -a -b /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml 3c3 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0 --- object id=MapRedTask0 class=org.apache.hadoop.hive.ql.exec.MapRedTask 12c12 object class=java.util.ArrayList id=ArrayList0 --- object id=ArrayList0 class=java.util.ArrayList 14c14 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask0 --- object id=MoveTask0 class=org.apache.hadoop.hive.ql.exec.MoveTask 18c18 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask1 --- object id=MoveTask1 class=org.apache.hadoop.hive.ql.exec.MoveTask 22c22 object class=org.apache.hadoop.hive.ql.exec.StatsTask id=StatsTask0 --- object id=StatsTask0 class=org.apache.hadoop.hive.ql.exec.StatsTask 60c60 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask1 --- object id=MapRedTask1 class=org.apache.hadoop.hive.ql.exec.MapRedTask {code} As it can be seen, the only difference is the order of the attributes in the serialized XML doc, yet it brings 50+ test failures in Hive. We need to have a better plan comparison, or object serialization to improve the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4885) Alternative object serialization for execution plan in hive testing
[ https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4885: -- Attachment: HIVE-4885.patch Alternative object serialization for execution plan in hive testing Key: HIVE-4885 URL: https://issues.apache.org/jira/browse/HIVE-4885 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4885.patch Currently there are a lot of test cases involving in comparing execution plan, such as those in TestParse suite. XmlEncoder is used to serialize the generated plan by hive, and store it in the file for file diff comparison. However, XmlEncoder is tied with Java compiler, whose implementation may change from version to version. Thus, upgrade the compiler can generate a lot of fake test failures. The following is an example of diff generated when running hive with JDK7: {code} Begin query: case_sensitivity.q diff -a /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out diff -a -b /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml 3c3 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0 --- object id=MapRedTask0 class=org.apache.hadoop.hive.ql.exec.MapRedTask 12c12 object class=java.util.ArrayList id=ArrayList0 --- object id=ArrayList0 class=java.util.ArrayList 14c14 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask0 --- object id=MoveTask0 class=org.apache.hadoop.hive.ql.exec.MoveTask 18c18 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask1 --- object id=MoveTask1 class=org.apache.hadoop.hive.ql.exec.MoveTask 22c22 object class=org.apache.hadoop.hive.ql.exec.StatsTask id=StatsTask0 --- object id=StatsTask0 class=org.apache.hadoop.hive.ql.exec.StatsTask 60c60 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask1 --- object id=MapRedTask1 class=org.apache.hadoop.hive.ql.exec.MapRedTask {code} As it can be seen, the only difference is the order of the attributes in the serialized XML doc, yet it brings 50+ test failures in Hive. We need to have a better plan comparison, or object serialization to improve the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira