soumyakanti3578 commented on code in PR #5131:
URL: https://github.com/apache/hive/pull/5131#discussion_r1576566058
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:
##########
@@ -1745,9 +1748,72 @@ public RelNode apply(RelOptCluster cluster, RelOptSchema
relOptSchema, SchemaPlu
if (LOG.isDebugEnabled()) {
LOG.debug("Plan after post-join transformations:\n" +
RelOptUtil.toString(calcitePlan));
}
+ perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER);
+
+ if
(conf.getBoolVar(ConfVars.TEST_CBO_PLAN_SERIALIZATION_DESERIALIZATION_ENABLED))
{
+ calcitePlan = testSerializationAndDeserialization(perfLogger,
calcitePlan);
+ }
+
+ return calcitePlan;
+ }
+
+ @Nullable
+ private RelNode testSerializationAndDeserialization(PerfLogger perfLogger,
RelNode calcitePlan) {
+ if (!isSerializable(calcitePlan)) {
+ return calcitePlan;
+ }
+ perfLogger.perfLogBegin(this.getClass().getName(), "plan serializer");
+ String calcitePlanJson = serializePlan(calcitePlan);
+ perfLogger.perfLogEnd(this.getClass().getName(), "plan serializer");
+
+ if (stringSizeGreaterThan(calcitePlanJson,
PLAN_SERIALIZATION_DESERIALIZATION_STR_SIZE_LIMIT)) {
+ return calcitePlan;
+ }
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Size of calcite plan: {}",
calcitePlanJson.getBytes(Charset.defaultCharset()).length);
+ LOG.debug("JSON plan: \n{}", calcitePlanJson);
+ }
+
+ try {
+ perfLogger.perfLogBegin(this.getClass().getName(), "plan
deserializer");
+ RelNode fromJson = deserializePlan(calcitePlan.getCluster(),
calcitePlanJson);
+ perfLogger.perfLogEnd(this.getClass().getName(), "plan deserializer");
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Base plan: \n{}", RelOptUtil.toString(calcitePlan));
+ LOG.debug("Plan from JSON: \n{}", RelOptUtil.toString(fromJson));
+ }
+
+ calcitePlan = fromJson;
+ } catch (IOException e) {
+ throw new RuntimeException(e);
+ }
+
Review Comment:
Only reason I have the method `testSerializationAndDeserialization` here
instead of a unit test is to enable integration tests with just a property from
the qfile. My idea was to enable
`TEST_CBO_PLAN_SERIALIZATION_DESERIALIZATION_ENABLED` from either each
individual qfile, or from `hive-site.xml` for a whole driver like
`TestTezTPCDS30TBPerfCliDriver`.
That would ensure that we will test this for a larger number and more
diverse set of queries, instead of testing a few scenarios with unit tests.
Let me know if you still want me to add a unit test for this and remove this
from here and I will do it. :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]