[
https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064576#comment-14064576
]
Cheng Hao commented on SPARK-2523:
----------------------------------
I think the root cause is the when ALTER table with different SERDE, it will
not affect the existed partitions, but table and all subsequent partitions will
inherit from it, so we got different SerDEs for different Partitions when new
partitions added afterward.
The original implementation of HiveTableScan gets the ObjectInspectors from the
TableDesc, which is not correct for the existed partitions in this case.
This PR solve that by utilizing the partition desc for SerDe instantiation, and
convert it into Catalyst MutableRow directly while scanning the partition, I
think it's more straightforward as we don't need the ObjectInspector for the
downstream operators. (In Shark we do have to make a uniform ObjectInspector
for downstream operators, that's why we have to serialize the row and then
deserialize it again in [PR1390|https://github.com/apache/spark/pull/1390])
> Potential Bugs if SerDe is not the identical among partitions and table
> -----------------------------------------------------------------------
>
> Key: SPARK-2523
> URL: https://issues.apache.org/jira/browse/SPARK-2523
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Reporter: Cheng Hao
>
> In HiveTableScan.scala, ObjectInspector was created for all of the partition
> based records, which probably causes ClassCastException if the object
> inspector is not identical among table & partitions.
--
This message was sent by Atlassian JIRA
(v6.2#6252)