mdayakar commented on code in PR #6229:
URL: https://github.com/apache/hive/pull/6229#discussion_r2606665288
##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java:
##########
@@ -143,7 +147,14 @@ public Writable serialize(final Object obj, final
ObjectInspector objInspector)
}
parquetRow.value = obj;
- parquetRow.inspector= (StructObjectInspector)objInspector;
+ // The 'objInspector' coming from Operator may have different type infos
than table column type infos which will lead to the issues like HIVE-26877
+ // so comparing the object inspector created during initialize phase of
this SerDe class and the object inspector coming from Operator
+ // if they are different then using the object inspector created during
initialize phase which is proper
+ if (!ObjectInspectorUtils.compareTypes(writableObjectInspector,
objInspector)) {
+ parquetRow.inspector = (StructObjectInspector) writableObjectInspector;
+ } else {
+ parquetRow.inspector = (StructObjectInspector) objInspector;
+ }
Review Comment:
yeah actually my initial commit(commit 1) has the same solution but it has
some impacts(53 test cases are failed) when analyzed those failures found that
if the data is coming from a TEXT format table then the ObjectInspector coming
from Operator is having of type `Lazy*ObjectInspector`(for some types like Int,
String) where as the ObjectInspector created during ParquetHiveSerDe object is
of type `Writable*ObjectInspector`.
For example consider string data type, the ObjectInspector coming from
Operator is of type `LazyStringObjectInspector` which maintains the
corresponding primitive java object as `LazyString` where as
`WritableStringObjectInspector` maintains the primitive java object as `Text`
which results in ClassCastException while getting the actual data.
So we can not always use the ObjectInspector created during initialization
phase.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]