Github user alexbaretta commented on the pull request:
https://github.com/apache/spark/pull/4039#issuecomment-70019159
@squito This is not new functionality for which it would make sense to
write a unit test. This is a hotfix for a bug. I am completely unfamiliar with
this code, but I understand pretty well that although Int is a subtype of Any,
MutableInt is not a subtype of MutableAny; hence, whereas, it is possible to
cast val declared of type Any to an Int--a type-unsafe operation that can fail
hard but can also succeed if the payload of the Any val is indeed an int--a
cast from MutableAny to MutableInt is simply impossible and will necessarily
fail, even if the payload of the MutableAny is indeed an Int. If you look at
the JIRA you will see this as the cause of failure:
Caused by: java.lang.ClassCastException:
org.apache.spark.sql.catalyst.expressions.MutableAny cannot be cast to
org.apache.spark.sql.catalyst.expressions.MutableInt
Now, a question worth asking to the author of this class, is why does
SparkSQL rely on this type-casting mechanism to parse Parquet files? I am
inclined to believe that there is a deeper issue here. That being said, my
patch does allow my SQL queries to complete successfully against my Parquet
dataset instead of failing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]