Github user AndreSchumacher commented on the pull request:
https://github.com/apache/spark/pull/934#issuecomment-45002954
Parquet's `FIXED_LEN_BYTE_ARRAY` should be not as bad as arbitrary length
arrays (since the length is the same for all "rows" and it does not require
nesting in the conversion) but yeah I see your point.
I checked Hive's type conversion and it turns out it maps its ints, shorts
and bytes all to `INT32`. I could not immediately figure out what Impala does
but it seems to use Thrift and at least the parquet-thrift converters do the
same. So I will change it as you suggested.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---