Here’s all I can find related to this idea.
ParquetHiveSerde is where the raw parquet data is unpacked into readable
POJO. Everything started with this root array ObjectInspector
Hi Thai,
Any links or examples for achieving this? Since I do not have much idea of
this.
On Thu, 30 Aug 2018 20:08 Thai Bui, wrote:
> Another option is to implement a custom ParquetInputFormat extending the
> current Hive MR Parquet format and handle schema coersion at the input
>
Another option is to implement a custom ParquetInputFormat extending the
current Hive MR Parquet format and handle schema coersion at the input
split/record reader level. This would be more involving but guarantee to
work if you could add auxiliary jars to your Hive cluster.
On Wed, Aug 29, 2018
> Because I believe string should be able to handle integer as well.
No, because it is not a lossless conversion. Comparisons are lost.
"9" > "11", but 9 < 11
Even float -> double is lossy (because of epsilon).
You can always apply the Hive workaround suggested, otherwise you might find
Hi,
> optional int32 action_date (DATE);
> optional binary action_date (UTF8);
Those two column types aren't convertible implicitly between each other,
which is probably the problem
In above statement, are you referring to date/utf-8 OR int32/binary..
Because I believe string should be able to
Hi,
> on some days parquet was created by hive 2.1.1 and on some days it was
> created by using glue
…
> After some drill down i saw schema of columns inside both type of parquet
> file using parquet tool and found different data types for some column
...
> optional int32 action_date (DATE);
>
Hi All,
We have a use case where we have created a partition external table in hive
2.3.3 which is pointing to a parquet location where we have date level
folder and on some days parquet was created by hive 2.1.1 and on some days
it was created by using glue. Now when we trying to read this data,