[
https://issues.apache.org/jira/browse/IMPALA-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316243#comment-17316243
]
Gabor Szadovszky commented on IMPALA-7427:
------------------------------------------
[~boroknagyz], as you've described it correctly this field is useful when the
writer library (e.g. parquet-mr) is different than the component which actually
handles the data (e.g. hive, spark). We've created this new field because we
faced issues when trying to create some workarounds defects when the written
parquet file was wrong only in cases when a specific component was written the
data. I am not aware of such situations since we added this field so far.
Anyway, in case of Impala we can easily distinguish the created parquet files
from any other library/component combinations so I am fine if Impala wouldn't
write this field but only the created_by one.
> Write Impala version information to writer.model.name footer field of Parquet
> -----------------------------------------------------------------------------
>
> Key: IMPALA-7427
> URL: https://issues.apache.org/jira/browse/IMPALA-7427
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Zoltan Ivanfi
> Assignee: Amogh Margoor
> Priority: Minor
> Labels: newbie, parquet, ramp-up
>
> PARQUET-352 added support for the "writer.model.name" property in the Parquet
> metadata to identify the object model (application) that wrote the file.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]