aniketnanna commented on issue #7191:
URL: https://github.com/apache/hudi/issues/7191#issuecomment-1313453089
We have 3 issues to solve as mentioned above:
1. Missing data
2. DDL Change
3. Upgrade New version
- 1st issue is more specific to AWS with Hudi.
- 2nd and 3rd issues are more specific to Hudi.
### 1. Missing data:
a. This issue is related to Athena. Connected with support for the same.
b. Support Engineer found an error ---> "can not create year partitions
from string".
This error was found only for a few records and a few tables.
c. Used the following parameter from Hudi Document into the glue job:
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
(reference-
https://hudi.apache.org/docs/syncing_aws_glue_data_catalog/)
d. Current Status:
For a few records, partitions are not getting added to the table
even though data is written in S3 within their respective partitions.
### 2. DDL Change:
a. Hudi version: 0.11.0 explicitly mentions that we can run DDL,
such as ALTER TABLE starting with Hudi 0.11.0 using Spark SQL query.
b. Though DDL changes It is not explicitly mentioned in Hudi 0.10.1
documentation that we can run DDL like executing ALTER TABLE using Spark SQL
query.
But, it provides a 'How To' documentation page to run ALTER
TABLE with Spark SQL queries.
c. it is confusing to understand, whether Hudi 0.10.1 can perform
ALTER TABLE Queries.
d. It threw an error when I tried adding a column from spark.sql.
Attached error screenshot in the case above.
e. Need your guidance to perform mainly 2 DDL changes: Add Column
and Drop Column.
### 3. Upgrade New version:
a. If want to upgrade to the newer version, it's not feasible to
reprocess all data.
b. Need your help to upgrade the Hudi version without affecting the
existing Hudi table data, if only the Hudi version upgrade can solve some of
the issues and AWS is compatible with new Hudi versions above 0.10.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]