parisni commented on issue #11599:
URL: https://github.com/apache/hudi/issues/11599#issuecomment-2239542559

   also turns out glue sync likely does not update fields, only sql.schema:
   
   ```
   "float_to_double"",""type"":""double"" 
   VS
   float_to_double         float
   ```
   IN: 
   
   ```
   col_name,data_type,comment
   event_id                bigint
   int_to_long             int
   float_to_double         float
   _kafka_timestamp        bigint
   version                 string
   event_date              string
   event_hour              string
   
   # Partition Information
   # col_name              data_type               comment
   
   version                 string
   event_date              string
   event_hour              string
   
   "Detailed Table Information     Table(tableName:hudi_promotion, 
dbName:identifying_leboncoin, owner:null, createTime:1721405676, 
lastAccessTime:1721405711, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:event_id, type:bigint, comment:), 
FieldSchema(name:int_to_long, type:int, comment:), 
FieldSchema(name:float_to_double, type:float, comment:), 
FieldSchema(name:_kafka_timestamp, type:bigint, comment:), 
FieldSchema(name:version, type:string, comment:), FieldSchema(name:event_date, 
type:string, comment:), FieldSchema(name:event_hour, type:string, comment:)], 
location:s3://bucket/test_promotion/hudi_promotion, 
inputFormat:org.apache.hudi.hadoop.HoodieParquetInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, 
compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, 
parameters:{hoodie.query.as.ro.table=false, serialization.format=1, 
path=s3a://bucket/test_pr
 omotion/hudi_promotion}), bucketCols:[], sortCols:[], parameters:{}, 
storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:version, 
type:string, comment:), FieldSchema(name:event_date, type:string, comment:), 
FieldSchema(name:event_hour, type:string, comment:)], 
parameters:{EXTERNAL=TRUE, last_commit_time_sync=20240719161441010, 
spark.sql.sources.schema.numPartCols=3, hudi.metadata-listing-enabled=FALSE, 
spark.sql.sources.schema.part.0={""type"":""struct"",""fields"":[{""name"":""event_id"",""type"":""long"",""nullable"":true,""metadata"":{}},{""name"":""int_to_long"",""type"":""long"",""nullable"":true,""metadata"":{}},{""name"":""float_to_double"",""type"":""double"",""nullable"":true,""metadata"":{}},{""name"":""_kafka_timestamp"",""type"":""long"",""nullable"":true,""metadata"":{}},{""name"":""version"",""type"":""string"",""nullable"":true,""metadata"":{}},{""name"":""event_date"",""type"":""string"",""nullable"":true,""metadata"":{}},{""name"":""event_hour"",""type"":
 ""string"",""nullable"":true,""metadata"":{}}]}, 
last_commit_completion_time_sync=20240719161507000, 
spark.sql.sources.schema.partCol.0=version, 
spark.sql.sources.schema.partCol.2=event_hour, 
spark.sql.sources.schema.partCol.1=event_date, 
spark.sql.sources.schema.numParts=1, spark.sql.sources.provider=hudi}, 
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)  "
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to