Istvan Darvas created HUDI-4047:
-----------------------------------

             Summary: hoodie.avro.schema.validate error message refact
                 Key: HUDI-4047
                 URL: https://issues.apache.org/jira/browse/HUDI-4047
             Project: Apache Hudi
          Issue Type: New Feature
            Reporter: Istvan Darvas


Hi Guys!

 

I have just used the schema validation and works as a charm, but :)

 

A few things would be very usefull

1.) after the error message: Failed schema compatibility check for \{FULL JSON 
Compatible payload should come}

2.) in the JSON payload should contain "violations": [\{item}, \{item}] 

  if go over all the violations is complex, or not performant then just print 
the first oine

   "violation": \{"fiel_name": fied_name,  "writerSchema": "writer_schema, 
"tableSchema": table_schema }

 

Why? ;) - if someone has a table with a lot of cols/feature would be easier to 
find the discrepenacy.

So the debug process would be copy the FULL JSON into a json editor, and check 
the nodes... 

this would speed up the the debug for me ;) but maybe I am not alone with this.

 

Thanks,

 

I got an exception like this for example and I would like a nicer one like I 
explained above:

Caused by: org.apache.hudi.exception.HoodieException: Failed schema 
compatibility check for writerSchema 
:\{"type":"record","name":"iot_raw__ingress_pkg_decoded_rep_record","namespace":"hoodie.iot_raw__ingress_pkg_decoded_rep","fields":[{"name":"_hoodie_commit_time","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_commit_seqno","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_record_key","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_partition_path","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_file_name","type":["null","string"],"doc":"","default":null},\{"name":"correlation_id","type":["null","string"],"default":null},\{"name":"iot_pkg_receive_time","type":["null",{"type":"long","logicalType":"timestamp-micros"}],"default":null},\{"name":"parsing_time","type":["null",{"type":"long","logicalType":"timestamp-micros"}],"default":null},\{"name":"receive_time","type":["null",{"type":"long","logicalType":"timestamp-micros"}],"default":null},\{"name":"aggregate_id","type":["null","string"],"default":null},\{"name":"message_id","type":["null","int"],"default":null},\{"name":"message_type_name","type":["null","string"],"default":null},\{"name":"message_type_id","type":["null","int"],"default":null},\{"name":"report_id","type":["null","int"],"default":null},\{"name":"report_type_name","type":["null","string"],"default":null},\{"name":"report_type_id","type":["null","int"],"default":null},\{"name":"report_dcd_payload","type":["null","string"],"default":null}]},
 table schema 
:\{"type":"record","name":"hoodie_source","namespace":"hoodie.source","fields":[{"name":"_hoodie_commit_time","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_commit_seqno","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_record_key","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_partition_path","type":["null","string"],"doc":"","default":null},\{"name":"_hoodie_file_name","type":["null","string"],"doc":"","default":null},\{"name":"correlation_id","type":"string"},\{"name":"iot_pkg_receive_time","type":{"type":"long","logicalType":"timestamp-micros"}},\{"name":"parsing_time","type":{"type":"long","logicalType":"timestamp-micros"}},\{"name":"receive_time","type":{"type":"long","logicalType":"timestamp-micros"}},\{"name":"aggregate_id","type":"string"},\{"name":"message_id","type":"int"},\{"name":"message_type_name","type":"string"},\{"name":"message_type_id","type":"int"},\{"name":"report_id","type":"int"},\{"name":"report_type_name","type":"string"},\{"name":"report_type_id","type":"int"},\{"name":"report_dcd_payload","type":"string"},\{"name":"year","type":["null","int"],"default":null},\{"name":"month","type":["null","int"],"default":null},\{"name":"day","type":["null","int"],"default":null}]},
 base path :s3://scgps-datalake/iot_raw/ingress_pkg_decoded_rep
        at 
org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:682)
        at 
org.apache.hudi.table.HoodieTable.validateUpsertSchema(HoodieTable.java:688)
        ... 42 more

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to