[ 
https://issues.apache.org/jira/browse/HUDI-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-6028:
-----------------------------
    Sprint: Sprint 2023-04-10

> GCS incr source does not handle pubsub message properly
> -------------------------------------------------------
>
>                 Key: HUDI-6028
>                 URL: https://issues.apache.org/jira/browse/HUDI-6028
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: deltastreamer
>            Reporter: Raymond Xu
>            Priority: Major
>
> Gcs event source uses schema converter from spark and won't handle field name 
> with hyphen in nested column. a sample message
> {code:java}
> 23/04/03 19:23:45 DEBUG GcsEventsSource: msg: {
>   "kind": "storage#object",
>   "id": "",
>   "selfLink": "",
>   "name": "",
>   "bucket": "",
>   "generation": "1680505551370137",
>   "metageneration": "1",
>   "contentType": "application/octet-stream",
>   "timeCreated": "2023-04-03T07:05:51.373Z",
>   "updated": "2023-04-03T07:05:51.373Z",
>   "storageClass": "STANDARD",
>   "timeStorageClassUpdated": "2023-04-03T07:05:51.373Z",
>   "size": "6707",
>   "md5Hash": "",
>   "mediaLink": "",
>   "metadata": {
>     "goog-reserved-file-mtime": "1680503048"
>   },
>   "crc32c": "",
>   "etag": ""
> }
> {code}
> and it throws
> {code}
> Exception in thread "main" org.apache.avro.SchemaParseException: Illegal 
> character in: goog-reserved-file-mtime
>       at org.apache.avro.Schema.validateName(Schema.java:1571)
>       at org.apache.avro.Schema.access$400(Schema.java:92)
>       at org.apache.avro.Schema$Field.<init>(Schema.java:549)
>       at 
> org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2258)
>       at 
> org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2254)
>       at 
> org.apache.avro.SchemaBuilder$FieldBuilder.access$5100(SchemaBuilder.java:2150)
>       at 
> org.apache.avro.SchemaBuilder$GenericDefault.noDefault(SchemaBuilder.java:2557)
>       at 
> org.apache.hudi.org.apache.spark.sql.avro.SchemaConverters$.$anonfun$toAvroType$2(SchemaConverters.scala:205)
> {code}
> This is a problem with org.apache.spark.sql.avro.SchemaConverters#toAvroType



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to