Anthony Hsu created GOBBLIN-556: ----------------------------------- Summary: AvroUtils reads and writes UTF rather than chars Key: GOBBLIN-556 URL: https://issues.apache.org/jira/browse/GOBBLIN-556 Project: Apache Gobblin Issue Type: Bug Reporter: Anthony Hsu
In GOBBLIN-485 and GOBBLIN-514, AvroUtils was updated to writeUTF and readUTF. This causes SchemaParseExceptions like: {noformat} org.apache.avro.SchemaParseException: org.codehaus.jackson.JsonParseException: Unexpected character ('Ë' (code 203)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: org.apache.hadoop.hdfs.client.HdfsDataInputStream@1d4fd4ee; line: 1, column: 2] at org.apache.avro.Schema$Parser.parse(Schema.java:1034) at org.apache.avro.Schema$Parser.parse(Schema.java:1004) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.getSchemaFor(AvroSerdeUtils.java:295) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.getSchemaFromFS(AvroSerdeUtils.java:166) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:135) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:177) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:103) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:80) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:390) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getFieldSchemas(HiveMetaStoreUtils.java:351) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getStorageDescriptor(HiveMetaStoreUtils.java:209) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getTable(HiveMetaStoreUtils.java:115) at org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.registerPath(HiveMetaStoreBasedRegister.java:152) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:113) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:97) at org.apache.gobblin.util.executors.MDCPropagatingCallable.call(MDCPropagatingCallable.java:42) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.codehaus.jackson.JsonParseException: Unexpected character ('Ë' (code 203)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')” {noformat} We should switch back to using writeChars and readChar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)