[
https://issues.apache.org/jira/browse/HUDI-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216728#comment-17216728
]
liwei commented on HUDI-303:
----------------------------
i do not think this should fix. because hive meta column is case insensitive.
if do not *lowercase will not match the hive meta schema with avro schema.
just like : hive_metastoreConstants.META_TABLE_COLUMNS will be case
insensitive.*
Map<String, Field> schemaFieldsMap =
HoodieRealtimeRecordReaderUtils.getNameToFieldMap(writerSchema);
hiveSchema = constructHiveOrderedSchema(writerSchema, schemaFieldsMap);
// Get all column names of hive table
String hiveColumnString =
jobConf.get(hive_metastoreConstants.META_TABLE_COLUMNS);
LOG.info("Hive Columns : " + hiveColumnString);
String[] hiveColumns = hiveColumnString.split(",");
LOG.info("Hive Columns : " + hiveColumnString);
List<Field> hiveSchemaFields = new ArrayList<>();
for (String columnName : hiveColumns) {
Field field = schemaFieldsMap.get(columnName.toLowerCase());
if (field != null) {
hiveSchemaFields.add(new Schema.Field(field.name(), field.schema(),
field.doc(), field.defaultVal()));
} else {
// Hive has some extra virtual columns like BLOCK__OFFSET__INSIDE__FILE which
do not exist in table schema.
// They will get skipped as they won't be found in the original schema.
LOG.debug("Skipping Hive Column => " + columnName);
}
}
> Avro schema case sensitivity testing
> ------------------------------------
>
> Key: HUDI-303
> URL: https://issues.apache.org/jira/browse/HUDI-303
> Project: Apache Hudi
> Issue Type: Test
> Components: Spark Integration
> Reporter: Udit Mehrotra
> Assignee: Udit Mehrotra
> Priority: Minor
> Labels: bug-bash-0.6.0
>
> As a fallout of [PR 956|https://github.com/apache/incubator-hudi/pull/956] we
> would like to understand how Avro behaves with case sensitive column names.
> Couple of action items:
> * Test with different field names just differing in case.
> * *AbstractRealtimeRecordReader* is one of the classes where we are
> converting Avro Schema field names to lower case, to be able to verify them
> against column names from Hive. We can consider removing the *lowercase*
> conversion there if we verify it does not break anything.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)