[ https://issues.apache.org/jira/browse/HUDI-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216728#comment-17216728 ]
liwei commented on HUDI-303: ---------------------------- i do not think this should fix. because hive meta column is case insensitive. if do not *lowercase will not match the hive meta schema with avro schema. just like : hive_metastoreConstants.META_TABLE_COLUMNS will be case insensitive.* Map<String, Field> schemaFieldsMap = HoodieRealtimeRecordReaderUtils.getNameToFieldMap(writerSchema); hiveSchema = constructHiveOrderedSchema(writerSchema, schemaFieldsMap); // Get all column names of hive table String hiveColumnString = jobConf.get(hive_metastoreConstants.META_TABLE_COLUMNS); LOG.info("Hive Columns : " + hiveColumnString); String[] hiveColumns = hiveColumnString.split(","); LOG.info("Hive Columns : " + hiveColumnString); List<Field> hiveSchemaFields = new ArrayList<>(); for (String columnName : hiveColumns) { Field field = schemaFieldsMap.get(columnName.toLowerCase()); if (field != null) { hiveSchemaFields.add(new Schema.Field(field.name(), field.schema(), field.doc(), field.defaultVal())); } else { // Hive has some extra virtual columns like BLOCK__OFFSET__INSIDE__FILE which do not exist in table schema. // They will get skipped as they won't be found in the original schema. LOG.debug("Skipping Hive Column => " + columnName); } } > Avro schema case sensitivity testing > ------------------------------------ > > Key: HUDI-303 > URL: https://issues.apache.org/jira/browse/HUDI-303 > Project: Apache Hudi > Issue Type: Test > Components: Spark Integration > Reporter: Udit Mehrotra > Assignee: Udit Mehrotra > Priority: Minor > Labels: bug-bash-0.6.0 > > As a fallout of [PR 956|https://github.com/apache/incubator-hudi/pull/956] we > would like to understand how Avro behaves with case sensitive column names. > Couple of action items: > * Test with different field names just differing in case. > * *AbstractRealtimeRecordReader* is one of the classes where we are > converting Avro Schema field names to lower case, to be able to verify them > against column names from Hive. We can consider removing the *lowercase* > conversion there if we verify it does not break anything. > -- This message was sent by Atlassian Jira (v8.3.4#803005)