umehrot2 commented on a change in pull request #1559:
URL: https://github.com/apache/incubator-hudi/pull/1559#discussion_r417020636
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/table/TableSchemaResolver.java
##########
@@ -145,23 +146,37 @@ public MessageType getDataSchema() throws Exception {
* @return Avro schema for this table
* @throws Exception
*/
- public Schema getTableSchema() throws Exception {
- return convertParquetSchemaToAvro(getDataSchema());
+ public Schema getTableSchemaInAvroFormat() throws Exception {
+ Option<Schema> schemaFromCommitMetadata =
getTableSchemaFromCommitMetadata();
+ return schemaFromCommitMetadata.isPresent() ?
schemaFromCommitMetadata.get() :
+ convertParquetSchemaToAvro(getDataSchema());
+ }
+
+ /**
+ * Gets the schema for a hoodie table in Parquet format.
+ *
+ * @return Parquet schema for the table
+ * @throws Exception
+ */
+ public MessageType getTableSchemaInParquetFormat() throws Exception {
+ Option<Schema> schemaFromCommitMetadata =
getTableSchemaFromCommitMetadata();
+ return schemaFromCommitMetadata.isPresent() ?
convertAvroSchemaToParquet(schemaFromCommitMetadata.get()) :
+ getDataSchema();
}
/**
* Gets the schema for a hoodie table in Avro format from the
HoodieCommitMetadata of the last commit.
*
* @return Avro schema for this table
- * @throws Exception
*/
- public Schema getTableSchemaFromCommitMetadata() throws Exception {
+ private Option<Schema> getTableSchemaFromCommitMetadata() {
try {
HoodieTimeline timeline =
metaClient.getActiveTimeline().getCommitsTimeline().filterCompletedInstants();
byte[] data =
timeline.getInstantDetails(timeline.lastInstant().get()).get();
HoodieCommitMetadata metadata = HoodieCommitMetadata.fromBytes(data,
HoodieCommitMetadata.class);
String existingSchemaStr =
metadata.getMetadata(HoodieCommitMetadata.SCHEMA_KEY);
- return new Schema.Parser().parse(existingSchemaStr);
+ return StringUtils.isNullOrEmpty(existingSchemaStr) ? Option.empty() :
Review comment:
Agreed. This would be a good addition and make it cleaner.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]