kbendick commented on a change in pull request #3435:
URL: https://github.com/apache/iceberg/pull/3435#discussion_r740417000
##########
File path:
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java
##########
@@ -292,16 +293,17 @@ private SparkTableUtil() {
* @param conf a serializable Hadoop conf
* @param metricsConfig a metrics conf
* @param mapping a name mapping
+ * @param schema schema of the target table
Review comment:
Nit: Consider adding a note that this is nullable (as it's only used in
Parquet).
Though if the schema is always available in places we use it, then maybe not
a concern and easier to require people to pass it through in case we add
support for other file formats.
##########
File path: data/src/main/java/org/apache/iceberg/data/TableMigrationUtil.java
##########
@@ -181,4 +195,20 @@ private TableMigrationUtil() {
throw new RuntimeException("Unable to list files in partition: " +
partitionUri, e);
}
}
+
+ /**
+ * @param importSchema the schema of file being imported
+ * @param tableSchema schema of the table to which file is to be imported
+ * @return if the files schema is compatible with table's schema
+ */
+ public static boolean canImportSchema(Schema importSchema, Schema
tableSchema) {
+ // Assigning ids by name look up, required for checking compatibility
+ Schema schemaWithIds = TypeUtil.assignFreshIds(importSchema, tableSchema,
new AtomicInteger(1000)::incrementAndGet);
+ List<String> errors =
CheckCompatibility.writeCompatibilityErrors(tableSchema, schemaWithIds, false);
Review comment:
Nit: Consider adding an in-line comment to the `false` to document what
the parameter is with `false /* ??? */`.
##########
File path: data/src/main/java/org/apache/iceberg/data/TableMigrationUtil.java
##########
@@ -73,11 +84,11 @@ private TableMigrationUtil() {
*/
public static List<DataFile> listPartition(Map<String, String> partition,
String uri, String format,
PartitionSpec spec, Configuration
conf, MetricsConfig metricsConfig,
- NameMapping mapping) {
+ NameMapping mapping, Schema
schema) {
Review comment:
Nit: Consider adding a comment to the javadoc saying schema is ignored
when not using parquet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]