wypoon commented on code in PR #7744:
URL: https://github.com/apache/iceberg/pull/7744#discussion_r1237388661


##########
data/src/main/java/org/apache/iceberg/data/TableMigrationUtil.java:
##########
@@ -83,27 +83,47 @@ public static List<DataFile> listPartition(
     return listPartition(partition, uri, format, spec, conf, metricsConfig, 
mapping, 1);
   }
 
+  /**
+   * Returns the data files in a partition by listing the partition location 
using some number of
+   * threads.
+   *
+   * <p>For Parquet and ORC partitions, this will read metrics from the file 
footer. For Avro
+   * partitions, metrics are set to null.
+   *
+   * <p>Note: certain metrics, like NaN counts, that are only supported by 
Iceberg file writers but
+   * not file footers, will not be populated.
+   *
+   * @param partition map of partition columns to column values
+   * @param uri partition location URI
+   * @param format partition format, avro, parquet or orc
+   * @param spec a partition spec
+   * @param conf a Hadoop conf
+   * @param metricsConfig a metrics conf
+   * @param mapping a name mapping
+   * @param parallelism number of threads to use
+   * @return a List of DataFile
+   */
   public static List<DataFile> listPartition(
-      Map<String, String> partitionPath,
-      String partitionUri,
+      Map<String, String> partition,

Review Comment:
   Here I'm just using the same parameter names as in the first `listPartition` 
method above, which is actually the original `listPartition` method. This one 
here was added subsequently to introduce `parallelism`. I changed the parameter 
names here to make them consistent across both methods.
   Admittedly, `partition` is not the best name for the parameter, but it does 
pertain to a single partition, the one we want to list the data files for. So 
`partitions` would not be correct. I didn't call it `partitionValues` here 
because I use that later for a `List<String>`.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to