hsiang-c commented on code in PR #3247:
URL: https://github.com/apache/datafusion-comet/pull/3247#discussion_r2718918850


##########
spark/src/test/scala/org/apache/comet/IcebergReadFromS3Suite.scala:
##########
@@ -163,6 +163,41 @@ class IcebergReadFromS3Suite extends CometS3TestBase {
     }
   }
 
+  test("large scale partitioned table - 100 partitions with many files") {
+    assume(icebergAvailable, "Iceberg not available in classpath")
+
+    withSQLConf("spark.sql.files.maxRecordsPerFile" -> "50") {
+      spark.sql("""
+        CREATE TABLE s3_catalog.db.large_partitioned_test (
+          id INT,
+          data STRING,
+          partition_id INT
+        ) USING iceberg
+        PARTITIONED BY (partition_id)
+      """)
+
+      spark.sql("""
+        INSERT INTO s3_catalog.db.large_partitioned_test
+        SELECT
+          id,
+          CONCAT('data_', CAST(id AS STRING)) as data,
+          (id % 100) as partition_id
+        FROM range(500000)
+      """)
+
+      checkIcebergNativeScan(
+        "SELECT COUNT(DISTINCT id) FROM s3_catalog.db.large_partitioned_test")
+      checkIcebergNativeScan(
+        "SELECT * FROM s3_catalog.db.large_partitioned_test WHERE id < 10 
ORDER BY id")
+      checkIcebergNativeScan(
+        "SELECT COUNT(*) FROM s3_catalog.db.large_partitioned_test WHERE 
partition_id = 0")
+      checkIcebergNativeScan(
+        "SELECT COUNT(*) FROM s3_catalog.db.large_partitioned_test WHERE 
partition_id IN (0, 50, 99)")
+
+      spark.sql("DROP TABLE s3_catalog.db.large_partitioned_test")

Review Comment:
   (nit) You can try `DROP TABLE s3_catalog.db.large_partitioned_test PURGE` to 
remove files on disk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to