ReemaAlzaid commented on PR #11615:
URL: 
https://github.com/apache/incubator-gluten/pull/11615#issuecomment-3904356968

   Here are the relevant test cases I ran 
   
   ### Before
   ```
   Last login: Sun Feb 15 15:10:56 on ttys088
   (3.13.3) ➜  incubator-gluten git:(main) ✗ export 
GLUTEN_JAR=/Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-SNAPSHOT.jar
   export ICEBERG_JAR=/tmp/iceberg.jar
   
   spark-submit \
     --jars "$GLUTEN_JAR,$ICEBERG_JAR" \
     --conf spark.plugins=org.apache.gluten.GlutenPlugin \
     --conf spark.gluten.sql.columnar.backend.lib=velox \
     --conf spark.gluten.enabled=true \
     --conf spark.driver.extraClassPath="$GLUTEN_JAR:$ICEBERG_JAR" \
     --conf spark.executor.extraClassPath="$GLUTEN_JAR:$ICEBERG_JAR" \
     test_iceberg_simple.py
   26/02/15 15:15:56 WARN Utils: Your hostname, Reemas-MacBook-Pro.local 
resolves to a loopback address: 127.0.0.1; using 192.168.100.32 instead (on 
interface en0)
   26/02/15 15:15:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
another address
   26/02/15 15:15:56 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   26/02/15 15:15:57 INFO SparkContext: Running Spark version 3.5.5
   26/02/15 15:15:57 INFO SparkContext: OS info Mac OS X, 15.6, aarch64
   26/02/15 15:15:57 INFO SparkContext: Java version 17.0.18
   26/02/15 15:15:57 INFO ResourceUtils: 
==============================================================
   26/02/15 15:15:57 INFO ResourceUtils: No custom resources configured for 
spark.driver.
   26/02/15 15:15:57 INFO ResourceUtils: 
==============================================================
   26/02/15 15:15:57 INFO SparkContext: Submitted application: 
iceberg-input-file-metadata-test
   26/02/15 15:15:57 INFO ResourceProfile: Default ResourceProfile created, 
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , 
memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: 
offHeap, amount: 2048, script: , vendor: ), task resources: Map(cpus -> name: 
cpus, amount: 1.0)
   26/02/15 15:15:57 INFO ResourceProfile: Limiting resource is cpu
   26/02/15 15:15:57 INFO ResourceProfileManager: Added ResourceProfile id: 0
   26/02/15 15:15:57 INFO SecurityManager: Changing view acls to: reema
   26/02/15 15:15:57 INFO SecurityManager: Changing modify acls to: reema
   26/02/15 15:15:57 INFO SecurityManager: Changing view acls groups to: 
   26/02/15 15:15:57 INFO SecurityManager: Changing modify acls groups to: 
   26/02/15 15:15:57 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: reema; groups with 
view permissions: EMPTY; users with modify permissions: reema; groups with 
modify permissions: EMPTY
   26/02/15 15:15:57 INFO Utils: Successfully started service 'sparkDriver' on 
port 49661.
   26/02/15 15:15:57 INFO SparkEnv: Registering MapOutputTracker
   26/02/15 15:15:57 INFO SparkEnv: Registering BlockManagerMaster
   26/02/15 15:15:57 INFO BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   26/02/15 15:15:57 INFO BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up
   26/02/15 15:15:57 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
   26/02/15 15:15:57 INFO DiskBlockManager: Created local directory at 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/blockmgr-e6925b7a-9b9b-43a1-8861-1452ad6dda87
   26/02/15 15:15:57 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:15:57 INFO SparkEnv: Registering OutputCommitCoordinator
   26/02/15 15:15:57 INFO JettyUtils: Start Jetty 0.0.0.0:4040 for SparkUI
   26/02/15 15:15:57 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
   26/02/15 15:15:57 INFO SparkContext: Added JAR 
file:///Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-SNAPSHOT.jar
 at 
spark://192.168.100.32:49661/jars/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-SNAPSHOT.jar
 with timestamp 1771157757097
   26/02/15 15:15:57 INFO SparkContext: Added JAR 
file:///private/tmp/iceberg.jar at 
spark://192.168.100.32:49661/jars/iceberg.jar with timestamp 1771157757097
   26/02/15 15:15:57 INFO Discovery: Start discovering components in the 
current classpath... 
   26/02/15 15:15:57 INFO Discovery: Discovered component files: 
org.apache.gluten.backendsapi.velox.VeloxBackend, 
org.apache.gluten.component.VeloxIcebergComponent. Duration: 8 ms.
   26/02/15 15:15:57 INFO package: Components registered within order: velox, 
velox-iceberg
   26/02/15 15:15:57 INFO GlutenDriverPlugin: Gluten components:
   ==============================================================
   Component velox
     velox_branch = HEAD
     velox_revision = f247a8e922c4802fd9b9cf7a626421bff9b803fd
     velox_revisionTime = 2026-02-07 14:11:45 +0000
   Component velox-iceberg
   ==============================================================
   26/02/15 15:15:57 INFO SubstraitBackend: Gluten build info:
   ==============================================================
   Gluten Version: 1.7.0-SNAPSHOT
   GCC Version: 
   Java Version: 17
   Scala Version: 2.12.15
   Spark Version: 3.5.5
   Hadoop Version: 2.7.4
   Gluten Branch: main
   Gluten Revision: be3eeea8c33ddfb5352a37ad7d169e326c4dc1ba
   Gluten Revision Time: 2026-02-13 22:47:03 +0000
   Gluten Build Time: 2026-02-15T12:07:38Z
   Gluten Repo URL: https://github.com/ReemaAlzaid/incubator-gluten.git
   ==============================================================
   26/02/15 15:15:57 INFO VeloxListenerApi: Memory overhead is not set. Setting 
it to 644245094 automatically. Gluten doesn't follow Spark's calculation on 
default value of this option because the actual required memory overhead will 
depend on off-heap usage than on on-heap usage.
   26/02/15 15:15:57 INFO SparkDirectoryUtil: Created local directory at 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b
   26/02/15 15:15:57 INFO JniWorkspace: Creating JNI workspace in root 
directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7
   26/02/15 15:15:57 INFO JniWorkspace: JNI workspace 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281
 created in root directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7
   26/02/15 15:15:57 INFO JniLibLoader: Read real path 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libgluten.dylib
 for libPath 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libgluten.dylib
   26/02/15 15:15:57 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libgluten.dylib
 has been loaded using path-loading method
   26/02/15 15:15:57 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libgluten.dylib
 has been loaded
   26/02/15 15:15:57 INFO JniLibLoader: Successfully loaded library 
darwin/aarch64/libgluten.dylib
   26/02/15 15:15:57 INFO JniLibLoader: Read real path 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libvelox.dylib
 for libPath 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libvelox.dylib
   26/02/15 15:15:57 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libvelox.dylib
 has been loaded using path-loading method
   26/02/15 15:15:57 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-27f14ba0-aa7c-45b8-8c18-2bf2e896d45b/jni/182e3935-f5c1-4a64-86fb-377b2af85cd7/gluten-13074889086958015281/darwin/aarch64/libvelox.dylib
 has been loaded
   26/02/15 15:15:57 INFO JniLibLoader: Successfully loaded library 
darwin/aarch64/libvelox.dylib
   W20260215 15:15:57.885989 14490670 MemoryArbitrator.cpp:84] Query memory 
capacity[460.50MB] is set for NOOP arbitrator which has no capacity enforcement
   26/02/15 15:15:57 INFO DriverPluginContainer: Initialized driver component 
for plugin org.apache.gluten.GlutenPlugin.
   26/02/15 15:15:57 INFO Executor: Starting executor ID driver on host 
192.168.100.32
   26/02/15 15:15:57 INFO Executor: OS info Mac OS X, 15.6, aarch64
   26/02/15 15:15:57 INFO Executor: Java version 17.0.18
   26/02/15 15:15:57 INFO Executor: Starting executor with user classpath 
(userClassPathFirst = false): 
'file:/Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-SNAPSHOT.jar,file:/tmp/iceberg.jar,file:/Users/reema/Desktop/OpenSource/incubator-gluten/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-SNAPSHOT.jar,file:/Users/reema/Desktop/OpenSource/incubator-gluten/iceberg.jar'
   26/02/15 15:15:57 INFO Executor: Created or updated repl class loader 
org.apache.spark.util.MutableURLClassLoader@3d5e1c01 for default.
   26/02/15 15:15:57 INFO CodedInputStreamClassInitializer: The 
defaultRecursionLimit in protobuf has been increased to 100000
   26/02/15 15:15:57 INFO VeloxListenerApi: Gluten is running with Spark local 
mode. Skip running static initializer for executor.
   26/02/15 15:15:57 INFO ExecutorPluginContainer: Initialized executor 
component for plugin org.apache.gluten.GlutenPlugin.
   26/02/15 15:15:57 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 49662.
   26/02/15 15:15:57 INFO NettyBlockTransferService: Server created on 
192.168.100.32:49662
   26/02/15 15:15:57 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   26/02/15 15:15:57 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, 192.168.100.32, 49662, None)
   26/02/15 15:15:57 INFO BlockManagerMasterEndpoint: Registering block manager 
192.168.100.32:49662 with 2.4 GiB RAM, BlockManagerId(driver, 192.168.100.32, 
49662, None)
   26/02/15 15:15:57 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, 192.168.100.32, 49662, None)
   26/02/15 15:15:57 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(driver, 192.168.100.32, 49662, None)
   26/02/15 15:15:58 INFO VeloxBackend: Gluten SQL Tab has been attached.
   26/02/15 15:15:58 INFO SparkShimLoader: Loading Spark Shims for version: 
3.5.5
   26/02/15 15:15:58 INFO SparkShimLoader: Using Shim provider: 
List(org.apache.gluten.sql.shims.spark35.SparkShimProvider@4339652b)
   
================================================================================
   Creating Iceberg table...
   
================================================================================
   26/02/15 15:15:58 INFO SharedState: Setting hive.metastore.warehouse.dir 
('null') to the value of spark.sql.warehouse.dir.
   26/02/15 15:15:58 INFO SharedState: Warehouse path is 
'file:/Users/reema/Desktop/OpenSource/incubator-gluten/spark-warehouse'.
   26/02/15 15:15:58 INFO CatalogUtil: Loading custom FileIO implementation: 
org.apache.iceberg.hadoop.HadoopFileIO
   26/02/15 15:15:59 INFO BaseMetastoreCatalog: Table properties set at catalog 
level through catalog properties: {}
   26/02/15 15:15:59 INFO BaseMetastoreCatalog: Table properties enforced at 
catalog level through catalog properties: {}
   26/02/15 15:15:59 INFO HadoopTableOperations: Committed a new metadata file 
file:/tmp/iceberg_warehouse/default/test_table/metadata/v1.metadata.json
   26/02/15 15:15:59 WARN GlutenFallbackReporter: Validation failed for plan: 
AppendData[QueryId=1], due to: [FallbackByBackendSettings] Validation failed on 
node AppendData
   26/02/15 15:15:59 INFO CodeGenerator: Code generated in 120.717458 ms
   26/02/15 15:15:59 INFO MemoryStore: Block broadcast_0 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:15:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes 
in memory (estimated size 29.7 KiB, free 2.4 GiB)
   26/02/15 15:15:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory 
on 192.168.100.32:49662 (size: 29.7 KiB, free: 2.4 GiB)
   26/02/15 15:15:59 INFO SparkContext: Created broadcast 0 from broadcast at 
SparkWrite.java:195
   26/02/15 15:15:59 INFO AppendDataExec: Start processing data source write 
support: IcebergBatchWrite(table=local.default.test_table, format=PARQUET). The 
input RDD has 3 partitions.
   26/02/15 15:15:59 INFO SparkContext: Starting job: sql at 
NativeMethodAccessorImpl.java:0
   26/02/15 15:15:59 INFO DAGScheduler: Got job 0 (sql at 
NativeMethodAccessorImpl.java:0) with 3 output partitions
   26/02/15 15:15:59 INFO DAGScheduler: Final stage: ResultStage 0 (sql at 
NativeMethodAccessorImpl.java:0)
   26/02/15 15:15:59 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:15:59 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:15:59 INFO DAGScheduler: Submitting ResultStage 0 
(MapPartitionsRDD[1] at sql at NativeMethodAccessorImpl.java:0), which has no 
missing parents
   26/02/15 15:15:59 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 7.8 KiB, free 2.4 GiB)
   26/02/15 15:15:59 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes 
in memory (estimated size 4.4 KiB, free 2.4 GiB)
   26/02/15 15:15:59 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory 
on 192.168.100.32:49662 (size: 4.4 KiB, free: 2.4 GiB)
   26/02/15 15:15:59 INFO SparkContext: Created broadcast 1 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:15:59 INFO DAGScheduler: Submitting 3 missing tasks from 
ResultStage 0 (MapPartitionsRDD[1] at sql at NativeMethodAccessorImpl.java:0) 
(first 15 tasks are for partitions Vector(0, 1, 2))
   26/02/15 15:15:59 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks 
resource profile 0
   26/02/15 15:15:59 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 
0) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 9503 bytes) 
   26/02/15 15:15:59 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 
1) (192.168.100.32, executor driver, partition 1, PROCESS_LOCAL, 9503 bytes) 
   26/02/15 15:15:59 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 
2) (192.168.100.32, executor driver, partition 2, PROCESS_LOCAL, 9503 bytes) 
   26/02/15 15:15:59 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
   26/02/15 15:15:59 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
   26/02/15 15:15:59 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
   26/02/15 15:16:00 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:00 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:00 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:00 INFO DataWritingSparkTask: Writer for partition 0 is 
committing.
   26/02/15 15:16:00 INFO DataWritingSparkTask: Writer for partition 2 is 
committing.
   26/02/15 15:16:00 INFO DataWritingSparkTask: Writer for partition 1 is 
committing.
   26/02/15 15:16:00 INFO DataWritingSparkTask: Committed partition 1 (task 1, 
attempt 0, stage 0.0)
   26/02/15 15:16:00 INFO DataWritingSparkTask: Committed partition 0 (task 0, 
attempt 0, stage 0.0)
   26/02/15 15:16:00 INFO DataWritingSparkTask: Committed partition 2 (task 2, 
attempt 0, stage 0.0)
   26/02/15 15:16:00 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 
4118 bytes result sent to driver
   26/02/15 15:16:00 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 
4114 bytes result sent to driver
   26/02/15 15:16:00 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 
4110 bytes result sent to driver
   26/02/15 15:16:00 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 
2) in 405 ms on 192.168.100.32 (executor driver) (1/3)
   26/02/15 15:16:00 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 
0) in 429 ms on 192.168.100.32 (executor driver) (2/3)
   26/02/15 15:16:00 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 
1) in 406 ms on 192.168.100.32 (executor driver) (3/3)
   26/02/15 15:16:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:00 INFO DAGScheduler: ResultStage 0 (sql at 
NativeMethodAccessorImpl.java:0) finished in 0.476 s
   26/02/15 15:16:00 INFO DAGScheduler: Job 0 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:00 INFO TaskSchedulerImpl: Killing all running tasks in stage 
0: Stage finished
   26/02/15 15:16:00 INFO DAGScheduler: Job 0 finished: sql at 
NativeMethodAccessorImpl.java:0, took 0.513478 s
   26/02/15 15:16:00 INFO AppendDataExec: Data source write support 
IcebergBatchWrite(table=local.default.test_table, format=PARQUET) is committing.
   26/02/15 15:16:00 INFO SparkWrite: Committing append with 3 new data files 
to table local.default.test_table
   26/02/15 15:16:00 INFO HadoopTableOperations: Committed a new metadata file 
file:/tmp/iceberg_warehouse/default/test_table/metadata/v2.metadata.json
   26/02/15 15:16:00 INFO SnapshotProducer: Committed snapshot 
8759041077900200141 (MergeAppend)
   26/02/15 15:16:00 INFO LoggingMetricsReporter: Received metrics report: 
CommitReport{tableName=local.default.test_table, 
snapshotId=8759041077900200141, sequenceNumber=1, operation=append, 
commitMetrics=CommitMetricsResult{totalDuration=TimerResult{timeUnit=NANOSECONDS,
 totalDuration=PT0.302477167S, count=1}, attempts=CounterResult{unit=COUNT, 
value=1}, addedDataFiles=CounterResult{unit=COUNT, value=3}, 
removedDataFiles=null, totalDataFiles=CounterResult{unit=COUNT, value=3}, 
addedDeleteFiles=null, addedEqualityDeleteFiles=null, 
addedPositionalDeleteFiles=null, addedDVs=null, removedDeleteFiles=null, 
removedEqualityDeleteFiles=null, removedPositionalDeleteFiles=null, 
removedDVs=null, totalDeleteFiles=CounterResult{unit=COUNT, value=0}, 
addedRecords=CounterResult{unit=COUNT, value=3}, removedRecords=null, 
totalRecords=CounterResult{unit=COUNT, value=3}, 
addedFilesSizeInBytes=CounterResult{unit=BYTES, value=1920}, 
removedFilesSizeInBytes=null, totalFilesSizeInBytes=CounterResult{uni
 t=BYTES, value=1920}, addedPositionalDeletes=null, 
removedPositionalDeletes=null, totalPositionalDeletes=CounterResult{unit=COUNT, 
value=0}, addedEqualityDeletes=null, removedEqualityDeletes=null, 
totalEqualityDeletes=CounterResult{unit=COUNT, value=0}, manifestsCreated=null, 
manifestsReplaced=null, manifestsKept=null, manifestEntriesProcessed=null}, 
metadata={engine-version=3.5.5, app-id=local-1771157757901, engine-name=spark, 
iceberg-version=Apache Iceberg 1.10.0 (commit 
2114bf631e49af532d66e2ce148ee49dd1dd1f1f)}}
   26/02/15 15:16:00 INFO SparkWrite: Committed in 322 ms
   26/02/15 15:16:00 INFO AppendDataExec: Data source write support 
IcebergBatchWrite(table=local.default.test_table, format=PARQUET) committed.
   
   
================================================================================
   Testing input_file_name() on Iceberg table
   
================================================================================
   
   === input_file_name() Results ===
   26/02/15 15:16:00 INFO V2ScanRelationPushDown: 
   Output: id#7, name#8
            
   26/02/15 15:16:00 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 8759041077900200141 created at 2026-02-15T12:16:00.623+00:00 with 
filter true
   26/02/15 15:16:00 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:00 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:00 INFO MemoryStore: Block broadcast_2 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:00 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:00 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:00 INFO SparkContext: Created broadcast 2 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:00 INFO MemoryStore: Block broadcast_3 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:00 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:00 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:00 INFO SparkContext: Created broadcast 3 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:00 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=2], due to: fallback input file expression
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 8.318542 ms
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_4 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 4 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:01 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:01 INFO DAGScheduler: Got job 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
with 1 output partitions
   26/02/15 15:16:01 INFO DAGScheduler: Final stage: ResultStage 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56)
   26/02/15 15:16:01 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:01 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:01 INFO DAGScheduler: Submitting ResultStage 1 
(MapPartitionsRDD[5] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56), 
which has no missing parents
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_5 stored as values in 
memory (estimated size 16.7 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes 
in memory (estimated size 7.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory 
on 192.168.100.32:49662 (size: 7.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 5 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:01 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 1 (MapPartitionsRDD[5] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:01 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 
3) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11471 bytes) 
   26/02/15 15:16:01 INFO Executor: Running task 0.0 in stage 1.0 (TID 3)
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 4.898333 ms
   26/02/15 15:16:01 INFO Executor: Finished task 0.0 in stage 1.0 (TID 3). 
7086 bytes result sent to driver
   26/02/15 15:16:01 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 
3) in 59 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:01 INFO DAGScheduler: ResultStage 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
finished in 0.064 s
   26/02/15 15:16:01 INFO DAGScheduler: Job 1 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Killing all running tasks in stage 
1: Stage finished
   26/02/15 15:16:01 INFO DAGScheduler: Job 1 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56, 
took 0.067347 s
   ID: 1, Name: Alice, File: ''
   ID: 2, Name: Bob, File: ''
   ID: 3, Name: Charlie, File: ''
   
   ❌ BUG: 3/3 rows have EMPTY file paths!
   
   
================================================================================
   Testing input_file_block_start() on Iceberg table
   
================================================================================
   
   === input_file_block_start() Results ===
   26/02/15 15:16:01 INFO V2ScanRelationPushDown: 
   Output: id#23, name#24
            
   26/02/15 15:16:01 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 8759041077900200141 created at 2026-02-15T12:16:00.623+00:00 with 
filter true
   26/02/15 15:16:01 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:01 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_6 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 6 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_7 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 7 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=3], due to: fallback input file expression
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 4.856583 ms
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_8 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 8 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:01 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:01 INFO DAGScheduler: Got job 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
with 1 output partitions
   26/02/15 15:16:01 INFO DAGScheduler: Final stage: ResultStage 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82)
   26/02/15 15:16:01 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:01 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:01 INFO DAGScheduler: Submitting ResultStage 2 
(MapPartitionsRDD[9] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82), 
which has no missing parents
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_9 stored as values in 
memory (estimated size 16.7 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes 
in memory (estimated size 7.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory 
on 192.168.100.32:49662 (size: 7.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 9 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:01 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 2 (MapPartitionsRDD[9] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:01 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 
4) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11473 bytes) 
   26/02/15 15:16:01 INFO Executor: Running task 0.0 in stage 2.0 (TID 4)
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 4.496625 ms
   26/02/15 15:16:01 INFO Executor: Finished task 0.0 in stage 2.0 (TID 4). 
7037 bytes result sent to driver
   26/02/15 15:16:01 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 
4) in 14 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:01 INFO DAGScheduler: ResultStage 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
finished in 0.017 s
   26/02/15 15:16:01 INFO DAGScheduler: Job 2 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Killing all running tasks in stage 
2: Stage finished
   26/02/15 15:16:01 INFO DAGScheduler: Job 2 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82, 
took 0.018839 s
   ID: 1, Name: Alice, Block Start: -1
   ID: 2, Name: Bob, Block Start: -1
   ID: 3, Name: Charlie, Block Start: -1
   
   ❌ BUG: Some rows have invalid block start positions!
   
   
================================================================================
   Testing input_file_block_length() on Iceberg table
   
================================================================================
   
   === input_file_block_length() Results ===
   26/02/15 15:16:01 INFO V2ScanRelationPushDown: 
   Output: id#39, name#40
            
   26/02/15 15:16:01 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 8759041077900200141 created at 2026-02-15T12:16:00.623+00:00 with 
filter true
   26/02/15 15:16:01 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:01 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_10 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_10_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 10 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_11 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_11_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 11 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=4], due to: fallback input file expression
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 5.000667 ms
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_12 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_12_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 12 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:01 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:01 INFO DAGScheduler: Got job 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
with 1 output partitions
   26/02/15 15:16:01 INFO DAGScheduler: Final stage: ResultStage 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106)
   26/02/15 15:16:01 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:01 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:01 INFO DAGScheduler: Submitting ResultStage 3 
(MapPartitionsRDD[13] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106), 
which has no missing parents
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_13 stored as values in 
memory (estimated size 16.7 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_13_piece0 stored as 
bytes in memory (estimated size 7.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory 
on 192.168.100.32:49662 (size: 7.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 13 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:01 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 3 (MapPartitionsRDD[13] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:01 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 
5) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11473 bytes) 
   26/02/15 15:16:01 INFO Executor: Running task 0.0 in stage 3.0 (TID 5)
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 
192.168.100.32:49662 in memory (size: 7.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 4.27675 ms
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 
192.168.100.32:49662 in memory (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 
192.168.100.32:49662 in memory (size: 4.4 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_11_piece0 on 
192.168.100.32:49662 in memory (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO Executor: Finished task 0.0 in stage 3.0 (TID 5). 
7037 bytes result sent to driver
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_7_piece0 on 
192.168.100.32:49662 in memory (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 
5) in 12 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:01 INFO DAGScheduler: ResultStage 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
finished in 0.020 s
   26/02/15 15:16:01 INFO DAGScheduler: Job 3 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Killing all running tasks in stage 
3: Stage finished
   26/02/15 15:16:01 INFO DAGScheduler: Job 3 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106, 
took 0.020872 s
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
192.168.100.32:49662 in memory (size: 29.7 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Removed broadcast_5_piece0 on 
192.168.100.32:49662 in memory (size: 7.0 KiB, free: 2.4 GiB)
   ID: 1, Name: Alice, Block Length: -1
   ID: 2, Name: Bob, Block Length: -1
   ID: 3, Name: Charlie, Block Length: -1
   
   ❌ BUG: Some rows have invalid block lengths!
   
   
================================================================================
   Testing all three metadata functions together
   
================================================================================
   
   === All Metadata Functions Results ===
   26/02/15 15:16:01 INFO V2ScanRelationPushDown: 
   Output: id#57, name#58
            
   26/02/15 15:16:01 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 8759041077900200141 created at 2026-02-15T12:16:00.623+00:00 with 
filter true
   26/02/15 15:16:01 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:01 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_14 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_14_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_14_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 14 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_15 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_15_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 15 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:01 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=5], due to: fallback input file expression
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 7.722917 ms
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_16 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_16_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_16_piece0 in memory 
on 192.168.100.32:49662 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 16 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:01 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:01 INFO DAGScheduler: Got job 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
with 1 output partitions
   26/02/15 15:16:01 INFO DAGScheduler: Final stage: ResultStage 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132)
   26/02/15 15:16:01 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:01 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:01 INFO DAGScheduler: Submitting ResultStage 4 
(MapPartitionsRDD[17] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132), 
which has no missing parents
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_17 stored as values in 
memory (estimated size 17.0 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO MemoryStore: Block broadcast_17_piece0 stored as 
bytes in memory (estimated size 7.1 KiB, free 2.4 GiB)
   26/02/15 15:16:01 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory 
on 192.168.100.32:49662 (size: 7.1 KiB, free: 2.4 GiB)
   26/02/15 15:16:01 INFO SparkContext: Created broadcast 17 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:01 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 4 (MapPartitionsRDD[17] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Adding task set 4.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:01 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 
6) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11473 bytes) 
   26/02/15 15:16:01 INFO Executor: Running task 0.0 in stage 4.0 (TID 6)
   26/02/15 15:16:01 INFO CodeGenerator: Code generated in 4.799166 ms
   26/02/15 15:16:01 INFO Executor: Finished task 0.0 in stage 4.0 (TID 6). 
7049 bytes result sent to driver
   26/02/15 15:16:01 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 
6) in 23 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:01 INFO DAGScheduler: ResultStage 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
finished in 0.027 s
   26/02/15 15:16:01 INFO DAGScheduler: Job 4 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:01 INFO TaskSchedulerImpl: Killing all running tasks in stage 
4: Stage finished
   26/02/15 15:16:01 INFO DAGScheduler: Job 4 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132, 
took 0.029916 s
   ID: 1, Name: Alice
     File: ''
     Block Start: -1
     Block Length: -1
   
   ID: 2, Name: Bob
     File: ''
     Block Start: -1
     Block Length: -1
   
   ID: 3, Name: Charlie
     File: ''
     Block Start: -1
     Block Length: -1
   
   ❌ SOME TESTS FAILED: Check the output above for details
   26/02/15 15:16:01 INFO SparkContext: SparkContext is stopping with exitCode 
0.
   26/02/15 15:16:01 INFO SparkUI: Stopped Spark web UI at 
http://192.168.100.32:4040
   26/02/15 15:16:01 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   26/02/15 15:16:01 INFO MemoryStore: MemoryStore cleared
   26/02/15 15:16:01 INFO BlockManager: BlockManager stopped
   26/02/15 15:16:01 INFO BlockManagerMaster: BlockManagerMaster stopped
   26/02/15 15:16:01 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   26/02/15 15:16:01 INFO SparkContext: Successfully stopped SparkContext
   26/02/15 15:16:02 INFO ShutdownHookManager: Shutdown hook called
   26/02/15 15:16:02 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-63dfb5d6-fcec-4e7f-a9a3-af9007b62490
   26/02/15 15:16:02 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-d0203fd8-92c9-42cb-a2b3-e3437e2b4a37
   26/02/15 15:16:02 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-63dfb5d6-fcec-4e7f-a9a3-af9007b62490/pyspark-11640bfd-7a71-41d6-8134-9eca7086ac34
   
   [Process completed]
   ```
   
   ### After
   ```
   Last login: Sun Feb 15 15:15:06 on ttys072
   (3.13.3) ➜  incubator-gluten git:(main) ✗ export 
GLUTEN_JAR=/Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-iceberg-fix.jar
   export ICEBERG_JAR=/tmp/iceberg.jar
   
   spark-submit \
     --jars "$GLUTEN_JAR,$ICEBERG_JAR" \
     --conf spark.plugins=org.apache.gluten.GlutenPlugin \
     --conf spark.gluten.sql.columnar.backend.lib=velox \
     --conf spark.gluten.enabled=true \
     --conf spark.driver.extraClassPath="$GLUTEN_JAR:$ICEBERG_JAR" \
     --conf spark.executor.extraClassPath="$GLUTEN_JAR:$ICEBERG_JAR" \
     test_iceberg_simple.py
   26/02/15 15:16:15 WARN Utils: Your hostname, Reemas-MacBook-Pro.local 
resolves to a loopback address: 127.0.0.1; using 192.168.100.32 instead (on 
interface en0)
   26/02/15 15:16:15 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
another address
   26/02/15 15:16:16 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   26/02/15 15:16:16 INFO SparkContext: Running Spark version 3.5.5
   26/02/15 15:16:16 INFO SparkContext: OS info Mac OS X, 15.6, aarch64
   26/02/15 15:16:16 INFO SparkContext: Java version 17.0.18
   26/02/15 15:16:16 INFO ResourceUtils: 
==============================================================
   26/02/15 15:16:16 INFO ResourceUtils: No custom resources configured for 
spark.driver.
   26/02/15 15:16:16 INFO ResourceUtils: 
==============================================================
   26/02/15 15:16:16 INFO SparkContext: Submitted application: 
iceberg-input-file-metadata-test
   26/02/15 15:16:16 INFO ResourceProfile: Default ResourceProfile created, 
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , 
memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: 
offHeap, amount: 2048, script: , vendor: ), task resources: Map(cpus -> name: 
cpus, amount: 1.0)
   26/02/15 15:16:16 INFO ResourceProfile: Limiting resource is cpu
   26/02/15 15:16:16 INFO ResourceProfileManager: Added ResourceProfile id: 0
   26/02/15 15:16:16 INFO SecurityManager: Changing view acls to: reema
   26/02/15 15:16:16 INFO SecurityManager: Changing modify acls to: reema
   26/02/15 15:16:16 INFO SecurityManager: Changing view acls groups to: 
   26/02/15 15:16:16 INFO SecurityManager: Changing modify acls groups to: 
   26/02/15 15:16:16 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: reema; groups with 
view permissions: EMPTY; users with modify permissions: reema; groups with 
modify permissions: EMPTY
   26/02/15 15:16:17 INFO Utils: Successfully started service 'sparkDriver' on 
port 49683.
   26/02/15 15:16:17 INFO SparkEnv: Registering MapOutputTracker
   26/02/15 15:16:17 INFO SparkEnv: Registering BlockManagerMaster
   26/02/15 15:16:17 INFO BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   26/02/15 15:16:17 INFO BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up
   26/02/15 15:16:17 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
   26/02/15 15:16:17 INFO DiskBlockManager: Created local directory at 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/blockmgr-28fb4afe-55ae-4ae3-b6bb-b9ce02a8e490
   26/02/15 15:16:17 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:17 INFO SparkEnv: Registering OutputCommitCoordinator
   26/02/15 15:16:17 INFO JettyUtils: Start Jetty 0.0.0.0:4040 for SparkUI
   26/02/15 15:16:17 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
   26/02/15 15:16:17 INFO SparkContext: Added JAR 
file:///Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-iceberg-fix.jar
 at 
spark://192.168.100.32:49683/jars/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-iceberg-fix.jar
 with timestamp 1771157776861
   26/02/15 15:16:17 INFO SparkContext: Added JAR 
file:///private/tmp/iceberg.jar at 
spark://192.168.100.32:49683/jars/iceberg.jar with timestamp 1771157776861
   26/02/15 15:16:17 INFO Discovery: Start discovering components in the 
current classpath... 
   26/02/15 15:16:17 INFO Discovery: Discovered component files: 
org.apache.gluten.backendsapi.velox.VeloxBackend, 
org.apache.gluten.component.VeloxIcebergComponent. Duration: 4 ms.
   26/02/15 15:16:17 INFO package: Components registered within order: velox, 
velox-iceberg
   26/02/15 15:16:17 INFO GlutenDriverPlugin: Gluten components:
   ==============================================================
   Component velox
     velox_branch = HEAD
     velox_revision = f247a8e922c4802fd9b9cf7a626421bff9b803fd
     velox_revisionTime = 2026-02-07 14:11:45 +0000
   Component velox-iceberg
   ==============================================================
   26/02/15 15:16:17 INFO SubstraitBackend: Gluten build info:
   ==============================================================
   Gluten Version: 1.7.0-SNAPSHOT
   GCC Version: 
   Java Version: 17
   Scala Version: 2.12.15
   Spark Version: 3.5.5
   Hadoop Version: 2.7.4
   Gluten Branch: iceberg-input-file
   Gluten Revision: bdb1f9117dc415d0c42c89fbd5533844bfa17b85
   Gluten Revision Time: 2026-02-15 01:29:56 +0300
   Gluten Build Time: 2026-02-15T11:32:58Z
   Gluten Repo URL: https://github.com/ReemaAlzaid/incubator-gluten.git
   ==============================================================
   26/02/15 15:16:17 INFO VeloxListenerApi: Memory overhead is not set. Setting 
it to 644245094 automatically. Gluten doesn't follow Spark's calculation on 
default value of this option because the actual required memory overhead will 
depend on off-heap usage than on on-heap usage.
   26/02/15 15:16:17 INFO SparkDirectoryUtil: Created local directory at 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e
   26/02/15 15:16:17 INFO JniWorkspace: Creating JNI workspace in root 
directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27
   26/02/15 15:16:17 INFO JniWorkspace: JNI workspace 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019
 created in root directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27
   26/02/15 15:16:17 INFO JniLibLoader: Read real path 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libgluten.dylib
 for libPath 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libgluten.dylib
   26/02/15 15:16:17 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libgluten.dylib
 has been loaded using path-loading method
   26/02/15 15:16:17 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libgluten.dylib
 has been loaded
   26/02/15 15:16:17 INFO JniLibLoader: Successfully loaded library 
darwin/aarch64/libgluten.dylib
   26/02/15 15:16:17 INFO JniLibLoader: Read real path 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libvelox.dylib
 for libPath 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libvelox.dylib
   26/02/15 15:16:17 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libvelox.dylib
 has been loaded using path-loading method
   26/02/15 15:16:17 INFO JniLibLoader: Library 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/gluten-87fb8dd4-e5f3-48f1-8ab9-480ad6d7553e/jni/a633327e-1f81-4a3f-8c1b-cf20ab0d3f27/gluten-7854201777374203019/darwin/aarch64/libvelox.dylib
 has been loaded
   26/02/15 15:16:17 INFO JniLibLoader: Successfully loaded library 
darwin/aarch64/libvelox.dylib
   W20260215 15:16:17.696556 14493069 MemoryArbitrator.cpp:84] Query memory 
capacity[460.50MB] is set for NOOP arbitrator which has no capacity enforcement
   26/02/15 15:16:17 INFO DriverPluginContainer: Initialized driver component 
for plugin org.apache.gluten.GlutenPlugin.
   26/02/15 15:16:17 INFO Executor: Starting executor ID driver on host 
192.168.100.32
   26/02/15 15:16:17 INFO Executor: OS info Mac OS X, 15.6, aarch64
   26/02/15 15:16:17 INFO Executor: Java version 17.0.18
   26/02/15 15:16:17 INFO Executor: Starting executor with user classpath 
(userClassPathFirst = false): 
'file:/Users/reema/Desktop/OpenSource/incubator-gluten/package/target/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-iceberg-fix.jar,file:/tmp/iceberg.jar,file:/Users/reema/Desktop/OpenSource/incubator-gluten/gluten-velox-bundle-spark3.5_2.12-darwin_aarch64-1.7.0-iceberg-fix.jar,file:/Users/reema/Desktop/OpenSource/incubator-gluten/iceberg.jar'
   26/02/15 15:16:17 INFO Executor: Created or updated repl class loader 
org.apache.spark.util.MutableURLClassLoader@3a7933da for default.
   26/02/15 15:16:17 INFO CodedInputStreamClassInitializer: The 
defaultRecursionLimit in protobuf has been increased to 100000
   26/02/15 15:16:17 INFO VeloxListenerApi: Gluten is running with Spark local 
mode. Skip running static initializer for executor.
   26/02/15 15:16:17 INFO ExecutorPluginContainer: Initialized executor 
component for plugin org.apache.gluten.GlutenPlugin.
   26/02/15 15:16:17 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 49684.
   26/02/15 15:16:17 INFO NettyBlockTransferService: Server created on 
192.168.100.32:49684
   26/02/15 15:16:17 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   26/02/15 15:16:17 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, 192.168.100.32, 49684, None)
   26/02/15 15:16:17 INFO BlockManagerMasterEndpoint: Registering block manager 
192.168.100.32:49684 with 2.4 GiB RAM, BlockManagerId(driver, 192.168.100.32, 
49684, None)
   26/02/15 15:16:17 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, 192.168.100.32, 49684, None)
   26/02/15 15:16:17 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(driver, 192.168.100.32, 49684, None)
   26/02/15 15:16:17 INFO VeloxBackend: Gluten SQL Tab has been attached.
   26/02/15 15:16:17 INFO SparkShimLoader: Loading Spark Shims for version: 
3.5.5
   26/02/15 15:16:17 INFO SparkShimLoader: Using Shim provider: 
List(org.apache.gluten.sql.shims.spark35.SparkShimProvider@4d028882)
   
================================================================================
   Creating Iceberg table...
   
================================================================================
   26/02/15 15:16:17 INFO SharedState: Setting hive.metastore.warehouse.dir 
('null') to the value of spark.sql.warehouse.dir.
   26/02/15 15:16:17 INFO SharedState: Warehouse path is 
'file:/Users/reema/Desktop/OpenSource/incubator-gluten/spark-warehouse'.
   26/02/15 15:16:18 INFO CatalogUtil: Loading custom FileIO implementation: 
org.apache.iceberg.hadoop.HadoopFileIO
   26/02/15 15:16:18 INFO BaseMetastoreCatalog: Table properties set at catalog 
level through catalog properties: {}
   26/02/15 15:16:18 INFO BaseMetastoreCatalog: Table properties enforced at 
catalog level through catalog properties: {}
   26/02/15 15:16:18 INFO HadoopTableOperations: Committed a new metadata file 
file:/tmp/iceberg_warehouse/default/test_table/metadata/v1.metadata.json
   26/02/15 15:16:19 WARN GlutenFallbackReporter: Validation failed for plan: 
AppendData[QueryId=1], due to: [FallbackByBackendSettings] Validation failed on 
node AppendData
   26/02/15 15:16:19 INFO CodeGenerator: Code generated in 84.452875 ms
   26/02/15 15:16:19 INFO MemoryStore: Block broadcast_0 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:19 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes 
in memory (estimated size 29.7 KiB, free 2.4 GiB)
   26/02/15 15:16:19 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory 
on 192.168.100.32:49684 (size: 29.7 KiB, free: 2.4 GiB)
   26/02/15 15:16:19 INFO SparkContext: Created broadcast 0 from broadcast at 
SparkWrite.java:195
   26/02/15 15:16:19 INFO AppendDataExec: Start processing data source write 
support: IcebergBatchWrite(table=local.default.test_table, format=PARQUET). The 
input RDD has 3 partitions.
   26/02/15 15:16:19 INFO SparkContext: Starting job: sql at 
NativeMethodAccessorImpl.java:0
   26/02/15 15:16:19 INFO DAGScheduler: Got job 0 (sql at 
NativeMethodAccessorImpl.java:0) with 3 output partitions
   26/02/15 15:16:19 INFO DAGScheduler: Final stage: ResultStage 0 (sql at 
NativeMethodAccessorImpl.java:0)
   26/02/15 15:16:19 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:19 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:19 INFO DAGScheduler: Submitting ResultStage 0 
(MapPartitionsRDD[1] at sql at NativeMethodAccessorImpl.java:0), which has no 
missing parents
   26/02/15 15:16:19 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 7.8 KiB, free 2.4 GiB)
   26/02/15 15:16:19 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes 
in memory (estimated size 4.4 KiB, free 2.4 GiB)
   26/02/15 15:16:19 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory 
on 192.168.100.32:49684 (size: 4.4 KiB, free: 2.4 GiB)
   26/02/15 15:16:19 INFO SparkContext: Created broadcast 1 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:19 INFO DAGScheduler: Submitting 3 missing tasks from 
ResultStage 0 (MapPartitionsRDD[1] at sql at NativeMethodAccessorImpl.java:0) 
(first 15 tasks are for partitions Vector(0, 1, 2))
   26/02/15 15:16:19 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks 
resource profile 0
   26/02/15 15:16:19 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 
0) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 9506 bytes) 
   26/02/15 15:16:19 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 
1) (192.168.100.32, executor driver, partition 1, PROCESS_LOCAL, 9506 bytes) 
   26/02/15 15:16:19 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 
2) (192.168.100.32, executor driver, partition 2, PROCESS_LOCAL, 9506 bytes) 
   26/02/15 15:16:19 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
   26/02/15 15:16:19 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
   26/02/15 15:16:19 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
   26/02/15 15:16:19 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:19 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:19 INFO CodecPool: Got brand-new compressor [.zstd]
   26/02/15 15:16:19 INFO DataWritingSparkTask: Writer for partition 1 is 
committing.
   26/02/15 15:16:19 INFO DataWritingSparkTask: Writer for partition 0 is 
committing.
   26/02/15 15:16:19 INFO DataWritingSparkTask: Writer for partition 2 is 
committing.
   26/02/15 15:16:19 INFO DataWritingSparkTask: Committed partition 0 (task 0, 
attempt 0, stage 0.0)
   26/02/15 15:16:19 INFO DataWritingSparkTask: Committed partition 2 (task 2, 
attempt 0, stage 0.0)
   26/02/15 15:16:19 INFO DataWritingSparkTask: Committed partition 1 (task 1, 
attempt 0, stage 0.0)
   26/02/15 15:16:19 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 
4161 bytes result sent to driver
   26/02/15 15:16:19 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 
4157 bytes result sent to driver
   26/02/15 15:16:19 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 
4153 bytes result sent to driver
   26/02/15 15:16:19 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 
0) in 401 ms on 192.168.100.32 (executor driver) (1/3)
   26/02/15 15:16:19 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 
1) in 393 ms on 192.168.100.32 (executor driver) (2/3)
   26/02/15 15:16:19 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 
2) in 393 ms on 192.168.100.32 (executor driver) (3/3)
   26/02/15 15:16:19 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:19 INFO DAGScheduler: ResultStage 0 (sql at 
NativeMethodAccessorImpl.java:0) finished in 0.450 s
   26/02/15 15:16:19 INFO DAGScheduler: Job 0 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:19 INFO TaskSchedulerImpl: Killing all running tasks in stage 
0: Stage finished
   26/02/15 15:16:19 INFO DAGScheduler: Job 0 finished: sql at 
NativeMethodAccessorImpl.java:0, took 0.477432 s
   26/02/15 15:16:19 INFO AppendDataExec: Data source write support 
IcebergBatchWrite(table=local.default.test_table, format=PARQUET) is committing.
   26/02/15 15:16:19 INFO SparkWrite: Committing append with 3 new data files 
to table local.default.test_table
   26/02/15 15:16:20 INFO HadoopTableOperations: Committed a new metadata file 
file:/tmp/iceberg_warehouse/default/test_table/metadata/v2.metadata.json
   26/02/15 15:16:20 INFO SnapshotProducer: Committed snapshot 
7722039398521868759 (MergeAppend)
   26/02/15 15:16:20 INFO LoggingMetricsReporter: Received metrics report: 
CommitReport{tableName=local.default.test_table, 
snapshotId=7722039398521868759, sequenceNumber=1, operation=append, 
commitMetrics=CommitMetricsResult{totalDuration=TimerResult{timeUnit=NANOSECONDS,
 totalDuration=PT0.239004458S, count=1}, attempts=CounterResult{unit=COUNT, 
value=1}, addedDataFiles=CounterResult{unit=COUNT, value=3}, 
removedDataFiles=null, totalDataFiles=CounterResult{unit=COUNT, value=3}, 
addedDeleteFiles=null, addedEqualityDeleteFiles=null, 
addedPositionalDeleteFiles=null, addedDVs=null, removedDeleteFiles=null, 
removedEqualityDeleteFiles=null, removedPositionalDeleteFiles=null, 
removedDVs=null, totalDeleteFiles=CounterResult{unit=COUNT, value=0}, 
addedRecords=CounterResult{unit=COUNT, value=3}, removedRecords=null, 
totalRecords=CounterResult{unit=COUNT, value=3}, 
addedFilesSizeInBytes=CounterResult{unit=BYTES, value=1920}, 
removedFilesSizeInBytes=null, totalFilesSizeInBytes=CounterResult{uni
 t=BYTES, value=1920}, addedPositionalDeletes=null, 
removedPositionalDeletes=null, totalPositionalDeletes=CounterResult{unit=COUNT, 
value=0}, addedEqualityDeletes=null, removedEqualityDeletes=null, 
totalEqualityDeletes=CounterResult{unit=COUNT, value=0}, manifestsCreated=null, 
manifestsReplaced=null, manifestsKept=null, manifestEntriesProcessed=null}, 
metadata={engine-version=3.5.5, app-id=local-1771157777707, engine-name=spark, 
iceberg-version=Apache Iceberg 1.10.0 (commit 
2114bf631e49af532d66e2ce148ee49dd1dd1f1f)}}
   26/02/15 15:16:20 INFO SparkWrite: Committed in 255 ms
   26/02/15 15:16:20 INFO AppendDataExec: Data source write support 
IcebergBatchWrite(table=local.default.test_table, format=PARQUET) committed.
   
   
================================================================================
   Testing input_file_name() on Iceberg table
   
================================================================================
   
   === input_file_name() Results ===
   26/02/15 15:16:20 INFO V2ScanRelationPushDown: 
   Output: id#7, name#8
            
   26/02/15 15:16:20 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 7722039398521868759 created at 2026-02-15T12:16:20.033+00:00 with 
filter true
   26/02/15 15:16:20 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:20 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_2 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes 
in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 2 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_3 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 3 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=2], due to: fallback input file expression
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 15.250084 ms
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_4 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes 
in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 4 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 
192.168.100.32:49684 in memory (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 
192.168.100.32:49684 in memory (size: 4.4 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
192.168.100.32:49684 in memory (size: 29.7 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56
   26/02/15 15:16:20 INFO DAGScheduler: Got job 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
with 1 output partitions
   26/02/15 15:16:20 INFO DAGScheduler: Final stage: ResultStage 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56)
   26/02/15 15:16:20 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:20 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:20 INFO DAGScheduler: Submitting ResultStage 1 
(MapPartitionsRDD[8] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56), 
which has no missing parents
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_5 stored as values in 
memory (estimated size 29.5 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes 
in memory (estimated size 11.7 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory 
on 192.168.100.32:49684 (size: 11.7 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 5 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:20 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 1 (MapPartitionsRDD[8] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:20 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 
3) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11766 bytes) 
   26/02/15 15:16:20 INFO Executor: Running task 0.0 in stage 1.0 (TID 3)
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 7.91075 ms
   26/02/15 15:16:20 INFO BaseAllocator: Debug mode disabled. Enable with the 
VM option -Darrow.memory.debug.allocator=true.
   26/02/15 15:16:20 INFO DefaultAllocationManagerOption: allocation manager 
type not specified, using netty as the default type
   26/02/15 15:16:20 INFO CheckAllocator: Using DefaultAllocationManager at 
memory/DefaultAllocationManagerFactory.class
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 4.909666 ms
   26/02/15 15:16:20 INFO Executor: Finished task 0.0 in stage 1.0 (TID 3). 
8350 bytes result sent to driver
   26/02/15 15:16:20 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 
3) in 118 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:20 INFO DAGScheduler: ResultStage 1 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56) 
finished in 0.122 s
   26/02/15 15:16:20 INFO DAGScheduler: Job 1 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Killing all running tasks in stage 
1: Stage finished
   26/02/15 15:16:20 INFO DAGScheduler: Job 1 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:56, 
took 0.123958 s
   ID: 1, Name: Alice, File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00000-0-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
   ID: 2, Name: Bob, File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00001-1-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
   ID: 3, Name: Charlie, File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00002-2-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
   
   ✅ SUCCESS: All 3 rows have valid file paths
   
   
================================================================================
   Testing input_file_block_start() on Iceberg table
   
================================================================================
   
   === input_file_block_start() Results ===
   26/02/15 15:16:20 INFO V2ScanRelationPushDown: 
   Output: id#24, name#25
            
   26/02/15 15:16:20 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 7722039398521868759 created at 2026-02-15T12:16:20.033+00:00 with 
filter true
   26/02/15 15:16:20 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:20 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_6 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 6 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_7 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes 
in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 7 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_5_piece0 on 
192.168.100.32:49684 in memory (size: 11.7 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=3], due to: fallback input file expression
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 7.716333 ms
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_8 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes 
in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 8 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:20 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82
   26/02/15 15:16:20 INFO DAGScheduler: Got job 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
with 1 output partitions
   26/02/15 15:16:20 INFO DAGScheduler: Final stage: ResultStage 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82)
   26/02/15 15:16:20 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:20 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:20 INFO DAGScheduler: Submitting ResultStage 2 
(MapPartitionsRDD[15] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82), 
which has no missing parents
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_7_piece0 on 
192.168.100.32:49684 in memory (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_9 stored as values in 
memory (estimated size 29.7 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes 
in memory (estimated size 11.8 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory 
on 192.168.100.32:49684 (size: 11.8 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 9 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:20 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 2 (MapPartitionsRDD[15] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:20 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 
4) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11786 bytes) 
   26/02/15 15:16:20 INFO Executor: Running task 0.0 in stage 2.0 (TID 4)
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 3.413875 ms
   26/02/15 15:16:20 INFO CodeGenerator: Code generated in 4.24975 ms
   26/02/15 15:16:20 INFO Executor: Finished task 0.0 in stage 2.0 (TID 4). 
8212 bytes result sent to driver
   26/02/15 15:16:20 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 
4) in 22 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:20 INFO DAGScheduler: ResultStage 2 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82) 
finished in 0.028 s
   26/02/15 15:16:20 INFO DAGScheduler: Job 2 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Killing all running tasks in stage 
2: Stage finished
   26/02/15 15:16:20 INFO DAGScheduler: Job 2 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:82, 
took 0.035285 s
   ID: 1, Name: Alice, Block Start: 4
   ID: 2, Name: Bob, Block Start: 4
   ID: 3, Name: Charlie, Block Start: 4
   
   ✅ SUCCESS: All 3 rows have valid block start positions
   
   
================================================================================
   Testing input_file_block_length() on Iceberg table
   
================================================================================
   
   === input_file_block_length() Results ===
   26/02/15 15:16:20 INFO V2ScanRelationPushDown: 
   Output: id#41, name#42
            
   26/02/15 15:16:20 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 7722039398521868759 created at 2026-02-15T12:16:20.033+00:00 with 
filter true
   26/02/15 15:16:20 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:20 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_10 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_10_piece0 stored as 
bytes in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 10 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_11 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_11_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 
192.168.100.32:49684 in memory (size: 11.8 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 11 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:20 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=4], due to: fallback input file expression
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_12 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_12_piece0 stored as 
bytes in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 12 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:20 INFO BlockManagerInfo: Removed broadcast_11_piece0 on 
192.168.100.32:49684 in memory (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106
   26/02/15 15:16:20 INFO DAGScheduler: Got job 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
with 1 output partitions
   26/02/15 15:16:20 INFO DAGScheduler: Final stage: ResultStage 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106)
   26/02/15 15:16:20 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:20 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:20 INFO DAGScheduler: Submitting ResultStage 3 
(MapPartitionsRDD[22] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106), 
which has no missing parents
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_13 stored as values in 
memory (estimated size 29.7 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_13_piece0 stored as 
bytes in memory (estimated size 11.8 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory 
on 192.168.100.32:49684 (size: 11.8 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 13 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:20 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 3 (MapPartitionsRDD[22] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:20 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 
5) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11795 bytes) 
   26/02/15 15:16:20 INFO Executor: Running task 0.0 in stage 3.0 (TID 5)
   26/02/15 15:16:20 INFO Executor: Finished task 0.0 in stage 3.0 (TID 5). 
8220 bytes result sent to driver
   26/02/15 15:16:20 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 
5) in 17 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:20 INFO DAGScheduler: ResultStage 3 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106) 
finished in 0.019 s
   26/02/15 15:16:20 INFO DAGScheduler: Job 3 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:20 INFO TaskSchedulerImpl: Killing all running tasks in stage 
3: Stage finished
   26/02/15 15:16:20 INFO DAGScheduler: Job 3 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:106, 
took 0.020702 s
   ID: 1, Name: Alice, Block Length: 636
   ID: 2, Name: Bob, Block Length: 622
   ID: 3, Name: Charlie, Block Length: 650
   
   ✅ SUCCESS: All 3 rows have valid block lengths
   
   
================================================================================
   Testing all three metadata functions together
   
================================================================================
   
   === All Metadata Functions Results ===
   26/02/15 15:16:20 INFO V2ScanRelationPushDown: 
   Output: id#60, name#61
            
   26/02/15 15:16:20 INFO SnapshotScan: Scanning table local.default.test_table 
snapshot 7722039398521868759 created at 2026-02-15T12:16:20.033+00:00 with 
filter true
   26/02/15 15:16:20 INFO BaseDistributedDataScan: Planning file tasks locally 
for table local.default.test_table
   26/02/15 15:16:20 INFO SparkPartitioningAwareScan: Reporting 
UnknownPartitioning with 1 partition(s) for table local.default.test_table
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_14 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO MemoryStore: Block broadcast_14_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:20 INFO BlockManagerInfo: Added broadcast_14_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:20 INFO SparkContext: Created broadcast 14 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_15 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_15_piece0 stored as 
bytes in memory (estimated size 30.0 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory 
on 192.168.100.32:49684 (size: 30.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:21 INFO SparkContext: Created broadcast 15 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore started with capacity 2.4 GiB
   26/02/15 15:16:21 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=5], due to: fallback input file expression
   26/02/15 15:16:21 INFO CodeGenerator: Code generated in 6.058208 ms
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_16 stored as values in 
memory (estimated size 32.0 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_16_piece0 stored as 
bytes in memory (estimated size 29.9 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO BlockManagerInfo: Added broadcast_16_piece0 in memory 
on 192.168.100.32:49684 (size: 29.9 KiB, free: 2.4 GiB)
   26/02/15 15:16:21 INFO SparkContext: Created broadcast 16 from collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:21 INFO SparkContext: Starting job: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132
   26/02/15 15:16:21 INFO DAGScheduler: Got job 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
with 1 output partitions
   26/02/15 15:16:21 INFO DAGScheduler: Final stage: ResultStage 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132)
   26/02/15 15:16:21 INFO DAGScheduler: Parents of final stage: List()
   26/02/15 15:16:21 INFO DAGScheduler: Missing parents: List()
   26/02/15 15:16:21 INFO DAGScheduler: Submitting ResultStage 4 
(MapPartitionsRDD[29] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132), 
which has no missing parents
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_17 stored as values in 
memory (estimated size 30.2 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO MemoryStore: Block broadcast_17_piece0 stored as 
bytes in memory (estimated size 12.0 KiB, free 2.4 GiB)
   26/02/15 15:16:21 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory 
on 192.168.100.32:49684 (size: 12.0 KiB, free: 2.4 GiB)
   26/02/15 15:16:21 INFO SparkContext: Created broadcast 17 from broadcast at 
DAGScheduler.scala:1585
   26/02/15 15:16:21 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 4 (MapPartitionsRDD[29] at collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
(first 15 tasks are for partitions Vector(0))
   26/02/15 15:16:21 INFO TaskSchedulerImpl: Adding task set 4.0 with 1 tasks 
resource profile 0
   26/02/15 15:16:21 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 
6) (192.168.100.32, executor driver, partition 0, PROCESS_LOCAL, 11989 bytes) 
   26/02/15 15:16:21 INFO Executor: Running task 0.0 in stage 4.0 (TID 6)
   26/02/15 15:16:21 INFO CodeGenerator: Code generated in 4.262917 ms
   26/02/15 15:16:21 INFO CodeGenerator: Code generated in 10.714417 ms
   26/02/15 15:16:21 INFO Executor: Finished task 0.0 in stage 4.0 (TID 6). 
8369 bytes result sent to driver
   26/02/15 15:16:21 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 
6) in 42 ms on 192.168.100.32 (executor driver) (1/1)
   26/02/15 15:16:21 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks 
have all completed, from pool 
   26/02/15 15:16:21 INFO DAGScheduler: ResultStage 4 (collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132) 
finished in 0.047 s
   26/02/15 15:16:21 INFO DAGScheduler: Job 4 is finished. Cancelling potential 
speculative or zombie tasks for this job
   26/02/15 15:16:21 INFO TaskSchedulerImpl: Killing all running tasks in stage 
4: Stage finished
   26/02/15 15:16:21 INFO DAGScheduler: Job 4 finished: collect at 
/Users/reema/Desktop/OpenSource/incubator-gluten/test_iceberg_simple.py:132, 
took 0.050441 s
   ID: 1, Name: Alice
     File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00000-0-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
     Block Start: 4
     Block Length: 636
   
   ID: 2, Name: Bob
     File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00001-1-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
     Block Start: 4
     Block Length: 622
   
   ID: 3, Name: Charlie
     File: 
'file:/tmp/iceberg_warehouse/default/test_table/data/00002-2-2b0c5d04-98fb-4cda-bdf3-7dac021f8032-0-00001.parquet'
     Block Start: 4
     Block Length: 650
   
   ✅ ALL TESTS PASSED: All metadata functions work correctly!
   26/02/15 15:16:21 INFO SparkContext: SparkContext is stopping with exitCode 
0.
   26/02/15 15:16:21 INFO SparkUI: Stopped Spark web UI at 
http://192.168.100.32:4040
   26/02/15 15:16:21 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   26/02/15 15:16:21 INFO MemoryStore: MemoryStore cleared
   26/02/15 15:16:21 INFO BlockManager: BlockManager stopped
   26/02/15 15:16:21 INFO BlockManagerMaster: BlockManagerMaster stopped
   26/02/15 15:16:21 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   26/02/15 15:16:21 INFO SparkContext: Successfully stopped SparkContext
   26/02/15 15:16:21 INFO ShutdownHookManager: Shutdown hook called
   26/02/15 15:16:21 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-dbc8040a-b030-4017-b078-49a1acd7c001/pyspark-fe5f02ee-61ab-470d-a651-4a4eb74b901e
   26/02/15 15:16:21 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-0482ef35-5e29-4648-b2b4-9fc9e31eccac
   26/02/15 15:16:21 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/5z/4mxhbysx1hb00rzxt6wj738m0000gn/T/spark-dbc8040a-b030-4017-b078-49a1acd7c001
   
   [Process completed]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to