mgmarino commented on issue #12046:
URL: https://github.com/apache/iceberg/issues/12046#issuecomment-2609058634

   I tried to trace where the connection pool is being closed. Aside from a 
calls stemming from finalizers on Thread shutdown (which seem perfectly 
legitimate), I see:
   
   ```
   ERROR PoolingHttpClientConnectionManager: Shutting down Pool: 
   java.lang.Exception: shutting down
        at 
org.apache.iceberg.aws.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.shutdown(PoolingHttpClientConnectionManager.java:410)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.http.apache.ApacheHttpClient.close(ApacheHttpClient.java:247)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70) 
~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87) 
~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.utils.AttributeMap.closeIfPossible(AttributeMap.java:678)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.utils.AttributeMap.access$1600(AttributeMap.java:49) 
~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.utils.AttributeMap$DerivedValue.close(AttributeMap.java:632)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at java.util.HashMap$Values.forEach(HashMap.java:1065) ~[?:?]
        at 
software.amazon.awssdk.utils.AttributeMap.close(AttributeMap.java:107) 
~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.core.client.config.SdkClientConfiguration.close(SdkClientConfiguration.java:118)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.core.internal.http.HttpClientDependencies.close(HttpClientDependencies.java:82)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient.close(AmazonSyncHttpClient.java:76)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.close(BaseSyncClientHandler.java:86)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
software.amazon.awssdk.services.s3.DefaultS3Client.close(DefaultS3Client.java:12477)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at org.apache.iceberg.aws.s3.S3FileIO.close(S3FileIO.java:417) 
~[custom-jar-glue-job-680fb4e.jar:?]
        at 
org.apache.iceberg.spark.source.SerializableTableWithSize.close(SerializableTableWithSize.java:69)
 ~[custom-jar-glue-job-680fb4e.jar:?]
        at 
org.apache.spark.storage.memory.MemoryStore.$anonfun$freeMemoryEntry$1(MemoryStore.scala:410)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.memory.MemoryStore.$anonfun$freeMemoryEntry$1$adapted(MemoryStore.scala:407)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) 
~[scala-library-2.12.18.jar:?]
        at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) 
~[scala-library-2.12.18.jar:?]
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.storage.memory.MemoryStore.freeMemoryEntry(MemoryStore.scala:407)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.memory.MemoryStore.remove(MemoryStore.scala:425) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:2012) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.memory.MemoryStore.dropBlock$1(MemoryStore.scala:503) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.memory.MemoryStore.$anonfun$evictBlocksToFreeSpace$4(MemoryStore.scala:529)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.storage.memory.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:520)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:93)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:74)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.memory.UnifiedMemoryManager.acquireStorageMemory(UnifiedMemoryManager.scala:181)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.memory.MemoryStore.putBytes(MemoryStore.scala:151) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager$BlockStoreUpdater.saveSerializedValuesToMemoryStore(BlockManager.scala:363)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager$BlockStoreUpdater.$anonfun$save$1(BlockManager.scala:404)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1540)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager$BlockStoreUpdater.save(BlockManager.scala:384)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.storage.BlockManager.putBytes(BlockManager.scala:1484) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:240)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23) 
~[scala-library-2.12.18.jar:?]
        at scala.collection.immutable.List.foreach(List.scala:431) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:212)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$4(TorrentBroadcast.scala:308)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at scala.Option.getOrElse(Option.scala:189) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$2(TorrentBroadcast.scala:284)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$1(TorrentBroadcast.scala:279)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.util.SparkErrorUtils.tryOrIOException(SparkErrorUtils.scala:35)
 ~[spark-common-utils_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.util.SparkErrorUtils.tryOrIOException$(SparkErrorUtils.scala:33)
 ~[spark-common-utils_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:96) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:279)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:125)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:77) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:174) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.scheduler.Task.run(Task.scala:152) 
~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:632)
 ~[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
 [spark-common-utils_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
 [spark-common-utils_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:96) 
[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:635) 
[spark-core_2.12-3.5.2-amzn-1.jar:3.5.2-amzn-1]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 
[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 
[?:?]
        at java.lang.Thread.run(Thread.java:840) [?:?]
   
   ```
   
   Where I would pick out the relevant line:
   
   `at 
org.apache.iceberg.spark.source.SerializableTableWithSize.close(SerializableTableWithSize.java:69)`
   
   
https://github.com/apache/iceberg/blob/77813609f1a28ac6080b29e03e2b3d018fd0f7c9/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SerializableTableWithSize.java#L69
   
   My suspicion is that that this IO object (created e.g. here, I believe: 
https://github.com/apache/iceberg/blob/84c8db40f9500aa804b0428baeb51b1041c64a94/core/src/main/java/org/apache/iceberg/SerializableTable.java#L123)
   
   is shared with the reader. I will try to do some more investigation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to