[ 
https://issues.apache.org/jira/browse/SPARK-47622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831831#comment-17831831
 ] 

Srinivasu Majeti commented on SPARK-47622:
------------------------------------------

CCing [~vanzin] to look at it and guide on the next proceedings. Thank you!

> Spark creates lot of tiny blocks for a single driverLog file of size less 
> than a dfs.blocksize
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47622
>                 URL: https://issues.apache.org/jira/browse/SPARK-47622
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, Spark Submit
>    Affects Versions: 3.3.2
>            Reporter: Srinivasu Majeti
>            Priority: Major
>
> Upon reviewing the spark code, found that /user/spark/driverLogs are synced 
> to HDFS with hsync option as shown below.
> {code:java}
>  hdfsStream.hsync(EnumSet.allOf(classOf[HdfsDataOutputStream.SyncFlag]))
> Ref: 
> https://github.com/apache/spark/blob/a3c04ec1145662e4227d57cd953bffce96b8aad7/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala{code}
> As a result of this we see lot of tiny blocks getting synced every 5 seconds 
> with a new block. So we see a small HDFS file with 8 blocks as shown in the 
> below example.
> {code:java}
> [r...@ccycloud-3.smajeti.root.comops.site subdir0]# hdfs fsck 
> /user/spark/driverLogs/application_1710495774861_0002_driver.log
> Connecting to namenode via 
> https://ccycloud-3.smajeti.root.comops.site:20102/fsck?ugi=hdfs&path=%2Fuser%2Fspark%2FdriverLogs%2Fapplication_1710495774861_0002_driver.log
> FSCK started by hdfs (auth:KERBEROS_SSL) from /10.140.136.139 for path 
> /user/spark/driverLogs/application_1710495774861_0002_driver.log at Thu Mar 
> 28 06:37:29 UTC 2024
> Status: HEALTHY
>  Number of data-nodes:        4
>  Number of racks:             1
>  Total dirs:                  0
>  Total symlinks:              0
> Replicated Blocks:
>  Total size:  157574 B
>  Total files: 1
>  Total blocks (validated):    8 (avg. block size 19696 B)
>  Minimally replicated blocks: 8 (100.0 %) {code}
> HdfsDataOutputStream.SyncFlag includes two flags UPDATE_LENGTH and END_BLOCK. 
> This has been an expected behavior for some time now and these flags will 
> help visualize the latest size of the HDFS Driver log file and to achieve 
> that, blocks are being ended/closed every 5-second sync. Every new sync will 
> create a new block for the same HDFS driver log file. This hysnc behavior was 
> started after fixing SPARK-29105 (SHS may delete driver log file of 
> in-progress application) 5 years back.
> But this leaves Namenode to manage a lot of meta and becomes an overhead at 
> times in large clusters.
> {code:java}
> public static enum SyncFlag {
>         UPDATE_LENGTH,
>         END_BLOCK;
>         private SyncFlag() {
>         }
>     }
> {code}
> I don't see any configurable option to avoid this and avoiding this type of 
> hsync may have some side effects in spark as we saw SPARK-29105 bug.
> We only have two options that needs manual intevention
> 1. Keep cleaning these Driver logs after some time
> 2. Keep merging these small block files into files with 128MB 
> Can we provide some customizable option to merge these blocks while closing 
> the spark-shell or during closing the driver log file?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to