shuyouZZ created SPARK-43991:
--------------------------------

             Summary: Use the compression codec set by the spark config file
                 Key: SPARK-43991
                 URL: https://issues.apache.org/jira/browse/SPARK-43991
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, Web UI
    Affects Versions: 3.4.0
            Reporter: shuyouZZ


Currently, if enable rolling log in SHS, only {{originalFilePath}} is used to 
determine the path of compact file.
{code:java}
override val logPath: String = originalFilePath.toUri.toString + 
EventLogFileWriter.COMPACTED
{code}
If the user set {{spark.eventLog.compression.codec}} in spark conf, when the 
log compact logic is triggered, the old event log file will be compacted and 
use the compression codec set by the spark default config file.
{code:java}
protected val compressionCodec =
    if (shouldCompress) {
      Some(CompressionCodec.createCodec(sparkConf, 
sparkConf.get(EVENT_LOG_COMPRESSION_CODEC)))
    } else {
      None
    }

private[history] val compressionCodecName = compressionCodec.map { c =>
    CompressionCodec.getShortName(c.getClass.getName)
  }
{code}
However, The compression codec used by EventLogFileReader to read log is split 
from the log path, this will lead to EventLogFileReader can not read the 
compacted log file normally.
{code:java}
def codecName(log: Path): Option[String] = {
    // Compression codec is encoded as an extension, e.g. app_123.lzf
    // Since we sanitize the app ID to not include periods, it is safe to split 
on it
    val logName = log.getName.stripSuffix(COMPACTED).stripSuffix(IN_PROGRESS)
    logName.split("\\.").tail.lastOption
  }
{code}
So we should improve the {{logPath}} method in class 
CompactedEventLogFileWriter, use compression codec set by the spark default 
config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to