[ https://issues.apache.org/jira/browse/SPARK-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-5932: ----------------------------- Description: This is SPARK-5931's sister issue. The naming of existing byte configs is inconsistent. We currently have the following throughout the code base: {code} spark.reducer.maxMbInFlight // megabytes spark.kryoserializer.buffer.mb // megabytes spark.shuffle.file.buffer.kb // kilobytes spark.broadcast.blockSize // kilobytes spark.executor.logs.rolling.size.maxBytes // bytes spark.io.compression.snappy.block.size // bytes {code} Instead, my proposal is to simplify the config name itself and make everything accept time using the following format: 500b, 2k, 100m, 46g, similar to what we currently use for our memory settings. For instance: {code} spark.reducer.maxSizeInFlight = 10m spark.kryoserializer.buffer = 2m spark.shuffle.file.buffer = 10k spark.broadcast.blockSize = 20k spark.executor.logs.rolling.maxSize = 500b spark.io.compression.snappy.blockSize = 200b {code} All existing configs that are relevant will be deprecated in favor of the new ones. We should do this soon before we keep introducing more time configs. was: This is SPARK-5931's sister issue. The naming of existing byte configs is inconsistent. We currently have the following throughout the code base: {code} spark.reducer.maxMbInFlight (mb) spark.kryoserializer.buffer.mb (mb) spark.shuffle.file.buffer.kb (kb) spark.broadcast.blockSize (kb) spark.executor.logs.rolling.size.maxBytes (bytes) spark.io.compression.snappy.block.size (bytes) {code} Instead, my proposal is to simplify the config name itself and make everything accept time using the following format: 500b, 2k, 100m, 46g, similar to what we currently use for our memory settings. For instance: {code} spark.reducer.maxSizeInFlight = 10m spark.kryoserializer.buffer = 2m spark.shuffle.file.buffer = 10k spark.broadcast.blockSize = 20k spark.executor.logs.rolling.maxSize = 500b spark.io.compression.snappy.blockSize = 200b {code} All existing configs that are relevant will be deprecated in favor of the new ones. We should do this soon before we keep introducing more time configs. > Use consistent naming for byte properties > ----------------------------------------- > > Key: SPARK-5932 > URL: https://issues.apache.org/jira/browse/SPARK-5932 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Andrew Or > Assignee: Andrew Or > > This is SPARK-5931's sister issue. > The naming of existing byte configs is inconsistent. We currently have the > following throughout the code base: > {code} > spark.reducer.maxMbInFlight // megabytes > spark.kryoserializer.buffer.mb // megabytes > spark.shuffle.file.buffer.kb // kilobytes > spark.broadcast.blockSize // kilobytes > spark.executor.logs.rolling.size.maxBytes // bytes > spark.io.compression.snappy.block.size // bytes > {code} > Instead, my proposal is to simplify the config name itself and make > everything accept time using the following format: 500b, 2k, 100m, 46g, > similar to what we currently use for our memory settings. For instance: > {code} > spark.reducer.maxSizeInFlight = 10m > spark.kryoserializer.buffer = 2m > spark.shuffle.file.buffer = 10k > spark.broadcast.blockSize = 20k > spark.executor.logs.rolling.maxSize = 500b > spark.io.compression.snappy.blockSize = 200b > {code} > All existing configs that are relevant will be deprecated in favor of the new > ones. We should do this soon before we keep introducing more time configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org