Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang closed pull request #46193: [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms URL: https://github.com/apache/spark/pull/46193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on PR #46193: URL: https://github.com/apache/spark/pull/46193#issuecomment-2079790129 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on PR #46193: URL: https://github.com/apache/spark/pull/46193#issuecomment-2078352745 > @panbingkun Sorry, I have to merge #46192 since it was created before this one. Please resolve the conflicts and I will merge this one. Thanks as always. Don't worry, I have solved it, 😄 Let's run GA first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on PR #46193: URL: https://github.com/apache/spark/pull/46193#issuecomment-2078113658 @panbingkun Sorry, I have to merge https://github.com/apache/spark/pull/46192 since it was created before this one. Please resolve the conflicts and I will merge this one. Thanks as always. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1580054051 ## common/utils/src/test/scala/org/apache/spark/util/LogKeySuite.scala: ## @@ -57,20 +57,20 @@ class LogKeySuite private val logKeyFilePath = getWorkspaceFilePath("common", "utils", "src", "main", "scala", "org", "apache", "spark", "internal", "LogKey.scala") - // regenerate the file `LogKey.scala` with its enumeration fields sorted alphabetically + // regenerate the file `LogKey.scala` with its members sorted alphabetically private def regenerateLogKeyFile( - originalKeys: Seq[LogKey], sortedKeys: Seq[LogKey]): Unit = { + originalKeys: Seq[String], sortedKeys: Seq[String]): Unit = { if (originalKeys != sortedKeys) { val logKeyFile = logKeyFilePath.toFile - logInfo(s"Regenerating LogKey file $logKeyFile") + logInfo(s"Regenerating the file $logKeyFile") val originalContents = FileUtils.readLines(logKeyFile, StandardCharsets.UTF_8) val sortedContents = new JList[String]() var firstMatch = false originalContents.asScala.foreach { line => -if (line.trim.startsWith("val ") && line.trim.endsWith(" = Value")) { +if (line.trim.startsWith("case object ") && line.trim.endsWith(" extends LogKey")) { Review Comment: Not directly related to this PR, but we need to think about what if there are code comments around the LogKey. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578930058 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: Yeah -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578922427 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: > If necessary in the future, I think it would be easier to implement based on the `current changes`, such as > > * `ILogKey` > * `InternalLogKey` extends `ILogKey` > * `LogKey` extends `InternalLogKey` > * `ExternalLogKey` extends `ILogKey` > * `ThirdPartyLogKey` extends `ExternalLogKey` Yes, we can further expand after this PR is completed. Personally, I think we can add a String field to the MDC to identify the namespace. For example, the namespace of Spark's internal MDC can always be `SPARK`, and the third-party Spark ecosystem can customize their own namespace. This can make the key of `MessageWithContext` become similar to `SPARK.mdcKey`, `CUSTOM_NAMESPACE.mdcKey`. Perhaps this is not a perfect method, we can further discuss it after this PR is completed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578903240 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,399 +16,399 @@ */ package org.apache.spark.internal +trait LogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BATCH_WRITE = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXCEPTION = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val IDENTIFIER = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTE
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578806516 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,399 +16,399 @@ */ package org.apache.spark.internal +trait LogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BATCH_WRITE = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXCEPTION = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val IDENTIFIER = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVA
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578797076 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -33,8 +33,8 @@ import org.apache.spark.deploy.k8s.Config._ import org.apache.spark.deploy.k8s.Constants._ import org.apache.spark.deploy.k8s.KubernetesConf import org.apache.spark.deploy.k8s.KubernetesUtils.addOwnerReference -import org.apache.spark.internal.{Logging, LogKey, MDC} -import org.apache.spark.internal.LogKey.{COUNT, EXECUTOR_IDS, TIMEOUT} +import org.apache.spark.internal.{Logging, LogKeys, MDC} +import org.apache.spark.internal.LogKeys.{COUNT, EXECUTOR_IDS, TIMEOUT} Review Comment: merge to `import org.apache.spark.internal.LogKeys` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578791101 ## core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala: ## @@ -33,8 +33,8 @@ import org.apache.spark.InternalAccumulator.{input, shuffleRead} import org.apache.spark.TaskState.TaskState import org.apache.spark.errors.SparkCoreErrors import org.apache.spark.executor.ExecutorMetrics -import org.apache.spark.internal.{config, Logging, LogKey, MDC} -import org.apache.spark.internal.LogKey.{REASON, TASK_SET_NAME, TASK_STATE, TID} +import org.apache.spark.internal.{config, Logging, LogKeys, MDC} +import org.apache.spark.internal.LogKeys.{REASON, TASK_SET_NAME, TASK_STATE, TID} Review Comment: merge to `import org.apache.spark.internal.LogKeys` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578788755 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,399 +16,399 @@ */ package org.apache.spark.internal +trait LogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BATCH_WRITE = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXCEPTION = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val IDENTIFIER = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERV
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578786930 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,399 +16,399 @@ */ package org.apache.spark.internal +trait LogKey Review Comment: Relevant explanatory comments should be added for `LogKey` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578780993 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -20,15 +20,15 @@ package org.apache.spark.sql.catalyst.optimizer import scala.collection.mutable import org.apache.spark.SparkException -import org.apache.spark.internal.LogKey._ +import org.apache.spark.internal.LogKeys._ import org.apache.spark.internal.MDC import org.apache.spark.sql.catalyst.SQLConfHelper import org.apache.spark.sql.catalyst.analysis._ import org.apache.spark.sql.catalyst.catalog.{InMemoryCatalog, SessionCatalog} import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.aggregate._ import org.apache.spark.sql.catalyst.plans._ -import org.apache.spark.sql.catalyst.plans.logical.{RepartitionOperation, _} +import org.apache.spark.sql.catalyst.plans.logical._ Review Comment: This is not related to the renaming of `LogKey -> LogKeys`, but it is almost more reasonable to change it to this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578780993 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -20,15 +20,15 @@ package org.apache.spark.sql.catalyst.optimizer import scala.collection.mutable import org.apache.spark.SparkException -import org.apache.spark.internal.LogKey._ +import org.apache.spark.internal.LogKeys._ import org.apache.spark.internal.MDC import org.apache.spark.sql.catalyst.SQLConfHelper import org.apache.spark.sql.catalyst.analysis._ import org.apache.spark.sql.catalyst.catalog.{InMemoryCatalog, SessionCatalog} import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.aggregate._ import org.apache.spark.sql.catalyst.plans._ -import org.apache.spark.sql.catalyst.plans.logical.{RepartitionOperation, _} +import org.apache.spark.sql.catalyst.plans.logical._ Review Comment: This is not related to the renaming of LogKey -> LogKeys, but it is almost more reasonable to change it to this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578773809 ## common/utils/src/main/scala/org/apache/spark/internal/README.md: ## @@ -2,7 +2,7 @@ ## LogKey -LogKeys serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding new LogKeys: +LogKey serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding new LogKey: Review Comment: ```suggestion `LogKey`s serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding a new LogKey: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578772766 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala: ## @@ -21,12 +21,12 @@ import org.json4s.{Formats, NoTypeHints} import org.json4s.jackson.Serialization import org.apache.spark.internal.{Logging, MDC} -import org.apache.spark.internal.LogKey.{CONFIG, DEFAULT_VALUE, NEW_VALUE, OLD_VALUE, TIP} +import org.apache.spark.internal.LogKeys.{CONFIG, DEFAULT_VALUE, NEW_VALUE, OLD_VALUE, TIP} import org.apache.spark.io.CompressionCodec import org.apache.spark.sql.RuntimeConfig import org.apache.spark.sql.connector.read.streaming.{Offset => OffsetV2, SparkDataStream} import org.apache.spark.sql.execution.streaming.state.{FlatMapGroupsWithStateExecHelper, StreamingAggregationStateManager, SymmetricHashJoinStateManager} -import org.apache.spark.sql.internal.SQLConf.{FLATMAPGROUPSWITHSTATE_STATE_FORMAT_VERSION, _} +import org.apache.spark.sql.internal.SQLConf._ Review Comment: This is not related to the renaming of `LogKey -> LogKeys`, but it is almost more reasonable to change it to this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578769212 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,21 @@ trait LoggingSuiteBase } } + private val externalSystemCustomLog = Review Comment: Add a UT to test how external third-party can use `LogKey` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578768196 ## common/utils/src/main/scala/org/apache/spark/internal/README.md: ## @@ -2,7 +2,7 @@ ## LogKey -LogKeys serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding new LogKeys: +LogKey serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding new LogKey: Review Comment: I'm not sure if it's more accurate to call it `LogKeys` or `LogKey`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578641891 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVAL = Value - val INIT_MODE = Value - val INTERVAL = Value - val ISOLATION_L
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578640785 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVAL = Value - val INIT_MODE = Value - val INTERVAL = Value - val ISOLATION_L
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578640764 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVAL = Value - val INIT_MODE = Value - val INTERVAL = Value - val ISOLATIO
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578638859 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVAL = Value - val INIT_MODE = Value - val INTERVAL = Value - val ISOLATION_L
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578624819 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: Yeah creating a trait seems better. I have some preference in the naming. See my other comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578624300 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** * Various keys used for mapped diagnostic contexts(MDC) in logging. * All structured logging keys should be defined here for standardization. */ -object LogKey extends Enumeration { - val ACCUMULATOR_ID = Value - val ACTUAL_NUM_FILES = Value - val ACTUAL_PARTITION_COLUMN = Value - val ALPHA = Value - val ANALYSIS_ERROR = Value - val APP_ATTEMPT_ID = Value - val APP_DESC = Value - val APP_ID = Value - val APP_NAME = Value - val APP_STATE = Value - val ARGS = Value - val BACKUP_FILE = Value - val BATCH_ID = Value - val BLOCK_ID = Value - val BLOCK_MANAGER_ID = Value - val BROADCAST_ID = Value - val BUCKET = Value - val BYTECODE_SIZE = Value - val CACHED_TABLE_PARTITION_METADATA_SIZE = Value - val CACHE_AUTO_REMOVED_SIZE = Value - val CACHE_UNTIL_HIGHEST_CONSUMED_SIZE = Value - val CACHE_UNTIL_LAST_PRODUCED_SIZE = Value - val CALL_SITE_LONG_FORM = Value - val CATALOG_NAME = Value - val CATEGORICAL_FEATURES = Value - val CHECKPOINT_FILE = Value - val CHECKPOINT_TIME = Value - val CHECKSUM_FILE_NUM = Value - val CLASS_LOADER = Value - val CLASS_NAME = Value - val CLUSTER_CENTROIDS = Value - val CLUSTER_ID = Value - val CLUSTER_LABEL = Value - val CLUSTER_LEVEL = Value - val CLUSTER_WEIGHT = Value - val CODEC_LEVEL = Value - val CODEC_NAME = Value - val CODEGEN_STAGE_ID = Value - val COLUMN_DATA_TYPE_SOURCE = Value - val COLUMN_DATA_TYPE_TARGET = Value - val COLUMN_DEFAULT_VALUE = Value - val COLUMN_NAME = Value - val COMMAND = Value - val COMMAND_OUTPUT = Value - val COMPONENT = Value - val CONFIG = Value - val CONFIG2 = Value - val CONFIG3 = Value - val CONFIG4 = Value - val CONFIG5 = Value - val CONSUMER = Value - val CONTAINER = Value - val CONTAINER_ID = Value - val CONTAINER_STATE = Value - val COST = Value - val COUNT = Value - val CROSS_VALIDATION_METRIC = Value - val CROSS_VALIDATION_METRICS = Value - val CSV_HEADER_COLUMN_NAME = Value - val CSV_HEADER_COLUMN_NAMES = Value - val CSV_HEADER_LENGTH = Value - val CSV_SCHEMA_FIELD_NAME = Value - val CSV_SCHEMA_FIELD_NAMES = Value - val CSV_SOURCE = Value - val CURRENT_PATH = Value - val DATA = Value - val DATABASE_NAME = Value - val DATAFRAME_CACHE_ENTRY = Value - val DATAFRAME_ID = Value - val DATA_FILE_NUM = Value - val DATA_SOURCE = Value - val DATA_SOURCES = Value - val DATA_SOURCE_PROVIDER = Value - val DEFAULT_ISOLATION_LEVEL = Value - val DEFAULT_VALUE = Value - val DELAY = Value - val DELEGATE = Value - val DELTA = Value - val DESCRIPTION = Value - val DESIRED_PARTITIONS_SIZE = Value - val DIFF_DELTA = Value - val DIVISIBLE_CLUSTER_INDICES_SIZE = Value - val DRIVER_ID = Value - val DROPPED_PARTITIONS = Value - val DURATION = Value - val EFFECTIVE_STORAGE_LEVEL = Value - val ELAPSED_TIME = Value - val ENCODING = Value - val END_INDEX = Value - val END_POINT = Value - val ENGINE = Value - val ERROR = Value - val ESTIMATOR_PARAMETER_MAP = Value - val EVENT_LOOP = Value - val EVENT_QUEUE = Value - val EXECUTE_INFO = Value - val EXECUTE_KEY = Value - val EXECUTION_PLAN_LEAVES = Value - val EXECUTOR_DESIRED_COUNT = Value - val EXECUTOR_ENVS = Value - val EXECUTOR_ENV_REGEX = Value - val EXECUTOR_ID = Value - val EXECUTOR_IDS = Value - val EXECUTOR_LAUNCH_COMMANDS = Value - val EXECUTOR_LAUNCH_COUNT = Value - val EXECUTOR_RESOURCES = Value - val EXECUTOR_STATE = Value - val EXECUTOR_TARGET_COUNT = Value - val EXIT_CODE = Value - val EXPECTED_NUM_FILES = Value - val EXPECTED_PARTITION_COLUMN = Value - val EXPIRY_TIMESTAMP = Value - val EXPR = Value - val EXPR_TERMS = Value - val EXTENDED_EXPLAIN_GENERATOR = Value - val FAILURES = Value - val FALLBACK_VERSION = Value - val FEATURE_COLUMN = Value - val FEATURE_DIMENSION = Value - val FIELD_NAME = Value - val FILE_ABSOLUTE_PATH = Value - val FILE_FORMAT = Value - val FILE_FORMAT2 = Value - val FILE_NAME = Value - val FILE_VERSION = Value - val FINISH_TRIGGER_DURATION = Value - val FROM_OFFSET = Value - val FROM_TIME = Value - val FUNCTION_NAME = Value - val FUNCTION_PARAMETER = Value - val GROUP_ID = Value - val HADOOP_VERSION = Value - val HASH_JOIN_KEYS = Value - val HEARTBEAT_INTERVAL = Value - val HISTORY_DIR = Value - val HIVE_CLIENT_VERSION = Value - val HIVE_METASTORE_VERSION = Value - val HIVE_OPERATION_STATE = Value - val HIVE_OPERATION_TYPE = Value - val HOST = Value - val HOST_PORT = Value - val INCOMPATIBLE_TYPES = Value - val INDEX = Value - val INDEX_FILE_NUM = Value - val INDEX_NAME = Value - val INFERENCE_MODE = Value - val INITIAL_CAPACITY = Value - val INITIAL_HEARTBEAT_INTERVAL = Value - val INIT_MODE = Value - val INTERVAL = Value - val ISOLATIO
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578623991 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey Review Comment: rename as LogKey? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578623558 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: @gengliangwang What do you think of the current modification approach? If we define it as `a string`, it feels like there is `no binding requirement` for `LogKey` at all Perhaps we need more eyes to review the changes above? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578608646 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: We should definitely allow external libraries to extend the log keys. But I don't have a good suggestion for this one now. Either making it as a trait or making the keys as strings are ok. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577686240 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: If necessary in the future, I think it would be easier to implement based on the `current changes`, such as - `ILogKey` - `InternalLogKey` extends `ILogKey` - `LogKey` extends `InternalLogKey` - `ExternalLogKey` extends `ILogKey` - `ThirdPartyLogKey` extends `ExternalLogKey` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577602188 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: Perhaps we also need to consider namespace isolation. Currently, we can define keys with the same name in multiple places, but we cannot distinguish from the `MessageWithContext` whether they are defined internally in Spark or externally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577167593 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: In https://github.com/apache/spark/pull/46186/files, I try to relax the type of `MDC.key` to `Enumeration#Value`, which allows us to extend the use of other enumeration types as `MDC.key` to enhance a bit of extensibility. ![image](https://github.com/apache/spark/assets/1475305/60b2fd34-61de-4f92-8f10-cba983bf6718) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577397848 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An enumeration value that is not defined in `LogKey`, you can define it anywhere + private val externalLogKey = LogKey.VALUE + private val externalLog = log"${MDC(externalLogKey, "External log message.")}" + test("Logging with external LogKey") { Review Comment: @LuciferYang @gengliangwang I have updated a version that removes `enumeration`. When using structured logging mechanisms in external Spark related ecosystems, the following format can be used: ``` case object CUSTOME_LOG_KEY extends ILogKey ``` At present, this form is more in line with the traditional Spark style, and we can `constrain` the `class` of MDC parameter `key` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org