pan3793 commented on PR #46947: URL: https://github.com/apache/spark/pull/46947#issuecomment-2164258527
@gengliangwang thanks for the summary. > To address concerns about the potential overload of variables for users ... Major concerns come from public API exposure. Spark is conservative for deleting deprecated public API, for example, `HiveContext` has been marked as deprecated since 2.0 and it still lives. Nearly a thousand `LogKeys` were added in a short time, and were exposed as public API. > The primary reason for migrating all variables in our log messages is to avoid ongoing debates about which specific keys should be included as log keys. IMO the debates are valuable, I do think NOT all variables are suitable for the `LogKey` concept. Additionally, we may need to tune the original logs to adapt to the new structured logging framework. For example, exposing the TASK_ID to all task-specific logs makes it easy to filter out each task's logs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
