Hi community, Related issue: https://github.com/apache/dolphinscheduler/issues/13017
Currently, DS only supports writing task logs to the local file system in worker. So this issue discusses the feature design of remote logging. # Why remote logging? * Avoid task log loss after worker is torn down * Easier to obtain logs and troubleshoot after logs are aggregated in remote storage * Enhanced cloud-native support for DS # Feature Design ## Connect to different remote targets DS can support a variety of common remote storage, and has strong scalability to support other types of remote storage * S3 * OSS * ElasticSearch * Azure Blob Storage * Google Cloud Storage * ... ## When to write logs to remote storage Like airflow, DS writes the task logs to remote storage after the task completes (success or fail). ## How to read logs Since the task log is stored in both the worker's local and remote storage, when the `api-server` needs to read the log of a certain task instance, it needs to determine the reading strategy. Airflow first tries to read the logs stored remotely, and if it fails, reads the local logs. But I prefer to try to read the local log first, and then read the remote log if the local log file does not exist. We could discuss this further. ## Log retention strategy For example, the maximum capacity of remote storage can be set, and old logs can be deleted by rolling. # Sub-tasks WIP Any comments or suggestions are welcome. Best Regards, Rick Cheng
