jieguangzhou opened a new issue, #10372: URL: https://github.com/apache/dolphinscheduler/issues/10372
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement. ### Description In the MLOps scenario, data versioning is an important module that can help data scientists perform data versioning so that machine learning experiments and team data sharing and management can be carried out more clearly. [DVC](https://github.com/iterative/dvc) is an Open-source Version Control System for Machine Learning Projects, I think the DVC task plugin in DolphinScheduler can help users with visual operations to manage data versions. ### Use case - [ ] Easy to add or update data to a data repository, and tag the data version - [ ] Easy to download a specific version of data from a data repository - [ ] Big file data management, as it's officially said: "Switching to a different version of a 100Gb file in less than a second with a git checkout" ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
