jieguangzhou opened a new issue, #10372:
URL: https://github.com/apache/dolphinscheduler/issues/10372

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar feature requirement.
   
   
   ### Description
   
    In the MLOps scenario, data versioning is an important module that can help 
data scientists perform data versioning so that machine learning experiments 
and team data sharing and management can be carried out more clearly.
   
   [DVC](https://github.com/iterative/dvc) is an Open-source Version Control 
System for Machine Learning Projects, I think the DVC task plugin in 
DolphinScheduler can help users with visual operations to manage data versions. 
   
   ### Use case
   
   - [ ] Easy to add or update data to a data repository, and tag the data 
version
   - [ ] Easy to download a specific version of data from a data repository
   - [ ] Big file data management, as it's officially said: "Switching to a 
different version of a 100Gb file in less than a second with a git checkout"
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to