narrowizard opened a new issue, #8216:
URL: https://github.com/apache/incubator-devlake/issues/8216

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and 
found no similar feature requirement.
   
   
   ### Use case
   
   As a DevLake user leveraging the Customize plugin to upload issues and 
issue_repo_commits data for further analysis, I need the ability to perform 
incremental CSV uploads. This would allow me to append new data to existing 
records without overwriting or replacing the entire dataset.
   
   ### Description
   
   Currently, the Customize plugin in DevLake only supports full data uploads, 
which replace all existing data with the new data from the uploaded CSV file. 
While this functionality works for initial data loads, it poses significant 
challenges as the dataset grows over time:
   
   1. Data Integrity Risks: Full uploads may inadvertently overwrite or lose 
historical data, compromising the dataset's accuracy and completeness.
   2. File Maintenance Overhead: CSV files become increasingly large as time 
progresses, making them cumbersome to maintain and manage.
   To address these challenges, I propose adding incremental upload support to 
the Customize plugin. This feature would enable users to append new records 
from CSV files to the existing dataset without requiring a complete overwrite.
   
   Benefits:
   
   - Enhanced Data Integrity: Ensures existing data remains untouched while 
appending new entries.
   - Improved Scalability: Reduces the need to maintain and manage increasingly 
large CSV files.
   - Better User Experience: Simplifies data upload workflows for users.
   I envision this feature functioning as follows:
   
   1. Users upload a new CSV file containing only new data entries.
   2. The Customize plugin compares the uploaded data with existing records.
   3. New records are appended to the domain layer, while existing records 
remain unchanged.
   This functionality would greatly improve the usability of the Customize 
plugin and make it more suitable for long-term data collection and analysis 
workflows.
   
   Let me know if additional details or clarifications are needed!
   
   ### Related issues
   
   No
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to