klesh opened a new issue, #2538: URL: https://github.com/apache/incubator-devlake/issues/2538
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and found no similar feature requirement. ### Description # Why Currently, users can choose between mysql / pg as the database system to store and analyze collected data, it works fine in most cases. However, it poses some limitations when the dataset grows too large to handle. We had to shut down our demo server twice due to storage/memory overflow, so, we can easily project that it would not be practical if we needed some large-scale analysis with Traditional RDBS (mysql/pg). # What I propose that we investigate the possibility of adopting Distributed Columnar DBS as our storage, like StarRocks Pros: - it supports MySQL protocol - it is Distributed, thus unlimited storage / memory and computation - it is Columnar, more efficient in storage and search Cons: - supports a subset of Standard SQL statement, thus some queries might not be possible - bad at updating (not support?) insertion(insert rows in the middle of the table)? thus we have to redesign our data-collect-update-convert logic # How By introducing StarRocks, we may have unlimited storage and memory, thus, support for large-scale analysis becomes possible. We may support other kinds of Big-Data. I propose that we approach with the following steps: 1. Assign a Veteran Developer to investigate the StarRocks DBS, and evaluate the feasibility of the adoption. 2. A report should be submitted to the Community within 5 workdays. 3. The PPMC members should evaluate the report and make a decision in 3 workdays while all Committer could share their thoughts 4. We will schedule the implementation afterward ### Use case Users may run large-scale analysis in Apache DevLake ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
