decster opened a new issue #3382: URL: https://github.com/apache/incubator-doris/issues/3382
Currently, the underlying storage engine in Doris only supports append style batch write, update/upsert/delete by a primary key (like in traditional DBMS/KV) is not supported. As Doris become more popular, many use-cases start to request the `realtime update` capability, e.g. * eCommerce use-case: order, inventory status real-time analytics * Finance use case, use account/balance/transfer checking * Social media, user/post status updates & statistics Also, Doris rollup table update on new data batch can be considered as `counter` updates, which may also benefit from this storage engine. Currently, Doris uses a special `REPLACE` column property to `simulate` upsert semantics. It's basically merge-on-read, when scanning a tablet, Doris automatically merges versions under the same key and only keep the latest version. In real-time use-cases, if the ingestion frequency is high, there are a lot of segments need to be merged, causing a performance bottleneck. We propose a new memory optimized column storage engine to support the `realtime` + `frequent update` use-case, which borrows some ideas from Kudu. Desing doc can be found [here](https://decster.github.io/docs/choco.pdf) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
