[ https://issues.apache.org/jira/browse/GSOC-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maxim Solodovnik updated GSOC-253: ---------------------------------- Labels: Doris full-time gsoc2024 (was: full-time gsoc2024) > [GSoC][Doris]Support UPDATE for Doris Duplicate Key Table > --------------------------------------------------------- > > Key: GSOC-253 > URL: https://issues.apache.org/jira/browse/GSOC-253 > Project: Comdev GSOC > Issue Type: New Feature > Reporter: Calvin Kirs > Priority: Major > Labels: Doris, full-time, gsoc2024 > > h2. Objectives > Support UPDATE for Doris Duplicate Key Table > Currently, Doris supports three data models, Duplicate Key / Aggregate Key / > Unique Key, of which Unique Key has perfect data update support (including > UPDATE statement). With the widespread popularity of Doris, users have more > demands on Doris. For example, some user needs to perform ETL processing > operations inside Doris, but they uses Duplicate Key table and hopes that > Duplicate Key can also support UPDATE. For Duplicate Key, since there is no > primary key can help we locate one specific row, UPDATE is low efficient. The > usual practice is to rewrite all the data, even if the user only updates one > field of a row of data, he must rewrite at least the segment file it is in. > Another potentially more efficient solution is to implement Duplicate Key by > combining Unique Key's Merge-on-Write, and the auto_increment column. i.e., > let's change the underlying implementation of Duplicate Key to use Unique Key > MoW, and add a hidden auto_increment column in the primary key, so that all > the keys written by the user to the Unique Key MoW table are not duplicated, > which realizes the semantics of Duplicate Key, and since each row of data has > a unique primary key, we can reuse the UPDATE capability of Unique Key to > support the Duplicate Key's UPDATE > We would like participants to help design and implement the solution, and > perform performance testing for comparison and performance optimization. > h2. Recommended Skills > Familiar with C++ programming > Familiar with the storage layer of Doris > h2. Mentor > Mentor: Chen Zhang, Apache Doris Committer, chzhang1...@gmail.com > Mentor: Guolei Yi, Apache Doris PMC Member, yiguo...@gmail.com > Mailing List: d...@doris.apache.org > Website: [https://doris.apache.org|https://doris.apache.org/] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: gsoc-unsubscr...@community.apache.org For additional commands, e-mail: gsoc-h...@community.apache.org