hi Aniket Adnaik, It is a great design document about update/delete and very useful feature for CarbonData.
For the solution you proposed, i think the most difficult challenge is Compaction. If without careful attention, rewriting data over and over can lead to some serious network and disk over-subscription, In other words, compaction is about trading some disk IO now for fewer seeks later, as HBase and LevelDB raise the same issue. The following compaction solution for LevelDB/HBase could be reference for the detailed design. FYI. - FIFO Compaction(HBASE-14468 <https://issues.apache.org/jira/browse/HBASE-14468>) - Tier-Based Compaction(HBASE-7055 <https://issues.apache.org/jira/browse/HBASE-7055>,HBASE-14477 <https://issues.apache.org/jira/browse/HBASE-14477>) - Level Compaction(LevelDB Implementation notes <https://rawgit.com/google/leveldb/master/doc/impl.html>)/Stripe Compaction(HBASE-7667 <https://issues.apache.org/jira/browse/HBASE-7667>) Please correct me if I am wrong. Regards, He Xiaoqiao On Sun, Nov 20, 2016 at 11:54 PM, Aniket Adnaik <[email protected]> wrote: > Hi All, > > Please find a design doc for Update/Delete support in CarbonData. > > https://drive.google.com/file/d/0B71_EuXTdDi8S2dxVjN6Z1RhWlU/view? > usp=sharing > > Best Regards, > Aniket >
