Hi, all:
Here I am to make a conclusion of my opinion and provide option 4.

Option 4:
4) Extending existing SQL syntax of Major and Minor compaciton based on
syntax of delete segment:
    ALTER TABLE tablename COMPACT 'MAJOR' WHERE SEGMENT.ID IN (1,2,3,4)
    ALTER TABLE tablename COMPACT 'MINOR' WHERE SEGMENT.ID IN (1,2,3,4)
    ALTER TABLE tablename COMPACT 'MAJOR' WHERE SEGMENT.STARTTIME BEFORE
'2017-06-01 12:05:06' AND SEGMENT.STARTTIME AFTER '2017-05-01 12:05:06' 
    ALTER TABLE tablename COMPACT 'MINOR' WHERE SEGMENT.STARTTIME BEFORE
'2017-06-01 12:05:06' AND SEGMENT.STARTTIME AFTER '2017-05-01 12:05:06' 
  Notice: The syntax is slightly different from that of Option1.

The previous (default without condition) major compaction is size based,
carbondata choose the segments by size. And for the newly major compaction
(with condition), we specify the segments and let carbondata merge them into
one large segment. 
Actually the previous compaction statement looks like this
    ALTER TABLE tablename COMPACT 'MAJOR' WHERE SEGMENT_SIZE > XXMB
The condition part 'WHERE SEGMENT_SIZE > XXMB' is implicit. However the
condition part in newly compaction statement is explicit.
They are no different in purpose -- merge some segments into larger one,
they are only different in selecting segments -- by segment size or by
condition. So we don't need an another compaction type.



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to