After studied the source code further, I realized that the retention algorithm is based on the ready segments kept within kylin. As long as the newest segment is within the retention range with the oldest segment, no purge is done. It is not based on the current date.
Thanks. Kang-sen From: ShaoFeng Shi <[email protected]> Sent: Tuesday, December 10, 2019 3:31 AM To: user <[email protected]> Subject: Re: how does cube retention range work ________________________________ NOTICE: This email was received from an EXTERNAL sender ________________________________ I checked the source code, there is no detailed log. At this moment I have no idea. Many users already use the auto-merge feature. Not sure what can block the function, we couldn't guess. Maybe you need to debug that. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: [email protected]<mailto:[email protected]> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html<https://kylin.apache.org/docs/gettingstarted/faq.html> Join Kylin user mail group: [email protected]<mailto:[email protected]> Join Kylin dev mail group: [email protected]<mailto:[email protected]> Lu, Kang-Sen <[email protected]<mailto:[email protected]>> 于2019年12月10日周二 上午2:56写道: Hi, Shaofeng: Just to be sure about this sentence: “it will be dropped from the segment list first”. Does it mean if we examine the cube storage, the old segment will not show? My experience does not match with this description. I had several cube segments build for, say, 20180209. Now we are in 2019, those segments stays in kylin and I can even query those segments’ data. Kang-sen From: ShaoFeng Shi <[email protected]<mailto:[email protected]>> Sent: Friday, December 6, 2019 10:40 PM To: user <[email protected]<mailto:[email protected]>> Subject: Re: how does cube retention range work ________________________________ NOTICE: This email was received from an EXTERNAL sender ________________________________ Hi kangsen, It will be triggered when a new segment is built, see CubeService.updateOnNewSegmentReady(), line 637. If a cube segment's all date is older than the retention days (say all before 30 days; if partial, it will not be selected), it will be dropped from the segment list first. The data (hdfs, HBase) cleanup will be deferred to StorageCleanupJob time. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: [email protected]<mailto:[email protected]> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html<https://kylin.apache.org/docs/gettingstarted/faq.html> Join Kylin user mail group: [email protected]<mailto:[email protected]> Join Kylin dev mail group: [email protected]<mailto:[email protected]> Lu, Kang-Sen <[email protected]<mailto:[email protected]>> 于2019年12月6日周五 上午12:01写道: I am running kylin 2.6.3. In kylin GUI configuring cube, at step “Refresh Setting”, we can specify “Retention Threshold”, say, 30 (days). How would kylin automatically remove cube segments that is older than 30 days? I searched kylin source code, it seems that kylin does save “retentionRange” with each CubeDesc. But no other source code refers to that retentionRange. Thanks. Kang-sen ________________________________ Notice: This e-mail together with any attachments may contain information of Ribbon Communications Inc. that is confidential and/or proprietary for the sole use of the intended recipient. Any review, disclosure, reliance or distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please notify the sender immediately and then delete all copies, including any attachments. ________________________________
