As an aside, If you can't wait for HBASE-4536 and trunk to get RC'd, you can just add a 'Delete' column to logically serve as a client-side delete marker instead of issuing the actual delete and have an MR job both extract the delete data & handle the actual server-side delete.
On 1/26/12 4:11 PM, "lars hofhansl" <[email protected]> wrote: >Unless you have HBASE-4536 (only in trunk, though) or are parsing the >HFiles yourself you have no way of actuallygettingto the deleted data. > >-- Lars > > > >----- Original Message ----- >From: yonghu <[email protected]> >To: [email protected] >Cc: >Sent: Thursday, January 26, 2012 1:00 PM >Subject: Re: the occasion of the major compact? > >Nicolas, > >In my use case, I want to extract the deleted data. Hence, if I >disable the major compaction, I can prevent the hbase to actually >delete the data. After extracting the deleted data, I can issue major >compact by myself. > >Regards > >Yong > >On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg ><[email protected]> wrote: >> Yong, >> >> Can you please explain why you want to disable major compactions? What >> are the problems that you're currently seeing or what are you worried >>will >> happen if a major compaction is allowed to occur? Right now, there are >> only an extremely small subset of cases where you must explicitly >>disable >> compactions. These use cases I know of are very complicated and require >> building StoreFile analysis tools underneath HBase, that I'm pretty sure >> you're not needing this. >> >> Please also read my follow up commentary to explaining major compaction >> logic: >> http://search-hadoop.com/m/JR9sK1xnbj21 >> http://search-hadoop.com/m/X7W7q1xnbj21 >> >> >> The vast majority of users need features completely unrelated to >> compactions. The compaction algorithm is an easy target to worry about. >> >> >> On 1/26/12 7:06 AM, "yonghu" <[email protected]> wrote: >> >>>Hello Mikael, >>> >>>I think disabling the major compaction in the timed and client-issued >>>situation is not a problem. The problem is the size-based. From the >>>mailing list, it only talks about the situation of minor compaction >>>not major compaction, if I understand right. So, I want to know if >>>someone can tell me how to close the major compaction in size-based >>>situation. >>> >>>Thanks >>> >>>Yong >>>I saw the description which indicating the size of store file can also >>>trigger major compaction. >>> >>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <[email protected]> >>>wrote: >>>> Yong hi >>>> >>>> As far as i know setting hbase.hregion.majorcompaction to 0 will >>>>disable >>>> the time based trigger only. >>>> Client are always able to invoke the major compact, no matter what is >>>>the >>>> value of the hbase.hregion.majorcompaction. >>>> >>>> Perhaps client invocation of compaction can me disabled with the >>>>security >>>> package. >>>> >>>> Anyway i'm digging into 0.92, I hope to get those insight soon. >>>> >>>> Mikael.S >>>> >>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <[email protected]> wrote: >>>> >>>>> Thanks for your response. >>>>> >>>>> I knew that major compact can be triggered based on client, time and >>>>> size. In my situation, I have to close the functionality of major >>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it >>>>> will close all the three situations or I have to set it separately >>>>>for >>>>> each case. BTW, my hbase version is 0.92. >>>>> >>>>> Thanks! >>>>> >>>>> Yong >>>>> >>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk >>>>><[email protected]> >>>>> wrote: >>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it >>>>>explain a >>>>> > lot on major compaction and enhancement over versions >>>>> > >>>>> > Mikael.S >>>>> > >>>>> > >>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <[email protected]> >>>>> wrote: >>>>> > >>>>> >> Le 26/01/2012 14:43, yonghu a écrit : >>>>> >> > Hello, >>>>> >> > >>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It >>>>> mentions >>>>> >> > that every 24 hours the major compaction will occur. My question >>>>>is >>>>> >> > that if there are any other conditions which can trigger major >>>>> >> > compaction happening? For example, when the size of store file >>>>>reaches >>>>> >> > the threshold (I think this will cause minor compaction or >>>>>region >>>>>file >>>>> >> > split, not major compaction, but not quite sure). >>>>> >> > >>>>> >> > Thanks! >>>>> >> > >>>>> >> > Yong >>>>> >> >>>>> >> Hello, >>>>> >> I think when there is massive delete on the table or change table >>>>> >> attribute like TTL (that is susseptible of remove a lot of >>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on >>>>>each >>>>> region. >>>>> >> >>>>> >> Cheers, >>>>> >> >>>>> >> -- >>>>> >> Damien >>>>> >> >>>>> >> >>>>> > >>>>> > >>>>> > -- >>>>> > Mikael.S >>>>> >>>> >>>> >>>> >>>> -- >>>> Mikael.S >> >
