>>And I still dont understand how the store files resulting after memstore >>flushes are having a size of 40MB. Does it hove smth to do with memstore >>upper limit and these 42MB are the result of forcing the memstore to be >>flushed? The problem is that all the newly store files added to HDFS are >>starting with this size (42MB) I did not mention that my CF is in-memory.
Its due to Java object overhead, so 3x is normal (128MB in memory -> 42MB on disk) Another aspect to take into account: flush can happen not only when we reach memstore size limit, there are other triggers as well: 1. maximum WAL files reached (hbase.regionserver.maxlogs) 2. periodic memstore flusher (once an 1h) can trigger flushes a s well -Vlad On Tue, Jan 5, 2016 at 9:37 AM, Ted Yu <[email protected]> wrote: > For #1, > bq. would this minor which becomes major take care of deleted rows > > Yes. > > For #2, please consider the following guide: > > dfs.blocksize (value: ${propdata["dfs.blocksize"]}) * 0.95 * > hbase.regionserver.maxlogs (value: > ${propdata["hbase.regionserver.maxlogs"]}) should be greater than > hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE (the value > for -Xmx) > > Cheers > > On Tue, Jan 5, 2016 at 8:39 AM, Mehdi Ben Haj Abbes <[email protected] > > > wrote: > > > Thanks Ted for the clarification about the major compactions. So if I did > > understand well when a minor compaction is triggered and the policy > selects > > all the store files, this compaction becomes a major one. But would this > > minor which becomes major take care of deleted rows as a major one would > do > > or at the end it is just a minor that happened and selected all the store > > files ? > > > > About disabling splitting I have already hbase.hregion.max.filesize set > to > > 10GB besides I pre splitted my table. > > > > And I still dont understand how the store files resulting after memstore > > flushes are having a size of 40MB. Does it hove smth to do with memstore > > upper limit and these 42MB are the result of forcing the memstore to be > > flushed? The problem is that all the newly store files added to HDFS are > > starting with this size (42MB) I did not mention that my CF is in-memory. > > > > Best regards, > > > > On Tue, Jan 5, 2016 at 4:04 PM, Ted Yu <[email protected]> wrote: > > > > > For #1, when all store files are selected for compaction, the > compaction > > > becomes major > > > > > > see 'Determine the Optimal Number of Pre-Split Regions' under: > > > http://hbase.apache.org/book.html#disable.splitting > > > > > > See also http://hbase.apache.org/book.html#managed.compactions > > > > > > Cheers > > > > > > On Tue, Jan 5, 2016 at 6:52 AM, Mehdi Ben Haj Abbes < > > [email protected] > > > > > > > wrote: > > > > > > > Hi folks, > > > > > > > > I'm using hbase 0.98. I have heavy writes workload. I'm writing to > one > > > > table with one CF compressed with GZ. My table is pre splitted to 27 > > > > regions. As I start writing to this table I start seeing HFiles of > the > > > size > > > > of 2-4 MB across the regions. I have the default hbase configuration > > for > > > > compaction properties. The compactions start as soon as I start > writing > > > to > > > > HBase but many of these compaction are major ones. I can see this > > through > > > > HBase master UI on the table details view. So I wanted to understand > > > when a > > > > compaction becomes major. > > > > > > > > Another question, If I'm not wrong we have a memstore per region, so > > > when a > > > > memstore is flushed I will have a HFile with 128MB but I only see > files > > > > with 42MB (without compression and 2.5MB when compressed with GZ). > > > > > > > > Any explanation ? > > > > > > > > Thanks in advance. > > > > -- > > > > Mehdi BEN HAJ ABBES > > > > > > > > > > > > > > > -- > > Mehdi BEN HAJ ABBES > > >
