On Mon, May 2, 2016 at 10:06 AM, Shushant Arora <[email protected]> wrote:
> Thanks Stack > > for point 2 : > I am concerned with downtime of Hbase for read and write. > If write lock is just for the time while we move aside the current > MemStore. > Then when a write happens to key will it update the memstore only but > snapshot does not have that update and when snapshot is dunmped to Hfile > won't we loose the update? > > > No. The update is in the new currently active MemStore. The update will be included in the next flush added to a new hfile. St.Ack > On Mon, May 2, 2016 at 9:06 PM, Stack <[email protected]> wrote: > > > On Mon, May 2, 2016 at 1:25 AM, Shushant Arora < > [email protected]> > > wrote: > > > > > Thanks! > > > > > > Few doubts; > > > > > > 1.LSM tree comprises two tree-like > > > <https://en.wikipedia.org/wiki/Tree_(data_structure)> structures, > called > > > C0 and > > > C1 and If the insertion causes the C0 component to exceed a certain > size > > > threshold, a contiguous segment of entries is removed from C0 and > merged > > > into C1 on disk > > > > > > But in Hbase when C0 which is memstore I guess? is exceeded the > threshold > > > size its dumped on to HDFS as HFIle(c1 I guess?) - and does compaction > is > > > the process which here means as merging of C0 and C1 ? > > > > > > > > The 'merge' in the quoted high-level description may just mean that the > > dumped hfile is 'merged' with the others at read time. Or it may be as > > stated, that the 'merge' happens at flush time. Some LSM tree > > implementations do it this way -- Bigtable, and it calls the merge of > > memstore and a file-on-disk a form of compaction -- but this is not what > > HBase does; it just dumps the memstore as a flushed hfile. Later, we'll > run > > a compaction process to merge hfiles in background. > > > > > > > > > 2.Moves current, active Map aside as a snapshot (while a write lock is > > held > > > for a short period of time), and then creates a new CSLS instances. > > > > > > In background, the snapshot is then dumped to disk. We get an Iterator > on > > > CSLS. We write a block at a time. When we exceed configured block size, > > we > > > start a new one. > > > > > > -- Does write lock is held till the time complete CSLS is dumpled on > > > disk. > > > > > > > > No. Just while we move aside the current MemStore. > > > > What is your concern/objective? Are you studying LSM trees generally or > are > > you worried that HBase is offline for periods of time for read and write? > > > > Thanks, > > St.Ack > > > > > > > > > And read is allowed using snapshot. > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > On Mon, May 2, 2016 at 11:39 AM, Stack <[email protected]> wrote: > > > > > > > On Sun, May 1, 2016 at 3:36 AM, Shushant Arora < > > > [email protected]> > > > > wrote: > > > > > > > > > 1.Does Hbase uses ConcurrentskipListMap(CSLM) to store data in > > > memstore? > > > > > > > > > > Yes (We use a CSLS but this is implemented over a CSLM). > > > > > > > > > > > > > 2.When mwmstore is flushed to HDFS- does it dump the memstore > > > > > Concurrentskiplist as Hfile2? Then How does it calculates blocks > out > > of > > > > > CSLM and dmp them in HDFS. > > > > > > > > > > > > > > Moves current, active Map aside as a snapshot (while a write lock is > > held > > > > for a short period of time), and then creates a new CSLS instances. > > > > > > > > In background, the snapshot is then dumped to disk. We get an > Iterator > > on > > > > CSLS. We write a block at a time. When we exceed configured block > size, > > > we > > > > start a new one. > > > > > > > > > > > > > 3.After dumping the inmemory CSLM of memstore to HFILe does > memstore > > > > > content is discarded > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > and if while dumping memstore any read request comes > > > > > will it be responded by copy of memstore or discard of memstore > will > > be > > > > > blocked until read request is completed? > > > > > > > > > > We will respond using the snapshot until it has been successfully > > > dumped. > > > > Once dumped, we'll respond using the hfile. > > > > > > > > No blocking (other than for the short period during which the > snapshot > > is > > > > made and the file is swapped into the read path). > > > > > > > > > > > > > > > > > 4.When a read request comes does it look in inmemory CSLM and then > in > > > > > HFile? > > > > > > > > > > > > Generally, yes. > > > > > > > > > > > > > > > > > And what is LogStructuredMerge tree and its usage in Hbase. > > > > > > > > > > > > > > Suggest you read up on LSM Trees ( > > > > https://en.wikipedia.org/wiki/Log-structured_merge-tree) and if you > > > still > > > > can't see the LSM tree in the HBase forest, ask specific questions > and > > > > we'll help you out. > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > >
