Thanks Stack.

1.So is it at any time there will be two reference 1.active memstore
2.snapshot memstore
snapshot will be initialised at time of flush using active memstore with a
momentaily lock and then active will be discarded and read will be served
usinmg snapshot and write will go to new active memstore.

2key of CSLS is keyvalue . Which part of keyValue is used while sorting the
set. Is it whole keyvalue or just row key. Does Hfile has separate entry
for each key value and keyvalues of same row key are always stored
contiguosly in HFile and may not be in same block?

On Tue, May 3, 2016 at 12:05 AM, Stack <st...@duboce.net> wrote:

> On Mon, May 2, 2016 at 10:06 AM, Shushant Arora <shushantaror...@gmail.com
> >
> wrote:
>
> > Thanks Stack
> >
> > for point 2 :
> > I am concerned with downtime of Hbase for read and write.
> > If write lock is just for the time while we move aside the current
> > MemStore.
> > Then when a write happens to key will it update the memstore only but
> > snapshot does not have that update and when snapshot is dunmped to Hfile
> > won't we loose the update?
> >
> >
> >
> No. The update is in the new currently active MemStore. The update will be
> included in the next flush added to a new hfile.
>
> St.Ack
>
>
>
>
>
> > On Mon, May 2, 2016 at 9:06 PM, Stack <st...@duboce.net> wrote:
> >
> > > On Mon, May 2, 2016 at 1:25 AM, Shushant Arora <
> > shushantaror...@gmail.com>
> > > wrote:
> > >
> > > > Thanks!
> > > >
> > > > Few doubts;
> > > >
> > > > 1.LSM tree comprises two tree-like
> > > > <https://en.wikipedia.org/wiki/Tree_(data_structure)> structures,
> > called
> > > > C0 and
> > > > C1 and If the insertion causes the C0 component to exceed a certain
> > size
> > > > threshold, a contiguous segment of entries is removed from C0 and
> > merged
> > > > into C1 on disk
> > > >
> > > > But in Hbase when C0 which is memstore I guess? is exceeded the
> > threshold
> > > > size its dumped on to HDFS as HFIle(c1 I guess?) - and does
> compaction
> > is
> > > > the process which here means as merging of C0 and C1 ?
> > > >
> > > >
> > > The 'merge' in the quoted high-level description may just mean that the
> > > dumped hfile is 'merged' with the others at read time. Or it may be as
> > > stated, that the 'merge' happens at flush time. Some LSM tree
> > > implementations do it this way -- Bigtable, and it calls the merge of
> > > memstore and a file-on-disk a form of compaction -- but this is not
> what
> > > HBase does; it just dumps the memstore as a flushed hfile. Later, we'll
> > run
> > > a compaction process to merge hfiles in background.
> > >
> > >
> > >
> > > > 2.Moves current, active Map aside as a snapshot (while a write lock
> is
> > > held
> > > > for a short period of time), and then creates a new CSLS instances.
> > > >
> > > > In background, the snapshot is then dumped to disk. We get an
> Iterator
> > on
> > > > CSLS. We write a block at a time. When we exceed configured block
> size,
> > > we
> > > > start a new one.
> > > >
> > > > -- Does write lock is held till the time complete CSLS is dumpled on
> > > > disk.
> > >
> > >
> > >
> > > No. Just while we move aside the current MemStore.
> > >
> > > What is your concern/objective? Are you studying LSM trees generally or
> > are
> > > you worried that HBase is offline for periods of time for read and
> write?
> > >
> > > Thanks,
> > > St.Ack
> > >
> > >
> > >
> > > > And read is allowed using snapshot.
> > > >
> > > >
> > >
> > >
> > >
> > > > Thanks!
> > > >
> > > >
> > > >
> > > > On Mon, May 2, 2016 at 11:39 AM, Stack <st...@duboce.net> wrote:
> > > >
> > > > > On Sun, May 1, 2016 at 3:36 AM, Shushant Arora <
> > > > shushantaror...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > 1.Does Hbase uses ConcurrentskipListMap(CSLM) to store data in
> > > > memstore?
> > > > > >
> > > > > > Yes (We use a CSLS but this is implemented over a CSLM).
> > > > >
> > > > >
> > > > > > 2.When mwmstore is flushed to HDFS- does it dump the memstore
> > > > > > Concurrentskiplist as Hfile2? Then How does it calculates blocks
> > out
> > > of
> > > > > > CSLM and dmp them in HDFS.
> > > > > >
> > > > > >
> > > > > Moves current, active Map aside as a snapshot (while a write lock
> is
> > > held
> > > > > for a short period of time), and then creates a new CSLS instances.
> > > > >
> > > > > In background, the snapshot is then dumped to disk. We get an
> > Iterator
> > > on
> > > > > CSLS. We write a block at a time. When we exceed configured block
> > size,
> > > > we
> > > > > start a new one.
> > > > >
> > > > >
> > > > > > 3.After dumping the inmemory CSLM of memstore to HFILe does
> > memstore
> > > > > > content is discarded
> > > > >
> > > > >
> > > > > Yes
> > > > >
> > > > >
> > > > >
> > > > > > and if while dumping memstore any read request comes
> > > > > > will it be responded by copy of memstore or discard of memstore
> > will
> > > be
> > > > > > blocked until read request is completed?
> > > > > >
> > > > > > We will respond using the snapshot until it has been successfully
> > > > dumped.
> > > > > Once dumped, we'll respond using the hfile.
> > > > >
> > > > > No blocking (other than for the short period during which the
> > snapshot
> > > is
> > > > > made and the file is swapped into the read path).
> > > > >
> > > > >
> > > > >
> > > > > > 4.When a read request comes does it look in inmemory CSLM and
> then
> > in
> > > > > > HFile?
> > > > >
> > > > >
> > > > > Generally, yes.
> > > > >
> > > > >
> > > > >
> > > > > > And what is LogStructuredMerge tree and its usage in Hbase.
> > > > > >
> > > > > >
> > > > > Suggest you read up on LSM Trees (
> > > > > https://en.wikipedia.org/wiki/Log-structured_merge-tree) and if
> you
> > > > still
> > > > > can't see the LSM tree in the HBase forest, ask specific questions
> > and
> > > > > we'll help you out.
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Thanks!
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to