On Mon, May 9, 2016 at 4:27 AM, Shushant Arora <[email protected]> wrote:
> Thanks! > > 1.Will write take lock on all the column families or just the column family > being affected by write? > > High-level, the lock is taken for the row at a level above where we begin to access column families (Recently the row lock went read/write.. but that is another topic). > 2.How does eviction in LRUBlockcache is implemeted for InMemory or > multiaccess priority. Say all elements of InMemory priority area(25%) are > recently used than single and multiaccess area. Now if a new inmemory row > comes will it evict from inmemory or single access area ? > > I'd have to read the code. Suggest you cut out the middleman and read the code yourself (come back here and ask again if you can't figure it). > 3.Why block cache is single per regionserver. Why not single per region. > > Accounting. It is easier to do accounting on a single mass of memory rather than many. What advantage do you see to having a cache per region? St.Ack > > On Sun, May 8, 2016 at 11:43 PM, Stack <[email protected]> wrote: > > > On Sun, May 8, 2016 at 6:12 AM, Shushant Arora < > [email protected]> > > wrote: > > > > > Thanks ! > > > > > > One doubt regarding locking in memtore : > > > > > > Hbase use implicit row lock while applying put operation on a row. > > > > > > put(byte[] rowkey). > > > > > > when htable.put(p) is fired , regionserver will lock the row but all > get > > > operations will not lock the row and return the row state which was at > > > state previous to put took lock. > > > > > > Memstore is implemented as CSLM so how does it return the row state > > > previous to put lock when get is fired before put is finished? > > > > > > > > Multiversion Concurrency Control. This is the core class: > > > > > http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/MultiVersionConcurrencyControl.html > > See how it is used in the codebase. > > > > Ask more questions if not clear. > > St.Ack > > > > > > > > > On Tue, May 3, 2016 at 7:41 AM, Stack <[email protected]> wrote: > > > > > > > On Mon, May 2, 2016 at 5:34 PM, Shushant Arora < > > > [email protected]> > > > > wrote: > > > > > > > > > Thanks Stack. > > > > > > > > > > 1.So is it at any time there will be two reference 1.active > memstore > > > > > 2.snapshot memstore > > > > > snapshot will be initialised at time of flush using active memstore > > > with > > > > a > > > > > momentaily lock and then active will be discarded and read will be > > > served > > > > > usinmg snapshot and write will go to new active memstore. > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > 2key of CSLS is keyvalue . Which part of keyValue is used while > > sorting > > > > the > > > > > set. Is it whole keyvalue or just row key. Does Hfile has separate > > > entry > > > > > for each key value and keyvalues of same row key are always stored > > > > > contiguosly in HFile and may not be in same block? > > > > > > > > > > > > > > Just the row key. Value is not considered in the sort. > > > > > > > > Yes, HFile has separate entry for each KeyValue (or 'Cell' in > > > hbase-speak). > > > > > > > > Cells in HFile are sorted. Those of the same or near 'Cell' > coordinates > > > > will be sorted together and may therefore appear inside the same > block. > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 12:05 AM, Stack <[email protected]> wrote: > > > > > > > > > > > On Mon, May 2, 2016 at 10:06 AM, Shushant Arora < > > > > > [email protected] > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Thanks Stack > > > > > > > > > > > > > > for point 2 : > > > > > > > I am concerned with downtime of Hbase for read and write. > > > > > > > If write lock is just for the time while we move aside the > > current > > > > > > > MemStore. > > > > > > > Then when a write happens to key will it update the memstore > only > > > but > > > > > > > snapshot does not have that update and when snapshot is dunmped > > to > > > > > Hfile > > > > > > > won't we loose the update? > > > > > > > > > > > > > > > > > > > > > > > > > > > No. The update is in the new currently active MemStore. The > update > > > will > > > > > be > > > > > > included in the next flush added to a new hfile. > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, May 2, 2016 at 9:06 PM, Stack <[email protected]> > wrote: > > > > > > > > > > > > > > > On Mon, May 2, 2016 at 1:25 AM, Shushant Arora < > > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > Few doubts; > > > > > > > > > > > > > > > > > > 1.LSM tree comprises two tree-like > > > > > > > > > <https://en.wikipedia.org/wiki/Tree_(data_structure)> > > > > structures, > > > > > > > called > > > > > > > > > C0 and > > > > > > > > > C1 and If the insertion causes the C0 component to exceed a > > > > certain > > > > > > > size > > > > > > > > > threshold, a contiguous segment of entries is removed from > C0 > > > and > > > > > > > merged > > > > > > > > > into C1 on disk > > > > > > > > > > > > > > > > > > But in Hbase when C0 which is memstore I guess? is exceeded > > the > > > > > > > threshold > > > > > > > > > size its dumped on to HDFS as HFIle(c1 I guess?) - and does > > > > > > compaction > > > > > > > is > > > > > > > > > the process which here means as merging of C0 and C1 ? > > > > > > > > > > > > > > > > > > > > > > > > > > The 'merge' in the quoted high-level description may just > mean > > > that > > > > > the > > > > > > > > dumped hfile is 'merged' with the others at read time. Or it > > may > > > be > > > > > as > > > > > > > > stated, that the 'merge' happens at flush time. Some LSM tree > > > > > > > > implementations do it this way -- Bigtable, and it calls the > > > merge > > > > of > > > > > > > > memstore and a file-on-disk a form of compaction -- but this > is > > > not > > > > > > what > > > > > > > > HBase does; it just dumps the memstore as a flushed hfile. > > Later, > > > > > we'll > > > > > > > run > > > > > > > > a compaction process to merge hfiles in background. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2.Moves current, active Map aside as a snapshot (while a > > write > > > > lock > > > > > > is > > > > > > > > held > > > > > > > > > for a short period of time), and then creates a new CSLS > > > > instances. > > > > > > > > > > > > > > > > > > In background, the snapshot is then dumped to disk. We get > an > > > > > > Iterator > > > > > > > on > > > > > > > > > CSLS. We write a block at a time. When we exceed configured > > > block > > > > > > size, > > > > > > > > we > > > > > > > > > start a new one. > > > > > > > > > > > > > > > > > > -- Does write lock is held till the time complete CSLS is > > > dumpled > > > > > on > > > > > > > > > disk. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No. Just while we move aside the current MemStore. > > > > > > > > > > > > > > > > What is your concern/objective? Are you studying LSM trees > > > > generally > > > > > or > > > > > > > are > > > > > > > > you worried that HBase is offline for periods of time for > read > > > and > > > > > > write? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And read is allowed using snapshot. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, May 2, 2016 at 11:39 AM, Stack <[email protected]> > > > wrote: > > > > > > > > > > > > > > > > > > > On Sun, May 1, 2016 at 3:36 AM, Shushant Arora < > > > > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > 1.Does Hbase uses ConcurrentskipListMap(CSLM) to store > > data > > > > in > > > > > > > > > memstore? > > > > > > > > > > > > > > > > > > > > > > Yes (We use a CSLS but this is implemented over a > CSLM). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2.When mwmstore is flushed to HDFS- does it dump the > > > memstore > > > > > > > > > > > Concurrentskiplist as Hfile2? Then How does it > calculates > > > > > blocks > > > > > > > out > > > > > > > > of > > > > > > > > > > > CSLM and dmp them in HDFS. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Moves current, active Map aside as a snapshot (while a > > write > > > > lock > > > > > > is > > > > > > > > held > > > > > > > > > > for a short period of time), and then creates a new CSLS > > > > > instances. > > > > > > > > > > > > > > > > > > > > In background, the snapshot is then dumped to disk. We > get > > an > > > > > > > Iterator > > > > > > > > on > > > > > > > > > > CSLS. We write a block at a time. When we exceed > configured > > > > block > > > > > > > size, > > > > > > > > > we > > > > > > > > > > start a new one. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3.After dumping the inmemory CSLM of memstore to HFILe > > does > > > > > > > memstore > > > > > > > > > > > content is discarded > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and if while dumping memstore any read request comes > > > > > > > > > > > will it be responded by copy of memstore or discard of > > > > memstore > > > > > > > will > > > > > > > > be > > > > > > > > > > > blocked until read request is completed? > > > > > > > > > > > > > > > > > > > > > > We will respond using the snapshot until it has been > > > > > successfully > > > > > > > > > dumped. > > > > > > > > > > Once dumped, we'll respond using the hfile. > > > > > > > > > > > > > > > > > > > > No blocking (other than for the short period during which > > the > > > > > > > snapshot > > > > > > > > is > > > > > > > > > > made and the file is swapped into the read path). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4.When a read request comes does it look in inmemory > CSLM > > > and > > > > > > then > > > > > > > in > > > > > > > > > > > HFile? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Generally, yes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And what is LogStructuredMerge tree and its usage in > > Hbase. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Suggest you read up on LSM Trees ( > > > > > > > > > > https://en.wikipedia.org/wiki/Log-structured_merge-tree) > > and > > > > if > > > > > > you > > > > > > > > > still > > > > > > > > > > can't see the LSM tree in the HBase forest, ask specific > > > > > questions > > > > > > > and > > > > > > > > > > we'll help you out. > > > > > > > > > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
