Re: flushing memstore when cluster shuts down WAS: empty table after flush

Stack Fri, 24 Sep 2010 20:49:28 -0700

When aborting its skips flushing memstore.

We're aborting because we lost our session with zookeeper.  When
aborting we try to exit swiftly and without doing anything that could
interfere w/ the recovery process that has probably already started.
Since recovery process will recover our memstore edits from WAL log,
there isn't much point completing a flush.


St.Ack

On Fri, Sep 24, 2010 at 5:13 PM, Ted Yu <[email protected]> wrote:
> I think that explains why memstore wasn't flushed.
> See snippet from region server log:
>
> 2010-09-24 11:21:18,603 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes:
> Total=6.5512085MB (6869440), Free=791.7863MB (830248128), Max=798.3375MB
> (837117568), Counts: Blocks=1, Access=384, Hit=383, Miss=1, Evictions=0,
> Evicted=0, Ratios: Hit Ratio=99.73958134651184%, Miss
> Ratio=0.2604166744276881%, Evicted/Run=NaN
> 2010-09-24 11:21:20,459 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12b44afa1f50001 to sun.nio.ch.selectionkeyi...@3d3cdaa
> java.io.IOException: TIMED OUT
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
> 2010-09-24 11:21:21,512 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
> state: Disconnected, type: None, path: null
> 2010-09-24 11:21:22,849 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server sjc9-flash-grid01.carrieriq.com/10.32.56.155:2181
> 2010-09-24 11:21:22,849 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 10.32.56.156:45455 remote=sjc9-flash-grid01.carrieriq.com/10.32.56.155:2181]
> 2010-09-24 11:21:22,873 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2010-09-24 11:21:23,065 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12b44afa1f50001 to sun.nio.ch.selectionkeyi...@383de71c
> java.io.IOException: Session Expired
>    at
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589)
>    at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709)
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
> 2010-09-24 11:21:23,065 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
> state: Expired, type: None, path: null
> 2010-09-24 11:21:23,065 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session
> expired
> 2010-09-24 11:21:23,101 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=45, stores=119, storefiles=5, storefileIndexSize=0,
> memstoreSize=7, compactionQueueSize=0, usedHeap=61, maxHeap=3991,
> blockCacheSize=6869440, blockCacheFree=830248128, blockCacheCount=1,
> blockCacheHitRatio=99, fsReadLatency=0, fsWriteLatency=0, fsSyncLatency=0
> 2010-09-24 11:21:24,457 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60020
>
> Obviously this method in HRegionServer triggered abort():
>  public void process(WatchedEvent event) {
>
> In this situation, why should internalFlushcache() be governed by abort ?
>
> Thanks
>
> On Fri, Sep 24, 2010 at 4:54 PM, Jean-Daniel Cryans 
> <[email protected]>wrote:
>
>> Like that:
>>
>> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/HRegion.html#519
>>
>> (this is a link to the 0.89 code, but 0.20 works exactly the same way)
>>
>> J-D
>>
>> On Fri, Sep 24, 2010 at 4:49 PM, Ted Yu <[email protected]> wrote:
>> > When HBase cluster shuts down, how is memstore flushing triggered ?
>> >
>> > Thanks
>> >
>> > On Fri, Sep 24, 2010 at 4:01 PM, Jean-Daniel Cryans <[email protected]
>> >wrote:
>> >
>> >> > StripedHBaseTable is just a wrapper around HTable.
>> >>
>> >> Ok.
>> >>
>> >> > QA installed another application which queries HBase. When QA didn't
>> see
>> >> > meaningful data from the application.
>> >> > Why didn't HBase serve data from Memstore ?
>> >>
>> >> I cannot answer this question, as far as I know 0.20.6 doesn't have a
>> >> bug that prevents you from writing data (and many people use it in
>> >> production environments). My guess is, the data never got into HBase
>> >> or it was written with wrong timestamps and is hidden by delete
>> >> markers or something "time-travelly" like that. If they do play with
>> >> timestamps, ask them not to or verify they do the right thing.
>> >>
>> >> > QA used stop-hbase.sh to stop HBase. Would Memstore contents be
>> flushed
>> >> to
>> >> > Storefile at that moment ?
>> >>
>> >> Yep, if you don't see any store file then the data never made it into
>> >> hbase.
>> >>
>> >> >
>> >> > QA servers were running JVM version 1.6.0_18
>> >>
>> >> From
>> >>
>> http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#requirements
>> >>
>> >> "Java 1.6.x, preferably from Sun. Use the latest version available
>> >> except u18 (u19 is fine)."
>> >>
>> >> J-D
>> >>
>> >
>>
>

Re: flushing memstore when cluster shuts down WAS: empty table after flush

Reply via email to