Don't expect any kind of performance when running HBase on VMs, by definition. Although exactly how bad will depend on the VM host environment, allocations per container, and such.
As for your questions: > When is HFile(StoreFile) being loaded as a region into region server's memory? Stargate is just another HBase client from the perspective of the RegionServers. So, the RegionServers will service requests from the REST gateway as needed, reading store files on demand. > Does a region stay in region server's memory afterward? When is it being freed? Depends, although the REST gateway pessimistically sets Scan# scan.setCacheBlocks(false) so scans from REST are unlikely to result in HFile block caching if those blocks are not in cache already. If they happen to be in cache from another request from another type of client, then the RegionServers will use the cached blocks and update usage counts for those blocks for LRU, etc. > When Stargate uses a scan instance to obtain data, does it communicate with region server with another connection overhead? It looks like this: REST client <--> Stargate <--> RegionServers On Tue, Apr 1, 2014 at 9:18 AM, yglin <[email protected]> wrote: > Hi~ > > I would like to know how data flows when you query it from HBase or > Stargate, especially in I/O perspective. > Please point me some directions to study. > That means questions like below: > When is HFile(StoreFile) being loaded as a region into region server's > memory? > Does a region stay in region server's memory afterward? When is it being > freed? > When Stargate uses a scan instance to obtain data, does it communicate with > region server with another connection overhead? > > Actually I'm asking these because I'm experimenting Toad for Cloud Database > on HBase. > And I got a performance issue of querying 400K data rows in about 5 > minutes, > kind of a awkward number. > I installed HBase/HDFS on 7 VMs, > 1 ResourceManager, 1 as NameNode and HMaster, 5 as DataNodes and > RegionServers > Barely change any configuration for performance tuning. > I drew myself a very simple chart trying to find where are the bottlenecks. > < > http://apache-hbase.679495.n3.nabble.com/file/n4057719/Toad_Read_HBase_Process.png > > > > I know I could miss many details in this simple chart > Please give me some clues > Much appreciate > > yglin > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/HBase-Stargate-dataflow-in-I-O-perspective-tp4057719.html > Sent from the HBase User mailing list archive at Nabble.com. > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
