How to avoid stop-the-world GC for HBase Region Server under big heap size

2012-08-23 Thread Gen Liu
Hi,

We are running Region Server on big memory machine (70G) and set Xmx=64G.
Most heap is used as block cache for random read.
Stop-the-world GC is killing the region server, but using less heap (16G)
doesn't utilize our machines well.

Is there a concurrent or parallel GC option that won't block all threads?

Any thought is appreciated. Thanks.

Gen Liu



Re: How to avoid stop-the-world GC for HBase Region Server under big heap size

2012-08-23 Thread J Mohamed Zahoor
Slab cache might help
http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/

./zahoor

On Thu, Aug 23, 2012 at 11:36 AM, Gen Liu ge...@zynga.com wrote:

 Hi,

 We are running Region Server on big memory machine (70G) and set Xmx=64G.
 Most heap is used as block cache for random read.
 Stop-the-world GC is killing the region server, but using less heap (16G)
 doesn't utilize our machines well.

 Is there a concurrent or parallel GC option that won't block all threads?

 Any thought is appreciated. Thanks.

 Gen Liu




client cache for all region server information?

2012-08-23 Thread Lin Ma
Hello HBase masters,

I am wondering whether in current implementation, each client of HBase
cache all information of region server, for example, where is region server
(physical hosting machine of region server), and also cache row-key range
managed by the region server. If so, two more questions,

- will there be too much overhead (e.g. memory footprint) of each client?
- when such information is downloaded and cached at client side, and when
the information is refreshed (does it only triggered by region server
change and failure to fetch such information from client -- e.g. when
client use cache to access machine A for region B, but find nothing, so the
client needs to refresh the information in cache to see which machine owns
region B)?

regards,
Lin


Re: client cache for all region server information?

2012-08-23 Thread Pamecha, Abhishek
I think for the refresh case, client first uses the older region server derived 
from its cache  it then connects to that older  region server which responds 
with a failure code.  and then client talks to the zookeeper and then the meta 
node server to find the new region server for that key. The client then 
reissues the original request to the new region server. 

Btw,Client only caches information as needed for its queries and not 
necessarily for 'all' region servers. 

Abhishek


i Sent from my iPad with iMstakes 

On Aug 22, 2012, at 23:31, Lin Ma lin...@gmail.com wrote:

 Hello HBase masters,
 
 I am wondering whether in current implementation, each client of HBase
 cache all information of region server, for example, where is region server
 (physical hosting machine of region server), and also cache row-key range
 managed by the region server. If so, two more questions,
 
 - will there be too much overhead (e.g. memory footprint) of each client?
 - when such information is downloaded and cached at client side, and when
 the information is refreshed (does it only triggered by region server
 change and failure to fetch such information from client -- e.g. when
 client use cache to access machine A for region B, but find nothing, so the
 client needs to refresh the information in cache to see which machine owns
 region B)?
 
 regards,
 Lin


Re: How to avoid stop-the-world GC for HBase Region Server under big heap size

2012-08-23 Thread N Keywal
Hi,

For a possible future, there is as well this to monitor:
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/G1.html
More or less requires JDK 1.7
See HBASE-2039

Cheers,

N.

On Thu, Aug 23, 2012 at 8:16 AM, J Mohamed Zahoor jmo...@gmail.com wrote:
 Slab cache might help
 http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/

 ./zahoor

 On Thu, Aug 23, 2012 at 11:36 AM, Gen Liu ge...@zynga.com wrote:

 Hi,

 We are running Region Server on big memory machine (70G) and set Xmx=64G.
 Most heap is used as block cache for random read.
 Stop-the-world GC is killing the region server, but using less heap (16G)
 doesn't utilize our machines well.

 Is there a concurrent or parallel GC option that won't block all threads?

 Any thought is appreciated. Thanks.

 Gen Liu




Re: How to query by rowKey-infix

2012-08-23 Thread Christian Schäfer
Hi Anil,

to restrict data to a certain time window I also set timerange for the scan.

I'm slightly shocked about the processing time of more than 2 mins to return 
225 rows.
I would actually need a response in 5-10 sec.
In your   timestamp based filtering, do you check the timestamp as part of the 
row key or do you use the put timestamp (as I do)?
How many rows are scanned/touched  at your timestamp based filtering? 

Is it a full table scan where each row's key is checked against a given 
timestamp/timerange?


My use case of obtaining data by substring comparator operates on the row key.
It can't be replaced by setting the time range in my case, really. 

Btw. the scan is additionally restricted to a certain timerange to increase 
skipping of irrelevant files and thus improve performance.

 
regards,
Christian



- Ursprüngliche Message -
Von: anil gupta anilgupt...@gmail.com
An: user@hbase.apache.org; Christian Schäfer syrious3...@yahoo.de
CC: 
Gesendet: 20:42 Mittwoch, 22.August 2012
Betreff: Re: How to query by rowKey-infix

Hi Christian,

I had the similar requirements as yours. So, till now i have used
timestamps for filtering the data and I would say the performance is
satisfactory. Here are the results of timestamp based filtering:
The table has 34 million records(average row size is 1.21 KB), in 136
seconds i get the entire result of query which had 225 rows.
I am running a HBase 0.92, 8 node cluster on Vmware Hypervisor. Each node
had 3.2 GB of memory, and 500 GB HDFS space. Each Hard Drive in my set-up
is hosting 2 Slaves Instance(2 VM's running Datanode,
NodeManager,RegionServer). I have only allocated 1200MB for RS's. I haven't
done any modification in the block size of HDFS or HBase. Considering the
below-par hardware configuration of cluster i feel the performance is OK
and IMO it'll be better than substring comparator of column values since in
substring comparator filter you are essentially doing a FULL TABLE scan.
Whereas, in timerange based scan you can *Skip Store Files*.

On a side note, Alex created a JIRA for enhancing the current
FuzzyRowFilter to do range based filtering also. Here is the link:
https://issues.apache.org/jira/browse/HBASE-6618 . You are more than
welcome if you would like to chime in.

HTH,
Anil Gupta


On Thu, Aug 9, 2012 at 1:55 PM, Christian Schäfer syrious3...@yahoo.dewrote:

 Nice. Thanks Alex for sharing your experiences with that custom filter
 implementation.


 Currently I'm still using key filter with substring comparator.
 As soon as I got a good amount of test data I will measure performance of
 that naiive substring filter in comparison to your fuzzy row filter.

 regards,
 Christian



 
 Von: Alex Baranau alex.barano...@gmail.com
 An: user@hbase.apache.org; Christian Schäfer syrious3...@yahoo.de
 Gesendet: 22:18 Donnerstag, 9.August 2012
 Betreff: Re: How to query by rowKey-infix


 jfyi: documented FuzzyRowFilter usage here: http://bit.ly/OXVdbg. Will
 add documentation to HBase book very soon [1]

 Alex Baranau
 --
 Sematext :: http://sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr

 [1] https://issues.apache.org/jira/browse/HBASE-6526

 On Fri, Aug 3, 2012 at 6:14 PM, Alex Baranau alex.barano...@gmail.com
 wrote:

 Good!
 
 
 Submitted initial patch of fuzzy row key filter at
 https://issues.apache.org/jira/browse/HBASE-6509. You can just copy the
 filter class and include it in your code and use it in your setup as any
 other custom filter (no need to patch HBase).
 
 
 Please let me know if you try it out (or post your comments at
 HBASE-6509).
 
 
 Alex Baranau
 --
 Sematext :: http://sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr
 
 
 On Fri, Aug 3, 2012 at 5:23 AM, Christian Schäfer syrious3...@yahoo.de
 wrote:
 
 Hi Alex,
 
 thanks a lot for the hint about setting the timestamp of the put.
 I didn't know that this would be possible but that's solving the problem
 (first test was successful).
 So I'm really glad that I don't need to apply a filter to extract the
 time and so on for every row.
 
 Nevertheless I would like to see your custom filter implementation.
 Would be nice if you could provide it helping me to get a bit into it.
 
 And yes that helped :)
 
 regards
 Chris
 
 
 
 
 Von: Alex Baranau alex.barano...@gmail.com
 An: user@hbase.apache.org; Christian Schäfer syrious3...@yahoo.de
 Gesendet: 0:57 Freitag, 3.August 2012
 
 Betreff: Re: How to query by rowKey-infix
 
 
 Hi Christian!
 If to put off secondary indexes and assume you are going with heavy
 scans, you can try two following things to make it much faster. If this is
 appropriate to your situation, of course.
 
 1.
 
  Is there a more elegant way to collect rows within time range X?
  (Unfortunately, the date attribute is not equal to the timestamp that
 is stored by hbase automatically.)
 
 Can you set timestamp of the Puts to the one you have in row key?
 Instead of relying 

Re: Hbase Shell: UnsatisfiedLinkError

2012-08-23 Thread o brbrs
2012/8/22 Stack st...@duboce.net

 On Wed, Aug 22, 2012 at 4:39 AM, o brbrs obr...@gmail.com wrote:
  Thanks for your reply. I send this issue to the user mail list, but i
  haven't got any reply.
  I have installed jdk 1.6 and hbase 0.94, and have made configuration that
  are said in http://hbase.apache.org/book.html#configuration. But the
 error
  continues.
 

 Suggest you go googling for an answer.  This is general jruby jffi
 dependency issue -- our shell is jruby -- unsatisfied in your
 environment (For example, this link has a user running ibm's jvm which
 could be the cause of the missing link:
 http://www.digipedia.pl/usenet/thread/13899/1438/).

 St.Ack


Thanks for your reply. I fixed the problem by changing jffi and ffi folders
which are in jruby-complete-1.0.6.jar with folders which are in
jruby-complete-1.0.7.jar and patching again jruby-complete-1.0.6.jar. It
works.


-- 
...
Obrbrs


HTable batch execution order

2012-08-23 Thread Shagun Agarwal
Hi,

I have a question about HTable.batch(List? extends Row actions,Object[] 
results) API, according to java doc -The ordering of execution of the actions 
is not defined. Meaning if you do a Put and a Get in the same batch call, you 
will not necessarily be guaranteed that the Get returns what the Put had put.
however my question is if I don't mix up the actions  only provide Get action, 
do I get the result in same order in which Get was provided.
e.g if I provide 3 Get with row keys [r1, r2, r3], will I get [result1, 
result2, result3]?

Thanks
Shagun Agarwal


Re: client cache for all region server information?

2012-08-23 Thread Lin Ma
Thank you Abhishek,

Two more comments,

-- Client only caches information as needed for its queries and not
necessarily for 'all' region servers. -- how did client know which region
server information is necessary to be cached in current HBase
implementation?

-- When the client loads region server information for the first time? Did
client persistent cache information at client side about region server
information?

regards,
Lin

On Thu, Aug 23, 2012 at 2:47 PM, Pamecha, Abhishek apame...@x.com wrote:

 I think for the refresh case, client first uses the older region server
 derived from its cache  it then connects to that older  region server which
 responds with a failure code.  and then client talks to the zookeeper and
 then the meta node server to find the new region server for that key. The
 client then reissues the original request to the new region server.

 Btw,Client only caches information as needed for its queries and not
 necessarily for 'all' region servers.

 Abhishek


 i Sent from my iPad with iMstakes

 On Aug 22, 2012, at 23:31, Lin Ma lin...@gmail.com wrote:

  Hello HBase masters,
 
  I am wondering whether in current implementation, each client of HBase
  cache all information of region server, for example, where is region
 server
  (physical hosting machine of region server), and also cache row-key range
  managed by the region server. If so, two more questions,
 
  - will there be too much overhead (e.g. memory footprint) of each client?
  - when such information is downloaded and cached at client side, and when
  the information is refreshed (does it only triggered by region server
  change and failure to fetch such information from client -- e.g. when
  client use cache to access machine A for region B, but find nothing, so
 the
  client needs to refresh the information in cache to see which machine
 owns
  region B)?
 
  regards,
  Lin



Re: backup strategies

2012-08-23 Thread Rita
Lets say I have a huge table and I want to back it up onto system with a
lot of disk space. Would this work, take all the keys and export the
database in chunks by selectively picking a range. For instance if the keys
are from 0-10, I would say backup key 0-5 into backup_dir_A and
50001-10 to backup_dir_B . Would the be feasible?



On Wed, Aug 22, 2012 at 6:48 AM, Rita rmorgan...@gmail.com wrote:

 what is the typical conversion process? My biggest worry is I come from a
 higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1.



 On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles pmack...@adobe.com wrote:

 Hi Rita

 By default, the export that ships with hbase writes KeyValue objects to a
 sequence file. It is a very simple app and it wouldn't be hard to roll
 your own export program to write to whatever format you wanted (its a very
 simple app). You can use the current export program as a basis and just
 change the output of the mapper.

 I will say that I spent a lot of time thinking about backups and DR and I
 didn't really worry much about hbase versions. The file formats for hbase
 don't change that often and when they do, there is usually a pretty
 straight-forward conversion process. Also, if you are doing something like
 full daily backups then I am having trouble imagining a scenario where you
 would need to restore from anything but the most recent backup.

 Depending on which version of hbase you are using, there are probably much
 bigger issues with using export for backups that you should worry about
 like being able to restore in a timely fashion, preserving deletes and
 impact of the backup procress on your SLA.

 Paul


 On 8/16/12 7:31 AM, Rita rmorgan...@gmail.com wrote:

 I am sure this topic has been visited many times but I though I ask to
 see
 if anything changed.
 
 We are using hbase with close to 40b rows and backing up the data is
 non-trivial. We can use export table to another Hadoop/HDFS filesystem
 but
 I am not aware of any guaranteed way of preserving data from one version
 of
 Hbase to another (specifically if its very old) . Is there a program
 which
 will serialize the data into JSON/XML and dump it on a Unix filesystem?
 Once I get the data we can compress it whatever we like and back it up
 using our internal software.
 
 
 
 
 --
 --- Get your facts first, then you can distort them as you please.--




 --
 --- Get your facts first, then you can distort them as you please.--




-- 
--- Get your facts first, then you can distort them as you please.--


Re: how client location a region/tablet?

2012-08-23 Thread Doug Meil

For further information about the catalog tables and region-regionserver
assignment, see thisŠ

http://hbase.apache.org/book.html#arch.catalog






On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:

Thank you Stack, especially for the smart 6 round trip guess for the
puzzle. :-)

1. Yeah, we client cache's locations, not the data. -- does it mean for
each client, it will cache all location information of a HBase cluster,
i.e. which physical server owns which region? Supposing each region has
128M bytes, for a big cluster (P-bytes level), total data size / 128M is
not a trivial number, not sure if any overhead to client?
2. A bit confused by what do you mean not the data? For the client
cached
location information, it should be the data in table METADATA, which is
region / physical server mapping data. Why you say not data (do you mean
real content in each region)?

regards,
Lin

On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:

 On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
  Hello guys,
 
  I am referencing the Big Table paper about how a client locates a
tablet.
  In section 5.1 Tablet location, it is mentioned that client will cache
 all
  tablet locations, I think it means client will cache root tablet in
  METADATA table, and all other tablets in METADATA table (which means
 client
  cache the whole METADATA table?). My question is, whether HBase
 implements
  in the same or similar way? My concern or confusion is, supposing each
  tablet or region file is 128M bytes, it will be very huge space (i.e.
  memory footprint) for each client to cache all tablets or region
files of
  METADATA table. Is it doable or feasible in real HBase clusters?
Thanks.
 

 Yeah, we client cache's locations, not the data.


  BTW: another confusion from me is in the paper of Big Table section
5.1
  Tablet location, it is mentioned that If the client¹s cache is stale,
 the
  location algorithm could take up to six round-trips, because stale
cache
  entries are only discovered upon misses (assuming that METADATA
tablets
 do
  not move very frequently)., I do not know how the 6 times round trip
 time
  is calculated, if anyone could answer this puzzle, it will be great.
:-)
 

 I'm not sure what the 6 is about either.  Here is a guesstimate:

 1. Go to cached location for a server for a particular user region,
 but server says that it does not have a region, the client location is
 stale
 2. Go back to client cached meta region that holds user region w/ row
 we want, but its location is stale.
 3. Go to root location, to find new location of meta, but the root
 location has moved what the client has is stale
 4. Find new root location and do lookup of meta region location
 5. Go to meta region location to find new user region
 6. Go to server w/ user region

 St.Ack





Re: Choose the location of a record

2012-08-23 Thread Ian Varley
Blaise,

Generally speaking, no. The distribution of row keys over regions is handled by 
HBase. This is as you would want, so that the failure of any given server is 
transparent to your application. 

There are ways to hack around this, but generally you shouldn't design in such 
a way as to require that. 

What's the requirement motivating your question?

Ian

On Aug 23, 2012, at 7:57 AM, Blaise NGONMANG kaledjebla...@yahoo.fr wrote:

 
 Hi
 
 I just want to know if it is possible to select the server were we want to
 insert a record.
 
 Regards
 Blaise
 -- 
 View this message in context: 
 http://old.nabble.com/Choose-the-location-of-a-record-tp34339260p34339260.html
 Sent from the HBase User mailing list archive at Nabble.com.
 


Re: client cache for all region server information?

2012-08-23 Thread Harsh J
Hi Lin,

On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma lin...@gmail.com wrote:
 Thank you Abhishek,

 Two more comments,

 -- Client only caches information as needed for its queries and not
 necessarily for 'all' region servers. -- how did client know which region
 server information is necessary to be cached in current HBase
 implementation?

What Abhishek meant here is that it caches only the needed table's
rows from META. It also only caches the specific region required for
the row you're looking up/operating on, AFAICT.

 -- When the client loads region server information for the first time? Did
 client persistent cache information at client side about region server
 information?

The client loads up regionserver information for a table, when it is
requested to perform an operation on that table (on a specific row or
the whole). It does not immediately, upon initialization, cache the
whole of META's contents.

Your question makes sense though, that it does seem to be such that a
client *may* use quite a bit of memory space in trying to cache the
META entries locally, but practically we've not had this cause issues
for users yet. The amount of memory cached for META far outweighs the
other items it caches (scan results, etc.). At least I have not seen
any reports of excessive client memory usage just due to region
locations of tables being cached.

I think there's more benefits storing/caching it than not doing so,
and so far we've not needed the extra complexity of persisting the
cache to a local or non-RAM storage than keeping it in memory.

-- 
Harsh J


Re: HTable batch execution order

2012-08-23 Thread Harsh J
Hi Shagun,

The original ordering index is still maintained.

Yes you will have them back in order. Don't be confused by that
javadoc statement. The result list is ordered in the same way as the
actions list, but the order of which they are executed depends on
variable things, and hence the statement The Get may not return what
the Put, in the same batch, had put.

On Thu, Aug 23, 2012 at 2:49 PM, Shagun Agarwal sha...@yahoo-inc.com wrote:
 Hi,

 I have a question about HTable.batch(List? extends Row actions,Object[] 
 results) API, according to java doc -The ordering of execution of the actions 
 is not defined. Meaning if you do a Put and a Get in the same batch call, you 
 will not necessarily be guaranteed that the Get returns what the Put had put.
 however my question is if I don't mix up the actions  only provide Get 
 action, do I get the result in same order in which Get was provided.
 e.g if I provide 3 Get with row keys [r1, r2, r3], will I get [result1, 
 result2, result3]?

 Thanks
 Shagun Agarwal



-- 
Harsh J


Re: client cache for all region server information?

2012-08-23 Thread Lin Ma
Harsh, thanks for the detailed information.

Two more comments,

1. I want to confirm my understanding is correct. At the beginning client
cache has nothing, when it issue request for a table, if the region server
location is not known, it will request from root META region to get region
server information step by step, then cache the region server information.
If cache already contain the requested region information, it will use
directly from cache. In this way, cache grows when cache miss for requested
region information;
2. far outweighs the other items it caches (scan results, etc.), you mean
GET API of HBase cache results? Sorry I am not aware of this feature
before. How the results are cached, and whether we can control it
(supposing a client is doing random read pattern, we do not want to cache
information since each read may be unique row-key access)? Appreciate if
you could point me to some more detailed information.

regards,
Lin

On Thu, Aug 23, 2012 at 9:35 PM, Harsh J ha...@cloudera.com wrote:

 Hi Lin,

 On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma lin...@gmail.com wrote:
  Thank you Abhishek,
 
  Two more comments,
 
  -- Client only caches information as needed for its queries and not
  necessarily for 'all' region servers. -- how did client know which
 region
  server information is necessary to be cached in current HBase
  implementation?

 What Abhishek meant here is that it caches only the needed table's
 rows from META. It also only caches the specific region required for
 the row you're looking up/operating on, AFAICT.

  -- When the client loads region server information for the first time?
 Did
  client persistent cache information at client side about region server
  information?

 The client loads up regionserver information for a table, when it is
 requested to perform an operation on that table (on a specific row or
 the whole). It does not immediately, upon initialization, cache the
 whole of META's contents.

 Your question makes sense though, that it does seem to be such that a
 client *may* use quite a bit of memory space in trying to cache the
 META entries locally, but practically we've not had this cause issues
 for users yet. The amount of memory cached for META far outweighs the
 other items it caches (scan results, etc.). At least I have not seen
 any reports of excessive client memory usage just due to region
 locations of tables being cached.

 I think there's more benefits storing/caching it than not doing so,
 and so far we've not needed the extra complexity of persisting the
 cache to a local or non-RAM storage than keeping it in memory.

 --
 Harsh J



Re: how client location a region/tablet?

2012-08-23 Thread Lin Ma
Doug, very informative document. Thanks a lot!

I read through it and have some thoughts,

- Supposing at the beginning, client side cache for region information is
empty, and the client wants to GET row-key 123 from table ABC;
- The client will read from ROOT table at first. But unfortunately, ROOT
table only contains region information for META table (please correct me if
I am wrong), but not region information for real data table (e.g. table
ABC);
- Does the client have to call each META region server one by one, in order
to find which META region contains information for region owner of row-key
123 of data table ABC?

BTW: I think if there is a way to expose information about what range of
table/region each META region contains from .META. region key, it will be
better to save time to iterate META region server one by one. Please feel
free to correct me if I am wrong.

regards,
Lin

On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil doug.m...@explorysmedical.comwrote:


 For further information about the catalog tables and region-regionserver
 assignment, see thisŠ

 http://hbase.apache.org/book.html#arch.catalog






 On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:

 Thank you Stack, especially for the smart 6 round trip guess for the
 puzzle. :-)
 
 1. Yeah, we client cache's locations, not the data. -- does it mean for
 each client, it will cache all location information of a HBase cluster,
 i.e. which physical server owns which region? Supposing each region has
 128M bytes, for a big cluster (P-bytes level), total data size / 128M is
 not a trivial number, not sure if any overhead to client?
 2. A bit confused by what do you mean not the data? For the client
 cached
 location information, it should be the data in table METADATA, which is
 region / physical server mapping data. Why you say not data (do you mean
 real content in each region)?
 
 regards,
 Lin
 
 On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
 
  On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
   Hello guys,
  
   I am referencing the Big Table paper about how a client locates a
 tablet.
   In section 5.1 Tablet location, it is mentioned that client will cache
  all
   tablet locations, I think it means client will cache root tablet in
   METADATA table, and all other tablets in METADATA table (which means
  client
   cache the whole METADATA table?). My question is, whether HBase
  implements
   in the same or similar way? My concern or confusion is, supposing each
   tablet or region file is 128M bytes, it will be very huge space (i.e.
   memory footprint) for each client to cache all tablets or region
 files of
   METADATA table. Is it doable or feasible in real HBase clusters?
 Thanks.
  
 
  Yeah, we client cache's locations, not the data.
 
 
   BTW: another confusion from me is in the paper of Big Table section
 5.1
   Tablet location, it is mentioned that If the client¹s cache is stale,
  the
   location algorithm could take up to six round-trips, because stale
 cache
   entries are only discovered upon misses (assuming that METADATA
 tablets
  do
   not move very frequently)., I do not know how the 6 times round trip
  time
   is calculated, if anyone could answer this puzzle, it will be great.
 :-)
  
 
  I'm not sure what the 6 is about either.  Here is a guesstimate:
 
  1. Go to cached location for a server for a particular user region,
  but server says that it does not have a region, the client location is
  stale
  2. Go back to client cached meta region that holds user region w/ row
  we want, but its location is stale.
  3. Go to root location, to find new location of meta, but the root
  location has moved what the client has is stale
  4. Find new root location and do lookup of meta region location
  5. Go to meta region location to find new user region
  6. Go to server w/ user region
 
  St.Ack
 





Re: how client location a region/tablet?

2012-08-23 Thread Lin Ma
Dong,

Some more thoughts, after reading data structure for HRegionInfo =
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html,
start key and end key looks informative which we could leverage,

- I am not sure if we could leverage this information (stored as part of
value in table ROOT) to find which META region may contains region server
information for row-key 123 of data table ABC;
- But I think unfortunately the information is stored in value of table
ROOT, other than key field of table ROOT, so that we have to iterate each
row in ROOT table one by one to figure out which META region server to
access.

Not sure if I get the points. Please feel free to correct me.

regards,
Lin

On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma lin...@gmail.com wrote:

 Doug, very informative document. Thanks a lot!

 I read through it and have some thoughts,

 - Supposing at the beginning, client side cache for region information is
 empty, and the client wants to GET row-key 123 from table ABC;
 - The client will read from ROOT table at first. But unfortunately, ROOT
 table only contains region information for META table (please correct me if
 I am wrong), but not region information for real data table (e.g. table
 ABC);
 - Does the client have to call each META region server one by one, in
 order to find which META region contains information for region owner of
 row-key 123 of data table ABC?

 BTW: I think if there is a way to expose information about what range of
 table/region each META region contains from .META. region key, it will be
 better to save time to iterate META region server one by one. Please feel
 free to correct me if I am wrong.

 regards,
 Lin


 On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil 
 doug.m...@explorysmedical.comwrote:


 For further information about the catalog tables and region-regionserver
 assignment, see thisŠ

 http://hbase.apache.org/book.html#arch.catalog






 On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:

 Thank you Stack, especially for the smart 6 round trip guess for the
 puzzle. :-)
 
 1. Yeah, we client cache's locations, not the data. -- does it mean for
 each client, it will cache all location information of a HBase cluster,
 i.e. which physical server owns which region? Supposing each region has
 128M bytes, for a big cluster (P-bytes level), total data size / 128M is
 not a trivial number, not sure if any overhead to client?
 2. A bit confused by what do you mean not the data? For the client
 cached
 location information, it should be the data in table METADATA, which is
 region / physical server mapping data. Why you say not data (do you mean
 real content in each region)?
 
 regards,
 Lin
 
 On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
 
  On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
   Hello guys,
  
   I am referencing the Big Table paper about how a client locates a
 tablet.
   In section 5.1 Tablet location, it is mentioned that client will
 cache
  all
   tablet locations, I think it means client will cache root tablet in
   METADATA table, and all other tablets in METADATA table (which means
  client
   cache the whole METADATA table?). My question is, whether HBase
  implements
   in the same or similar way? My concern or confusion is, supposing
 each
   tablet or region file is 128M bytes, it will be very huge space (i.e.
   memory footprint) for each client to cache all tablets or region
 files of
   METADATA table. Is it doable or feasible in real HBase clusters?
 Thanks.
  
 
  Yeah, we client cache's locations, not the data.
 
 
   BTW: another confusion from me is in the paper of Big Table section
 5.1
   Tablet location, it is mentioned that If the client¹s cache is
 stale,
  the
   location algorithm could take up to six round-trips, because stale
 cache
   entries are only discovered upon misses (assuming that METADATA
 tablets
  do
   not move very frequently)., I do not know how the 6 times round trip
  time
   is calculated, if anyone could answer this puzzle, it will be great.
 :-)
  
 
  I'm not sure what the 6 is about either.  Here is a guesstimate:
 
  1. Go to cached location for a server for a particular user region,
  but server says that it does not have a region, the client location is
  stale
  2. Go back to client cached meta region that holds user region w/ row
  we want, but its location is stale.
  3. Go to root location, to find new location of meta, but the root
  location has moved what the client has is stale
  4. Find new root location and do lookup of meta region location
  5. Go to meta region location to find new user region
  6. Go to server w/ user region
 
  St.Ack
 






Re: client cache for all region server information?

2012-08-23 Thread Harsh J
Hi Lin,

On Thu, Aug 23, 2012 at 7:56 PM, Lin Ma lin...@gmail.com wrote:
 Harsh, thanks for the detailed information.

 Two more comments,

 1. I want to confirm my understanding is correct. At the beginning client
 cache has nothing, when it issue request for a table, if the region server
 location is not known, it will request from root META region to get region
 server information step by step, then cache the region server information.
 If cache already contain the requested region information, it will use
 directly from cache. In this way, cache grows when cache miss for requested
 region information;

You have it correct now. Region locations are cached only if they are
not available. And they are cached on need-basis, not all at once.

 2. far outweighs the other items it caches (scan results, etc.), you mean
 GET API of HBase cache results? Sorry I am not aware of this feature before.
 How the results are cached, and whether we can control it (supposing a
 client is doing random read pattern, we do not want to cache information
 since each read may be unique row-key access)? Appreciate if you could point
 me to some more detailed information.

Am speaking of Scanner value caching, not Gets exactly. See more about
Scanner (client) caching at
http://hbase.apache.org/book.html#perf.hbase.client.caching

 regards,
 Lin


 On Thu, Aug 23, 2012 at 9:35 PM, Harsh J ha...@cloudera.com wrote:

 Hi Lin,

 On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma lin...@gmail.com wrote:
  Thank you Abhishek,
 
  Two more comments,
 
  -- Client only caches information as needed for its queries and not
  necessarily for 'all' region servers. -- how did client know which
  region
  server information is necessary to be cached in current HBase
  implementation?

 What Abhishek meant here is that it caches only the needed table's
 rows from META. It also only caches the specific region required for
 the row you're looking up/operating on, AFAICT.

  -- When the client loads region server information for the first time?
  Did
  client persistent cache information at client side about region server
  information?

 The client loads up regionserver information for a table, when it is
 requested to perform an operation on that table (on a specific row or
 the whole). It does not immediately, upon initialization, cache the
 whole of META's contents.

 Your question makes sense though, that it does seem to be such that a
 client *may* use quite a bit of memory space in trying to cache the
 META entries locally, but practically we've not had this cause issues
 for users yet. The amount of memory cached for META far outweighs the
 other items it caches (scan results, etc.). At least I have not seen
 any reports of excessive client memory usage just due to region
 locations of tables being cached.

 I think there's more benefits storing/caching it than not doing so,
 and so far we've not needed the extra complexity of persisting the
 cache to a local or non-RAM storage than keeping it in memory.

 --
 Harsh J





-- 
Harsh J


Re: how client location a region/tablet?

2012-08-23 Thread Harsh J
HBase currently keeps a single META region (Doesn't split it). ROOT
holds META region location, and META has a few rows in it, a few of
them for each table. See also the class MetaScanner.

On Thu, Aug 23, 2012 at 9:00 PM, Lin Ma lin...@gmail.com wrote:
 Dong,

 Some more thoughts, after reading data structure for HRegionInfo =
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html,
 start key and end key looks informative which we could leverage,

 - I am not sure if we could leverage this information (stored as part of
 value in table ROOT) to find which META region may contains region server
 information for row-key 123 of data table ABC;
 - But I think unfortunately the information is stored in value of table
 ROOT, other than key field of table ROOT, so that we have to iterate each
 row in ROOT table one by one to figure out which META region server to
 access.

 Not sure if I get the points. Please feel free to correct me.

 regards,
 Lin

 On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma lin...@gmail.com wrote:

 Doug, very informative document. Thanks a lot!

 I read through it and have some thoughts,

 - Supposing at the beginning, client side cache for region information is
 empty, and the client wants to GET row-key 123 from table ABC;
 - The client will read from ROOT table at first. But unfortunately, ROOT
 table only contains region information for META table (please correct me if
 I am wrong), but not region information for real data table (e.g. table
 ABC);
 - Does the client have to call each META region server one by one, in
 order to find which META region contains information for region owner of
 row-key 123 of data table ABC?

 BTW: I think if there is a way to expose information about what range of
 table/region each META region contains from .META. region key, it will be
 better to save time to iterate META region server one by one. Please feel
 free to correct me if I am wrong.

 regards,
 Lin


 On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil 
 doug.m...@explorysmedical.comwrote:


 For further information about the catalog tables and region-regionserver
 assignment, see thisŠ

 http://hbase.apache.org/book.html#arch.catalog






 On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:

 Thank you Stack, especially for the smart 6 round trip guess for the
 puzzle. :-)
 
 1. Yeah, we client cache's locations, not the data. -- does it mean for
 each client, it will cache all location information of a HBase cluster,
 i.e. which physical server owns which region? Supposing each region has
 128M bytes, for a big cluster (P-bytes level), total data size / 128M is
 not a trivial number, not sure if any overhead to client?
 2. A bit confused by what do you mean not the data? For the client
 cached
 location information, it should be the data in table METADATA, which is
 region / physical server mapping data. Why you say not data (do you mean
 real content in each region)?
 
 regards,
 Lin
 
 On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
 
  On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
   Hello guys,
  
   I am referencing the Big Table paper about how a client locates a
 tablet.
   In section 5.1 Tablet location, it is mentioned that client will
 cache
  all
   tablet locations, I think it means client will cache root tablet in
   METADATA table, and all other tablets in METADATA table (which means
  client
   cache the whole METADATA table?). My question is, whether HBase
  implements
   in the same or similar way? My concern or confusion is, supposing
 each
   tablet or region file is 128M bytes, it will be very huge space (i.e.
   memory footprint) for each client to cache all tablets or region
 files of
   METADATA table. Is it doable or feasible in real HBase clusters?
 Thanks.
  
 
  Yeah, we client cache's locations, not the data.
 
 
   BTW: another confusion from me is in the paper of Big Table section
 5.1
   Tablet location, it is mentioned that If the client¹s cache is
 stale,
  the
   location algorithm could take up to six round-trips, because stale
 cache
   entries are only discovered upon misses (assuming that METADATA
 tablets
  do
   not move very frequently)., I do not know how the 6 times round trip
  time
   is calculated, if anyone could answer this puzzle, it will be great.
 :-)
  
 
  I'm not sure what the 6 is about either.  Here is a guesstimate:
 
  1. Go to cached location for a server for a particular user region,
  but server says that it does not have a region, the client location is
  stale
  2. Go back to client cached meta region that holds user region w/ row
  we want, but its location is stale.
  3. Go to root location, to find new location of meta, but the root
  location has moved what the client has is stale
  4. Find new root location and do lookup of meta region location
  5. Go to meta region location to find new user region
  6. Go to server w/ user region
 
  St.Ack
 







-- 
Harsh J


Re: How to avoid stop-the-world GC for HBase Region Server under big heap size

2012-08-23 Thread Stack
On Wed, Aug 22, 2012 at 11:06 PM, Gen Liu ge...@zynga.com wrote:
 Hi,

 We are running Region Server on big memory machine (70G) and set Xmx=64G.
 Most heap is used as block cache for random read.
 Stop-the-world GC is killing the region server, but using less heap (16G)
 doesn't utilize our machines well.

 Is there a concurrent or parallel GC option that won't block all threads?

 Any thought is appreciated. Thanks.


Have you tried tuning the JVM at all?  What are the options that you
are running with?  You have GC logs enabled?   Post a few up on
pastebin?  As Mohamed asks, you've the slab allocator enabled?   What
are your configs like?  How many regions per server?  What size are
they?

St.Ack


Re: Thrift2 interface

2012-08-23 Thread Karthik Ranganathan
Hey Joe,

We have tried a few different things wrt the C++ clients and thrift. Just
putting out some of out thoughts here.

First, we used the existing Thrift proxy as a separate tier (Thrift proxy
tier). The issue there was that we just didn't get enough throughput (for
various reasons). Indepedently, adoption of HBase from C++ was increasing
- so we thought it made sense to write a native client.

So we wrote the native C++ client and embedded the thrift proxy into the
region server (embedded thrift proxy). Cutting the redirect from the
client was one gain (as the native client is a smart client), but the real
advantage came from short-circuiting the flow. In the thrift proxy tier
case, the Thrift client would talk to the proxy using Thrift
serialization, proxy would deserialize the Thrift call and re-serialize it
into the Java client format, then send it to the region server which would
deserialize the java formatted buffers again. But in the embedded proxy +
native client, we can short-circuit on the embedded proxy and make a
function call to the region server which is running in the same JVM (which
helps cut one round of serialization and deserialization).

The issues, however, with the thrift based approach are that the Java
objects (Htable, scan, get, put, etc) are not thrift definitions, so they
need to be updated as a separate (and often very different) set of api's
every time there is an enhancement to the Java side of things. The proxy
tier has to be separately configured/tuned/bug fixed from the region
server to make sure it is as performant as the region server - as the
overall system will perform like the slowest component in the stack.

The ideal solution (IMHO) is to have a C++ client which has a compatible
protocol with the Java client, so that there are no significant perf
differences between the two approaches, and there is no separate proxy to
tune. Just a though of course, might be hard to achieve. Of course we have
just talked about this :) but with the move to protocol buffers in trunk,
this should be easier.

Out of curiosity, why thrift2 - do you specifically need thrift api's to
region servers? Why not  efficient C/C++ client for HBase?

Thanks
Karthik



On 8/22/12 4:06 PM, Joe Pallas joseph.pal...@oracle.com wrote:


On Aug 21, 2012, at 9:29 AM, Stack wrote:

 On Mon, Aug 20, 2012 at 6:18 PM, Joe Pallas joseph.pal...@oracle.com
wrote:
 Anyone out there actively using the thrift2 interface in 0.94?  Thrift
bindings for C++ don¹t seem to handle optional arguments too well (that
is to say, it seems that optional arguments are not optional).
Unfortunately, checkAndPut uses an optional argument for value to
distinguish between the two cases (value must match vs no cell with
that column qualifier).  Any clues on how to work around that
difficulty would be welcome.
 
 
 If you make a patch, we'll commit it Joe.

Well, I think the patch really needs to be in Thrift; the only workaround
I can see is to restructure the hbase.thrift interface file to avoid
having routines with optional arguments.  It seems a shame to break
compatibility with existing clients for that, and I am not sure if there
is a way to do it without breaking compatibility.  (On the other hand,
we¹re talking about thrift2, so it isn¹t like there are many existing
clients.)

The state of Thrift documentation is lamentable.  The original white
paper is the most detailed information I can find about compatibility
rules.  It has enough information to tell me that Thrift doesn¹t support
overloading of routine names within a service, because the names are the
identifiers used to identify the routines.  I think that means it isn¹t
possible to make a compatible change that would only affect the client
side.
 
 Have you seen this?
 https://github.com/facebook/native-cpp-hbase-client  Would it help?

The native client stuff is certainly interesting, but, as near as I can
tell, it expects the in-region-server Thrift server, which I would like
to give a chance to mature a bit before playing with.  I¹m also puzzled
by the hbase.thrift file in that repository.  It seems to be based on the
older HBase Thrift interface, but it adds some functions.  I can¹t see
how a client could use them, though, since there are no HBase-side
patches.

Anyone involved with FB¹s native client efforts care to enlighten me?

joe




Re: how client location a region/tablet?

2012-08-23 Thread Harsh J
Lin,

On Thu, Aug 23, 2012 at 10:10 PM, Lin Ma lin...@gmail.com wrote:
 Thanks, Harsh!

 - HBase currently keeps a single META region (Doesn't split it).  -- does
 it mean there is only one row in ROOT table, which points the only one META
 region?

Yes, currently this is the case. We disabled multiple META regions at
some point, I am unsure about why exactly but perhaps it was complex
to maintain that.

 - In Big Table, it seems they have multiple META regions (tablets), is it an
 advantage over HBase? :-)

Well, depends. A single META region hasn't proven as a scalability
bottleneck to anyone yet. A single META region can easily serve
millions of rows if needed, like any other region, and I've usually
not seen META table grow so big in deployments.

-- 
Harsh J


RE: how client location a region/tablet?

2012-08-23 Thread Pamecha, Abhishek
I too thought there are multiple meta regions where as just one ROOT.  May be I 
am mixing b/w Big Table and Hbase.

Thanks,
Abhishek


-Original Message-
From: Lin Ma [mailto:lin...@gmail.com] 
Sent: Thursday, August 23, 2012 9:41 AM
To: user@hbase.apache.org; ha...@cloudera.com
Cc: doug.m...@explorysmedical.com
Subject: Re: how client location a region/tablet?

Thanks, Harsh!

- HBase currently keeps a single META region (Doesn't split it).  -- does it 
mean there is only one row in ROOT table, which points the only one META region?
- In Big Table, it seems they have multiple META regions (tablets), is it an 
advantage over HBase? :-)

regards,
Lin
On Thu, Aug 23, 2012 at 11:48 PM, Harsh J ha...@cloudera.com wrote:

 HBase currently keeps a single META region (Doesn't split it). ROOT 
 holds META region location, and META has a few rows in it, a few of 
 them for each table. See also the class MetaScanner.

 On Thu, Aug 23, 2012 at 9:00 PM, Lin Ma lin...@gmail.com wrote:
  Dong,
 
  Some more thoughts, after reading data structure for HRegionInfo = 
  http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.
  html
 ,
  start key and end key looks informative which we could leverage,
 
  - I am not sure if we could leverage this information (stored as 
  part of value in table ROOT) to find which META region may contains 
  region server information for row-key 123 of data table ABC;
  - But I think unfortunately the information is stored in value of 
  table ROOT, other than key field of table ROOT, so that we have to 
  iterate each row in ROOT table one by one to figure out which META 
  region server to access.
 
  Not sure if I get the points. Please feel free to correct me.
 
  regards,
  Lin
 
  On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma lin...@gmail.com wrote:
 
  Doug, very informative document. Thanks a lot!
 
  I read through it and have some thoughts,
 
  - Supposing at the beginning, client side cache for region 
  information
 is
  empty, and the client wants to GET row-key 123 from table ABC;
  - The client will read from ROOT table at first. But unfortunately, 
  ROOT table only contains region information for META table (please 
  correct
 me if
  I am wrong), but not region information for real data table (e.g. 
  table ABC);
  - Does the client have to call each META region server one by one, 
  in order to find which META region contains information for region 
  owner of row-key 123 of data table ABC?
 
  BTW: I think if there is a way to expose information about what 
  range of table/region each META region contains from .META. region 
  key, it will
 be
  better to save time to iterate META region server one by one. 
  Please
 feel
  free to correct me if I am wrong.
 
  regards,
  Lin
 
 
  On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil 
 doug.m...@explorysmedical.comwrote:
 
 
  For further information about the catalog tables and
 region-regionserver
  assignment, see thisŠ
 
  http://hbase.apache.org/book.html#arch.catalog
 
 
 
 
 
 
  On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:
 
  Thank you Stack, especially for the smart 6 round trip guess for 
  the puzzle. :-)
  
  1. Yeah, we client cache's locations, not the data. -- does it 
  mean
 for
  each client, it will cache all location information of a HBase
 cluster,
  i.e. which physical server owns which region? Supposing each 
  region
 has
  128M bytes, for a big cluster (P-bytes level), total data size / 
  128M
 is
  not a trivial number, not sure if any overhead to client?
  2. A bit confused by what do you mean not the data? For the 
  client cached location information, it should be the data in 
  table METADATA, which
 is
  region / physical server mapping data. Why you say not data (do 
  you
 mean
  real content in each region)?
  
  regards,
  Lin
  
  On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
  
   On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
Hello guys,
   
I am referencing the Big Table paper about how a client 
locates a
  tablet.
In section 5.1 Tablet location, it is mentioned that client 
will
  cache
   all
tablet locations, I think it means client will cache root 
tablet
 in
METADATA table, and all other tablets in METADATA table 
(which
 means
   client
cache the whole METADATA table?). My question is, whether 
HBase
   implements
in the same or similar way? My concern or confusion is, 
supposing
  each
tablet or region file is 128M bytes, it will be very huge 
space
 (i.e.
memory footprint) for each client to cache all tablets or 
region
  files of
METADATA table. Is it doable or feasible in real HBase clusters?
  Thanks.
   
  
   Yeah, we client cache's locations, not the data.
  
  
BTW: another confusion from me is in the paper of Big Table
 section
  5.1
Tablet location, it is mentioned that If the client¹s cache 
is
  stale,
   the
location 

Client receives SocketTimeoutException (CallerDisconnected on RS)

2012-08-23 Thread Adrien Mogenet
Hi there,

While I'm performing read-intensive benchmarks, I'm seeing storm of
CallerDisconnectedException in certain RegionServers. As the
documentation says, my client received a SocketTimeoutException
(6ms etc...) at the same time.
It's always happening and I get very poor read-performances (from 10
to 5000 reads/sc) in a 10 nodes cluster.

My benchmark consists in several iterations launching 10, 100 and 1000
Get requests on a given random rowkey with a single CF/qualifier.
I'm using HBase 0.94.1 (a few commits before the official stable
release) with Hadoop 1.0.3.
Bloom filters have been enabled (at the rowkey level).

I do not find very clear informations about these exceptions. From the
reference guide :
  (...) you should consider digging in a bit more if you aren't doing
something to trigger them.

Well... could you help me digging? :-)

-- 
AM.


Re: Client receives SocketTimeoutException (CallerDisconnected on RS)

2012-08-23 Thread Jean-Daniel Cryans
Hi Adrien,

I would love to see the region server side of the logs while those
socket timeouts happen, also check the GC log, but one thing people
often hit while doing pure random read workloads with tons of clients
is running out of sockets because they are all stuck in CLOSE_WAIT.
You can check that by using lsof. There are other discussion on this
mailing list about it.

J-D

On Thu, Aug 23, 2012 at 10:24 AM, Adrien Mogenet
adrien.moge...@gmail.com wrote:
 Hi there,

 While I'm performing read-intensive benchmarks, I'm seeing storm of
 CallerDisconnectedException in certain RegionServers. As the
 documentation says, my client received a SocketTimeoutException
 (6ms etc...) at the same time.
 It's always happening and I get very poor read-performances (from 10
 to 5000 reads/sc) in a 10 nodes cluster.

 My benchmark consists in several iterations launching 10, 100 and 1000
 Get requests on a given random rowkey with a single CF/qualifier.
 I'm using HBase 0.94.1 (a few commits before the official stable
 release) with Hadoop 1.0.3.
 Bloom filters have been enabled (at the rowkey level).

 I do not find very clear informations about these exceptions. From the
 reference guide :
   (...) you should consider digging in a bit more if you aren't doing
 something to trigger them.

 Well... could you help me digging? :-)

 --
 AM.


Re: Client receives SocketTimeoutException (CallerDisconnected on RS)

2012-08-23 Thread N Keywal
Hi Adrien,

As well, if you can share the client code (number of threads, regions,
is it a set of single get, or are they multi gets, this kind of
stuff).

Cheers,

N.


On Thu, Aug 23, 2012 at 7:40 PM, Jean-Daniel Cryans jdcry...@apache.org wrote:
 Hi Adrien,

 I would love to see the region server side of the logs while those
 socket timeouts happen, also check the GC log, but one thing people
 often hit while doing pure random read workloads with tons of clients
 is running out of sockets because they are all stuck in CLOSE_WAIT.
 You can check that by using lsof. There are other discussion on this
 mailing list about it.

 J-D

 On Thu, Aug 23, 2012 at 10:24 AM, Adrien Mogenet
 adrien.moge...@gmail.com wrote:
 Hi there,

 While I'm performing read-intensive benchmarks, I'm seeing storm of
 CallerDisconnectedException in certain RegionServers. As the
 documentation says, my client received a SocketTimeoutException
 (6ms etc...) at the same time.
 It's always happening and I get very poor read-performances (from 10
 to 5000 reads/sc) in a 10 nodes cluster.

 My benchmark consists in several iterations launching 10, 100 and 1000
 Get requests on a given random rowkey with a single CF/qualifier.
 I'm using HBase 0.94.1 (a few commits before the official stable
 release) with Hadoop 1.0.3.
 Bloom filters have been enabled (at the rowkey level).

 I do not find very clear informations about these exceptions. From the
 reference guide :
   (...) you should consider digging in a bit more if you aren't doing
 something to trigger them.

 Well... could you help me digging? :-)

 --
 AM.


Re: Client receives SocketTimeoutException (CallerDisconnected on RS)

2012-08-23 Thread Adrien Mogenet
Hi guys,

1/ I checked quickly the GC logs and saw nothing. Since I need very
fast lookup I set the zookeeper.session.timeout parameter to 10s to
consider the RS as dead after very short pauses, and that did not
occur.

2/ I did not check but I don't think I ran out of sockets since the
ulimit has been set very high, but I'll check !

3/ Benchmark can launch several R/W threads, but even the simplest
program leads to my issue :

Configuration config = HBaseConfiguration.create();
HTable table = new HTable(config, test);
for (1, 10, 100 or 1000)
  getsList.add(new Get(randomKey)
table.get(getsList)
table.close()

4/ I will share more logs tomorrow to dig deeper, I personally need a
long STW-pause :-)

Cheers,

On Thu, Aug 23, 2012 at 7:49 PM, N Keywal nkey...@gmail.com wrote:
 Hi Adrien,

 As well, if you can share the client code (number of threads, regions,
 is it a set of single get, or are they multi gets, this kind of
 stuff).

 Cheers,

 N.


 On Thu, Aug 23, 2012 at 7:40 PM, Jean-Daniel Cryans jdcry...@apache.org 
 wrote:
 Hi Adrien,

 I would love to see the region server side of the logs while those
 socket timeouts happen, also check the GC log, but one thing people
 often hit while doing pure random read workloads with tons of clients
 is running out of sockets because they are all stuck in CLOSE_WAIT.
 You can check that by using lsof. There are other discussion on this
 mailing list about it.

 J-D

 On Thu, Aug 23, 2012 at 10:24 AM, Adrien Mogenet
 adrien.moge...@gmail.com wrote:
 Hi there,

 While I'm performing read-intensive benchmarks, I'm seeing storm of
 CallerDisconnectedException in certain RegionServers. As the
 documentation says, my client received a SocketTimeoutException
 (6ms etc...) at the same time.
 It's always happening and I get very poor read-performances (from 10
 to 5000 reads/sc) in a 10 nodes cluster.

 My benchmark consists in several iterations launching 10, 100 and 1000
 Get requests on a given random rowkey with a single CF/qualifier.
 I'm using HBase 0.94.1 (a few commits before the official stable
 release) with Hadoop 1.0.3.
 Bloom filters have been enabled (at the rowkey level).

 I do not find very clear informations about these exceptions. From the
 reference guide :
   (...) you should consider digging in a bit more if you aren't doing
 something to trigger them.

 Well... could you help me digging? :-)

-- 
AM


Re: HBase row level cache for random read

2012-08-23 Thread Gen Liu


On 8/18/12 12:33 PM, Stack st...@duboce.net wrote:

On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu ge...@zynga.com wrote:
 I assume block cache store compressed data,

Generally its not, not unless you use block encoding.
Can you be more specific on this?  Are you talking about
https://issues.apache.org/jira/browse/HBASE-4218
So this is only available in 0.94 then? Thanks.

 one block can hold 6 rows, but in random read, maybe 1 row is ever
accessed, 5/6 of the cache space is wasted.
 Is there a better way of caching for random read. Lower the block size
to 32k or even 16k might be a choice.


We don't seem to list this as an option in this section,
http://hbase.apache.org/book.html#perf.reading, but yes, if lots of
random reads, smaller block cache could make a difference.

St.Ack



Re: HBase row level cache for random read

2012-08-23 Thread Stack
On Thu, Aug 23, 2012 at 12:06 PM, Gen Liu ge...@zynga.com wrote:


 On 8/18/12 12:33 PM, Stack st...@duboce.net wrote:

On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu ge...@zynga.com wrote:
 I assume block cache store compressed data,

Generally its not, not unless you use block encoding.
 Can you be more specific on this?  Are you talking about
 https://issues.apache.org/jira/browse/HBASE-4218
 So this is only available in 0.94 then? Thanks.

 one block can hold 6 rows, but in random read, maybe 1 row is ever
accessed, 5/6 of the cache space is wasted.
 Is there a better way of caching for random read. Lower the block size
to 32k or even 16k might be a choice.


We don't seem to list this as an option in this section,
http://hbase.apache.org/book.html#perf.reading, but yes, if lots of
random reads, smaller block cache could make a difference.


See release note in https://issues.apache.org/jira/browse/HBASE-4218
St.Ack


limit on number of blocks per HFile and files per region

2012-08-23 Thread Pamecha, Abhishek
Hi
I have a few questions on blocks/file and file/region.


1.   Can there be multiple row keys per block and then per  HFile? Or is a 
block or Hfile dedicated to a single row key?



I have a scenario, where for the same column family, some rowkeys will have 
very wide rows, say rowkey W, and some rowkeys will have very narrow rows, say 
rowkey N. In my case,  puts for rowkeys W and N are interleaved with a ratio of 
say 90 rowkeyW puts vs 10 rowkeyN puts. On the get side, my app works on 
getting data for a single  rowkey at a time.



Will that mean for a rowkeyN, the entries will be scattered across regions on 
that same region server, given there are interleaved puts? Or Is there a way I 
can enforce contiguous  writes to a region/Hfile reserved for rowkey N.  This 
way, I can leverage the block cache and have the entire/most of  rowkeyN fit in 
there for that session.



2.   Is there a limit on number of HFiles that can exist per region? 
Basically, on what criteria does a rowkey data gets split in two regions [on 
the same region server]. I am assuming there can be many regions per region 
server. And multiple regions for the same table can belong in the same region 
server.


3.   Also, is there a limit on the number of blocks that are created per 
HFile? What determines whether a split is required?



Thanks,
Abhishek



Re: limit on number of blocks per HFile and files per region

2012-08-23 Thread Jean-Daniel Cryans
Inline. In general I'd recommend you read the documentation more
closely and/or get the book.

J-D

On Thu, Aug 23, 2012 at 4:21 PM, Pamecha, Abhishek apame...@x.com wrote:
 1.   Can there be multiple row keys per block and then per  HFile? Or is 
 a block or Hfile dedicated to a single row key?

Multiple row keys per HFile block. Read
http://hbase.apache.org/book.html#hfilev2

 I have a scenario, where for the same column family, some rowkeys will have 
 very wide rows, say rowkey W, and some rowkeys will have very narrow rows, 
 say rowkey N. In my case,  puts for rowkeys W and N are interleaved with a 
 ratio of say 90 rowkeyW puts vs 10 rowkeyN puts. On the get side, my app 
 works on getting data for a single  rowkey at a time.
 Will that mean for a rowkeyN, the entries will be scattered across regions on 
 that same region server, given there are interleaved puts? Or Is there a way 
 I can enforce contiguous  writes to a region/Hfile reserved for rowkey N.  
 This way, I can leverage the block cache and have the entire/most of  rowkeyN 
 fit in there for that session.

The row keys are sorted according to their lexicographical order. See
http://hbase.apache.org/book.html#row

If you don't want the big rows coexisting with the small rows, put
them in different column families or different tables.

 2.   Is there a limit on number of HFiles that can exist per region?

I think your understanding of HFiles being a bit wrong prompted you to
ask this, my previous answers probably make it so that you don't need
this answer anymore, but there it is just in case:

The HFiles are compacted when reaching
hbase.hstore.compactionThreshold (default of 3) per family, and you
can have no more than hbase.hstore.blockingStoreFiles (default of 7).

 Basically, on what criteria does a rowkey data gets split in two
regions [on the same region server]. I am assuming there can be many
regions per region server. And multiple regions for the same table can
belong in the same region server.

A row key only lives in a single region since the regions are split
based on row keys.

 3.   Also, is there a limit on the number of blocks that are created per 
 HFile?

No.

 What determines whether a split is required?

hbase.hregion.max.filesize, also see
http://hbase.apache.org/book.html#disable.splitting if you want to
change that.


Re: how client location a region/tablet?

2012-08-23 Thread Lin Ma
Thank you Harsh. You answered my question. I like the current architecture
of HBase, which is designed for extensibility for the future -- we have two
layer index of data structure, and we can utilize it when needed for
specific problems. It looks like you buy a 4 bed-room house, but only
utilizing one room for living before having more children. :-)

regards,
Lin

On Fri, Aug 24, 2012 at 12:46 AM, Harsh J ha...@cloudera.com wrote:

 Lin,

 On Thu, Aug 23, 2012 at 10:10 PM, Lin Ma lin...@gmail.com wrote:
  Thanks, Harsh!
 
  - HBase currently keeps a single META region (Doesn't split it).  --
 does
  it mean there is only one row in ROOT table, which points the only one
 META
  region?

 Yes, currently this is the case. We disabled multiple META regions at
 some point, I am unsure about why exactly but perhaps it was complex
 to maintain that.

  - In Big Table, it seems they have multiple META regions (tablets), is
 it an
  advantage over HBase? :-)

 Well, depends. A single META region hasn't proven as a scalability
 bottleneck to anyone yet. A single META region can easily serve
 millions of rows if needed, like any other region, and I've usually
 not seen META table grow so big in deployments.

 --
 Harsh J



Re: how client location a region/tablet?

2012-08-23 Thread Lin Ma
Me too, Abhishek -- you are not alone. But it is good to learn and discuss
here to know various design choices.

regards,
Lin

On Fri, Aug 24, 2012 at 1:06 AM, Pamecha, Abhishek apame...@x.com wrote:

 I too thought there are multiple meta regions where as just one ROOT.  May
 be I am mixing b/w Big Table and Hbase.

 Thanks,
 Abhishek


 -Original Message-
 From: Lin Ma [mailto:lin...@gmail.com]
 Sent: Thursday, August 23, 2012 9:41 AM
 To: user@hbase.apache.org; ha...@cloudera.com
 Cc: doug.m...@explorysmedical.com
 Subject: Re: how client location a region/tablet?

 Thanks, Harsh!

 - HBase currently keeps a single META region (Doesn't split it).  --
 does it mean there is only one row in ROOT table, which points the only one
 META region?
 - In Big Table, it seems they have multiple META regions (tablets), is it
 an advantage over HBase? :-)

 regards,
 Lin
 On Thu, Aug 23, 2012 at 11:48 PM, Harsh J ha...@cloudera.com wrote:

  HBase currently keeps a single META region (Doesn't split it). ROOT
  holds META region location, and META has a few rows in it, a few of
  them for each table. See also the class MetaScanner.
 
  On Thu, Aug 23, 2012 at 9:00 PM, Lin Ma lin...@gmail.com wrote:
   Dong,
  
   Some more thoughts, after reading data structure for HRegionInfo =
   http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.
   html
  ,
   start key and end key looks informative which we could leverage,
  
   - I am not sure if we could leverage this information (stored as
   part of value in table ROOT) to find which META region may contains
   region server information for row-key 123 of data table ABC;
   - But I think unfortunately the information is stored in value of
   table ROOT, other than key field of table ROOT, so that we have to
   iterate each row in ROOT table one by one to figure out which META
   region server to access.
  
   Not sure if I get the points. Please feel free to correct me.
  
   regards,
   Lin
  
   On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma lin...@gmail.com wrote:
  
   Doug, very informative document. Thanks a lot!
  
   I read through it and have some thoughts,
  
   - Supposing at the beginning, client side cache for region
   information
  is
   empty, and the client wants to GET row-key 123 from table ABC;
   - The client will read from ROOT table at first. But unfortunately,
   ROOT table only contains region information for META table (please
   correct
  me if
   I am wrong), but not region information for real data table (e.g.
   table ABC);
   - Does the client have to call each META region server one by one,
   in order to find which META region contains information for region
   owner of row-key 123 of data table ABC?
  
   BTW: I think if there is a way to expose information about what
   range of table/region each META region contains from .META. region
   key, it will
  be
   better to save time to iterate META region server one by one.
   Please
  feel
   free to correct me if I am wrong.
  
   regards,
   Lin
  
  
   On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil 
  doug.m...@explorysmedical.comwrote:
  
  
   For further information about the catalog tables and
  region-regionserver
   assignment, see thisŠ
  
   http://hbase.apache.org/book.html#arch.catalog
  
  
  
  
  
  
   On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:
  
   Thank you Stack, especially for the smart 6 round trip guess for
   the puzzle. :-)
   
   1. Yeah, we client cache's locations, not the data. -- does it
   mean
  for
   each client, it will cache all location information of a HBase
  cluster,
   i.e. which physical server owns which region? Supposing each
   region
  has
   128M bytes, for a big cluster (P-bytes level), total data size /
   128M
  is
   not a trivial number, not sure if any overhead to client?
   2. A bit confused by what do you mean not the data? For the
   client cached location information, it should be the data in
   table METADATA, which
  is
   region / physical server mapping data. Why you say not data (do
   you
  mean
   real content in each region)?
   
   regards,
   Lin
   
   On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
   
On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
 Hello guys,

 I am referencing the Big Table paper about how a client
 locates a
   tablet.
 In section 5.1 Tablet location, it is mentioned that client
 will
   cache
all
 tablet locations, I think it means client will cache root
 tablet
  in
 METADATA table, and all other tablets in METADATA table
 (which
  means
client
 cache the whole METADATA table?). My question is, whether
 HBase
implements
 in the same or similar way? My concern or confusion is,
 supposing
   each
 tablet or region file is 128M bytes, it will be very huge
 space
  (i.e.
 memory footprint) for each client to cache all tablets or
 region
   files of
 METADATA 

HBase/JRuby update wiki page

2012-08-23 Thread Russell Jurney
The wiki page at http://wiki.apache.org/hadoop/Hbase/JRuby is out of date.
 I have updated the code so that it works with the late model APIs here:
https://github.com/rjurney/enron-jruby-sinatra-hbase-pig/blob/master/hbase_example.rb

Can someone please give me edit access on the HBase wiki, so I can
fix/update the documentation?

Thanks!

-- 
Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com


hbase many-to-many design

2012-08-23 Thread jing wang
Hi 'user',

 This is a many-to-many question, I also infer the hbase design FAQ,
http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
 What I want to do is desinging a 'user' table, incluing 'user' basic
infomations(columnFamily1), and team-name 'user' joined in(columnFamily2),
 When a user join in a new Team, I want to update the 'user' table to
add a column to 'columnFamily2'. So when getting the user , I get all the
team-names the user join in.
 Yet I don't want to put duplicate records, known as multi-versions,
each user has only one record.
 What should I do?

Any advice will be appreciated!


Thanks  Best Regards
Mike


Re: hbase many-to-many design

2012-08-23 Thread Sonal Goyal
If you are you adding a new column to the team column family, I dont think
multi version comes into picture. Multi Versioning is saving copies of
values of a particular cell, but you are creating a new cell within the
same row.


Best Regards,
Sonal
Crux: Reporting for HBase https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal





On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.comwrote:

 Hi 'user',

  This is a many-to-many question, I also infer the hbase design FAQ,
 http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
  What I want to do is desinging a 'user' table, incluing 'user' basic
 infomations(columnFamily1), and team-name 'user' joined in(columnFamily2),
  When a user join in a new Team, I want to update the 'user' table to
 add a column to 'columnFamily2'. So when getting the user , I get all the
 team-names the user join in.
  Yet I don't want to put duplicate records, known as multi-versions,
 each user has only one record.
  What should I do?

 Any advice will be appreciated!


 Thanks  Best Regards
 Mike



Re: hbase many-to-many design

2012-08-23 Thread jing wang
Hi Sonal,

   Thanks for your reply.
   How to add a new column to the existing columnFamily?The method I want
to try is using 3 steps, first get the record, construct a new put, using
the reocrd's( getted before) columnFamily2, then delete the old record in
Hbase, finally put the new constructed 'put' into Hbase.I really don't
think this is a good way.
   if another 'put', including a new column is put to Hbase, this is a
'update' action or another version?
   Would you please give me some reference for adding a column to a row?

Thanks  Best Regards
Mike


2012/8/24 Sonal Goyal sonalgoy...@gmail.com

 If you are you adding a new column to the team column family, I dont think
 multi version comes into picture. Multi Versioning is saving copies of
 values of a particular cell, but you are creating a new cell within the
 same row.


 Best Regards,
 Sonal
 Crux: Reporting for HBase https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal





 On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.com
 wrote:

  Hi 'user',
 
   This is a many-to-many question, I also infer the hbase design FAQ,
  http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
   What I want to do is desinging a 'user' table, incluing 'user' basic
  infomations(columnFamily1), and team-name 'user' joined
 in(columnFamily2),
   When a user join in a new Team, I want to update the 'user' table to
  add a column to 'columnFamily2'. So when getting the user , I get all the
  team-names the user join in.
   Yet I don't want to put duplicate records, known as multi-versions,
  each user has only one record.
   What should I do?
 
  Any advice will be appreciated!
 
 
  Thanks  Best Regards
  Mike
 



Re: hbase many-to-many design

2012-08-23 Thread Sonal Goyal
Sorry, is this what you want? I created a table with two column families. I
added one row and column family cf1 with qualifier team1. Then I added new
team cf1 with qualifier team2.

hbase(main):001:0 create 'multi','cf1','cf2'
0 row(s) in 1.6240 seconds

hbase(main):002:0 put 'multi','row1','cf1:team1','firstTeam'
0 row(s) in 0.0880 seconds

hbase(main):003:0 scan 'multi'
ROW   COLUMN+CELL

 row1 column=cf1:team1, timestamp=1345783824219,
value=firstTeam
1 row(s) in 0.0540 seconds

hbase(main):004:0 put 'multi','row1','cf1:team2','secondTeam'
0 row(s) in 0.0060 seconds

hbase(main):005:0 scan 'multi'
ROW   COLUMN+CELL

 row1 column=cf1:team1, timestamp=1345783824219,
value=firstTeam
 row1 column=cf1:team2, timestamp=1345783846821,
value=secondTea
  m

1 row(s) in 0.0250 seconds


Best Regards,
Sonal
Crux: Reporting for HBase https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal





On Fri, Aug 24, 2012 at 10:03 AM, jing wang happygodwithw...@gmail.comwrote:

 Hi Sonal,

Thanks for your reply.
How to add a new column to the existing columnFamily?The method I want
 to try is using 3 steps, first get the record, construct a new put, using
 the reocrd's( getted before) columnFamily2, then delete the old record in
 Hbase, finally put the new constructed 'put' into Hbase.I really don't
 think this is a good way.
if another 'put', including a new column is put to Hbase, this is a
 'update' action or another version?
Would you please give me some reference for adding a column to a row?

 Thanks  Best Regards
 Mike


 2012/8/24 Sonal Goyal sonalgoy...@gmail.com

  If you are you adding a new column to the team column family, I dont
 think
  multi version comes into picture. Multi Versioning is saving copies of
  values of a particular cell, but you are creating a new cell within the
  same row.
 
 
  Best Regards,
  Sonal
  Crux: Reporting for HBase https://github.com/sonalgoyal/crux
  Nube Technologies http://www.nubetech.co
 
  http://in.linkedin.com/in/sonalgoyal
 
 
 
 
 
  On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.com
  wrote:
 
   Hi 'user',
  
This is a many-to-many question, I also infer the hbase design
 FAQ,
   http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
What I want to do is desinging a 'user' table, incluing 'user'
 basic
   infomations(columnFamily1), and team-name 'user' joined
  in(columnFamily2),
When a user join in a new Team, I want to update the 'user' table
 to
   add a column to 'columnFamily2'. So when getting the user , I get all
 the
   team-names the user join in.
Yet I don't want to put duplicate records, known as
 multi-versions,
   each user has only one record.
What should I do?
  
   Any advice will be appreciated!
  
  
   Thanks  Best Regards
   Mike
  
 



Re: hbase many-to-many design

2012-08-23 Thread Pamecha, Abhishek
Hi Jong

You can add a new column unannounced. This means your current 'put' does not 
have to recall which other columns are already present in  the row or for that 
matter, in the table. You just issue a put command as if it was your first one, 
and the column will be added. 

Unlike rdbms, There are no update or alter table commands You need to execute 
to add a new column. 

If the column you are adding already existed,then a new version of value you 
put, is stored.

Thanks
Abhishek


i Sent from my iPad with iMstakes 

On Aug 23, 2012, at 21:34, jing wang happygodwithw...@gmail.com wrote:

 Hi Sonal,
 
   Thanks for your reply.
   How to add a new column to the existing columnFamily?The method I want
 to try is using 3 steps, first get the record, construct a new put, using
 the reocrd's( getted before) columnFamily2, then delete the old record in
 Hbase, finally put the new constructed 'put' into Hbase.I really don't
 think this is a good way.
   if another 'put', including a new column is put to Hbase, this is a
 'update' action or another version?
   Would you please give me some reference for adding a column to a row?
 
 Thanks  Best Regards
 Mike
 
 
 2012/8/24 Sonal Goyal sonalgoy...@gmail.com
 
 If you are you adding a new column to the team column family, I dont think
 multi version comes into picture. Multi Versioning is saving copies of
 values of a particular cell, but you are creating a new cell within the
 same row.
 
 
 Best Regards,
 Sonal
 Crux: Reporting for HBase https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co
 
 http://in.linkedin.com/in/sonalgoyal
 
 
 
 
 
 On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.com
 wrote:
 
 Hi 'user',
 
 This is a many-to-many question, I also infer the hbase design FAQ,
 http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
 What I want to do is desinging a 'user' table, incluing 'user' basic
 infomations(columnFamily1), and team-name 'user' joined
 in(columnFamily2),
 When a user join in a new Team, I want to update the 'user' table to
 add a column to 'columnFamily2'. So when getting the user , I get all the
 team-names the user join in.
 Yet I don't want to put duplicate records, known as multi-versions,
 each user has only one record.
 What should I do?
 
Any advice will be appreciated!
 
 
 Thanks  Best Regards
 Mike
 
 


Re: limit on number of blocks per HFile and files per region

2012-08-23 Thread Pamecha, Abhishek
Thanks Jean-daniel. I did go through  the documentation, but there was no clear 
answer to interleaving puts from two or more row keys or if there was a way to 
reserve contiguous blocks per rowkey. I made some derivations but clearly, I 
was incorrect in some of them as you pointed out  too. The questions were 
partly validations and partly doubt-riddance. :)

Thanks
Abhishek 

i Sent from my iPad with iMstakes 

On Aug 23, 2012, at 17:19, Jean-Daniel Cryans jdcry...@apache.org wrote:

 Inline. In general I'd recommend you read the documentation more
 closely and/or get the book.
 
 J-D
 
 On Thu, Aug 23, 2012 at 4:21 PM, Pamecha, Abhishek apame...@x.com wrote:
 1.   Can there be multiple row keys per block and then per  HFile? Or is 
 a block or Hfile dedicated to a single row key?
 
 Multiple row keys per HFile block. Read
 http://hbase.apache.org/book.html#hfilev2
 
 I have a scenario, where for the same column family, some rowkeys will have 
 very wide rows, say rowkey W, and some rowkeys will have very narrow rows, 
 say rowkey N. In my case,  puts for rowkeys W and N are interleaved with a 
 ratio of say 90 rowkeyW puts vs 10 rowkeyN puts. On the get side, my app 
 works on getting data for a single  rowkey at a time.
 Will that mean for a rowkeyN, the entries will be scattered across regions 
 on that same region server, given there are interleaved puts? Or Is there a 
 way I can enforce contiguous  writes to a region/Hfile reserved for rowkey 
 N.  This way, I can leverage the block cache and have the entire/most of  
 rowkeyN fit in there for that session.
 
 The row keys are sorted according to their lexicographical order. See
 http://hbase.apache.org/book.html#row
 
 If you don't want the big rows coexisting with the small rows, put
 them in different column families or different tables.
 
 2.   Is there a limit on number of HFiles that can exist per region?
 
 I think your understanding of HFiles being a bit wrong prompted you to
 ask this, my previous answers probably make it so that you don't need
 this answer anymore, but there it is just in case:
 
 The HFiles are compacted when reaching
 hbase.hstore.compactionThreshold (default of 3) per family, and you
 can have no more than hbase.hstore.blockingStoreFiles (default of 7).
 
  Basically, on what criteria does a rowkey data gets split in two
 regions [on the same region server]. I am assuming there can be many
 regions per region server. And multiple regions for the same table can
 belong in the same region server.
 
 A row key only lives in a single region since the regions are split
 based on row keys.
 
 3.   Also, is there a limit on the number of blocks that are created per 
 HFile?
 
 No.
 
 What determines whether a split is required?
 
 hbase.hregion.max.filesize, also see
 http://hbase.apache.org/book.html#disable.splitting if you want to
 change that.


Re: hbase many-to-many design

2012-08-23 Thread jing wang
Hi Abhishek,

   Got it. Thank you very much.


Thanks  Best Regards
Mike


2012/8/24 Pamecha, Abhishek apame...@x.com

 Hi Jong

 You can add a new column unannounced. This means your current 'put' does
 not have to recall which other columns are already present in  the row or
 for that matter, in the table. You just issue a put command as if it was
 your first one, and the column will be added.

 Unlike rdbms, There are no update or alter table commands You need to
 execute to add a new column.

 If the column you are adding already existed,then a new version of value
 you put, is stored.

 Thanks
 Abhishek


 i Sent from my iPad with iMstakes

 On Aug 23, 2012, at 21:34, jing wang happygodwithw...@gmail.com wrote:

  Hi Sonal,
 
Thanks for your reply.
How to add a new column to the existing columnFamily?The method I want
  to try is using 3 steps, first get the record, construct a new put, using
  the reocrd's( getted before) columnFamily2, then delete the old record in
  Hbase, finally put the new constructed 'put' into Hbase.I really don't
  think this is a good way.
if another 'put', including a new column is put to Hbase, this is a
  'update' action or another version?
Would you please give me some reference for adding a column to a row?
 
  Thanks  Best Regards
  Mike
 
 
  2012/8/24 Sonal Goyal sonalgoy...@gmail.com
 
  If you are you adding a new column to the team column family, I dont
 think
  multi version comes into picture. Multi Versioning is saving copies of
  values of a particular cell, but you are creating a new cell within the
  same row.
 
 
  Best Regards,
  Sonal
  Crux: Reporting for HBase https://github.com/sonalgoyal/crux
  Nube Technologies http://www.nubetech.co
 
  http://in.linkedin.com/in/sonalgoyal
 
 
 
 
 
  On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.com
  wrote:
 
  Hi 'user',
 
  This is a many-to-many question, I also infer the hbase design FAQ,
  http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
  What I want to do is desinging a 'user' table, incluing 'user'
 basic
  infomations(columnFamily1), and team-name 'user' joined
  in(columnFamily2),
  When a user join in a new Team, I want to update the 'user' table
 to
  add a column to 'columnFamily2'. So when getting the user , I get all
 the
  team-names the user join in.
  Yet I don't want to put duplicate records, known as multi-versions,
  each user has only one record.
  What should I do?
 
 Any advice will be appreciated!
 
 
  Thanks  Best Regards
  Mike
 
 



Re: hbase many-to-many design

2012-08-23 Thread jing wang
Hi Sonal,

   Thanks again.I have got a  misunderstanding of Column-oriented Hbase.
   Said by Abhishek, solving my problem

*You can add a new column unannounced. This means your current 'put' does
not have to recall which other columns are already present in  the row or
for that matter, in the table. You just issue a put command as if it was
your first one, and the column will be added.

Unlike rdbms, There are no update or alter table commands You need to
execute to add a new column.* *

If the column you are adding already existed,then a new version of value
you put, is stored.*


Thanks,
Mike

2012/8/24 Sonal Goyal sonalgoy...@gmail.com

 Sorry, is this what you want? I created a table with two column families. I
 added one row and column family cf1 with qualifier team1. Then I added new
 team cf1 with qualifier team2.

 hbase(main):001:0 create 'multi','cf1','cf2'
 0 row(s) in 1.6240 seconds

 hbase(main):002:0 put 'multi','row1','cf1:team1','firstTeam'
 0 row(s) in 0.0880 seconds

 hbase(main):003:0 scan 'multi'
 ROW   COLUMN+CELL

  row1 column=cf1:team1, timestamp=1345783824219,
 value=firstTeam
 1 row(s) in 0.0540 seconds

 hbase(main):004:0 put 'multi','row1','cf1:team2','secondTeam'
 0 row(s) in 0.0060 seconds

 hbase(main):005:0 scan 'multi'
 ROW   COLUMN+CELL

  row1 column=cf1:team1, timestamp=1345783824219,
 value=firstTeam
  row1 column=cf1:team2, timestamp=1345783846821,
 value=secondTea
   m

 1 row(s) in 0.0250 seconds


 Best Regards,
 Sonal
 Crux: Reporting for HBase https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal





 On Fri, Aug 24, 2012 at 10:03 AM, jing wang happygodwithw...@gmail.com
 wrote:

  Hi Sonal,
 
 Thanks for your reply.
 How to add a new column to the existing columnFamily?The method I want
  to try is using 3 steps, first get the record, construct a new put, using
  the reocrd's( getted before) columnFamily2, then delete the old record in
  Hbase, finally put the new constructed 'put' into Hbase.I really don't
  think this is a good way.
 if another 'put', including a new column is put to Hbase, this is a
  'update' action or another version?
 Would you please give me some reference for adding a column to a row?
 
  Thanks  Best Regards
  Mike
 
 
  2012/8/24 Sonal Goyal sonalgoy...@gmail.com
 
   If you are you adding a new column to the team column family, I dont
  think
   multi version comes into picture. Multi Versioning is saving copies of
   values of a particular cell, but you are creating a new cell within the
   same row.
  
  
   Best Regards,
   Sonal
   Crux: Reporting for HBase https://github.com/sonalgoyal/crux
   Nube Technologies http://www.nubetech.co
  
   http://in.linkedin.com/in/sonalgoyal
  
  
  
  
  
   On Fri, Aug 24, 2012 at 8:07 AM, jing wang happygodwithw...@gmail.com
   wrote:
  
Hi 'user',
   
 This is a many-to-many question, I also infer the hbase design
  FAQ,
http://wiki.apache.org/hadoop/Hbase/FAQ_Design.
 What I want to do is desinging a 'user' table, incluing 'user'
  basic
infomations(columnFamily1), and team-name 'user' joined
   in(columnFamily2),
 When a user join in a new Team, I want to update the 'user'
 table
  to
add a column to 'columnFamily2'. So when getting the user , I get all
  the
team-names the user join in.
 Yet I don't want to put duplicate records, known as
  multi-versions,
each user has only one record.
 What should I do?
   
Any advice will be appreciated!
   
   
Thanks  Best Regards
Mike