Thanks Jonathan. This question arose while exploring the design/working
of HBase; The use of zookeeper session wasn`t clear from the paper or
source code, hence the confusion.
Thanks,
Naresh.
On 10/03/2010 12:14 AM, Jonathan Gray wrote:
From the client POV, ZooKeeper is currently used to locate the currently
active master in addition to locating the current root region location (as you
describe).
This could be a "lookup" but any connection to ZK requires opening a session
(that's my understanding at least). Rather than establishing a new connection/session on
every lookup, it is kept open by the client. There should not be any measurable overhead
to this idle connection, or at least I've never seen or heard of one. And there will
only be one connection per process (well, per Configuration) so it would take a lot of
client processes to cause any problems.
And by retaining the connection, we actually get events in the client to point
to the new master/root region when they do change, which is one benefit of
keeping a session and watches open to ZK.
The new master design in trunk moves region transitions into ZK, so it would
actually be possible to send clients invalidations / updated region locations.
There are also some who have thought about putting all assignment information
into ZK at some point, but this is just a (contentious) idea for now.
What exactly are your concerns about ZK sessions in the client?
JG
-----Original Message-----
From: Naresh Rapolu [mailto:[email protected]]
Sent: Friday, October 01, 2010 7:52 PM
To: [email protected]
Subject: Re: Client`s cache invalidation
If zookeeper isn`t invalidating the client`s cache, then what is the
actual use of a zookeeper session? Is it needed only to lookup
region-location of the "root-tablet" of META table, which is later used
to recursively scan the entire META table ?
Can`t that just be a lookup query to zookeeper, instead of a session ?
I`ve looked into the Bigtable paper; the above use-case seems to be the
only one mentioned. Am I missing something ?
Thanks,
Naresh.
On 10/01/2010 09:40 PM, Jonathan Gray wrote:
Yes. RegionServers will throw a NotServingRegionException. This, in
turn, will cause the client to grab the location from META again.
-----Original Message-----
From: Naresh Rapolu [mailto:[email protected]]
Sent: Friday, October 01, 2010 5:35 PM
To: [email protected]
Subject: Client`s cache invalidation
Hello,
How does the client`s cache of "region-location"( .META table), get
invalidated when a region-server splits regions ? Does Zookeeper
abort
the client session or inform it of staleness ? How is consistency
ensured in the time interval between splits being registered in
.META
table and client cache being refreshed ? I`m guessing, the region-
server
would reject operations on rows it isn`t responsible for. Am I
correct
?
Thanks,
Naresh.