On Thu, Dec 12, 2013 at 2:44 AM, Gagan Juneja <[email protected]>wrote:
> Hi Aaron, > Thanks! this address most of my queries. I have some more queries > > 1. How we handle replication of shards like if some shard server is > down the how we can address requests for shards this Shard server was > holding. > Basically each server is given a portion of the shards to serve. If that server goes down, the layout of the table is re-calculated and of the remaining servers each get some shard(s) to serve. The logic behind the layout calculation is in the "org.apache.blur.manager.indexserver.MasterBasedDistributedLayoutFactory" class. ZooKeeper is used by the shard servers to know when a server goes offline. > 2. Do we use HDFS replication any where? Means do we use HDFS > replicated indexes any where? > All indexes are stored in HDFS (for a normal install) and HDFS transparently replicates all the block of a given file 3 times (by default). So we make use of it, but it's a normal function in HDFS. Aaron > > Regards, > Gagan > > On Wed, Dec 11, 2013 at 9:48 PM, Aaron McCurry <[email protected]> wrote: > > On Wed, Dec 11, 2013 at 8:53 AM, Gagan Juneja <[email protected] > >wrote: > > > >> Hi Team, > >> I just wanted to know how shards are created on shard servers, how the > >> server is selected for a query and how replication comes into picture > >> once any shard server goes down. It would be better if you can take > >> simple analogy of any example table and describe this. > >> > > > > This is a great question and we should put the answer(s) into the > > documentation. > > > > I will walk through a basic setup of creating a table, adding some data, > > then querying that data. > > > > CREATE TABLE: > > > > When a client issues a create table command to a controller the following > > happens. > > > > Controller: Receives the message > > Controller: Creates the ZooKeeper node that describes the table. > Typically > > in /blur/clusters/default/tables/<tablename> > > Controller: Controller then begins polling the shard servers using the > > shardServerLayoutState waiting on the shards to be OPEN > > > > While that is happening on the controller side the shard servers react > > accordingly. > > > > Shard: Sees the table get created through a ZooKeeper watcher. And adds > > the table to the online table list. > > Shard: Then it calculates the layout that the shard server should be > > serving by using the MasterBasedDistributedLayoutFactory class. > > Shard: Then once every 10 seconds the warmup thread fires and opens any > > shards that should be open on the given shard server. > > NOTE: If the shard tries to access the table the before the warmup > fires, > > the missing shards are opened then. > > Shard: Once the shards are open the state for shard will change to OPEN > and > > the controller will no longer block and it will return to the client. > > > > MUTATE: > > > > Client submits a mutation command to the controller. > > The controller reads the rowid from the mutation and figures out (through > > hashing the rowid) what shard and the server that it's being served from > > and forwards the mutation to that shard server. > > The shard server receives the mutation and figures out what shard it is > > meant for and applies the mutation to the given index. > > > > QUERY: > > > > Client submits a query request to the controller. > > The controller then sprays all the query to the shard servers. > > The shard servers then issues the query to all the shards within the > server > > (each one in it's own thread) and the top N number of hits from each > shard > > are returned to the request thread. > > Then the top N number of hits are returned the controller. > > Then the controller receives responses from all the shards servers and > > figures out the top N hits from all the servers. > > The data from the top N hits are then fetch in parallel from the > controller > > back to the shard servers for the top N hits. > > After the data is received back to the controller the response is then > sent > > back to the client. > > > > Hope this helps to explain at a high level wants going on. Let me know > if > > you have anymore questions. > > > > Aaron > > > > > > > > > >> > >> > >> Regards, > >> Gagan > >> >
