> 1) Consistency guarantees for reads in Hbase:
>        What happens when you issue a direct bulk incremental update without 
> using the API?
>   Say, a new storefile is created in a region through the bulk tool. Already 
> existing scanners will not have an effect on the new updates. But new 
> scanners would. Is this correct?
>    And what will happen to the block cache? Are they marked dirty after the 
> new upload?

I don't have ops experience with bulk uploader, so can't tell.

>
> 2) Regionserver failures:
>     I know that when a region server is running properly but is unreachable 
> for some time (a few minutes), then zk will change its state to expired. And 
> when the RS is reachable again, it will access the zk state, know that it is 
> viewed dead and will throw an exception. Can you guys let me know if I am 
> correct?
>    What if a regionserver is unreachable for a longer time (say an hour) and 
> then is again reachable? Does it have the same effect as the previous case?

When the region server comes back after a GC pause (most likely reason
to be partitioned IMO), it will try to heartbeat to the master and the
ZooKeeper ensemble. Both will fail with an exception, and the region
server will do an emergency shutdown. If the machine is partitioned
from the rest of the cluster, the process will also shutdown after a
few retries. In any case, you will need to restart the process or have
a cluster management tool that does it for you.

J-D

Reply via email to