Re: Question on production readiness, deployment, data of BookKeeper / Hedwig

2010-10-08 Thread Jake Mannix
Hi Ben,

  To follow up with this question, which seems to be asking primarily about
Hedwig (and I guess the answer is: it's not in production yet, anywhere),
with one more about Bookkeeper: is BookKeeper used in production as a WAL
(or for any other use) anywhere?  If so, for what uses?

  Any info (even anecdotal) would be great!

  -jake

On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reed br...@yahoo-inc.com wrote:

  hi amit,

 sorry for the late response. this week has been crunch time for a lot of
 different things.

 here are your answers:

 production

 1. it is still in prototype phase. we are evaluating different aspects, but
 there is still some work to do to make it production ready. we also need to
 get an engineering team to signup to stand behind it.

 2. it's a generic pub/sub message bus. in some sense it is really a
 datacenter solution with extensions for multi-data center operation, so it
 is perfectly reasonable to use it in a single datacenter setting.

 3. yeah, we have removed the hw.bash script. it had some hardcoded
 assumptions and was a swiss army knife on steroids. he have been breaking it
 up into simpler scripts.

 4. session expiry really represents a fundamental connectivity problem, so
 both bk and hedwig restart the component that gets the expired session
 errror.

 data

 1. yes.

 2. once all subscribers have consumed a message there is a background
 process that cleans it up.

 3. yes there is a replication factor and we ensure replication on writes
 and there is a recovery tool to recover bookies that fail. we don't have to
 worry about conflicts because there is only a single writer for a give
 ledger. because of this we do not need to do quorum reads.

 documentation

 yes, this is something we need to work on. i'll see if i can push out some
 of our hello world applications. we'd also like to put a JMS API on top so
 that the API is more familiar (and documented :). i don't want to delay the
 answers to your other questions, so let me answer that HedwigSubscriber is
 the class for clients. the other classes are internal. (for cross data
 center hubs use a special kind of subscriptions to do cross data center
 updates.)

 ben

 On 10/05/2010 10:32 PM, amit jaiswal wrote:

 Hi,

 In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the
 primary
 use case for Hedwig comes from the distributed key-value store PNUTS in
 Yahoo!,
 but also said that the work is new.

 Could you please about the following:

 Production readiness / Deployment
 1. What is the production readiness of Hedwig / BookKeeper. Is it being
 used
 anywhere (like in PNUTS)?
 2. Is Hedwig designed to use as a generic message bus or only for
 multi-datacenter operations?
 3. Hedwig installation and deployment is done through a script hw.bash,
 but that
 is difficult to use especially in a production environment. Are there any
 other
 packages available that can simplify the deployment of hedwig.
 4. How does BK/Hedwig handle zookeeper session expiry?

 Data Deletion, Handling data loss, Quorum
 1. Does BookKeeper support deletion of old log entries which have been
 consumed.
 2. How does Hedwig handles the case when all subscribers have consumed all
 the
 messages. In the talk, it was said that a subscriber can come back after
 hours,
 days or weeks. Is there any data retention / expiration policy for the
 data that
 is published?
 3. How does Hedwig handles data loss? There is a replication factor, and a
 write
 operation must be accepted by majority of the bookies, but how data
 conflicts
 are handled? Is there any possibility of data conflict at all? Is the
 replication only for recovery? When the hub is reading data from bookies,
 does
 it reads from all the bookies to satisfy quorum read?

 Code
 What is the difference between PubSubServer, HedwigSubscriber,
 HedwigHubSubscriber. Is there any HelloWorld program that simply
 illustrates how
 to instantiate a hedwig client, and publish/consume messages.
 (HedwigBenchmark
 class is helpful, but was looking something like API documentation).

 -regards
 Amit





Re: Question on production readiness, deployment, data of BookKeeper / Hedwig

2010-10-08 Thread Benjamin Reed
 your guess is correct :) for bookkeeper and hedwig we released early 
to do the development in public. originally we developed bookkeeper as a 
distributed write ahead log for the NameNode in HDFS, but while we were 
able to get a proof of concept going, the structure of the code of the 
NameNode makes it difficulty to integrate well. we are currently working 
on fixing the write ahead layer of the NameNode, which is taking a lot 
of time. in the meantime we applied bookkeeper to pub/sub and came up 
with hedwig, which is where most of our efforts are focused while the 
slow processing of pushing changes to the NameNode proceeds.


ben

On 10/08/2010 02:32 PM, Jake Mannix wrote:

Hi Ben,

   To follow up with this question, which seems to be asking primarily about
Hedwig (and I guess the answer is: it's not in production yet, anywhere),
with one more about Bookkeeper: is BookKeeper used in production as a WAL
(or for any other use) anywhere?  If so, for what uses?

   Any info (even anecdotal) would be great!

   -jake

On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reedbr...@yahoo-inc.com  wrote:


  hi amit,

sorry for the late response. this week has been crunch time for a lot of
different things.

here are your answers:

production

1. it is still in prototype phase. we are evaluating different aspects, but
there is still some work to do to make it production ready. we also need to
get an engineering team to signup to stand behind it.

2. it's a generic pub/sub message bus. in some sense it is really a
datacenter solution with extensions for multi-data center operation, so it
is perfectly reasonable to use it in a single datacenter setting.

3. yeah, we have removed the hw.bash script. it had some hardcoded
assumptions and was a swiss army knife on steroids. he have been breaking it
up into simpler scripts.

4. session expiry really represents a fundamental connectivity problem, so
both bk and hedwig restart the component that gets the expired session
errror.

data

1. yes.

2. once all subscribers have consumed a message there is a background
process that cleans it up.

3. yes there is a replication factor and we ensure replication on writes
and there is a recovery tool to recover bookies that fail. we don't have to
worry about conflicts because there is only a single writer for a give
ledger. because of this we do not need to do quorum reads.

documentation

yes, this is something we need to work on. i'll see if i can push out some
of our hello world applications. we'd also like to put a JMS API on top so
that the API is more familiar (and documented :). i don't want to delay the
answers to your other questions, so let me answer that HedwigSubscriber is
the class for clients. the other classes are internal. (for cross data
center hubs use a special kind of subscriptions to do cross data center
updates.)

ben

On 10/05/2010 10:32 PM, amit jaiswal wrote:


Hi,

In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the
primary
use case for Hedwig comes from the distributed key-value store PNUTS in
Yahoo!,
but also said that the work is new.

Could you please about the following:

Production readiness / Deployment
1. What is the production readiness of Hedwig / BookKeeper. Is it being
used
anywhere (like in PNUTS)?
2. Is Hedwig designed to use as a generic message bus or only for
multi-datacenter operations?
3. Hedwig installation and deployment is done through a script hw.bash,
but that
is difficult to use especially in a production environment. Are there any
other
packages available that can simplify the deployment of hedwig.
4. How does BK/Hedwig handle zookeeper session expiry?

Data Deletion, Handling data loss, Quorum
1. Does BookKeeper support deletion of old log entries which have been
consumed.
2. How does Hedwig handles the case when all subscribers have consumed all
the
messages. In the talk, it was said that a subscriber can come back after
hours,
days or weeks. Is there any data retention / expiration policy for the
data that
is published?
3. How does Hedwig handles data loss? There is a replication factor, and a
write
operation must be accepted by majority of the bookies, but how data
conflicts
are handled? Is there any possibility of data conflict at all? Is the
replication only for recovery? When the hub is reading data from bookies,
does
it reads from all the bookies to satisfy quorum read?

Code
What is the difference between PubSubServer, HedwigSubscriber,
HedwigHubSubscriber. Is there any HelloWorld program that simply
illustrates how
to instantiate a hedwig client, and publish/consume messages.
(HedwigBenchmark
class is helpful, but was looking something like API documentation).

-regards
Amit







Re: Question on production readiness, deployment, data of BookKeeper / Hedwig

2010-10-07 Thread Benjamin Reed

 hi amit,

sorry for the late response. this week has been crunch time for a lot of 
different things.


here are your answers:

production

1. it is still in prototype phase. we are evaluating different aspects, 
but there is still some work to do to make it production ready. we also 
need to get an engineering team to signup to stand behind it.


2. it's a generic pub/sub message bus. in some sense it is really a 
datacenter solution with extensions for multi-data center operation, so 
it is perfectly reasonable to use it in a single datacenter setting.


3. yeah, we have removed the hw.bash script. it had some hardcoded 
assumptions and was a swiss army knife on steroids. he have been 
breaking it up into simpler scripts.


4. session expiry really represents a fundamental connectivity problem, 
so both bk and hedwig restart the component that gets the expired 
session errror.


data

1. yes.

2. once all subscribers have consumed a message there is a background 
process that cleans it up.


3. yes there is a replication factor and we ensure replication on writes 
and there is a recovery tool to recover bookies that fail. we don't have 
to worry about conflicts because there is only a single writer for a 
give ledger. because of this we do not need to do quorum reads.


documentation

yes, this is something we need to work on. i'll see if i can push out 
some of our hello world applications. we'd also like to put a JMS API on 
top so that the API is more familiar (and documented :). i don't want to 
delay the answers to your other questions, so let me answer that 
HedwigSubscriber is the class for clients. the other classes are 
internal. (for cross data center hubs use a special kind of 
subscriptions to do cross data center updates.)


ben

On 10/05/2010 10:32 PM, amit jaiswal wrote:

Hi,

In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the primary
use case for Hedwig comes from the distributed key-value store PNUTS in Yahoo!,
but also said that the work is new.

Could you please about the following:

Production readiness / Deployment
1. What is the production readiness of Hedwig / BookKeeper. Is it being used
anywhere (like in PNUTS)?
2. Is Hedwig designed to use as a generic message bus or only for
multi-datacenter operations?
3. Hedwig installation and deployment is done through a script hw.bash, but that
is difficult to use especially in a production environment. Are there any other
packages available that can simplify the deployment of hedwig.
4. How does BK/Hedwig handle zookeeper session expiry?

Data Deletion, Handling data loss, Quorum
1. Does BookKeeper support deletion of old log entries which have been consumed.
2. How does Hedwig handles the case when all subscribers have consumed all the
messages. In the talk, it was said that a subscriber can come back after hours,
days or weeks. Is there any data retention / expiration policy for the data that
is published?
3. How does Hedwig handles data loss? There is a replication factor, and a write
operation must be accepted by majority of the bookies, but how data conflicts
are handled? Is there any possibility of data conflict at all? Is the
replication only for recovery? When the hub is reading data from bookies, does
it reads from all the bookies to satisfy quorum read?

Code
What is the difference between PubSubServer, HedwigSubscriber,
HedwigHubSubscriber. Is there any HelloWorld program that simply illustrates how
to instantiate a hedwig client, and publish/consume messages. (HedwigBenchmark
class is helpful, but was looking something like API documentation).

-regards
Amit