Re: General Question about Zookeeper

2009-06-25 Thread Harold Lim

Hi Henry,

Actually, I'm currently working on a research project. Basically, we're 
interested in the feasibility of a decentralized location-based services. All 
commercially available mobile location-based services (including social 
networks too) currently have a "central" server that stores all of the 
information of users, our main motivation is the security/privacy implications 
of this. 

Our approach is to still have maybe some infrastructure but only have limited 
role (we definitely don't want the the servers to store user's personal 
information). 


-Harold





--- On Thu, 6/25/09, Henry Robinson  wrote:

> From: Henry Robinson 
> Subject: Re: General Question about Zookeeper
> To: zookeeper-user@hadoop.apache.org
> Date: Thursday, June 25, 2009, 2:40 PM
> What else do you want to use ZK for -
> just leader election? It doesn't
> require so much a centralised server (which implies kind of
> a single point
> of failure) as a small amount of fixed infrastructure. If
> you have a highly
> dynamic network - an ad-hoc network like a social net - ZK
> will likely not
> be appropriate. There are leader election algorithms that
> work better in
> totally ad-hoc networks, and other co-ordination models
> that are better
> suited. In particular, you may not want persistence in the
> sense that later
> instances of a consensus algorithm might not need to see
> the results of
> previous ones, removing the need to keep logs
> synchronised.
> 
> However, if you have five or so servers that you can
> dedicate to
> coordination, ZooKeeper should work very well. I'm really
> curious about your
> use case - is there more you can explain?
> 
> Henry
> 
> On Thu, Jun 25, 2009 at 7:16 PM, Harold Lim 
> wrote:
> 
> >
> > Hi Gustavo,
> >
> > Actually, in my case, we have a fully decentralized
> service. Something like
> > where you have users in a social network. Originally,
> we were thinking of
> > using a distributed consensus algorithm (e.g., Paxos)
> to perform some
> > functionalities (e.g., leader election).
> >
> > Then, I read about ZooKeeper and was thinking of using
> ZooKeeper for leader
> > election instead. However, that means that we're
> introducing a "central"
> > server/service to the architecture.
> >
> > Currently, I'm just thinking of some of the original
> functionalities and
> > how much of these functionalities I can offload to
> ZooKeeper, without
> > breaking the original privacy/security motivation.
> >
> >
> > -Harold
> >
> >
> >
> >
> > --- On Thu, 6/25/09, Gustavo Niemeyer 
> wrote:
> >
> > > From: Gustavo Niemeyer 
> > > Subject: Re: General Question about Zookeeper
> > > To: zookeeper-user@hadoop.apache.org
> > > Date: Thursday, June 25, 2009, 1:59 PM
> > > Hey Harold,
> > >
> > > > I am interested in a security aspect of
> zookeeper,
> > > where the clients and the servers don't
> necessarily belong
> > > to the same "group". If a client creates a znode
> in the
> > > zookeeper? Can the person, who owns the zookeeper
> server,
> > > simply look at its filesystem and read the data
> > > (out-of-band, not using a client, simply browsing
> the file
> > > system of the machine hosting the zookeeper
> server)?
> > >
> > > Yes, absolutely.  You could certainly
> encrypt the data
> > > that goes
> > > through the ZooKeeper server, but since ZooKeeper
> is
> > > supposed to be
> > > doing coordination work, I think that if you
> don't trust
> > > the server,
> > > the whole situation might get a bit
> awkward.  I'm
> > > curious about your
> > > use case, since I'm pondering about doing
> something where
> > > clients
> > > don't necessarily trust other clients or machines
> in the
> > > same network
> > > (or even different users in the same machine),
> thus might
> > > require
> > > additional tighting up, but if you don't trust
> the server
> > > itself, that
> > > may be tricky.  Please note that ZooKeeper
> isn't meant
> > > to be used just
> > > as a distributed filesystem for storage, but
> that's
> > > probably not your
> > > intention anyway.
> > >
> > > --
> > > Gustavo Niemeyer
> > > http://niemeyer.net
> > >
> >
> >
> >
> >
> 





Re: General Question about Zookeeper

2009-06-25 Thread Henry Robinson
What else do you want to use ZK for - just leader election? It doesn't
require so much a centralised server (which implies kind of a single point
of failure) as a small amount of fixed infrastructure. If you have a highly
dynamic network - an ad-hoc network like a social net - ZK will likely not
be appropriate. There are leader election algorithms that work better in
totally ad-hoc networks, and other co-ordination models that are better
suited. In particular, you may not want persistence in the sense that later
instances of a consensus algorithm might not need to see the results of
previous ones, removing the need to keep logs synchronised.

However, if you have five or so servers that you can dedicate to
coordination, ZooKeeper should work very well. I'm really curious about your
use case - is there more you can explain?

Henry

On Thu, Jun 25, 2009 at 7:16 PM, Harold Lim  wrote:

>
> Hi Gustavo,
>
> Actually, in my case, we have a fully decentralized service. Something like
> where you have users in a social network. Originally, we were thinking of
> using a distributed consensus algorithm (e.g., Paxos) to perform some
> functionalities (e.g., leader election).
>
> Then, I read about ZooKeeper and was thinking of using ZooKeeper for leader
> election instead. However, that means that we're introducing a "central"
> server/service to the architecture.
>
> Currently, I'm just thinking of some of the original functionalities and
> how much of these functionalities I can offload to ZooKeeper, without
> breaking the original privacy/security motivation.
>
>
> -Harold
>
>
>
>
> --- On Thu, 6/25/09, Gustavo Niemeyer  wrote:
>
> > From: Gustavo Niemeyer 
> > Subject: Re: General Question about Zookeeper
> > To: zookeeper-user@hadoop.apache.org
> > Date: Thursday, June 25, 2009, 1:59 PM
> > Hey Harold,
> >
> > > I am interested in a security aspect of zookeeper,
> > where the clients and the servers don't necessarily belong
> > to the same "group". If a client creates a znode in the
> > zookeeper? Can the person, who owns the zookeeper server,
> > simply look at its filesystem and read the data
> > (out-of-band, not using a client, simply browsing the file
> > system of the machine hosting the zookeeper server)?
> >
> > Yes, absolutely.  You could certainly encrypt the data
> > that goes
> > through the ZooKeeper server, but since ZooKeeper is
> > supposed to be
> > doing coordination work, I think that if you don't trust
> > the server,
> > the whole situation might get a bit awkward.  I'm
> > curious about your
> > use case, since I'm pondering about doing something where
> > clients
> > don't necessarily trust other clients or machines in the
> > same network
> > (or even different users in the same machine), thus might
> > require
> > additional tighting up, but if you don't trust the server
> > itself, that
> > may be tricky.  Please note that ZooKeeper isn't meant
> > to be used just
> > as a distributed filesystem for storage, but that's
> > probably not your
> > intention anyway.
> >
> > --
> > Gustavo Niemeyer
> > http://niemeyer.net
> >
>
>
>
>


Re: General Question about Zookeeper

2009-06-25 Thread Harold Lim

Thanks. That makes sense.


-Harold

--- On Thu, 6/25/09, Mahadev Konar  wrote:

> From: Mahadev Konar 
> Subject: Re: General Question about Zookeeper
> To: zookeeper-user@hadoop.apache.org
> Date: Thursday, June 25, 2009, 2:29 PM
> Hi Harold,
>   Let me explain the whole concept of ZooKeeper Acls.
> 
> 1) Zookeeper servers are run using some user id say X
> 2) zookeeper client use ZooKeeper client libaryr to create
> zookeeper nodes
> on zookeeper servers. They could be running as user id C.
> They can provide
> acl's to create such nodes for there accessability
> restrictions. These ACL's
> have NOTHING to do with (user id X) or user id C. The
> access controls are
> intependent of any user id the client is running with or
> the server is
> running with
> 3) A user X can obviously create zookeeper database since
> he has access to
> the local filesystem data that zookeeper is snapshots/txns
> into.
> 
> 
> Hope this helps.
> Mahadev
>  
> On 6/25/09 11:20 AM, "Harold Lim" 
> wrote:
> 
> > 
> > Hi Henry,
> > 
> > Does that mean for example, if I own the Zookeeper
> server and physical machine
> > and have lots of clients using this Zookeeper server,
> I can simply look at the
> > logfiles and snapshot files and see all of the
> information created by those
> > clients?
> > 
> > 
> > Thanks,
> > Harold
> > 
> > --- On Thu, 6/25/09, Henry Robinson 
> wrote:
> > 
> >> From: Henry Robinson 
> >> Subject: Re: General Question about Zookeeper
> >> To: zookeeper-user@hadoop.apache.org
> >> Date: Thursday, June 25, 2009, 2:01 PM
> >> Hi Harold,
> >> 
> >> Each ZooKeeper server stores updates to znodes in
> logfiles,
> >> and periodic
> >> snapshots of the state of the datatree in snapshot
> files.
> >> 
> >> A user who has the same permissions as the server
> will be
> >> able to read these
> >> files, and can therefore recover the state of the
> datatree
> >> without the ZK
> >> server intervening. ACLs are applied only by the
> server;
> >> there is no
> >> filesystem-level representation of them.
> >> 
> >> Henry
> >> 
> >> 
> >> 
> >> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim 
> >> wrote:
> >> 
> >>> 
> >>> Hi All,
> >>> 
> >>> How does zookeeper store data/files?
> >>> From reading the doc, the clients can put ACL
> on
> >> files/znodes to limit
> >>> read/write/create of other clients. However, I
> was
> >> wondering how are these
> >>> znodes stored on Zookeeper servers?
> >>> 
> >>> I am interested in a security aspect of
> zookeeper,
> >> where the clients and
> >>> the servers don't necessarily belong to the
> same
> >> "group". If a client
> >>> creates a znode in the zookeeper? Can the
> person, who
> >> owns the zookeeper
> >>> server, simply look at its filesystem and read
> the
> >> data (out-of-band, not
> >>> using a client, simply browsing the file
> system of the
> >> machine hosting the
> >>> zookeeper server)?
> >>> 
> >>> 
> >>> Thanks,
> >>> Harold
> >>> 
> >>> 
> >>> 
> >>> 
> >> 
> > 
> > 
> >       
> 
> 





Re: General Question about Zookeeper

2009-06-25 Thread Mahadev Konar
Hi Harold,
  Let me explain the whole concept of ZooKeeper Acls.

1) Zookeeper servers are run using some user id say X
2) zookeeper client use ZooKeeper client libaryr to create zookeeper nodes
on zookeeper servers. They could be running as user id C. They can provide
acl's to create such nodes for there accessability restrictions. These ACL's
have NOTHING to do with (user id X) or user id C. The access controls are
intependent of any user id the client is running with or the server is
running with
3) A user X can obviously create zookeeper database since he has access to
the local filesystem data that zookeeper is snapshots/txns into.


Hope this helps.
Mahadev
 
On 6/25/09 11:20 AM, "Harold Lim"  wrote:

> 
> Hi Henry,
> 
> Does that mean for example, if I own the Zookeeper server and physical machine
> and have lots of clients using this Zookeeper server, I can simply look at the
> logfiles and snapshot files and see all of the information created by those
> clients?
> 
> 
> Thanks,
> Harold
> 
> --- On Thu, 6/25/09, Henry Robinson  wrote:
> 
>> From: Henry Robinson 
>> Subject: Re: General Question about Zookeeper
>> To: zookeeper-user@hadoop.apache.org
>> Date: Thursday, June 25, 2009, 2:01 PM
>> Hi Harold,
>> 
>> Each ZooKeeper server stores updates to znodes in logfiles,
>> and periodic
>> snapshots of the state of the datatree in snapshot files.
>> 
>> A user who has the same permissions as the server will be
>> able to read these
>> files, and can therefore recover the state of the datatree
>> without the ZK
>> server intervening. ACLs are applied only by the server;
>> there is no
>> filesystem-level representation of them.
>> 
>> Henry
>> 
>> 
>> 
>> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim 
>> wrote:
>> 
>>> 
>>> Hi All,
>>> 
>>> How does zookeeper store data/files?
>>> From reading the doc, the clients can put ACL on
>> files/znodes to limit
>>> read/write/create of other clients. However, I was
>> wondering how are these
>>> znodes stored on Zookeeper servers?
>>> 
>>> I am interested in a security aspect of zookeeper,
>> where the clients and
>>> the servers don't necessarily belong to the same
>> "group". If a client
>>> creates a znode in the zookeeper? Can the person, who
>> owns the zookeeper
>>> server, simply look at its filesystem and read the
>> data (out-of-band, not
>>> using a client, simply browsing the file system of the
>> machine hosting the
>>> zookeeper server)?
>>> 
>>> 
>>> Thanks,
>>> Harold
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
>   



Re: General Question about Zookeeper

2009-06-25 Thread Mahadev Konar
Hi Harold,
 As Henry mentioned, what acl's provide you is preventing access to znodes.
If someone has access to zookeeper's data stored on zookeeper's server
machines, they should be able to resconstruct the data and read it (using
zookeeper deserialization code).

I am not sure what kind of security model you are interested in, but for
ZooKeeper we expect the server side data stored on local disks be
inaccessible to normal users and only accessable to admins.

Hope this helps.
Thanks
mahadev

On 6/25/09 11:01 AM, "Henry Robinson"  wrote:

> Hi Harold,
> 
> Each ZooKeeper server stores updates to znodes in logfiles, and periodic
> snapshots of the state of the datatree in snapshot files.
> 
> A user who has the same permissions as the server will be able to read these
> files, and can therefore recover the state of the datatree without the ZK
> server intervening. ACLs are applied only by the server; there is no
> filesystem-level representation of them.
> 
> Henry
> 
> 
> 
> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim  wrote:
> 
>> 
>> Hi All,
>> 
>> How does zookeeper store data/files?
>> From reading the doc, the clients can put ACL on files/znodes to limit
>> read/write/create of other clients. However, I was wondering how are these
>> znodes stored on Zookeeper servers?
>> 
>> I am interested in a security aspect of zookeeper, where the clients and
>> the servers don't necessarily belong to the same "group". If a client
>> creates a znode in the zookeeper? Can the person, who owns the zookeeper
>> server, simply look at its filesystem and read the data (out-of-band, not
>> using a client, simply browsing the file system of the machine hosting the
>> zookeeper server)?
>> 
>> 
>> Thanks,
>> Harold
>> 
>> 
>> 
>> 



Re: General Question about Zookeeper

2009-06-25 Thread Ted Dunning
Yes.

But if you are using ZK only for leader election, then all that this person
gets to know is who the leader is.  Presumably, that is public or at least
wide-spread knowledge in any case.

On Thu, Jun 25, 2009 at 11:20 AM, Harold Lim  wrote:

> Does that mean for example, if I own the Zookeeper server and physical
> machine and have lots of clients using this Zookeeper server, I can simply
> look at the logfiles and snapshot files and see all of the information
> created by those clients?
>


Re: General Question about Zookeeper

2009-06-25 Thread Harold Lim

Hi Henry,

Does that mean for example, if I own the Zookeeper server and physical machine 
and have lots of clients using this Zookeeper server, I can simply look at the 
logfiles and snapshot files and see all of the information created by those 
clients?


Thanks,
Harold

--- On Thu, 6/25/09, Henry Robinson  wrote:

> From: Henry Robinson 
> Subject: Re: General Question about Zookeeper
> To: zookeeper-user@hadoop.apache.org
> Date: Thursday, June 25, 2009, 2:01 PM
> Hi Harold,
> 
> Each ZooKeeper server stores updates to znodes in logfiles,
> and periodic
> snapshots of the state of the datatree in snapshot files.
> 
> A user who has the same permissions as the server will be
> able to read these
> files, and can therefore recover the state of the datatree
> without the ZK
> server intervening. ACLs are applied only by the server;
> there is no
> filesystem-level representation of them.
> 
> Henry
> 
> 
> 
> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim 
> wrote:
> 
> >
> > Hi All,
> >
> > How does zookeeper store data/files?
> > From reading the doc, the clients can put ACL on
> files/znodes to limit
> > read/write/create of other clients. However, I was
> wondering how are these
> > znodes stored on Zookeeper servers?
> >
> > I am interested in a security aspect of zookeeper,
> where the clients and
> > the servers don't necessarily belong to the same
> "group". If a client
> > creates a znode in the zookeeper? Can the person, who
> owns the zookeeper
> > server, simply look at its filesystem and read the
> data (out-of-band, not
> > using a client, simply browsing the file system of the
> machine hosting the
> > zookeeper server)?
> >
> >
> > Thanks,
> > Harold
> >
> >
> >
> >
> 


  


Re: General Question about Zookeeper

2009-06-25 Thread Harold Lim

Hi Gustavo,

Actually, in my case, we have a fully decentralized service. Something like 
where you have users in a social network. Originally, we were thinking of using 
a distributed consensus algorithm (e.g., Paxos) to perform some functionalities 
(e.g., leader election). 

Then, I read about ZooKeeper and was thinking of using ZooKeeper for leader 
election instead. However, that means that we're introducing a "central" 
server/service to the architecture. 

Currently, I'm just thinking of some of the original functionalities and how 
much of these functionalities I can offload to ZooKeeper, without breaking the 
original privacy/security motivation.


-Harold




--- On Thu, 6/25/09, Gustavo Niemeyer  wrote:

> From: Gustavo Niemeyer 
> Subject: Re: General Question about Zookeeper
> To: zookeeper-user@hadoop.apache.org
> Date: Thursday, June 25, 2009, 1:59 PM
> Hey Harold,
> 
> > I am interested in a security aspect of zookeeper,
> where the clients and the servers don't necessarily belong
> to the same "group". If a client creates a znode in the
> zookeeper? Can the person, who owns the zookeeper server,
> simply look at its filesystem and read the data
> (out-of-band, not using a client, simply browsing the file
> system of the machine hosting the zookeeper server)?
> 
> Yes, absolutely.  You could certainly encrypt the data
> that goes
> through the ZooKeeper server, but since ZooKeeper is
> supposed to be
> doing coordination work, I think that if you don't trust
> the server,
> the whole situation might get a bit awkward.  I'm
> curious about your
> use case, since I'm pondering about doing something where
> clients
> don't necessarily trust other clients or machines in the
> same network
> (or even different users in the same machine), thus might
> require
> additional tighting up, but if you don't trust the server
> itself, that
> may be tricky.  Please note that ZooKeeper isn't meant
> to be used just
> as a distributed filesystem for storage, but that's
> probably not your
> intention anyway.
> 
> -- 
> Gustavo Niemeyer
> http://niemeyer.net
> 





Re: General Question about Zookeeper

2009-06-25 Thread Henry Robinson
Hi Harold,

Each ZooKeeper server stores updates to znodes in logfiles, and periodic
snapshots of the state of the datatree in snapshot files.

A user who has the same permissions as the server will be able to read these
files, and can therefore recover the state of the datatree without the ZK
server intervening. ACLs are applied only by the server; there is no
filesystem-level representation of them.

Henry



On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim  wrote:

>
> Hi All,
>
> How does zookeeper store data/files?
> From reading the doc, the clients can put ACL on files/znodes to limit
> read/write/create of other clients. However, I was wondering how are these
> znodes stored on Zookeeper servers?
>
> I am interested in a security aspect of zookeeper, where the clients and
> the servers don't necessarily belong to the same "group". If a client
> creates a znode in the zookeeper? Can the person, who owns the zookeeper
> server, simply look at its filesystem and read the data (out-of-band, not
> using a client, simply browsing the file system of the machine hosting the
> zookeeper server)?
>
>
> Thanks,
> Harold
>
>
>
>


Re: General Question about Zookeeper

2009-06-25 Thread Gustavo Niemeyer
Hey Harold,

> I am interested in a security aspect of zookeeper, where the clients and the 
> servers don't necessarily belong to the same "group". If a client creates a 
> znode in the zookeeper? Can the person, who owns the zookeeper server, simply 
> look at its filesystem and read the data (out-of-band, not using a client, 
> simply browsing the file system of the machine hosting the zookeeper server)?

Yes, absolutely.  You could certainly encrypt the data that goes
through the ZooKeeper server, but since ZooKeeper is supposed to be
doing coordination work, I think that if you don't trust the server,
the whole situation might get a bit awkward.  I'm curious about your
use case, since I'm pondering about doing something where clients
don't necessarily trust other clients or machines in the same network
(or even different users in the same machine), thus might require
additional tighting up, but if you don't trust the server itself, that
may be tricky.  Please note that ZooKeeper isn't meant to be used just
as a distributed filesystem for storage, but that's probably not your
intention anyway.

-- 
Gustavo Niemeyer
http://niemeyer.net