Re: General Question about Zookeeper
Hi Henry, Actually, I'm currently working on a research project. Basically, we're interested in the feasibility of a decentralized location-based services. All commercially available mobile location-based services (including social networks too) currently have a "central" server that stores all of the information of users, our main motivation is the security/privacy implications of this. Our approach is to still have maybe some infrastructure but only have limited role (we definitely don't want the the servers to store user's personal information). -Harold --- On Thu, 6/25/09, Henry Robinson wrote: > From: Henry Robinson > Subject: Re: General Question about Zookeeper > To: zookeeper-user@hadoop.apache.org > Date: Thursday, June 25, 2009, 2:40 PM > What else do you want to use ZK for - > just leader election? It doesn't > require so much a centralised server (which implies kind of > a single point > of failure) as a small amount of fixed infrastructure. If > you have a highly > dynamic network - an ad-hoc network like a social net - ZK > will likely not > be appropriate. There are leader election algorithms that > work better in > totally ad-hoc networks, and other co-ordination models > that are better > suited. In particular, you may not want persistence in the > sense that later > instances of a consensus algorithm might not need to see > the results of > previous ones, removing the need to keep logs > synchronised. > > However, if you have five or so servers that you can > dedicate to > coordination, ZooKeeper should work very well. I'm really > curious about your > use case - is there more you can explain? > > Henry > > On Thu, Jun 25, 2009 at 7:16 PM, Harold Lim > wrote: > > > > > Hi Gustavo, > > > > Actually, in my case, we have a fully decentralized > service. Something like > > where you have users in a social network. Originally, > we were thinking of > > using a distributed consensus algorithm (e.g., Paxos) > to perform some > > functionalities (e.g., leader election). > > > > Then, I read about ZooKeeper and was thinking of using > ZooKeeper for leader > > election instead. However, that means that we're > introducing a "central" > > server/service to the architecture. > > > > Currently, I'm just thinking of some of the original > functionalities and > > how much of these functionalities I can offload to > ZooKeeper, without > > breaking the original privacy/security motivation. > > > > > > -Harold > > > > > > > > > > --- On Thu, 6/25/09, Gustavo Niemeyer > wrote: > > > > > From: Gustavo Niemeyer > > > Subject: Re: General Question about Zookeeper > > > To: zookeeper-user@hadoop.apache.org > > > Date: Thursday, June 25, 2009, 1:59 PM > > > Hey Harold, > > > > > > > I am interested in a security aspect of > zookeeper, > > > where the clients and the servers don't > necessarily belong > > > to the same "group". If a client creates a znode > in the > > > zookeeper? Can the person, who owns the zookeeper > server, > > > simply look at its filesystem and read the data > > > (out-of-band, not using a client, simply browsing > the file > > > system of the machine hosting the zookeeper > server)? > > > > > > Yes, absolutely. You could certainly > encrypt the data > > > that goes > > > through the ZooKeeper server, but since ZooKeeper > is > > > supposed to be > > > doing coordination work, I think that if you > don't trust > > > the server, > > > the whole situation might get a bit > awkward. I'm > > > curious about your > > > use case, since I'm pondering about doing > something where > > > clients > > > don't necessarily trust other clients or machines > in the > > > same network > > > (or even different users in the same machine), > thus might > > > require > > > additional tighting up, but if you don't trust > the server > > > itself, that > > > may be tricky. Please note that ZooKeeper > isn't meant > > > to be used just > > > as a distributed filesystem for storage, but > that's > > > probably not your > > > intention anyway. > > > > > > -- > > > Gustavo Niemeyer > > > http://niemeyer.net > > > > > > > > > > > >
Re: General Question about Zookeeper
What else do you want to use ZK for - just leader election? It doesn't require so much a centralised server (which implies kind of a single point of failure) as a small amount of fixed infrastructure. If you have a highly dynamic network - an ad-hoc network like a social net - ZK will likely not be appropriate. There are leader election algorithms that work better in totally ad-hoc networks, and other co-ordination models that are better suited. In particular, you may not want persistence in the sense that later instances of a consensus algorithm might not need to see the results of previous ones, removing the need to keep logs synchronised. However, if you have five or so servers that you can dedicate to coordination, ZooKeeper should work very well. I'm really curious about your use case - is there more you can explain? Henry On Thu, Jun 25, 2009 at 7:16 PM, Harold Lim wrote: > > Hi Gustavo, > > Actually, in my case, we have a fully decentralized service. Something like > where you have users in a social network. Originally, we were thinking of > using a distributed consensus algorithm (e.g., Paxos) to perform some > functionalities (e.g., leader election). > > Then, I read about ZooKeeper and was thinking of using ZooKeeper for leader > election instead. However, that means that we're introducing a "central" > server/service to the architecture. > > Currently, I'm just thinking of some of the original functionalities and > how much of these functionalities I can offload to ZooKeeper, without > breaking the original privacy/security motivation. > > > -Harold > > > > > --- On Thu, 6/25/09, Gustavo Niemeyer wrote: > > > From: Gustavo Niemeyer > > Subject: Re: General Question about Zookeeper > > To: zookeeper-user@hadoop.apache.org > > Date: Thursday, June 25, 2009, 1:59 PM > > Hey Harold, > > > > > I am interested in a security aspect of zookeeper, > > where the clients and the servers don't necessarily belong > > to the same "group". If a client creates a znode in the > > zookeeper? Can the person, who owns the zookeeper server, > > simply look at its filesystem and read the data > > (out-of-band, not using a client, simply browsing the file > > system of the machine hosting the zookeeper server)? > > > > Yes, absolutely. You could certainly encrypt the data > > that goes > > through the ZooKeeper server, but since ZooKeeper is > > supposed to be > > doing coordination work, I think that if you don't trust > > the server, > > the whole situation might get a bit awkward. I'm > > curious about your > > use case, since I'm pondering about doing something where > > clients > > don't necessarily trust other clients or machines in the > > same network > > (or even different users in the same machine), thus might > > require > > additional tighting up, but if you don't trust the server > > itself, that > > may be tricky. Please note that ZooKeeper isn't meant > > to be used just > > as a distributed filesystem for storage, but that's > > probably not your > > intention anyway. > > > > -- > > Gustavo Niemeyer > > http://niemeyer.net > > > > > >
Re: General Question about Zookeeper
Thanks. That makes sense. -Harold --- On Thu, 6/25/09, Mahadev Konar wrote: > From: Mahadev Konar > Subject: Re: General Question about Zookeeper > To: zookeeper-user@hadoop.apache.org > Date: Thursday, June 25, 2009, 2:29 PM > Hi Harold, > Let me explain the whole concept of ZooKeeper Acls. > > 1) Zookeeper servers are run using some user id say X > 2) zookeeper client use ZooKeeper client libaryr to create > zookeeper nodes > on zookeeper servers. They could be running as user id C. > They can provide > acl's to create such nodes for there accessability > restrictions. These ACL's > have NOTHING to do with (user id X) or user id C. The > access controls are > intependent of any user id the client is running with or > the server is > running with > 3) A user X can obviously create zookeeper database since > he has access to > the local filesystem data that zookeeper is snapshots/txns > into. > > > Hope this helps. > Mahadev > > On 6/25/09 11:20 AM, "Harold Lim" > wrote: > > > > > Hi Henry, > > > > Does that mean for example, if I own the Zookeeper > server and physical machine > > and have lots of clients using this Zookeeper server, > I can simply look at the > > logfiles and snapshot files and see all of the > information created by those > > clients? > > > > > > Thanks, > > Harold > > > > --- On Thu, 6/25/09, Henry Robinson > wrote: > > > >> From: Henry Robinson > >> Subject: Re: General Question about Zookeeper > >> To: zookeeper-user@hadoop.apache.org > >> Date: Thursday, June 25, 2009, 2:01 PM > >> Hi Harold, > >> > >> Each ZooKeeper server stores updates to znodes in > logfiles, > >> and periodic > >> snapshots of the state of the datatree in snapshot > files. > >> > >> A user who has the same permissions as the server > will be > >> able to read these > >> files, and can therefore recover the state of the > datatree > >> without the ZK > >> server intervening. ACLs are applied only by the > server; > >> there is no > >> filesystem-level representation of them. > >> > >> Henry > >> > >> > >> > >> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim > >> wrote: > >> > >>> > >>> Hi All, > >>> > >>> How does zookeeper store data/files? > >>> From reading the doc, the clients can put ACL > on > >> files/znodes to limit > >>> read/write/create of other clients. However, I > was > >> wondering how are these > >>> znodes stored on Zookeeper servers? > >>> > >>> I am interested in a security aspect of > zookeeper, > >> where the clients and > >>> the servers don't necessarily belong to the > same > >> "group". If a client > >>> creates a znode in the zookeeper? Can the > person, who > >> owns the zookeeper > >>> server, simply look at its filesystem and read > the > >> data (out-of-band, not > >>> using a client, simply browsing the file > system of the > >> machine hosting the > >>> zookeeper server)? > >>> > >>> > >>> Thanks, > >>> Harold > >>> > >>> > >>> > >>> > >> > > > > > > > >
Re: General Question about Zookeeper
Hi Harold, Let me explain the whole concept of ZooKeeper Acls. 1) Zookeeper servers are run using some user id say X 2) zookeeper client use ZooKeeper client libaryr to create zookeeper nodes on zookeeper servers. They could be running as user id C. They can provide acl's to create such nodes for there accessability restrictions. These ACL's have NOTHING to do with (user id X) or user id C. The access controls are intependent of any user id the client is running with or the server is running with 3) A user X can obviously create zookeeper database since he has access to the local filesystem data that zookeeper is snapshots/txns into. Hope this helps. Mahadev On 6/25/09 11:20 AM, "Harold Lim" wrote: > > Hi Henry, > > Does that mean for example, if I own the Zookeeper server and physical machine > and have lots of clients using this Zookeeper server, I can simply look at the > logfiles and snapshot files and see all of the information created by those > clients? > > > Thanks, > Harold > > --- On Thu, 6/25/09, Henry Robinson wrote: > >> From: Henry Robinson >> Subject: Re: General Question about Zookeeper >> To: zookeeper-user@hadoop.apache.org >> Date: Thursday, June 25, 2009, 2:01 PM >> Hi Harold, >> >> Each ZooKeeper server stores updates to znodes in logfiles, >> and periodic >> snapshots of the state of the datatree in snapshot files. >> >> A user who has the same permissions as the server will be >> able to read these >> files, and can therefore recover the state of the datatree >> without the ZK >> server intervening. ACLs are applied only by the server; >> there is no >> filesystem-level representation of them. >> >> Henry >> >> >> >> On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim >> wrote: >> >>> >>> Hi All, >>> >>> How does zookeeper store data/files? >>> From reading the doc, the clients can put ACL on >> files/znodes to limit >>> read/write/create of other clients. However, I was >> wondering how are these >>> znodes stored on Zookeeper servers? >>> >>> I am interested in a security aspect of zookeeper, >> where the clients and >>> the servers don't necessarily belong to the same >> "group". If a client >>> creates a znode in the zookeeper? Can the person, who >> owns the zookeeper >>> server, simply look at its filesystem and read the >> data (out-of-band, not >>> using a client, simply browsing the file system of the >> machine hosting the >>> zookeeper server)? >>> >>> >>> Thanks, >>> Harold >>> >>> >>> >>> >> > > >
Re: General Question about Zookeeper
Hi Harold, As Henry mentioned, what acl's provide you is preventing access to znodes. If someone has access to zookeeper's data stored on zookeeper's server machines, they should be able to resconstruct the data and read it (using zookeeper deserialization code). I am not sure what kind of security model you are interested in, but for ZooKeeper we expect the server side data stored on local disks be inaccessible to normal users and only accessable to admins. Hope this helps. Thanks mahadev On 6/25/09 11:01 AM, "Henry Robinson" wrote: > Hi Harold, > > Each ZooKeeper server stores updates to znodes in logfiles, and periodic > snapshots of the state of the datatree in snapshot files. > > A user who has the same permissions as the server will be able to read these > files, and can therefore recover the state of the datatree without the ZK > server intervening. ACLs are applied only by the server; there is no > filesystem-level representation of them. > > Henry > > > > On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim wrote: > >> >> Hi All, >> >> How does zookeeper store data/files? >> From reading the doc, the clients can put ACL on files/znodes to limit >> read/write/create of other clients. However, I was wondering how are these >> znodes stored on Zookeeper servers? >> >> I am interested in a security aspect of zookeeper, where the clients and >> the servers don't necessarily belong to the same "group". If a client >> creates a znode in the zookeeper? Can the person, who owns the zookeeper >> server, simply look at its filesystem and read the data (out-of-band, not >> using a client, simply browsing the file system of the machine hosting the >> zookeeper server)? >> >> >> Thanks, >> Harold >> >> >> >>
Re: General Question about Zookeeper
Yes. But if you are using ZK only for leader election, then all that this person gets to know is who the leader is. Presumably, that is public or at least wide-spread knowledge in any case. On Thu, Jun 25, 2009 at 11:20 AM, Harold Lim wrote: > Does that mean for example, if I own the Zookeeper server and physical > machine and have lots of clients using this Zookeeper server, I can simply > look at the logfiles and snapshot files and see all of the information > created by those clients? >
Re: General Question about Zookeeper
Hi Henry, Does that mean for example, if I own the Zookeeper server and physical machine and have lots of clients using this Zookeeper server, I can simply look at the logfiles and snapshot files and see all of the information created by those clients? Thanks, Harold --- On Thu, 6/25/09, Henry Robinson wrote: > From: Henry Robinson > Subject: Re: General Question about Zookeeper > To: zookeeper-user@hadoop.apache.org > Date: Thursday, June 25, 2009, 2:01 PM > Hi Harold, > > Each ZooKeeper server stores updates to znodes in logfiles, > and periodic > snapshots of the state of the datatree in snapshot files. > > A user who has the same permissions as the server will be > able to read these > files, and can therefore recover the state of the datatree > without the ZK > server intervening. ACLs are applied only by the server; > there is no > filesystem-level representation of them. > > Henry > > > > On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim > wrote: > > > > > Hi All, > > > > How does zookeeper store data/files? > > From reading the doc, the clients can put ACL on > files/znodes to limit > > read/write/create of other clients. However, I was > wondering how are these > > znodes stored on Zookeeper servers? > > > > I am interested in a security aspect of zookeeper, > where the clients and > > the servers don't necessarily belong to the same > "group". If a client > > creates a znode in the zookeeper? Can the person, who > owns the zookeeper > > server, simply look at its filesystem and read the > data (out-of-band, not > > using a client, simply browsing the file system of the > machine hosting the > > zookeeper server)? > > > > > > Thanks, > > Harold > > > > > > > > >
Re: General Question about Zookeeper
Hi Gustavo, Actually, in my case, we have a fully decentralized service. Something like where you have users in a social network. Originally, we were thinking of using a distributed consensus algorithm (e.g., Paxos) to perform some functionalities (e.g., leader election). Then, I read about ZooKeeper and was thinking of using ZooKeeper for leader election instead. However, that means that we're introducing a "central" server/service to the architecture. Currently, I'm just thinking of some of the original functionalities and how much of these functionalities I can offload to ZooKeeper, without breaking the original privacy/security motivation. -Harold --- On Thu, 6/25/09, Gustavo Niemeyer wrote: > From: Gustavo Niemeyer > Subject: Re: General Question about Zookeeper > To: zookeeper-user@hadoop.apache.org > Date: Thursday, June 25, 2009, 1:59 PM > Hey Harold, > > > I am interested in a security aspect of zookeeper, > where the clients and the servers don't necessarily belong > to the same "group". If a client creates a znode in the > zookeeper? Can the person, who owns the zookeeper server, > simply look at its filesystem and read the data > (out-of-band, not using a client, simply browsing the file > system of the machine hosting the zookeeper server)? > > Yes, absolutely. You could certainly encrypt the data > that goes > through the ZooKeeper server, but since ZooKeeper is > supposed to be > doing coordination work, I think that if you don't trust > the server, > the whole situation might get a bit awkward. I'm > curious about your > use case, since I'm pondering about doing something where > clients > don't necessarily trust other clients or machines in the > same network > (or even different users in the same machine), thus might > require > additional tighting up, but if you don't trust the server > itself, that > may be tricky. Please note that ZooKeeper isn't meant > to be used just > as a distributed filesystem for storage, but that's > probably not your > intention anyway. > > -- > Gustavo Niemeyer > http://niemeyer.net >
Re: General Question about Zookeeper
Hi Harold, Each ZooKeeper server stores updates to znodes in logfiles, and periodic snapshots of the state of the datatree in snapshot files. A user who has the same permissions as the server will be able to read these files, and can therefore recover the state of the datatree without the ZK server intervening. ACLs are applied only by the server; there is no filesystem-level representation of them. Henry On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim wrote: > > Hi All, > > How does zookeeper store data/files? > From reading the doc, the clients can put ACL on files/znodes to limit > read/write/create of other clients. However, I was wondering how are these > znodes stored on Zookeeper servers? > > I am interested in a security aspect of zookeeper, where the clients and > the servers don't necessarily belong to the same "group". If a client > creates a znode in the zookeeper? Can the person, who owns the zookeeper > server, simply look at its filesystem and read the data (out-of-band, not > using a client, simply browsing the file system of the machine hosting the > zookeeper server)? > > > Thanks, > Harold > > > >
Re: General Question about Zookeeper
Hey Harold, > I am interested in a security aspect of zookeeper, where the clients and the > servers don't necessarily belong to the same "group". If a client creates a > znode in the zookeeper? Can the person, who owns the zookeeper server, simply > look at its filesystem and read the data (out-of-band, not using a client, > simply browsing the file system of the machine hosting the zookeeper server)? Yes, absolutely. You could certainly encrypt the data that goes through the ZooKeeper server, but since ZooKeeper is supposed to be doing coordination work, I think that if you don't trust the server, the whole situation might get a bit awkward. I'm curious about your use case, since I'm pondering about doing something where clients don't necessarily trust other clients or machines in the same network (or even different users in the same machine), thus might require additional tighting up, but if you don't trust the server itself, that may be tricky. Please note that ZooKeeper isn't meant to be used just as a distributed filesystem for storage, but that's probably not your intention anyway. -- Gustavo Niemeyer http://niemeyer.net
General Question about Zookeeper
Hi All, How does zookeeper store data/files? >From reading the doc, the clients can put ACL on files/znodes to limit >read/write/create of other clients. However, I was wondering how are these >znodes stored on Zookeeper servers? I am interested in a security aspect of zookeeper, where the clients and the servers don't necessarily belong to the same "group". If a client creates a znode in the zookeeper? Can the person, who owns the zookeeper server, simply look at its filesystem and read the data (out-of-band, not using a client, simply browsing the file system of the machine hosting the zookeeper server)? Thanks, Harold