Re: Improving Data-At-Rest encryption in Ceph

2015-12-21 Thread Adam Kupczyk
On Wed, Dec 16, 2015 at 11:33 PM, Sage Weil  wrote:
> On Wed, 16 Dec 2015, Adam Kupczyk wrote:
>> On Tue, Dec 15, 2015 at 3:23 PM, Lars Marowsky-Bree  wrote:
>> > On 2015-12-14T14:17:08, Radoslaw Zarzynski  wrote:
>> >
>> > Hi all,
>> >
>> > great to see this revived.
>> >
>> > However, I have come to see some concerns with handling the encryption
>> > within Ceph itself.
>> >
>> > The key part to any such approach is formulating the threat scenario.
>> > For the use cases we have seen, the data-at-rest encryption matters so
>> > they can confidently throw away disks without leaking data. It's not
>> > meant as a defense against an online attacker. There usually is no
>> > problem with "a few" disks being privileged, or one or two nodes that
>> > need an admin intervention for booting (to enter some master encryption
>> > key somehow, somewhere).
>> >
>> > However, that requires *all* data on the OSDs to be encrypted.
>> >
>> > Crucially, that includes not just the file system meta data (so not just
>> > the data), but also the root and especially the swap partition. Those
>> > potentially include swapped out data, coredumps, logs, etc.
>> >
>> > (As an optional feature, it'd be cool if an OSD could be moved to a
>> > different chassis and continue operating there, to speed up recovery.
>> > Another optional feature would be to eventually be able, for those
>> > customers that trust them ;-), supply the key to the on-disk encryption
>> > (OPAL et al).)
>> >
>> > The proposal that Joshua posted a while ago essentially remained based
>> > on dm-crypt, but put in simple hooks to retrieve the keys from some
>> > "secured" server via sftp/ftps instead of loading them from the root fs.
>> > Similar to deo, that ties the key to being on the network and knowing
>> > the OSD UUID.
>> >
>> > This would then also be somewhat easily extensible to utilize the same
>> > key management server via initrd/dracut.
>> >
>> > Yes, this means that each OSD disk is separately encrypted, but given
>> > modern CPUs, this is less of a problem. It does have the benefit of
>> > being completely transparent to Ceph, and actually covering the whole
>> > node.
>> Agreed, if encryption is infinitely fast dm-crypt is best solution.
>> Below is short analysis of encryption burden for dm-crypt and
>> OSD-encryption when using replicated pools.
>>
>> Summary:
>> OSD encryption requires 2.6 times less crypto operations then dm-crypt.
>
> Yeah, I believe that, but
>
>> Crypto ops are bottleneck.
>
> is this really true?  I don't think we've tried to measure performance
> with dm-crypt, but I also have never heard anyone complain about the
> additional CPU utilization or performance impact.  Have you observed this?
I made tests, mostly on my i7-4910MQ 2.9GHz(4 cores) with SSD.
The results for write were appallingly low, I guess due to kernel
problems with multi-cpu kcrypto[1]. I will not mention them, these
results will obfuscate discussion. And newer kernels >4.0.2 do fixes
the issue.

The results for read were 350MB/s, but CPU utilization was 44% in
kcrypto kernel worker(single core). This effectively means 11 % of
total crypto capacity, because intel-optimized AES-NI instruction is
used almost every cycle, making hyperthreading useless.

[1] 
http://unix.stackexchange.com/questions/203677/abysmal-general-dm-crypt-luks-write-performance
>
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-16 Thread Adam Kupczyk
On Tue, Dec 15, 2015 at 3:23 PM, Lars Marowsky-Bree  wrote:
> On 2015-12-14T14:17:08, Radoslaw Zarzynski  wrote:
>
> Hi all,
>
> great to see this revived.
>
> However, I have come to see some concerns with handling the encryption
> within Ceph itself.
>
> The key part to any such approach is formulating the threat scenario.
> For the use cases we have seen, the data-at-rest encryption matters so
> they can confidently throw away disks without leaking data. It's not
> meant as a defense against an online attacker. There usually is no
> problem with "a few" disks being privileged, or one or two nodes that
> need an admin intervention for booting (to enter some master encryption
> key somehow, somewhere).
>
> However, that requires *all* data on the OSDs to be encrypted.
>
> Crucially, that includes not just the file system meta data (so not just
> the data), but also the root and especially the swap partition. Those
> potentially include swapped out data, coredumps, logs, etc.
>
> (As an optional feature, it'd be cool if an OSD could be moved to a
> different chassis and continue operating there, to speed up recovery.
> Another optional feature would be to eventually be able, for those
> customers that trust them ;-), supply the key to the on-disk encryption
> (OPAL et al).)
>
> The proposal that Joshua posted a while ago essentially remained based
> on dm-crypt, but put in simple hooks to retrieve the keys from some
> "secured" server via sftp/ftps instead of loading them from the root fs.
> Similar to deo, that ties the key to being on the network and knowing
> the OSD UUID.
>
> This would then also be somewhat easily extensible to utilize the same
> key management server via initrd/dracut.
>
> Yes, this means that each OSD disk is separately encrypted, but given
> modern CPUs, this is less of a problem. It does have the benefit of
> being completely transparent to Ceph, and actually covering the whole
> node.
Agreed, if encryption is infinitely fast dm-crypt is best solution.
Below is short analysis of encryption burden for dm-crypt and
OSD-encryption when using replicated pools.

Summary:
OSD encryption requires 2.6 times less crypto operations then dm-crypt.
Crypto ops are bottleneck.
Possible solutions:
- make fewer crypto-ops (OSD based encryption can help)
- take crypto ops off CPU (H/W accelerators; not all are integrated
with kcrypto)

Calculations and explanations:
A) DM-CRYPT
When we use dm-crypt whole data and metadata is encrypted. In typical
deployment journal is located on different disc, but is also
encrypted.
On write data path is:
1) enc when writing to journal
2) dec when reading journal
3) enc when writing to storage
So for each byte 2-3 crypto operations are performed (2 can be skipped
if kernel's page allocated in 1 has not been evicted). Lets assume
2.5.
On read data path we have:
4) dec when reading from storage

Balance between reads and writes depends on deployment. Assuming 75%
are reads and 25% are writes and replication factor 3.
This gives us 1*0.75+2.5*0.25*3=2.625 bytes of crypto operation per
byte of disc i/o operation.

B) CRYPTO INSIDE OSD
When we do encryption in OSD less bytes are encrypted (dm-crypt has to
encrypt entire disc sectors); anyway we round it to 1.
Read requires 1 byte crypto op per byte. (when data comes from client)
Write requires 1 byte crypto op per byte. (when data goes to client)
This gives us 1*0.75+1*0.25=1 byte of crypto op per disc i/o.

C) OSD I/O performance calculation
Lets assume encryption speed 600MB/s per CPU core. (using AES-NI Haswell [1])
This gives us 600/2.625 = 229MB for dm-crypt and 600MB/s for OSD located crypt.
Usually there are few discs per CPU core in storage nodes. Lets say 6.
6xHDD=~600MB/s
6xSSD=~6000MB/s

It is clear that crypto is limit for speed.

https://software.intel.com/en-us/articles/intel-aes-ni-performance-enhancements-hytrust-datacontrol-case-study
>
> Of course, one of the key issues is always the key server.
> Putting/retrieving/deleting keys is reasonably simple, but the question
> of how to ensure HA for it is a bit tricky. But doable; people have been
> building HA ftp/http servers for a while ;-) Also, a single key server
> setup could theoretically serve multiple Ceph clusters.
>
> It's not yet perfect, but I think the approach is superior to being
> implemented in Ceph natively. If there's any encryption that should be
> implemented in Ceph, I believe it'd be the on-the-wire encryption to
> protect against evasedroppers.
>
> Other scenarios would require client-side encryption.
>
>> Current data at rest encryption is achieved through dm-crypt placed
>> under OSD’s filestore. This solution is a generic one and cannot
>> leverage Ceph-specific characteristics. The best example is that
>> encryption is done multiple times - one time for each replica. Another
>> issue is lack of granularity - either OSD encrypts nothing, or OSD
>> encrypts everything (with dm-crypt on).

Re: Improving Data-At-Rest encryption in Ceph

2015-12-16 Thread Sage Weil
On Wed, 16 Dec 2015, Adam Kupczyk wrote:
> On Tue, Dec 15, 2015 at 3:23 PM, Lars Marowsky-Bree  wrote:
> > On 2015-12-14T14:17:08, Radoslaw Zarzynski  wrote:
> >
> > Hi all,
> >
> > great to see this revived.
> >
> > However, I have come to see some concerns with handling the encryption
> > within Ceph itself.
> >
> > The key part to any such approach is formulating the threat scenario.
> > For the use cases we have seen, the data-at-rest encryption matters so
> > they can confidently throw away disks without leaking data. It's not
> > meant as a defense against an online attacker. There usually is no
> > problem with "a few" disks being privileged, or one or two nodes that
> > need an admin intervention for booting (to enter some master encryption
> > key somehow, somewhere).
> >
> > However, that requires *all* data on the OSDs to be encrypted.
> >
> > Crucially, that includes not just the file system meta data (so not just
> > the data), but also the root and especially the swap partition. Those
> > potentially include swapped out data, coredumps, logs, etc.
> >
> > (As an optional feature, it'd be cool if an OSD could be moved to a
> > different chassis and continue operating there, to speed up recovery.
> > Another optional feature would be to eventually be able, for those
> > customers that trust them ;-), supply the key to the on-disk encryption
> > (OPAL et al).)
> >
> > The proposal that Joshua posted a while ago essentially remained based
> > on dm-crypt, but put in simple hooks to retrieve the keys from some
> > "secured" server via sftp/ftps instead of loading them from the root fs.
> > Similar to deo, that ties the key to being on the network and knowing
> > the OSD UUID.
> >
> > This would then also be somewhat easily extensible to utilize the same
> > key management server via initrd/dracut.
> >
> > Yes, this means that each OSD disk is separately encrypted, but given
> > modern CPUs, this is less of a problem. It does have the benefit of
> > being completely transparent to Ceph, and actually covering the whole
> > node.
> Agreed, if encryption is infinitely fast dm-crypt is best solution.
> Below is short analysis of encryption burden for dm-crypt and
> OSD-encryption when using replicated pools.
> 
> Summary:
> OSD encryption requires 2.6 times less crypto operations then dm-crypt.

Yeah, I believe that, but

> Crypto ops are bottleneck.

is this really true?  I don't think we've tried to measure performance 
with dm-crypt, but I also have never heard anyone complain about the 
additional CPU utilization or performance impact.  Have you observed this?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-16 Thread Adam Kupczyk
On Tue, Dec 15, 2015 at 10:04 PM, Gregory Farnum  wrote:
> On Tue, Dec 15, 2015 at 1:58 AM, Adam Kupczyk  wrote:
>>
>>
>> On Mon, Dec 14, 2015 at 9:28 PM, Gregory Farnum  wrote:
>>>
>>> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
>>>  wrote:
>>> > Hello Folks,
>>> >
>>> > I would like to publish a proposal regarding improvements to Ceph
>>> > data-at-rest encryption mechanism. Adam Kupczyk and I worked
>>> > on that in last weeks.
>>> >
>>> > Initially we considered several architectural approaches and made
>>> > several iterations of discussions with Intel storage group. The proposal
>>> > is condensed description of the solution we see as the most promising
>>> > one.
>>> >
>>> > We are open to any comments and questions.
>>> >
>>> > Regards,
>>> > Adam Kupczyk
>>> > Radoslaw Zarzynski
>>> >
>>> >
>>> > ===
>>> > Summary
>>> > ===
>>> >
>>> > Data at-rest encryption is mechanism for protecting data center
>>> > operator from revealing content of physical carriers.
>>> >
>>> > Ceph already implements a form of at rest encryption. It is performed
>>> > through dm-crypt as intermediary layer between OSD and its physical
>>> > storage. The proposed at rest encryption mechanism will be orthogonal
>>> > and, in some ways, superior to already existing solution.
>>> >
>>> > ===
>>> > Owners
>>> > ===
>>> >
>>> > * Radoslaw Zarzynski (Mirantis)
>>> > * Adam Kupczyk (Mirantis)
>>> >
>>> > ===
>>> > Interested Parties
>>> > ===
>>> >
>>> > If you are interested in contributing to this blueprint, or want to be
>>> > a "speaker" during the Summit session, list your name here.
>>> >
>>> > Name (Affiliation)
>>> > Name (Affiliation)
>>> > Name
>>> >
>>> > ===
>>> > Current Status
>>> > ===
>>> >
>>> > Current data at rest encryption is achieved through dm-crypt placed
>>> > under OSD’s filestore. This solution is a generic one and cannot
>>> > leverage Ceph-specific characteristics. The best example is that
>>> > encryption is done multiple times - one time for each replica. Another
>>> > issue is lack of granularity - either OSD encrypts nothing, or OSD
>>> > encrypts everything (with dm-crypt on).
>>> >
>>> > Cryptographic keys are stored on filesystem of storage node that hosts
>>> > OSDs. Changing them require redeploying the OSDs.
>>> >
>>> > The best way to address those issues seems to be introducing
>>> > encryption into Ceph OSD.
>>> >
>>> > ===
>>> > Detailed Description
>>> > ===
>>> >
>>> > In addition to the currently available solution, Ceph OSD would
>>> > accommodate encryption component placed in the replication mechanisms.
>>> >
>>> > Data incoming from Ceph clients would be encrypted by primary OSD. It
>>> > would replicate ciphertext to non-primary members of an acting set.
>>> > Data sent to Ceph client would be decrypted by OSD handling read
>>> > operation. This allows to:
>>> > * perform only one encryption per write,
>>> > * achieve per-pool key granulation for both key and encryption itself.
>>> >
>>> > Unfortunately, having always and everywhere the same key for a given
>>> > pool is unacceptable - it would make cluster migration and key change
>>> > extremely burdensome process. To address those issues crypto key
>>> > versioning would be introduced. All RADOS objects inside single
>>> > placement group stored on a given OSD would use the same crypto key
>>> > version. The same PG on other replica may use different version of the
>>> > same, per pool-granulated key.
>>> >
>>> > In typical case ciphertext data transferred from OSD to OSD can be
>>> > used without change. This is when both OSDs have the same crypto key
>>> > version for given placement group. In rare cases when crypto keys are
>>> > different (key change or transition period) receiving OSD will recrypt
>>> > with local key versions.
>>>
>>> I don't understand this part at all. Do you plan to read and rewrite
>>> the entire PG whenever you change the "key version"? How often do you
>>> plan to change these keys? What is even the point of changing them,
>>> since anybody who can control an OSD can grab the entire current key
>>> set?
>>
>> We envision that key change will happen very infrequently. Usually in
>> reaction to some possible security breach.
>> After key version is incremented, nothing happens automatically. Old key is
>> used for as long as  PG is not empty. When first RADOS object is created,
>> the current key version is locked to PG.
>> There is no solution when someone gets control over OSD - either by running
>> custom OSD binary or extracting data by impersonating client. It is outside
>> of scope of at-rest-encryption. We only addressed cases when media storage
>> somehow leaves datacenter premises. Ability to change 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-16 Thread Radoslaw Zarzynski
On Tue, Dec 15, 2015 at 10:04 PM, Gregory Farnum  wrote:
> On Tue, Dec 15, 2015 at 1:58 AM, Adam Kupczyk  wrote:
>>
>>
>> On Mon, Dec 14, 2015 at 9:28 PM, Gregory Farnum  wrote:
>>>
>>> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
>>>  wrote:
>>> > Hello Folks,
>>> >
>>> > I would like to publish a proposal regarding improvements to Ceph
>>> > data-at-rest encryption mechanism. Adam Kupczyk and I worked
>>> > on that in last weeks.
>>> >
>>> > Initially we considered several architectural approaches and made
>>> > several iterations of discussions with Intel storage group. The proposal
>>> > is condensed description of the solution we see as the most promising
>>> > one.
>>> >
>>> > We are open to any comments and questions.
>>> >
>>> > Regards,
>>> > Adam Kupczyk
>>> > Radoslaw Zarzynski
>>> >
>>> >
>>> > ===
>>> > Summary
>>> > ===
>>> >
>>> > Data at-rest encryption is mechanism for protecting data center
>>> > operator from revealing content of physical carriers.
>>> >
>>> > Ceph already implements a form of at rest encryption. It is performed
>>> > through dm-crypt as intermediary layer between OSD and its physical
>>> > storage. The proposed at rest encryption mechanism will be orthogonal
>>> > and, in some ways, superior to already existing solution.
>>> >
>>> > ===
>>> > Owners
>>> > ===
>>> >
>>> > * Radoslaw Zarzynski (Mirantis)
>>> > * Adam Kupczyk (Mirantis)
>>> >
>>> > ===
>>> > Interested Parties
>>> > ===
>>> >
>>> > If you are interested in contributing to this blueprint, or want to be
>>> > a "speaker" during the Summit session, list your name here.
>>> >
>>> > Name (Affiliation)
>>> > Name (Affiliation)
>>> > Name
>>> >
>>> > ===
>>> > Current Status
>>> > ===
>>> >
>>> > Current data at rest encryption is achieved through dm-crypt placed
>>> > under OSD’s filestore. This solution is a generic one and cannot
>>> > leverage Ceph-specific characteristics. The best example is that
>>> > encryption is done multiple times - one time for each replica. Another
>>> > issue is lack of granularity - either OSD encrypts nothing, or OSD
>>> > encrypts everything (with dm-crypt on).
>>> >
>>> > Cryptographic keys are stored on filesystem of storage node that hosts
>>> > OSDs. Changing them require redeploying the OSDs.
>>> >
>>> > The best way to address those issues seems to be introducing
>>> > encryption into Ceph OSD.
>>> >
>>> > ===
>>> > Detailed Description
>>> > ===
>>> >
>>> > In addition to the currently available solution, Ceph OSD would
>>> > accommodate encryption component placed in the replication mechanisms.
>>> >
>>> > Data incoming from Ceph clients would be encrypted by primary OSD. It
>>> > would replicate ciphertext to non-primary members of an acting set.
>>> > Data sent to Ceph client would be decrypted by OSD handling read
>>> > operation. This allows to:
>>> > * perform only one encryption per write,
>>> > * achieve per-pool key granulation for both key and encryption itself.
>>> >
>>> > Unfortunately, having always and everywhere the same key for a given
>>> > pool is unacceptable - it would make cluster migration and key change
>>> > extremely burdensome process. To address those issues crypto key
>>> > versioning would be introduced. All RADOS objects inside single
>>> > placement group stored on a given OSD would use the same crypto key
>>> > version. The same PG on other replica may use different version of the
>>> > same, per pool-granulated key.
>>> >
>>> > In typical case ciphertext data transferred from OSD to OSD can be
>>> > used without change. This is when both OSDs have the same crypto key
>>> > version for given placement group. In rare cases when crypto keys are
>>> > different (key change or transition period) receiving OSD will recrypt
>>> > with local key versions.
>>>
>>> I don't understand this part at all. Do you plan to read and rewrite
>>> the entire PG whenever you change the "key version"? How often do you
>>> plan to change these keys? What is even the point of changing them,
>>> since anybody who can control an OSD can grab the entire current key
>>> set?
>>
>> We envision that key change will happen very infrequently. Usually in
>> reaction to some possible security breach.
>> After key version is incremented, nothing happens automatically. Old key is
>> used for as long as  PG is not empty. When first RADOS object is created,
>> the current key version is locked to PG.
>> There is no solution when someone gets control over OSD - either by running
>> custom OSD binary or extracting data by impersonating client. It is outside
>> of scope of at-rest-encryption. We only addressed cases when media storage
>> somehow leaves datacenter premises. Ability to change 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Adam Kupczyk
On Mon, Dec 14, 2015 at 9:28 PM, Gregory Farnum  wrote:
>
> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
>  wrote:
> > Hello Folks,
> >
> > I would like to publish a proposal regarding improvements to Ceph
> > data-at-rest encryption mechanism. Adam Kupczyk and I worked
> > on that in last weeks.
> >
> > Initially we considered several architectural approaches and made
> > several iterations of discussions with Intel storage group. The proposal
> > is condensed description of the solution we see as the most promising
> > one.
> >
> > We are open to any comments and questions.
> >
> > Regards,
> > Adam Kupczyk
> > Radoslaw Zarzynski
> >
> >
> > ===
> > Summary
> > ===
> >
> > Data at-rest encryption is mechanism for protecting data center
> > operator from revealing content of physical carriers.
> >
> > Ceph already implements a form of at rest encryption. It is performed
> > through dm-crypt as intermediary layer between OSD and its physical
> > storage. The proposed at rest encryption mechanism will be orthogonal
> > and, in some ways, superior to already existing solution.
> >
> > ===
> > Owners
> > ===
> >
> > * Radoslaw Zarzynski (Mirantis)
> > * Adam Kupczyk (Mirantis)
> >
> > ===
> > Interested Parties
> > ===
> >
> > If you are interested in contributing to this blueprint, or want to be
> > a "speaker" during the Summit session, list your name here.
> >
> > Name (Affiliation)
> > Name (Affiliation)
> > Name
> >
> > ===
> > Current Status
> > ===
> >
> > Current data at rest encryption is achieved through dm-crypt placed
> > under OSD’s filestore. This solution is a generic one and cannot
> > leverage Ceph-specific characteristics. The best example is that
> > encryption is done multiple times - one time for each replica. Another
> > issue is lack of granularity - either OSD encrypts nothing, or OSD
> > encrypts everything (with dm-crypt on).
> >
> > Cryptographic keys are stored on filesystem of storage node that hosts
> > OSDs. Changing them require redeploying the OSDs.
> >
> > The best way to address those issues seems to be introducing
> > encryption into Ceph OSD.
> >
> > ===
> > Detailed Description
> > ===
> >
> > In addition to the currently available solution, Ceph OSD would
> > accommodate encryption component placed in the replication mechanisms.
> >
> > Data incoming from Ceph clients would be encrypted by primary OSD. It
> > would replicate ciphertext to non-primary members of an acting set.
> > Data sent to Ceph client would be decrypted by OSD handling read
> > operation. This allows to:
> > * perform only one encryption per write,
> > * achieve per-pool key granulation for both key and encryption itself.
> >
> > Unfortunately, having always and everywhere the same key for a given
> > pool is unacceptable - it would make cluster migration and key change
> > extremely burdensome process. To address those issues crypto key
> > versioning would be introduced. All RADOS objects inside single
> > placement group stored on a given OSD would use the same crypto key
> > version. The same PG on other replica may use different version of the
> > same, per pool-granulated key.
> >
> > In typical case ciphertext data transferred from OSD to OSD can be
> > used without change. This is when both OSDs have the same crypto key
> > version for given placement group. In rare cases when crypto keys are
> > different (key change or transition period) receiving OSD will recrypt
> > with local key versions.
>
> I don't understand this part at all. Do you plan to read and rewrite
> the entire PG whenever you change the "key version"? How often do you
> plan to change these keys? What is even the point of changing them,
> since anybody who can control an OSD can grab the entire current key
> set?
We envision that key change will happen very infrequently. Usually in
reaction to some possible security breach.
After key version is incremented, nothing happens automatically. Old
key is used for as long as  PG is not empty. When first RADOS object
is created, the current key version is locked to PG.
There is no solution when someone gets control over OSD - either by
running custom OSD binary or extracting data by impersonating client.
It is outside of scope of at-rest-encryption. We only addressed cases
when media storage somehow leaves datacenter premises. Ability to
change key is necessary, since we need procedure to recover data
security after keys are compromised.
>
> > For compression to be effective it must be done before encryption. Due
> > to that encryption may be applied differently for replication pools
> > and EC pools. Replicated pools do not implement compression; for those
> > pools encryption is applied right after data enters OSD. For EC pools
> > 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Adam Kupczyk
On Mon, Dec 14, 2015 at 11:02 PM, Martin Millnert  wrote:
> On Mon, 2015-12-14 at 12:28 -0800, Gregory Farnum wrote:
>> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
> 
>> > In typical case ciphertext data transferred from OSD to OSD can be
>> > used without change. This is when both OSDs have the same crypto key
>> > version for given placement group. In rare cases when crypto keys are
>> > different (key change or transition period) receiving OSD will recrypt
>> > with local key versions.
>>
>> I don't understand this part at all. Do you plan to read and rewrite
>> the entire PG whenever you change the "key version"? How often do you
>> plan to change these keys? What is even the point of changing them,
>> since anybody who can control an OSD can grab the entire current key
>> set?
>
> You may have leaked keys without having leaked ciphertext.
> The typical use case for FDE/SED is IMO being able to RMA drives.
> Nothing more than that.
>
>> > For compression to be effective it must be done before encryption. Due
>> > to that encryption may be applied differently for replication pools
>> > and EC pools. Replicated pools do not implement compression; for those
>> > pools encryption is applied right after data enters OSD. For EC pools
>> > encryption is applied after compressing. When compression will be
>> > implemented for replicated pools, it must be placed before encryption.
>>
>> So this means you'll be encrypting the object data, but not the omap
>> nor xattrs, and not the file names on disk. Is that acceptable to
>> people? It's probably fine for a lot of rbd use cases, but not for
>> RGW, CephFS, nor raw RADOS where meaningful metadata (and even *data*)
>> is stored in those regions. I'd rather a solution worked on the full
>> data set. :/
>> -Greg
>
> This is indeed the largest weakness of the proposal.
>
> I'm lacking a bit on the motivation for what problem this solution
> solves that a dm-crypt-based solution doesn't address? When, except for
> snooping, is it a desired design to not encrypt all the things?
1) With dm-crypt encryption is performed separately for each replica.
With OSD solution it is possible to encrypt only once, and distribute encrypted.
2) It is best to encrypt everything. There just are some things we are
unable to encrypt, it actually means: omap names.
>
> I guess one could say: "ciphertext would be transferred on the network".
> But, it's incomplete. Internal transport encryption (and better auth)
> for Ceph is a different problem.
>
> I'd probably rather dm-crypt key management processes were refined and
> improved (saying this without knowing the state of any current
> implementations for Ceph), and have a solid FDE solution than a solution
> that doesn't encrypt metadata. Only encrypting data but not metadata
> isn't sufficient anymore...
>
> /M
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Radoslaw Zarzynski
On Mon, Dec 14, 2015 at 10:52 PM, Martin Millnert  wrote:
> On Mon, 2015-12-14 at 14:17 +0100, Radoslaw Zarzynski wrote:
>> Hello Folks,
>>
>> I would like to publish a proposal regarding improvements to Ceph
>> data-at-rest encryption mechanism. Adam Kupczyk and I worked
>> on that in last weeks.
>>
>> Initially we considered several architectural approaches and made
>> several iterations of discussions with Intel storage group. The proposal
>> is condensed description of the solution we see as the most promising
>> one.
>>
>> We are open to any comments and questions.
>>
>> Regards,
>> Adam Kupczyk
>> Radoslaw Zarzynski
>>
>>
>> ===
>> Summary
>> ===
>>
>> Data at-rest encryption is mechanism for protecting data center
>> operator from revealing content of physical carriers.
>>
>> Ceph already implements a form of at rest encryption. It is performed
>> through dm-crypt as intermediary layer between OSD and its physical
>> storage. The proposed at rest encryption mechanism will be orthogonal
>> and, in some ways, superior to already existing solution.
>>
>> ===
>> Owners
>> ===
>>
>> * Radoslaw Zarzynski (Mirantis)
>> * Adam Kupczyk (Mirantis)
>>
>> ===
>> Interested Parties
>> ===
>>
>> If you are interested in contributing to this blueprint, or want to be
>> a "speaker" during the Summit session, list your name here.
>>
>> Name (Affiliation)
>> Name (Affiliation)
>> Name
>>
>> ===
>> Current Status
>> ===
>>
>> Current data at rest encryption is achieved through dm-crypt placed
>> under OSD’s filestore. This solution is a generic one and cannot
>> leverage Ceph-specific characteristics. The best example is that
>> encryption is done multiple times - one time for each replica. Another
>> issue is lack of granularity - either OSD encrypts nothing, or OSD
>> encrypts everything (with dm-crypt on).
>
> All or nothing is some times a desired function of encryption.
> "In-betweens" are tricky.
>
> Additionally, dm-crypt is AFAICT fairly performant since at least
> there's no need to context switch per crypto-op, since it sits in the dm
> IO path within kernel.

Hello Martin,

I cannot agree about dm-crypt performance in comparison to the OSD
solution. Each BIO handled by dm-crypt must go through at least one
kernel workqueue (kcryptd) [1]. Some of them have to pass additional
one (kcryptd_io) [2]. Those wqueues are served by dedicated set of
kthreads, so context switches are present here. Moreover, the whole
BIO is split into small, 512 bytes long chunks before passing to
ablkcipher [3]. IMO that's far less than ideal.

In the case of application-layer encryption you would operate much
closer to the data. You may encrypt in much larger chunks. Costs of
context switches and op setup phase (important for hw accelerators)
would be negligible providing much better performance. Leveraging
some Ceph-specific characteristics (encrypting only selected pools;
constant complexity according to replica count) multiplies gain even
further.

Regards,
Radoslaw

[1] http://lxr.free-electrons.com/source/drivers/md/dm-crypt.c?v=3.19#L1350
[2] http://lxr.free-electrons.com/source/drivers/md/dm-crypt.c?v=3.19#L1355
[3] http://lxr.free-electrons.com/source/drivers/md/dm-crypt.c?v=3.19#L864

>
> These two points are not necessarily a critique of your proposal.
>
>> Cryptographic keys are stored on filesystem of storage node that hosts
>> OSDs. Changing them require redeploying the OSDs.
>
> Not very familiar with what deployment technique of dm-crypt you refer
> to (don't use ceph-deploy personally). But the LUKS FDE suite does allow
> for separating encryption key from activation key (or whatever it is
> called).
>
>> The best way to address those issues seems to be introducing
>> encryption into Ceph OSD.
>>
>> ===
>> Detailed Description
>> ===
>>
>> In addition to the currently available solution, Ceph OSD would
>> accommodate encryption component placed in the replication mechanisms.
>>
>> Data incoming from Ceph clients would be encrypted by primary OSD. It
>> would replicate ciphertext to non-primary members of an acting set.
>> Data sent to Ceph client would be decrypted by OSD handling read
>> operation. This allows to:
>> * perform only one encryption per write,
>> * achieve per-pool key granulation for both key and encryption itself.
>
> I.e. the primary OSD's key for the PG in question, would be the one used
> for all replicas of the data, per acting set. I.e. granularity of
> actually one key per acting set, controlled by primary OSD?
>
>> Unfortunately, having always and everywhere the same key for a given
>> pool is unacceptable - it would make cluster migration and key change
>> extremely burdensome process. To address those issues crypto key
>> versioning would be introduced. All RADOS objects 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Matt Benjamin
Hi,

Thanks for this detailed response.

- Original Message -
> From: "Lars Marowsky-Bree" <l...@suse.com>
> To: "Ceph Development" <ceph-devel@vger.kernel.org>
> Sent: Tuesday, December 15, 2015 9:23:04 AM
> Subject: Re: Improving Data-At-Rest encryption in Ceph

> 
> It's not yet perfect, but I think the approach is superior to being
> implemented in Ceph natively. If there's any encryption that should be
> implemented in Ceph, I believe it'd be the on-the-wire encryption to
> protect against evasedroppers.

++

> 
> Other scenarios would require client-side encryption.

++

> 
> > Cryptographic keys are stored on filesystem of storage node that hosts
> > OSDs. Changing them require redeploying the OSDs.
> 
> This is solvable by storing the key on an external key server.

++

> 
> Changing the key is only necessary if the key has been exposed. And with
> dm-crypt, that's still possible - it's not the actual encryption key
> that's stored, but the secret that is needed to unlock it, and that can
> be re-encrypted quite fast. (In theory; it's not implemented yet for
> the Ceph OSDs.)
> 
> 
> > Data incoming from Ceph clients would be encrypted by primary OSD. It
> > would replicate ciphertext to non-primary members of an acting set.
> 
> This still exposes data in coredumps or on swap on the primary OSD, and
> metadata on the secondaries.
> 
> 
> Regards,
> Lars
> 
> --
> Architect Storage/HA
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB
> 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 


-- 
-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-707-0660
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Gregory Farnum
On Tue, Dec 15, 2015 at 1:58 AM, Adam Kupczyk  wrote:
>
>
> On Mon, Dec 14, 2015 at 9:28 PM, Gregory Farnum  wrote:
>>
>> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
>>  wrote:
>> > Hello Folks,
>> >
>> > I would like to publish a proposal regarding improvements to Ceph
>> > data-at-rest encryption mechanism. Adam Kupczyk and I worked
>> > on that in last weeks.
>> >
>> > Initially we considered several architectural approaches and made
>> > several iterations of discussions with Intel storage group. The proposal
>> > is condensed description of the solution we see as the most promising
>> > one.
>> >
>> > We are open to any comments and questions.
>> >
>> > Regards,
>> > Adam Kupczyk
>> > Radoslaw Zarzynski
>> >
>> >
>> > ===
>> > Summary
>> > ===
>> >
>> > Data at-rest encryption is mechanism for protecting data center
>> > operator from revealing content of physical carriers.
>> >
>> > Ceph already implements a form of at rest encryption. It is performed
>> > through dm-crypt as intermediary layer between OSD and its physical
>> > storage. The proposed at rest encryption mechanism will be orthogonal
>> > and, in some ways, superior to already existing solution.
>> >
>> > ===
>> > Owners
>> > ===
>> >
>> > * Radoslaw Zarzynski (Mirantis)
>> > * Adam Kupczyk (Mirantis)
>> >
>> > ===
>> > Interested Parties
>> > ===
>> >
>> > If you are interested in contributing to this blueprint, or want to be
>> > a "speaker" during the Summit session, list your name here.
>> >
>> > Name (Affiliation)
>> > Name (Affiliation)
>> > Name
>> >
>> > ===
>> > Current Status
>> > ===
>> >
>> > Current data at rest encryption is achieved through dm-crypt placed
>> > under OSD’s filestore. This solution is a generic one and cannot
>> > leverage Ceph-specific characteristics. The best example is that
>> > encryption is done multiple times - one time for each replica. Another
>> > issue is lack of granularity - either OSD encrypts nothing, or OSD
>> > encrypts everything (with dm-crypt on).
>> >
>> > Cryptographic keys are stored on filesystem of storage node that hosts
>> > OSDs. Changing them require redeploying the OSDs.
>> >
>> > The best way to address those issues seems to be introducing
>> > encryption into Ceph OSD.
>> >
>> > ===
>> > Detailed Description
>> > ===
>> >
>> > In addition to the currently available solution, Ceph OSD would
>> > accommodate encryption component placed in the replication mechanisms.
>> >
>> > Data incoming from Ceph clients would be encrypted by primary OSD. It
>> > would replicate ciphertext to non-primary members of an acting set.
>> > Data sent to Ceph client would be decrypted by OSD handling read
>> > operation. This allows to:
>> > * perform only one encryption per write,
>> > * achieve per-pool key granulation for both key and encryption itself.
>> >
>> > Unfortunately, having always and everywhere the same key for a given
>> > pool is unacceptable - it would make cluster migration and key change
>> > extremely burdensome process. To address those issues crypto key
>> > versioning would be introduced. All RADOS objects inside single
>> > placement group stored on a given OSD would use the same crypto key
>> > version. The same PG on other replica may use different version of the
>> > same, per pool-granulated key.
>> >
>> > In typical case ciphertext data transferred from OSD to OSD can be
>> > used without change. This is when both OSDs have the same crypto key
>> > version for given placement group. In rare cases when crypto keys are
>> > different (key change or transition period) receiving OSD will recrypt
>> > with local key versions.
>>
>> I don't understand this part at all. Do you plan to read and rewrite
>> the entire PG whenever you change the "key version"? How often do you
>> plan to change these keys? What is even the point of changing them,
>> since anybody who can control an OSD can grab the entire current key
>> set?
>
> We envision that key change will happen very infrequently. Usually in
> reaction to some possible security breach.
> After key version is incremented, nothing happens automatically. Old key is
> used for as long as  PG is not empty. When first RADOS object is created,
> the current key version is locked to PG.
> There is no solution when someone gets control over OSD - either by running
> custom OSD binary or extracting data by impersonating client. It is outside
> of scope of at-rest-encryption. We only addressed cases when media storage
> somehow leaves datacenter premises. Ability to change key is necessary,
> since we need procedure to recover data security after keys are compromised.
>>
>>
>> > For compression to be effective it must be done before encryption. Due
>> > to that 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Lars Marowsky-Bree
On 2015-12-14T14:17:08, Radoslaw Zarzynski  wrote:

Hi all,

great to see this revived.

However, I have come to see some concerns with handling the encryption
within Ceph itself.

The key part to any such approach is formulating the threat scenario.
For the use cases we have seen, the data-at-rest encryption matters so
they can confidently throw away disks without leaking data. It's not
meant as a defense against an online attacker. There usually is no
problem with "a few" disks being privileged, or one or two nodes that
need an admin intervention for booting (to enter some master encryption
key somehow, somewhere).

However, that requires *all* data on the OSDs to be encrypted.

Crucially, that includes not just the file system meta data (so not just
the data), but also the root and especially the swap partition. Those
potentially include swapped out data, coredumps, logs, etc.

(As an optional feature, it'd be cool if an OSD could be moved to a
different chassis and continue operating there, to speed up recovery.
Another optional feature would be to eventually be able, for those
customers that trust them ;-), supply the key to the on-disk encryption
(OPAL et al).)

The proposal that Joshua posted a while ago essentially remained based
on dm-crypt, but put in simple hooks to retrieve the keys from some
"secured" server via sftp/ftps instead of loading them from the root fs.
Similar to deo, that ties the key to being on the network and knowing
the OSD UUID.

This would then also be somewhat easily extensible to utilize the same
key management server via initrd/dracut.

Yes, this means that each OSD disk is separately encrypted, but given
modern CPUs, this is less of a problem. It does have the benefit of
being completely transparent to Ceph, and actually covering the whole
node.

Of course, one of the key issues is always the key server.
Putting/retrieving/deleting keys is reasonably simple, but the question
of how to ensure HA for it is a bit tricky. But doable; people have been
building HA ftp/http servers for a while ;-) Also, a single key server
setup could theoretically serve multiple Ceph clusters.

It's not yet perfect, but I think the approach is superior to being
implemented in Ceph natively. If there's any encryption that should be
implemented in Ceph, I believe it'd be the on-the-wire encryption to
protect against evasedroppers.

Other scenarios would require client-side encryption.

> Current data at rest encryption is achieved through dm-crypt placed
> under OSD’s filestore. This solution is a generic one and cannot
> leverage Ceph-specific characteristics. The best example is that
> encryption is done multiple times - one time for each replica. Another
> issue is lack of granularity - either OSD encrypts nothing, or OSD
> encrypts everything (with dm-crypt on).

True. But for the threat scenario, a holistic approach to encryption
seems actually required.

> Cryptographic keys are stored on filesystem of storage node that hosts
> OSDs. Changing them require redeploying the OSDs.

This is solvable by storing the key on an external key server.

Changing the key is only necessary if the key has been exposed. And with
dm-crypt, that's still possible - it's not the actual encryption key
that's stored, but the secret that is needed to unlock it, and that can
be re-encrypted quite fast. (In theory; it's not implemented yet for
the Ceph OSDs.)


> Data incoming from Ceph clients would be encrypted by primary OSD. It
> would replicate ciphertext to non-primary members of an acting set.

This still exposes data in coredumps or on swap on the primary OSD, and
metadata on the secondaries.


Regards,
Lars

-- 
Architect Storage/HA
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Sage Weil
I agree with Lars's concerns: the main problems with the current dm-crypt 
approach are that there isn't any key management integration yet and the 
root volume and swap aren't encrypted. Those are easy to solve (and I'm 
hoping we'll be able to address them in time for Jewel).

On the other hand, implementing encryption within RADOS will be complex, 
and I don't see what the benefits are over whole-disk encryption.  Can 
someone summarize what per-pool encryption keys and the ability to rotate 
keys gives us?  If the threat is an attacker who is on the storage network 
and has compromised an OSD the game is pretty much up...

At a high level, I think almost anything beyond at-rest encryption (that 
is aimed at throwing out disks or physically walking a server out of the 
data center) turns into a key management and threat mitigation design 
nightmare (with few, if any, compelling solutions) until you give up and 
have clients encrypt their data and don't trust the cluster with the keys 
at all...

sage


On Tue, 15 Dec 2015, Lars Marowsky-Bree wrote:
> On 2015-12-14T14:17:08, Radoslaw Zarzynski  wrote:
> 
> Hi all,
> 
> great to see this revived.
> 
> However, I have come to see some concerns with handling the encryption
> within Ceph itself.
> 
> The key part to any such approach is formulating the threat scenario.
> For the use cases we have seen, the data-at-rest encryption matters so
> they can confidently throw away disks without leaking data. It's not
> meant as a defense against an online attacker. There usually is no
> problem with "a few" disks being privileged, or one or two nodes that
> need an admin intervention for booting (to enter some master encryption
> key somehow, somewhere).
> 
> However, that requires *all* data on the OSDs to be encrypted.
> 
> Crucially, that includes not just the file system meta data (so not just
> the data), but also the root and especially the swap partition. Those
> potentially include swapped out data, coredumps, logs, etc.
> 
> (As an optional feature, it'd be cool if an OSD could be moved to a
> different chassis and continue operating there, to speed up recovery.
> Another optional feature would be to eventually be able, for those
> customers that trust them ;-), supply the key to the on-disk encryption
> (OPAL et al).)
> 
> The proposal that Joshua posted a while ago essentially remained based
> on dm-crypt, but put in simple hooks to retrieve the keys from some
> "secured" server via sftp/ftps instead of loading them from the root fs.
> Similar to deo, that ties the key to being on the network and knowing
> the OSD UUID.
> 
> This would then also be somewhat easily extensible to utilize the same
> key management server via initrd/dracut.
> 
> Yes, this means that each OSD disk is separately encrypted, but given
> modern CPUs, this is less of a problem. It does have the benefit of
> being completely transparent to Ceph, and actually covering the whole
> node.
> 
> Of course, one of the key issues is always the key server.
> Putting/retrieving/deleting keys is reasonably simple, but the question
> of how to ensure HA for it is a bit tricky. But doable; people have been
> building HA ftp/http servers for a while ;-) Also, a single key server
> setup could theoretically serve multiple Ceph clusters.
> 
> It's not yet perfect, but I think the approach is superior to being
> implemented in Ceph natively. If there's any encryption that should be
> implemented in Ceph, I believe it'd be the on-the-wire encryption to
> protect against evasedroppers.
> 
> Other scenarios would require client-side encryption.
> 
> > Current data at rest encryption is achieved through dm-crypt placed
> > under OSD?s filestore. This solution is a generic one and cannot
> > leverage Ceph-specific characteristics. The best example is that
> > encryption is done multiple times - one time for each replica. Another
> > issue is lack of granularity - either OSD encrypts nothing, or OSD
> > encrypts everything (with dm-crypt on).
> 
> True. But for the threat scenario, a holistic approach to encryption
> seems actually required.
> 
> > Cryptographic keys are stored on filesystem of storage node that hosts
> > OSDs. Changing them require redeploying the OSDs.
> 
> This is solvable by storing the key on an external key server.
> 
> Changing the key is only necessary if the key has been exposed. And with
> dm-crypt, that's still possible - it's not the actual encryption key
> that's stored, but the secret that is needed to unlock it, and that can
> be re-encrypted quite fast. (In theory; it's not implemented yet for
> the Ceph OSDs.)
> 
> 
> > Data incoming from Ceph clients would be encrypted by primary OSD. It
> > would replicate ciphertext to non-primary members of an acting set.
> 
> This still exposes data in coredumps or on swap on the primary OSD, and
> metadata on the secondaries.
> 
> 
> Regards,
> Lars
> 
> -- 
> Architect Storage/HA
> SUSE Linux GmbH, 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Andrew Bartlett
On Mon, 2015-12-14 at 14:32 -0800, Gregory Farnum wrote:
> On Mon, Dec 14, 2015 at 2:02 PM, Martin Millnert 
> wrote:
> > On Mon, 2015-12-14 at 12:28 -0800, Gregory Farnum wrote:
> > > On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
> > 
> > > > In typical case ciphertext data transferred from OSD to OSD can
> > > > be
> > > > used without change. This is when both OSDs have the same
> > > > crypto key
> > > > version for given placement group. In rare cases when crypto
> > > > keys are
> > > > different (key change or transition period) receiving OSD will
> > > > recrypt
> > > > with local key versions.
> > > 
> > > I don't understand this part at all. Do you plan to read and
> > > rewrite
> > > the entire PG whenever you change the "key version"? How often do
> > > you
> > > plan to change these keys? What is even the point of changing
> > > them,
> > > since anybody who can control an OSD can grab the entire current
> > > key
> > > set?
> > 
> > You may have leaked keys without having leaked ciphertext.
> > The typical use case for FDE/SED is IMO being able to RMA drives.
> > Nothing more than that.
> 
> Yeah, but you necessarily need to let people keep using the old key
> *and* give them the new one on-demand if they've got access to the
> system, in order to allow switching to the new key. You need to wait
> for all the data to actually be rewritten with the new key before you
> can consider it secure again, and that'll take a lng time. I'm
> not
> saying there isn't threat mitigation here, just that I'm not sure
> it's
> useful against somebody who's already obtained access to your
> encryption keys — if they've gotten those it's unlikely they won't
> have gotten OSD keys as well, and if they've got network access they
> can impersonate an OSD and get access to whatever data they like.
> 
> I guess that still protects against an external database hack from
> somebody who gets access to your old hard drives, but...*shrug*

An important part of why we moved to LUKS for key dm-crypt is that LUKS
does some useful things to allow a form of key rotation. 

The master key is never changed (except at reformat), but it also is
never disclosed beyond the host's kernel.  What is stored on the disks
and/or on the key servers is a key-encryption-key.  

The process for rotating the key encryption key is pretty sensible,
given the constraints, because they go to good lengths to rewrite the
blocks where the old KEK encrypted the master key. 

> Yeah, I'd rather see dm-crypt get done well rather than in-Ceph
> encryption like this. If we want to protect data I think that's a lot
> more secure (and will *stay* that way since encryption is all that
> project does), and adding TLS or similar to the messenger code would
> give us on-the-wire protection from the clients to the disk.
> -Greg

The the good reason to use dm-crypt is that novel cryptography is NOT a
good thing.  The dm-crypt stuff is well used and well understood, and
any potential attacks against it are likely to be widely reported and
property analysed. 

Andrew Bartlett

-- 
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team https://samba.org
Samba Development and Support, Catalyst IT   
https://catalyst.net.nz/services/samba






--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Martin Millnert
On Mon, 2015-12-14 at 14:17 +0100, Radoslaw Zarzynski wrote:
> Hello Folks,
> 
> I would like to publish a proposal regarding improvements to Ceph
> data-at-rest encryption mechanism. Adam Kupczyk and I worked
> on that in last weeks.
> 
> Initially we considered several architectural approaches and made
> several iterations of discussions with Intel storage group. The proposal
> is condensed description of the solution we see as the most promising
> one.
> 
> We are open to any comments and questions.
> 
> Regards,
> Adam Kupczyk
> Radoslaw Zarzynski
> 
> 
> ===
> Summary
> ===
> 
> Data at-rest encryption is mechanism for protecting data center
> operator from revealing content of physical carriers.
> 
> Ceph already implements a form of at rest encryption. It is performed
> through dm-crypt as intermediary layer between OSD and its physical
> storage. The proposed at rest encryption mechanism will be orthogonal
> and, in some ways, superior to already existing solution.
> 
> ===
> Owners
> ===
> 
> * Radoslaw Zarzynski (Mirantis)
> * Adam Kupczyk (Mirantis)
> 
> ===
> Interested Parties
> ===
> 
> If you are interested in contributing to this blueprint, or want to be
> a "speaker" during the Summit session, list your name here.
> 
> Name (Affiliation)
> Name (Affiliation)
> Name
> 
> ===
> Current Status
> ===
> 
> Current data at rest encryption is achieved through dm-crypt placed
> under OSD’s filestore. This solution is a generic one and cannot
> leverage Ceph-specific characteristics. The best example is that
> encryption is done multiple times - one time for each replica. Another
> issue is lack of granularity - either OSD encrypts nothing, or OSD
> encrypts everything (with dm-crypt on).

All or nothing is some times a desired function of encryption.
"In-betweens" are tricky.

Additionally, dm-crypt is AFAICT fairly performant since at least
there's no need to context switch per crypto-op, since it sits in the dm
IO path within kernel.

These two points are not necessarily a critique of your proposal.

> Cryptographic keys are stored on filesystem of storage node that hosts
> OSDs. Changing them require redeploying the OSDs.

Not very familiar with what deployment technique of dm-crypt you refer
to (don't use ceph-deploy personally). But the LUKS FDE suite does allow
for separating encryption key from activation key (or whatever it is
called).

> The best way to address those issues seems to be introducing
> encryption into Ceph OSD.
> 
> ===
> Detailed Description
> ===
> 
> In addition to the currently available solution, Ceph OSD would
> accommodate encryption component placed in the replication mechanisms.
> 
> Data incoming from Ceph clients would be encrypted by primary OSD. It
> would replicate ciphertext to non-primary members of an acting set.
> Data sent to Ceph client would be decrypted by OSD handling read
> operation. This allows to:
> * perform only one encryption per write,
> * achieve per-pool key granulation for both key and encryption itself.

I.e. the primary OSD's key for the PG in question, would be the one used
for all replicas of the data, per acting set. I.e. granularity of
actually one key per acting set, controlled by primary OSD?

> Unfortunately, having always and everywhere the same key for a given
> pool is unacceptable - it would make cluster migration and key change
> extremely burdensome process. To address those issues crypto key
> versioning would be introduced. All RADOS objects inside single
> placement group stored on a given OSD would use the same crypto key
> version.

This seems to add key versioning on the primary OSD.

> The same PG on other replica may use different version of the
> same, per pool-granulated key.

Attempt to rewrite to see if I parsed correctly: Within a PG's acting
set, a non-primary OSD can use another version of the per-pool key.
That seems fair, to support asynchronous key roll forward/backward.

> In typical case ciphertext data transferred from OSD to OSD can be
> used without change. This is when both OSDs have the same crypto key
> version for given placement group. In rare cases when crypto keys are
> different (key change or transition period) receiving OSD will recrypt
> with local key versions.

Doesn't this presume the receiving OSD always has more up to date set of
keys than the sending OSD?
What if sending OSD has a newer key than the receiving OSD?

> For compression to be effective it must be done before encryption. Due
> to that encryption may be applied differently for replication pools
> and EC pools. Replicated pools do not implement compression; for those
> pools encryption is applied right after data enters OSD. For EC pools
> encryption is applied after compressing. When compression will be
> implemented for 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Gregory Farnum
On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
 wrote:
> Hello Folks,
>
> I would like to publish a proposal regarding improvements to Ceph
> data-at-rest encryption mechanism. Adam Kupczyk and I worked
> on that in last weeks.
>
> Initially we considered several architectural approaches and made
> several iterations of discussions with Intel storage group. The proposal
> is condensed description of the solution we see as the most promising
> one.
>
> We are open to any comments and questions.
>
> Regards,
> Adam Kupczyk
> Radoslaw Zarzynski
>
>
> ===
> Summary
> ===
>
> Data at-rest encryption is mechanism for protecting data center
> operator from revealing content of physical carriers.
>
> Ceph already implements a form of at rest encryption. It is performed
> through dm-crypt as intermediary layer between OSD and its physical
> storage. The proposed at rest encryption mechanism will be orthogonal
> and, in some ways, superior to already existing solution.
>
> ===
> Owners
> ===
>
> * Radoslaw Zarzynski (Mirantis)
> * Adam Kupczyk (Mirantis)
>
> ===
> Interested Parties
> ===
>
> If you are interested in contributing to this blueprint, or want to be
> a "speaker" during the Summit session, list your name here.
>
> Name (Affiliation)
> Name (Affiliation)
> Name
>
> ===
> Current Status
> ===
>
> Current data at rest encryption is achieved through dm-crypt placed
> under OSD’s filestore. This solution is a generic one and cannot
> leverage Ceph-specific characteristics. The best example is that
> encryption is done multiple times - one time for each replica. Another
> issue is lack of granularity - either OSD encrypts nothing, or OSD
> encrypts everything (with dm-crypt on).
>
> Cryptographic keys are stored on filesystem of storage node that hosts
> OSDs. Changing them require redeploying the OSDs.
>
> The best way to address those issues seems to be introducing
> encryption into Ceph OSD.
>
> ===
> Detailed Description
> ===
>
> In addition to the currently available solution, Ceph OSD would
> accommodate encryption component placed in the replication mechanisms.
>
> Data incoming from Ceph clients would be encrypted by primary OSD. It
> would replicate ciphertext to non-primary members of an acting set.
> Data sent to Ceph client would be decrypted by OSD handling read
> operation. This allows to:
> * perform only one encryption per write,
> * achieve per-pool key granulation for both key and encryption itself.
>
> Unfortunately, having always and everywhere the same key for a given
> pool is unacceptable - it would make cluster migration and key change
> extremely burdensome process. To address those issues crypto key
> versioning would be introduced. All RADOS objects inside single
> placement group stored on a given OSD would use the same crypto key
> version. The same PG on other replica may use different version of the
> same, per pool-granulated key.
>
> In typical case ciphertext data transferred from OSD to OSD can be
> used without change. This is when both OSDs have the same crypto key
> version for given placement group. In rare cases when crypto keys are
> different (key change or transition period) receiving OSD will recrypt
> with local key versions.

I don't understand this part at all. Do you plan to read and rewrite
the entire PG whenever you change the "key version"? How often do you
plan to change these keys? What is even the point of changing them,
since anybody who can control an OSD can grab the entire current key
set?

> For compression to be effective it must be done before encryption. Due
> to that encryption may be applied differently for replication pools
> and EC pools. Replicated pools do not implement compression; for those
> pools encryption is applied right after data enters OSD. For EC pools
> encryption is applied after compressing. When compression will be
> implemented for replicated pools, it must be placed before encryption.

So this means you'll be encrypting the object data, but not the omap
nor xattrs, and not the file names on disk. Is that acceptable to
people? It's probably fine for a lot of rbd use cases, but not for
RGW, CephFS, nor raw RADOS where meaningful metadata (and even *data*)
is stored in those regions. I'd rather a solution worked on the full
data set. :/
-Greg

>
> Ceph currently has thin abstraction layer over block ciphers
> (CryptoHandler, CryptoKeyHandler). We want to extend this API to
> introduce initialization vectors, chaining modes and asynchronous
> operations. Implementation of this API may be based on AF_ALG kernel
> interface. This assures the ability to use hardware accelerations
> already implemented in Linux kernel. Moreover, due to working on
> bigger chunks (dm-crypt operates on 512 byte long sectors) 

Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Gregory Farnum
On Mon, Dec 14, 2015 at 2:02 PM, Martin Millnert  wrote:
> On Mon, 2015-12-14 at 12:28 -0800, Gregory Farnum wrote:
>> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski
> 
>> > In typical case ciphertext data transferred from OSD to OSD can be
>> > used without change. This is when both OSDs have the same crypto key
>> > version for given placement group. In rare cases when crypto keys are
>> > different (key change or transition period) receiving OSD will recrypt
>> > with local key versions.
>>
>> I don't understand this part at all. Do you plan to read and rewrite
>> the entire PG whenever you change the "key version"? How often do you
>> plan to change these keys? What is even the point of changing them,
>> since anybody who can control an OSD can grab the entire current key
>> set?
>
> You may have leaked keys without having leaked ciphertext.
> The typical use case for FDE/SED is IMO being able to RMA drives.
> Nothing more than that.

Yeah, but you necessarily need to let people keep using the old key
*and* give them the new one on-demand if they've got access to the
system, in order to allow switching to the new key. You need to wait
for all the data to actually be rewritten with the new key before you
can consider it secure again, and that'll take a lng time. I'm not
saying there isn't threat mitigation here, just that I'm not sure it's
useful against somebody who's already obtained access to your
encryption keys — if they've gotten those it's unlikely they won't
have gotten OSD keys as well, and if they've got network access they
can impersonate an OSD and get access to whatever data they like.

I guess that still protects against an external database hack from
somebody who gets access to your old hard drives, but...*shrug*

>
>> > For compression to be effective it must be done before encryption. Due
>> > to that encryption may be applied differently for replication pools
>> > and EC pools. Replicated pools do not implement compression; for those
>> > pools encryption is applied right after data enters OSD. For EC pools
>> > encryption is applied after compressing. When compression will be
>> > implemented for replicated pools, it must be placed before encryption.
>>
>> So this means you'll be encrypting the object data, but not the omap
>> nor xattrs, and not the file names on disk. Is that acceptable to
>> people? It's probably fine for a lot of rbd use cases, but not for
>> RGW, CephFS, nor raw RADOS where meaningful metadata (and even *data*)
>> is stored in those regions. I'd rather a solution worked on the full
>> data set. :/
>> -Greg
>
> This is indeed the largest weakness of the proposal.
>
> I'm lacking a bit on the motivation for what problem this solution
> solves that a dm-crypt-based solution doesn't address? When, except for
> snooping, is it a desired design to not encrypt all the things?
>
> I guess one could say: "ciphertext would be transferred on the network".
> But, it's incomplete. Internal transport encryption (and better auth)
> for Ceph is a different problem.
>
> I'd probably rather dm-crypt key management processes were refined and
> improved (saying this without knowing the state of any current
> implementations for Ceph), and have a solid FDE solution than a solution
> that doesn't encrypt metadata. Only encrypting data but not metadata
> isn't sufficient anymore...

Yeah, I'd rather see dm-crypt get done well rather than in-Ceph
encryption like this. If we want to protect data I think that's a lot
more secure (and will *stay* that way since encryption is all that
project does), and adding TLS or similar to the messenger code would
give us on-the-wire protection from the clients to the disk.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Martin Millnert
On Mon, 2015-12-14 at 12:28 -0800, Gregory Farnum wrote:
> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski

> > In typical case ciphertext data transferred from OSD to OSD can be
> > used without change. This is when both OSDs have the same crypto key
> > version for given placement group. In rare cases when crypto keys are
> > different (key change or transition period) receiving OSD will recrypt
> > with local key versions.
> 
> I don't understand this part at all. Do you plan to read and rewrite
> the entire PG whenever you change the "key version"? How often do you
> plan to change these keys? What is even the point of changing them,
> since anybody who can control an OSD can grab the entire current key
> set?

You may have leaked keys without having leaked ciphertext.
The typical use case for FDE/SED is IMO being able to RMA drives.
Nothing more than that.

> > For compression to be effective it must be done before encryption. Due
> > to that encryption may be applied differently for replication pools
> > and EC pools. Replicated pools do not implement compression; for those
> > pools encryption is applied right after data enters OSD. For EC pools
> > encryption is applied after compressing. When compression will be
> > implemented for replicated pools, it must be placed before encryption.
> 
> So this means you'll be encrypting the object data, but not the omap
> nor xattrs, and not the file names on disk. Is that acceptable to
> people? It's probably fine for a lot of rbd use cases, but not for
> RGW, CephFS, nor raw RADOS where meaningful metadata (and even *data*)
> is stored in those regions. I'd rather a solution worked on the full
> data set. :/
> -Greg

This is indeed the largest weakness of the proposal.

I'm lacking a bit on the motivation for what problem this solution
solves that a dm-crypt-based solution doesn't address? When, except for
snooping, is it a desired design to not encrypt all the things?

I guess one could say: "ciphertext would be transferred on the network".
But, it's incomplete. Internal transport encryption (and better auth)
for Ceph is a different problem.

I'd probably rather dm-crypt key management processes were refined and
improved (saying this without knowing the state of any current
implementations for Ceph), and have a solid FDE solution than a solution
that doesn't encrypt metadata. Only encrypting data but not metadata
isn't sufficient anymore...

/M

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html