[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136783#comment-17136783
 ] 

Bruno Roustant edited comment on LUCENE-9379 at 6/23/20, 12:51 PM:
-------------------------------------------------------------------

So I plan to implement an EncryptingDirectory extending FilterDirectory.

 

+Encryption method:+

AES CTR (counter)
 * This mode is approved by NIST. 
([https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29])
 * AES encryption has the same size as the original clear text (no padding). So 
we can use the same file pointers.
 * CTR mode allows random access to encrypted blocks (128 bits blocks).
 * IV (initialisation vector) must be random, and is stored at the beginning of 
the encrypted file because it can be public. No need to repeat the IV for each 
block (less disk impact compared to CBC mode).
 * It is appropriate to encrypt streams.

 

+API:+ 

I don’t anticipate any API change.

 

+How to provide encryption keys:+

EncryptingDirectory would require a delegate Directory, an encryption key 
supplier, and a Cipher pool (for performance).

For the callers to pass the encryption keys, I see two ways:

1- In Solr, declare a DirectoryFactory in solrconfig.xml that creates 
EncryptingDirectory. This factory is able to determine the encryption key per 
file based on the path. It is the responsibility of this factory to access the 
keys (e.g. stored in safe DB, received with an admin handler, read from 
properties, etc). The Cipher pool is hold by the DirectoryFactory.

2- More generally the EncryptingDirectory can be created to wrap a Directory 
when opening a segment (e.g. in PostingsFormat/DocValuesFormat 
fieldsConsumer()/fieldsProducer(), in StoredFieldFormat 
fieldsReader()/fieldsWriter(), etc). In this case the 
PostingsFormat/DocValuesFormat/StoredFieldFormat extension determines the 
encryption key based on the SegmentInfo. A custom Codec can be created to 
handle encrypting formats. The Cipher pool is hold either in the Codec or in 
the Format.

 

+Code:+

I will inspire from Apache commons-crypto CtrCryptoOutputStream, although not 
directly using it because it is an OutputStream while we need an IndexOutput. 
And we can probably simplify since we have a specific use-case compared to this 
lib wide usage.


was (Author: broustant):
So I plan to implement an EncryptingDirectory extending FilterDirectory.

 

+Encryption method:+

AES CTR (counter)
 * This mode is approved by NIST. 
([https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29])
 * AES encryption has the same size as the original clear text (though the last 
block is padded to 128 bits). So we can use the same file pointers.
 * CTR mode allows random access to encrypted blocks (128 bits blocks).
 * IV (initialisation vector) must be random, and is stored at the beginning of 
the encrypted file because it can be public. No need to repeat the IV for each 
block (less disk impact compared to CBC mode).
 * It is appropriate to encrypt streams.

 

+API:+ 

I don’t anticipate any API change.

 

+How to provide encryption keys:+

EncryptingDirectory would require a delegate Directory, an encryption key 
supplier, and a Cipher pool (for performance).

For the callers to pass the encryption keys, I see two ways:

1- In Solr, declare a DirectoryFactory in solrconfig.xml that creates 
EncryptingDirectory. This factory is able to determine the encryption key per 
file based on the path. It is the responsibility of this factory to access the 
keys (e.g. stored in safe DB, received with an admin handler, read from 
properties, etc). The Cipher pool is hold by the DirectoryFactory.

2- More generally the EncryptingDirectory can be created to wrap a Directory 
when opening a segment (e.g. in PostingsFormat/DocValuesFormat 
fieldsConsumer()/fieldsProducer(), in StoredFieldFormat 
fieldsReader()/fieldsWriter(), etc). In this case the 
PostingsFormat/DocValuesFormat/StoredFieldFormat extension determines the 
encryption key based on the SegmentInfo. A custom Codec can be created to 
handle encrypting formats. The Cipher pool is hold either in the Codec or in 
the Format.

 

+Code:+

I will inspire from Apache commons-crypto CtrCryptoOutputStream, although not 
directly using it because it is an OutputStream while we need an IndexOutput. 
And we can probably simplify since we have a specific use-case compared to this 
lib wide usage.

> Directory based approach for index encryption
> ---------------------------------------------
>
>                 Key: LUCENE-9379
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9379
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Bruno Roustant
>            Assignee: Bruno Roustant
>            Priority: Major
>
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to