I've also observed that HDFS supports client provided encryption... or so I
recall when I looked many months ago.  Someone ought to do a blog/write-up
on that.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 15, 2023 at 5:28 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Btw, +1 to the initiative. I've heard of clients used encrypted HDFC for
> these usecases. Direct support at Lucene/Solr level is much better.
>
> On Wed, 15 Mar, 2023, 2:52 pm Ishan Chattopadhyaya, <
> ichattopadhy...@gmail.com> wrote:
>
> > Does it need to be a first party project?
> >
> > On Wed, 15 Mar, 2023, 2:46 pm Bruno Roustant, <broust...@apache.org>
> > wrote:
> >
> >> Hi,
> >>
> >> I pushed a PR <https://github.com/apache/solr-sandbox/pull/51> in
> >> solr-sandbox <https://github.com/apache/solr-sandbox> to propose a
> >> Java-level encryption for Solr.
> >> This work is the follow up of LUCENE-9379
> >> <https://issues.apache.org/jira/projects/LUCENE/issues/LUCENE-9379>.
> >>
> >> To give some details, here is the overview section of the ENCRYPTION.md
> >> <
> >>
> https://github.com/apache/solr-sandbox/blob/e422e3dd4febab54ba9a8d965189b38217552b46/ENCRYPTION.md
> >> >
> >> file in this PR:
> >>
> >> This solution provides the encryption of the Lucene index files at the
> >> Java
> >> level.
> >> It encrypts all (or some) the files in a given index with a provided
> >> encryption key.
> >> It stores the id of the encryption key in the commit metadata (and
> >> obviously the
> >> key secret is never stored). It is possible to define a different key
> per
> >> Solr Core.
> >> This module also provides an EncryptionRequestHandler so that a client
> can
> >> trigger
> >> the (re)encryption of a Solr Core index. The (re)encryption is done
> >> concurrently
> >> while the Solr Core can continue to serve update and query requests.
> >>
> >> Comparing with an OS-level encryption:
> >>
> >> - OS-level encryption [1][2] is more performant and more adapted to let
> >> Lucene
> >> leverage the OS memory cache. It can manage encryption at block or
> >> filesystem
> >> level in the OS. This makes it possible to encrypt with different keys
> >> per-directory,
> >> making multi-tenant use-cases possible.
> >> If you can use OS-level encryption, prefer it and skip this Java-level
> >> encryption.
> >>
> >> - Java-level encryption can be used when the OS-level encryption
> >> management
> >> is
> >> not possible (e.g. host machine managed by a cloud provider). It has an
> >> impact
> >> on performance: expect -20% on most queries, -60% on multi-term queries.
> >>
> >> [1] https://wiki.archlinux.org/title/Fscrypt
> >> [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html
> >>
> >> - Bruno
> >>
> >
>

Reply via email to