[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356487#comment-17356487
 ] 

Fabio Germann edited comment on LUCENE-9379 at 6/3/21, 2:52 PM:
----------------------------------------------------------------

Thanks [~broustant]/[~bruno.roustant], this is also something that I was 
looking for!

As for [~rcmuir]'s comment(s): I think the important distinction to be made is 
the goal of the usage of encryption and the guarantees you need.

If one needs tenant based encryption at rest, os level encryption is a valid 
way to go. Also if one needs maximum performance and tries to squeeze every 
last drop of performance out of their NVMe's - os level encryption (or no 
encryption) would probably be best.

BUT: In todays world there are sometimes things that are more important (or 
pose a greater risk) to a project or a company: namely user privacy and data 
protection. In such cases decreased performance is certainly acceptable (if not 
already anticipated).

Many of the above arguments against this contribution can be addressed one way 
or another. What can NOT be addressed (and why [~bruno.roustant]'s contribution 
is valuable) is:
 * It allows for the stored content to only be accessible to Lucene (the 
process/thread), for the exact duration that Lucene needs to process the data, 
without any dependency on a downstream component.
 * It allows for platform interoperability/independence. (Example: ) This 
allows the solution to be deployed to Linux system, while being developed on 
MacOS/Windows. (Sidenote: This is very important if there are large teams 
working on solution building on this.)
 * It can even offer protection from passive privileged users - meaning that 
the file on the filesystem is not readable for a privileged user. In contrast 
to that the os-level encryption that would make such protections more complex.
 * It allows for simple deployment in container technologies (which would be 
tricky with the alternatives proposed by [~rcmuir])

 

Maybe the increased interest in this topic signals that there is something to 
be done?

Also recent research has taken note - like: 
 (From the abstract: ) "[...] However, currently deployed IR technologies, 
e.g., Apache Lucene - open-source search software, are insufficient when the 
information is protected or deemed to be private [...]"
 (Source: 
[https://www.computer.org/csdl/journal/tq/5555/01/08954811/1gs4XOshKHC)] 


was (Author: fabio.germann):
Thanks [~broustant]/[~bruno.roustant], this is also something that I was 
looking for!

As for [~rcmuir]'s comment(s): I think the important distinction to be made is 
the goal of the usage of encryption and the guarantees you need.

If one needs tenant based encryption at rest, os level encryption is a valid 
way to go. Also if one needs maximum performance and tries to squeeze every 
last drop of performance out of their NVMe's - os level encryption (or no 
encryption) would probably be best.

BUT: In todays world there are sometimes things that are more important (or 
pose a greater risk) to a project or a company: namely user privacy and data 
protection. In such cases decreased performance is certainly acceptable (if not 
already anticipated).

Many of the above arguments against this contribution can be addressed one way 
or another. What can NOT be addressed (and why [~bruno.roustant]'s contribution 
is valuable) is:
 * It allows for the stored content to only be accessible to Lucene (the 
process/thread), for the exact duration that Lucene needs to process the data, 
without any dependency on a downstream component.
 * It allows for platform interoperability/independence. (Example:) This allows 
the solution to be deployed to Linux system, while being developed on 
MacOS/Windows. (Sidenote: This is very important if there are large teams 
working on solution building on this.)
 * It can even offer protection from passive privileged users - meaning that 
the file on the filesystem is not readable for a privileged user. In contrast 
to that the os-level encryption that would make such protections more complex.
 * It allows for simple deployment in container technologies (which would be 
tricky with the alternatives proposed by [~rcmuir])

 

Maybe the increased interest in this topic signals that there is something to 
be done?

Also recent research has taken note - like: 
(From the abstract:) "[...] However, currently deployed IR technologies, e.g., 
Apache Lucene - open-source search software, are insufficient when the 
information is protected or deemed to be private [...]"
(Source: 
[https://www.computer.org/csdl/journal/tq/5555/01/08954811/1gs4XOshKHC)] 

> Directory based approach for index encryption
> ---------------------------------------------
>
>                 Key: LUCENE-9379
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9379
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Bruno Roustant
>            Assignee: Bruno Roustant
>            Priority: Major
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> +Important+: This Lucene Directory wrapper approach is to be considered only 
> if an OS level encryption is not possible. OS level encryption better fits 
> Lucene usage of OS cache, and thus is more performant.
> But there are some use-case where OS level encryption is not possible. This 
> Jira issue was created to address those.
> ____________________________________________
>  
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to