> The problem with encrypted file systems is that if someone gets access to
> the file system (not the disk, the file system e.g via ssh), it is wide
> open to it. It's like my work laptop's disk is encrypted, but after I've
> entered my password, all files are readable to me. However, files that are
> password protected, aren't, and that's what security experts want - that
> even if an attacker stole the machine and has all the passwords and the
> time in the world, without the public/private key of the encrypted index,
> he won't be able to read it. I'm not justifying it, just repeating what I
> was told. Even though I think it's silly - if someone managed to get a hold
> of the machine, the login password, root access... what are the chance he
> doesn't already have the other keys?
>

I was rather assuming an encrypted filesystem (a partition if you like)
that is only available to a specific system user which our application runs
under. This filesystem would only hold the Lucene indexes, it would not be
a general purpose system boot filesystem as you are describing.


> Anyway, we're here to solve the technical problem, and we obviously aren't
> the ones making these decisions, and it's futile attempting to argue with
> security folks, so let's address the question of how to achieve encryption.
>

I'm not a security folk, some of the responders might be. I am just trying
to deliver a requirement, and have been told by the client that the
suggested encrypted filesystem etc is not good enough.


> I wouldn't go with a Codec, personally, to achieve encryption. It's over
> complicated IMO. Rather an encrypted Directory is a simpler solution. You
> will need to implement an EncryptingIndexOutput and a matching
> DecryptingIndexInput, but that's more or less it. The encryption/decryption
> happens in buffers, so you will want to extend the respective BufferedIO
> classes. The issues mentioned above should give you a head start, even
> though the patches are old and likely don't compile against new versions,
> but they contain the gist of it.
>

Thanks I will take a look. At the moment I am predominantly just trying to
understand if it is even possible, it is unlikely the client will sign off
any real development work on this until the New Year; If they sign-off,
expect some more questions to the list from me :-p


> Just make sure your application, or actually the process running Lucene,
> receive the public/private key in a non obvious way, so that if someone
> does get a hold of the machine, he can't obtain that information!
>
 Ok of course I will try and protect my app and paths to and from. However,
I assume that if someone gets root access to the server, they can just dump
the server's RAM to a disk file and have access to all the keys that happen
to be in RAM anyway and that I can't really protect against that.

> Also, as for encrypting the terms themselves, beyond the problems
> mentioned above about wildcard queries, there is the risk of someone
> guessing the terms based on their statistics. If the attacker knows the
> corpus domain, I assume it shouldn't be hard for him to guess that a
> certain word with a high DF and TF is probably "the" and proceed from there.
>

Based on the fact that my client doesn't seem to understand that this is
probably not a good idea. I think the fact that someone might use
statistical analysis to guess and potentially decrypt the index will be of
little worry to them (even if I explain it).


> Again, I'm no security expert and I've learned it's sometimes futile
> trying to argue with them. If you can convince them though that the system
> as a whole is protected enough, and if breached an encrypted index is
> likely already breached too, you can avoid the complexity. From my
> experience, encryption hurts performance, but you can improve that by eg
> buffering parts unencrypted, but then you also need to prove your program's
> memory is protected...
>
Mainly understood, but can you elaborate on "prove your program's memory is
protected"?


Thanks

-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

Reply via email to