Provision of encryption/decryption services API to support Field.Store.Encrypted
--------------------------------------------------------------------------------

                 Key: LUCENE-737
                 URL: http://issues.apache.org/jira/browse/LUCENE-737
             Project: Lucene - Java
          Issue Type: New Feature
          Components: Index, Search, Store
    Affects Versions: 2.0.0, 2.0.1
            Reporter: victor negrin


Attached are proposed modifications to Lucene 2.0 to support 
Field.Store.Encrypted.
The rational behind this proposal is simple. Since Lucene can store data in the 
index, it effectively makes the data portable. It is conceivable that some of 
the data may be sensitive in nature, hence the option to encrypt it. Both the 
data and its index are encrypted in this implementation.
This is only an initial implementation. It has the following several 
restrictions, all of which can be resolved if required, albeit with some effort 
and more changes to Lucene:
1) binary and compressed fields cannot be encrypted as well (a plaintext once 
encrypted becomes binary).
2) Field.Store.Encrypted implies Field.Store.Yes
This makes sense but it forces one to store the data in the same index where 
the tokens are stored. It may be preferable at times to have two indeces, one 
for tokens, the other for the data.
3) As implemented, it uses RC4 encryption from BouncyCastle. This is an open 
source package, very simple to use which has the advantage of guaranteeing that 
the length of the encrypted field is the same as the original plaintext. As of 
Java 1.5 (5.0) Sun provides an RC4 equivalent in its Java Cryptography 
Extension, but unfortunately not in Java 1.4.
The BouncyCastle RC4 is not the only algorythm available, others not depending 
on third party code can be used, but it was just the simplest to implement for 
this first attempt.
4) The attachements are modifications in diff form based on an early (I think 
August or September '06) repository snapshot of Lucene 2.0 subsequently updated 
from the Lucene repository on 29/11/06. They may need some additional work to 
merge with the latest version in the Lucene repository. They also include a 
couple of JUnit test programs which explain, as well as test, the usage. You 
will need the BouncyCastle .jar (bcprov-jdk14-134.jar) to run them. I did not 
attach it to minimize the size of the attachements, but it can be downloaded 
free from:
 http://www.bouncycastle.org/latest_releases.html
 
5) Searching an encrypted field is restricted to single terms, no phrase or 
boolean searches allowed yet, and the term has to be encrypted by the 
application before searching it. (ref. attached JUnit test programs)

To the extent that I have tested it, the code works as intended and does not 
appear to introduce any regression problems, but more testing by others would 
be desirable.
I don't propose at this stage to do any further work with this API extensions 
unless there is some expression of interest and direction from the Lucene 
Developers team. I have an application ready to roll which uses the proposed 
Lucene encryption API additions (please see http://www.kbforge.com/index.html). 
The application is not yet available for downloading simply because I am not 
sure if the Lucene licence allows me to do so. I would appreciate your advice 
in this regard. My application is free but its source code is not available 
(yet). I should add that encryption does not have to be an integral part of 
Lucene, it can be just part of the end application, but somehow it seems to me 
that Field.Store.Encrypted belongs in the same category as compression and 
binary values.
I would be happy to receive your feedback.

victor negrin 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to