I think that adding encryption support to Lucene fields is a bad idea for
the same reasons adding compression was a bad idea (conclusive comments on
the tail of this  issue
http://issues.apache.org/jira/browse/LUCENE-648?page=all).  Binary fields
can be used by users to achieve this end.  Maybe a contrib with utility
methods would be a compromise to preserve this work and make it accessible
to others, or alternatively just a faq entry with the sample code or
references to it.
Luke

On 11/29/06, negrinv <[EMAIL PROTECTED]> wrote:


Attached are proposed modifications to Lucene 2.0 to support
Field.Store.Encrypted.
The rational behind this proposal is simple. Since Lucene can store data
in
the index, it effectively makes the data portable. It is conceivable that
some of the data may be sensitive in nature, hence the option to encrypt
it.
Both the data and its index are encrypted in this implementation.
This is only an initial implementation. It has the following several
restrictions, all of which can be resolved if required, albeit with some
effort and more changes to Lucene:
1) binary and compressed fields cannot be encrypted as well (a plaintext
once encrypted becomes binary).
2) Field.Store.Encrypted implies Field.Store.Yes
This makes sense but it forces one to store the data in the same index
where
the tokens are stored. It may be preferable at times to have two indeces,
one for tokens, the other for the data.
3) As implemented, it uses RC4 encryption from BouncyCastle. This is an
open
source package, very simple to use which has the advantage of guaranteeing
that the length of the encrypted field is the same as the original
plaintext. As of Java 1.5 (5.0) Sun provides an RC4 equivalent in its Java
Cryptography Extension, but unfortunately not in Java 1.4.
The BouncyCastle RC4 is not the only algorythm available, others not
depending on third party code can be used, but it was just the simplest to
implement for this first attempt.
4) The attachements are modifications in diff form based on an early (I
think August or September '06) repository snapshot of Lucene 2.0
subsequently updated from the Lucene repository on 29/11/06. They may need
some additional work to merge with the latest version in the Lucene
repository. They also include a couple of JUnit test programs which
explain,
as well as test, the usage. You will need the BouncyCastle .jar
(bcprov-jdk14-134.jar) to run them. I did not attach it to minimize the
size
of the attachements, but it can be downloaded free from:
http://www.bouncycastle.org/latest_releases.html

5) Searching an encrypted field is restricted to single terms, no phrase
or
boolean searches allowed yet, and the term has to be encrypted by the
application before searching it. (ref. attached JUnit test programs)

To the extent that I have tested it, the code works as intended and does
not
appear to introduce any regression problems, but more testing by others
would be desirable.
I don't propose at this stage to do any further work with this API
extensions unless there is some expression of interest and direction from
the Lucene Developers team. I have an application ready to roll which uses
the proposed Lucene encryption API additions (please see
http://www.kbforge.com/index.html). The application is not yet available
for
downloading simply because I am not sure if the Lucene licence allows me
to
do so. I would appreciate your advice in this regard. My application is
free
but its source code is not available (yet). I should add that encryption
does not have to be an integral part of Lucene, it can be just part of the
end application, but somehow it seems to me that Field.Store.Encrypted
belongs in the same category as compression and binary values.
I would be happy to receive your feedback.

victor negrin

http://www.nabble.com/file/4376/luceneDiff2.txt luceneDiff2.txt
http://www.nabble.com/file/4377/TestEncryptedDocument.java
TestEncryptedDocument.java
http://www.nabble.com/file/4378/TestDocument.java TestDocument.java
--
View this message in context:
http://www.nabble.com/Attached-proposed-modifications-to-Lucene-2.0-to-support-Field.Store.Encrypted-tf2727614.html#a7607415
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to