Le Vendredi 1 Décembre 2006 01:33, negrinv a écrit : > Thank you Robert for your commnets. I am inclined to agree with you, but I > would like to establish first of all if simplicity of implementation is the > overriding consideration. But before I dwell on that let me say that i have > discovered that I am not a master of DIFF file creation with Eclipse. The > diff file attachement to my original posting is absurdly large and not > correct. I have therefore attached a zip file containing the complete > source code of the classes I modified. I leave it to others to extract the > diffs properly. > Back to the issue. So far the implementation has not been difficult > considering that I knew nothing about Lucene internals before I started. > The reason is that Lucene is very well structured and the changes just > fitted nicely by adding some code in the right place with minimal changes > to the existing code. But I admit that the proposed implementation so far > is not complete and more work is required to overcome some of its > restrictions. While I like your idea I believe that it imposed too large a > granularity on the encrypted data, all fields will all kinds of data will > be encrypted including images and others which normally would be left > alone, thus adding to the performance penalty due to encryption.
I don't agree with you here. In Lucene, you will encrypt the field data, the field names, and the tokens : I would say that is represents at least 2/3 of the index size. Then, with the implementation you suggest, I think (sorry I didn't took time to see you patch) that every time a lucene data need to be read, it is decrypted each time. With an encrypted FS, your kernel will maintain a cache in RAM for you, so it won't hurt so much. It needs some bench to see what is effectively the best, but I have doubt that your solution will be faster. Nicolas. > Many > hardware devices and most operating systems already provide directory or > file system encryption therefore that level of encryption appears to me an > unnecessary addition to Lucene. Encryption at field level however is not > provided by anything I know. The key in my opinion is to decide what is > best from the end user point of view, but perhaps we need more discussion > on this. > Victor > > http://www.nabble.com/file/4390/LuceneEncryptionMods.zip > LuceneEncryptionMods.zip > > Robert Engels wrote: > > I think a simpler solution would be to create a EncryptedDirectory > > implementation of Directory, which requires a password to open/modify the > > directory. > > > > Far simpler, and if yuou are using encryption to begin with, you are > > probably encrypting most of the data anyway. > > > > -----Original Message----- > > > >>From: negrinv <[EMAIL PROTECTED]> > >>Sent: Nov 29, 2006 9:45 PM > >>To: java-dev@lucene.apache.org > >>Subject: Re: Attached proposed modifications to Lucene 2.0 to support > > Field.Store.Encrypted > > >>Thank you Luke for your comments and the references you supplied. I read > >>through them and reached the following conclusions. There seems to be a > >>philosophical issue about the boundary between a user application and the > >>Lucene API, where should one start and the other stop. > >>The other issue is the significant difference between compression and > >>encryption. > >>As far as the first issue is concerned it is really a matter of personal > >>choice and preference. My feeling is that as long as adding functionality > >>does not impair the performance of the API as a whole, it makes sense to > > add > > >>it to Lucene and thus simplify the task of the application developer. > > After > > >>all, application developers do not have to use all the features of the > >> API and always have the option of subclassing, writing a better version > >> of it > > if > > >>they can, or writing the functionality as part of the application, even > >> if the API provides that functionality already. The API is there to make > >> life easier for those developers who want to use it, nobody "has" to use > >> it. The second issue is more technical. Compression simply compresses > >> the > > stored > > >>data to save storage. The index itself is not compressed therefore > > searching > > >>proceeds as normal. With encryption however you must encrypt the index as > >>well as the stored data otherwise one could reconstruct the source > > document > > >>from the index and thus defeat the purpose of encryption. Correct me if I > > am > > >>wrong, but I think that encrypting the Lucene index is not easy to > >> achieve from outside of Lucene, it implies re-writing as part of the > >> application much code now part of Lucene (see issue number one above), > >> hence my preference for including it as part of the Lucene API rather > >> than as part > > of > > >>the application. > >>Victor > >> > >>Luke Nezda wrote: > >>> I think that adding encryption support to Lucene fields is a bad idea > >>> for > >>> the same reasons adding compression was a bad idea (conclusive comments > >>> on > >>> the tail of this issue > >>> http://issues.apache.org/jira/browse/LUCENE-648?page=all). Binary > >>> fields > >>> can be used by users to achieve this end. Maybe a contrib with utility > >>> methods would be a compromise to preserve this work and make it > >>> accessible > >>> to others, or alternatively just a faq entry with the sample code or > >>> references to it. > >>> Luke > >>> > >>> On 11/29/06, negrinv <[EMAIL PROTECTED]> wrote: > >>>> Attached are proposed modifications to Lucene 2.0 to support > >>>> Field.Store.Encrypted. > >>>> The rational behind this proposal is simple. Since Lucene can store > >>>> data > >>>> in > >>>> the index, it effectively makes the data portable. It is conceivable > >>>> that > >>>> some of the data may be sensitive in nature, hence the option to > >>>> encrypt > >>>> it. > >>>> Both the data and its index are encrypted in this implementation. > >>>> This is only an initial implementation. It has the following several > >>>> restrictions, all of which can be resolved if required, albeit with > >>>> some > >>>> effort and more changes to Lucene: > >>>> 1) binary and compressed fields cannot be encrypted as well (a > >>>> plaintext > >>>> once encrypted becomes binary). > >>>> 2) Field.Store.Encrypted implies Field.Store.Yes > >>>> This makes sense but it forces one to store the data in the same index > >>>> where > >>>> the tokens are stored. It may be preferable at times to have two > >>>> indeces, > >>>> one for tokens, the other for the data. > >>>> 3) As implemented, it uses RC4 encryption from BouncyCastle. This is > >>>> an open > >>>> source package, very simple to use which has the advantage of > >>>> guaranteeing > >>>> that the length of the encrypted field is the same as the original > >>>> plaintext. As of Java 1.5 (5.0) Sun provides an RC4 equivalent in its > >>>> Java > >>>> Cryptography Extension, but unfortunately not in Java 1.4. > >>>> The BouncyCastle RC4 is not the only algorythm available, others not > >>>> depending on third party code can be used, but it was just the > >>>> simplest to > >>>> implement for this first attempt. > >>>> 4) The attachements are modifications in diff form based on an early > >>>> (I think August or September '06) repository snapshot of Lucene 2.0 > >>>> subsequently updated from the Lucene repository on 29/11/06. They may > >>>> need > >>>> some additional work to merge with the latest version in the Lucene > >>>> repository. They also include a couple of JUnit test programs which > >>>> explain, > >>>> as well as test, the usage. You will need the BouncyCastle .jar > >>>> (bcprov-jdk14-134.jar) to run them. I did not attach it to minimize > >>>> the size > >>>> of the attachements, but it can be downloaded free from: > >>>> http://www.bouncycastle.org/latest_releases.html > >>>> > >>>> 5) Searching an encrypted field is restricted to single terms, no > >>>> phrase > >>>> or > >>>> boolean searches allowed yet, and the term has to be encrypted by the > >>>> application before searching it. (ref. attached JUnit test programs) > >>>> > >>>> To the extent that I have tested it, the code works as intended and > >>>> does > >>>> not > >>>> appear to introduce any regression problems, but more testing by > >>>> others would be desirable. > >>>> I don't propose at this stage to do any further work with this API > >>>> extensions unless there is some expression of interest and direction > >>>> from > >>>> the Lucene Developers team. I have an application ready to roll which > >>>> uses > >>>> the proposed Lucene encryption API additions (please see > >>>> http://www.kbforge.com/index.html). The application is not yet > >>>> available > >>>> for > >>>> downloading simply because I am not sure if the Lucene licence allows > >>>> me > >>>> to > >>>> do so. I would appreciate your advice in this regard. My application > >>>> is free > >>>> but its source code is not available (yet). I should add that > >>>> encryption > >>>> does not have to be an integral part of Lucene, it can be just part of > >>>> the > >>>> end application, but somehow it seems to me that Field.Store.Encrypted > >>>> belongs in the same category as compression and binary values. > >>>> I would be happy to receive your feedback. > >>>> > >>>> victor negrin > >>>> > >>>> http://www.nabble.com/file/4376/luceneDiff2.txt luceneDiff2.txt > >>>> http://www.nabble.com/file/4377/TestEncryptedDocument.java > >>>> TestEncryptedDocument.java > >>>> http://www.nabble.com/file/4378/TestDocument.java TestDocument.java > >>>> -- > >>>> View this message in context: > >>>> http://www.nabble.com/Attached-proposed-modifications-to-Lucene-2.0-to > >>>>-support-Field.Store.Encrypted-tf2727614.html#a7607415 Sent from the > >>>> Lucene - Java Developer mailing list archive at Nabble.com. > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >>-- > >>View this message in context: > > http://www.nabble.com/Attached-proposed-modifications-to-Lucene-2.0-to-supp >ort-Field.Store.Encrypted-tf2727614.html#a7613046 > > >>Sent from the Lucene - Java Developer mailing list archive at Nabble.com. > >> > >> > >>--------------------------------------------------------------------- > >>To unsubscribe, e-mail: [EMAIL PROTECTED] > >>For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] -- Nicolas LALEVÉE Solutions & Technologies ANYWARE TECHNOLOGIES Tel : +33 (0)5 61 00 52 90 Fax : +33 (0)5 61 00 51 46 http://www.anyware-tech.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]