If it's not the data that's being searched, you can alway encode it before 
inserting it. You either have to either fruther encode it to base64 to make it 
printable before storing it, OR use a binary field.

You probably could also set up an external process that cycles through every 
document in the index, encodes the fields in question and reinserts the 
document. The time and horse power to do that might be better spent 
regenerating the index from scratch with the newly encoded documents.

You might even be able to modify something in Solr/Lucene to do the enocding 
automatically using Java. Java must have encryption libraries like most other 
languages.

I don't know solr/lucene well enough to say, but the data that's in the 
searchable columns must be visible as well, in some manner. I don't know how 
understandable it is after being tokenized. Someone else would have to comment 
on that.

Dennis Gearon

Signature Warning
----------------
EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Sun, 7/25/10, Girish Pandit <pandit.gir...@gmail.com> wrote:

> From: Girish Pandit <pandit.gir...@gmail.com>
> Subject: how to Protect data
> To: solr-user@lucene.apache.org
> Date: Sunday, July 25, 2010, 5:12 PM
> Hi,
> 
> I was being ask about protecting data, means that the
> search index data is stored in the some indexed files and
> when you open those indexed files, I can clearly see the
> data, means some texts, e.g. name, address, postal code
> etc.
> 
> is there anyway I can hide the data? means some kind of
> data encoding to not even see any text raw data.
> 
> -Girish
> 
> 

Reply via email to