Hi Rahul! 2011/8/18 Asharudeen <asharud...@gmail.com>: > However, I believe, in Lucene the indexed data would only be 1% to 10% of > the original file data. Plz correct me, if I am wrong.
I read once that is 30% to 40% of the original size. But this also depends on how the data is tokenized and indexed. By an appropriate analyzer you can also get the opposite. E.g., if you use an analyzer adding additional data the is inserted into the indexed. > So, I want to check if i would be able to use 'CLucene' project in the client > side, and generate only the analysed data that needs to be stored in an > index. Then, i would transfer this data to the server (through socket or > curl upload), and index the analysed content on the server side. So, with > this approach, i want to avoid transfering entire files and transfer only the > indexable portion of the content as input to the server. Then on the server > side, i want to perform the necessary processing to create the index with > this input data. Is there any way/api to achieve these steps on both the > client and server side using CLucene. Or any way to achieve this by > digging into the CLucene codes/project ? Yes, Lucene is capable of merging indexes, and so is CLucene. But what I don't know, if the different versions will be a problem. If I am not wrong, Solr is based on Lucene 3.3 and CLucene on the code base of Lucene 2.3.2. But in the past I was able to open an index created by CLucene with Luke. If I optimized the index I got an index in to uptodate format of Lucene. So may be, if a direct merge isn't possible, a optimization will convert the index to the current format. Then the indexes can be merged. Kind regards, Veit ------------------------------------------------------------------------------ EMC VNX: the world's simplest storage, starting under $10K The only unified storage solution that offers unified management Up to 160% more powerful than alternatives and 25% more efficient. Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers