Re: Post content to be indexed to Solr

2011-08-13 Thread Erick Erickson
I don't think this is really do-able. The only thing that
comes to my mind is that you could (and this is assuming
you're using Tika to handle the file evenutally) send the
document through Tika on the client and construct a
SolrJ document on the parts you care about. This would
give you substantial savings on data transmitted for some
file types, but not for others (text would be pretty much
unaffected, but pictures or video would be radically
reduced)...

Best
Erick

On Fri, Aug 12, 2011 at 7:47 AM, rahul  wrote:
> Hi,
>
> Currently I am indexing documents by directly adding files as
> 'req.addFile(fi);' or  by sending the content of the file like
> 'req.addContentStream(stream);' using solrj.
>
> Assume, if the solrj client & Solr server are in different network (ie, Solr
> server is in remote location) I need to transfer entire file content to
> Solr. I believe the indexed content of a file should be less than the actual
> file.
>
> Hence is there a way to get the content that to be indexed from client part
> (instead of simply sending the entire file content - I believe the content
> to be indexed should be 1 to 10% of original file. plz Correct me, if I am
> wrong...) using any lucene  api and then post the specific content to remote
> server.
>
> Is there any way to achieve this ?? Plz update me, if I am anything wrongly
> understand.
>
> Thanks in Advance..
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Post-content-to-be-indexed-to-Solr-tp3249009p3249009.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Post content to be indexed to Solr

2011-08-12 Thread rahul
Hi,

Currently I am indexing documents by directly adding files as
'req.addFile(fi);' or  by sending the content of the file like
'req.addContentStream(stream);' using solrj.

Assume, if the solrj client & Solr server are in different network (ie, Solr
server is in remote location) I need to transfer entire file content to
Solr. I believe the indexed content of a file should be less than the actual
file. 

Hence is there a way to get the content that to be indexed from client part
(instead of simply sending the entire file content - I believe the content
to be indexed should be 1 to 10% of original file. plz Correct me, if I am
wrong...) using any lucene  api and then post the specific content to remote
server.

Is there any way to achieve this ?? Plz update me, if I am anything wrongly
understand.

Thanks in Advance..

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Post-content-to-be-indexed-to-Solr-tp3249009p3249009.html
Sent from the Solr - User mailing list archive at Nabble.com.