Did you try https://github.com/dadoonet/fsriver?
Never tested it with so many docs but may be it could help you here?

If you have already generated json files on a server, then I would recommend 
trying logstash to send them into elasticsearch. 

My 2 cents

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 28 janvier 2014 at 16:46:06, ZenMaster80 ([email protected]) a écrit:

I would like to get your perspective on how to load json to index server in my 
scenario.
We have about 15 million documents in html/pdf/... on Server 1
I would like to process the data and convert to json on server 2
I would like the indexer to index json n a separate machine/server server 3

Ideally I thought on Server 2, as I prepare json and have it ready in memory, I 
can feed it to indexer. But since data processing is cpu intensive, I want 
indexing to be done on a separate machines/server.
How do you guys deal with this since I can no longer feed in-memory json to the 
indexer on separate machine? Do I just grab files from server 2 and index them 
then?
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/05b977ac-00d0-45c0-9e58-8df523e6978c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.52e7f16c.74b0dc51.ec%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to