Hi,

First off I recommend using the native integration (aka the Java/Scala APIs) instead of MapReduce. The latter works but the former is better performing and more flexible.

ES works in a similar fashion to the HDFS store - the data doesn't go through the master rather, each task has its own partition on works on its own set of data. Behind the scenes we map each worker to an index shard (if there aren't enough workers, then some will work across multiple shards).


On 12/8/14 4:59 PM, Mohamed Lrhazi wrote:
am trying to understand how spark and ES work... could someone please help me 
answer this question..

val conf = new Configuration()
conf.set("es.resource", "radio/artists")
conf.set("es.query", "?q=me*")
val esRDD = sc.newHadoopRDD(conf, classOf[EsInputFormat[Text, MapWritable]],
                                   classOf[Text], classOf[MapWritable]))
val docCount = esRDD.count();


When and where is data being transferred from ES? is it all collected on the 
Spark master node, then partitioned and
sent to the worker nodes? or is each worker node talking to ES to somehow get a 
partition of the data?

How does this effectively work?

Thanks a lot,
Mohamed.

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
[email protected] 
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEU_gmf9Nt0xn_0NbzDn_moRWUT96uWYf4cicJdZik3r0Zz8XA%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/CAEU_gmf9Nt0xn_0NbzDn_moRWUT96uWYf4cicJdZik3r0Zz8XA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5485C164.7090405%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to