Edge Node in Spark

2017-06-05 Thread Ashok Kumar
Hi, I am a bit confused between Edge node, Edge server and gateway node in Spark. Do these mean the same thing? How does one set up an Edge node to be used in Spark? Is this different from Edge node for Hadoop please? Thanks ---

Re: Edge Node in Spark

2017-06-06 Thread Ashok Kumar
referring to GraphX? Thank You, Irving Duran On Mon, Jun 5, 2017 at 3:45 PM, Ashok Kumar wrote: Hi, I am a bit confused between Edge node, Edge server and gateway node in Spark. Do these mean the same thing? How does one set up an Edge node to be used in Spark? Is this different from Edge node

how many topics spark streaming can handle

2017-06-19 Thread Ashok Kumar
Hi Gurus, Within one Spark streaming process how many topics can be handled? I have not tried more than one topic. Thanks

Re: how many topics spark streaming can handle

2017-06-19 Thread Ashok Kumar
f processing you are trying to do. On Mon, Jun 19, 2017 at 12:00 PM, Ashok Kumar wrote: Hi Gurus, Within one Spark streaming process how many topics can be handled? I have not tried more than one topic. Thanks

RDD and DataFrame persistent memory usage

2017-06-25 Thread Ashok Kumar
Gurus, I understand when we create RDD in Spark it is immutable. So I have few points please: - When RDD is created that is just a pointer. Not most Spark operations it is lazy not consumed until a collection operation done that affects RDD? - When a DF is created from RDD does that res

how does spark handle compressed files

2017-07-19 Thread Ashok Kumar
Hi, How does spark handle compressed files? Are they optimizable in terms of using multiple RDDs against the file pr one needs to uncompress them beforehand say bz type files. thanks

The parameter spark.yarn.executor.memoryOverhead

2017-10-30 Thread Ashok Kumar
Hi Gurus, The parameter spark.yarn.executor.memoryOverhead is explained as below: spark.yarn.executor.memoryOverhead executorMemory * 0.10, with minimum of 384 The amount of off-heap memory (in megabytes) to be allocated per executor. This is memory that accounts for things like VM overheads, i

<    1   2