RE: Difference between textFile Vs hadoopFile (textInoutFormat) on HDFS data

2015-04-08 Thread Puneet Kumar Ojha
Thanks From: Nick Pentreath [mailto:nick.pentre...@gmail.com] Sent: Tuesday, April 07, 2015 5:52 PM To: Puneet Kumar Ojha Cc: user@spark.apache.org Subject: Re: Difference between textFile Vs hadoopFile (textInoutFormat) on HDFS data There is no difference - textFile calls hadoopFile

Difference between textFile Vs hadoopFile (textInoutFormat) on HDFS data

2015-04-07 Thread Puneet Kumar Ojha
Hi , Is there any difference between Difference between textFile Vs hadoopFile (textInoutFormat) when data is present in HDFS? Will there be any performance gain that can be observed? Puneet Kumar Ojha Data Architect | PubMatichttp://www.pubmatic.com/

Spark Web UI Doesn't Open in Yarn-Client Mode

2015-02-14 Thread Puneet Kumar Ojha
Hi, I am running 3 mode spark cluster on EMR. While running job I see 1 executor running? Does that mean only 1 of the node is being used? ( Seems from Spark Documentation on default mode (LOCAL). When I switch to yarn-client mode the Spark Web UI doesn't open. How to view the job running

RE: Tuning number of partitions per CPU

2015-02-13 Thread Puneet Kumar Ojha
Use below configuration if u r using 1.2 version:- SET spark.shuffle.consolidateFiles=true; SET spark.rdd.compress=true; SET spark.default.parallelism=1000; SET spark.deploy.defaultCores=54; Thanks Puneet. -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Friday,