More of jars and files and app name. It runs on yarn-client mode.

Thanks,
Pradeep

> On Jul 26, 2016, at 7:10 AM, Jacek Laskowski <ja...@japila.pl> wrote:
> 
> Hi,
> 
> What's "<all other stuff>"? What master URL do you use?
> 
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
> 
> 
>> On Tue, Jul 26, 2016 at 2:18 AM, Mail.com <pradeep.mi...@mail.com> wrote:
>> Hi All,
>> 
>> I have a directory which has 12 files. I want to read the entire file so I 
>> am reading it as wholeTextFiles(dirpath, numPartitions).
>> 
>> I run spark-submit as <all other stuff> --num-executors 12 --executor-cores 
>> 1 and numPartitions 12.
>> 
>> However, when I run the job I see that the stage which reads the directory 
>> has only 8 tasks. So some task reads more than one file and takes twice the 
>> time.
>> 
>> What can I do that the files are read by 12 tasks  I.e one file per task.
>> 
>> Thanks,
>> Pradeep
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to