Hi,

When running Spark in the standalone cluster node, is there a way to
configure the number of splits for the input file(s)? It seems like it is
approximately 32 MB for every core be default. Is that correct? For example
in my cluster there are two workers, each running on a machine with two
cores. For an input file of size 500MB, Spark schedules 16 tasks for the
initial map (500/32 ~ 16)

thanks!
Umar

Reply via email to