Makes sense. Thanks! Ognen
On Thu, Jan 16, 2014 at 12:54 AM, Tathagata Das <[email protected] > wrote: > If you are running a distributed Spark cluster over the nodes, then the > reading should be done in a distributed manner. If you give sc.textFile() a > "local path" to a directory in the shared file system, then each worker > should read a subset of the files in directory by accessing them locally. > Nothing should be read on the master. > > TD > > > On Wed, Jan 15, 2014 at 3:56 PM, Ognen Duzlevski <[email protected] > > wrote: > >> On a cluster where the nodes and the master all have access to a shared >> filesystem/files - does spark read a file (like one resulting from >> sc.textFile()) in parallel/different sections on each node? Or is the file >> read on master in sequence and chunks processed on the nodes afterwards? >> >> Thanks! >> Ognen >> > > -- "Le secret des grandes fortunes sans cause apparente est un crime oublié, parce qu'il a été proprement fait" - Honore de Balzac
