Makes sense. Thanks!
Ognen

On Thu, Jan 16, 2014 at 12:54 AM, Tathagata Das <[email protected]
> wrote:

> If you are running a distributed Spark cluster over the nodes, then the
> reading should be done in a distributed manner. If you give sc.textFile() a
> "local path" to a directory in the shared file system, then each worker
> should read a subset of the files in directory by accessing them locally.
> Nothing should be read on the master.
>
> TD
>
>
> On Wed, Jan 15, 2014 at 3:56 PM, Ognen Duzlevski <[email protected]
> > wrote:
>
>> On a cluster where the nodes and the master all have access to a shared
>> filesystem/files - does spark read a file (like one resulting from
>> sc.textFile()) in parallel/different sections on each node? Or is the file
>> read on master in sequence and chunks processed on the nodes afterwards?
>>
>> Thanks!
>> Ognen
>>
>
>


-- 
"Le secret des grandes fortunes sans cause apparente est un crime oublié,
parce qu'il a été proprement fait" - Honore de Balzac

Reply via email to