Hi All,
I already find a solution to solve this problem. Please ignore my question...
Thanx
Best regards,
Henry
From: MA33 YTHung1
Sent: Friday, February 6, 2015 4:34 PM
To: user@spark.apache.org
Subject: how to process a file in spark standalone cluster without distributed
storage (i.e. HDFS
Hi All,
sc.textFile will not work because the file is not distributed to other workers,
So I try to read the file first using FileUtils.readLines and then use
sc.parallelize, but the readLines failed because OOM (file is large).
Is there a way to split local files and upload those partition to ea