Hi Rahul,
I'll just copy paste your question here to aid with context, and
reply afterwards.
-
Can I write the RDD data in excel file along with mapping in
apache-spark? Is that a correct way? Isn't that a writing will be a
local function and can't be passed over the clusters??
Below is
Hello there,
On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com wrote:
workbook = xlsxwriter.Workbook('output_excel.xlsx')
worksheet = workbook.add_worksheet()
data = sc.textFile(xyz.txt)
# xyz.txt is a file whose each line contains string delimited by SPACE
row=0
def
Hi Rahul,
Marcelo's explanation is correct. Here's a possible approach to your
program, in pseudo-Python:
# connect to Spark cluster
sc = SparkContext(...)
# load input data
input_data = load_xls(file(input.xls))
input_rows = input_data['Sheet1'].rows
# create RDD on cluster
input_rdd =
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com
wrote:
workbook = xlsxwriter.Workbook('output_excel.xlsx')
worksheet
Thanks jey
I was hellpful.
On Sat, May 31, 2014 at 12:45 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014