Hi Spark user,
I am new to spark so forgive me for asking a basic question. I'm trying to
import my tsv file into spark. This file has key and value separated by a
\t per line. I want to import this file as dictionary of key value pairs in
Spark.
I came across this code to do the same for csv file:
import csv
import StringIO
...
def loadRecord(line):
"""Parse a CSV line"""
input = StringIO.StringIO(line)
reader = csv.DictReader(input, fieldnames=["name", "favouriteAnimal"])
return reader.next()
input = sc.textFile(inputFile).map(loadRecord)
Can you point out the changes required to parse a tsv file?
After following operation :
split_lines = lines.map(_.split("\t"))
what should I do to read the key values in dictionary?
Thanks
Ravikant