Use this package:
https://github.com/databricks/spark-csv
and change the delimiter to a tab.
The documentation is pretty straightforward, you'll get a Dataframe back
from the parser.
-Don
On Thu, Jun 25, 2015 at 4:39 AM, Ravikant Dindokar ravikant.i...@gmail.com
wrote:
So I have a file
Can you be more specific Or can you provide sample file .
On Thu, Jun 25, 2015 at 11:00 AM, Ravikant Dindokar ravikant.i...@gmail.com
wrote:
Hi Spark user,
I am new to spark so forgive me for asking a basic question. I'm trying to
import my tsv file into spark. This file has key and value
So I have a file where each line represents an edge in the graph has two
values separated by a tab. Both values are vertex id's (source and sink). I
want to parse this file as dictionary in spark RDD.
So my question is get these values in the form of dictionary in RDD?
sample file :
12
15
Hi Spark user,
I am new to spark so forgive me for asking a basic question. I'm trying to
import my tsv file into spark. This file has key and value separated by a
\t per line. I want to import this file as dictionary of key value pairs in
Spark.
I came across this code to do the same for csv