Re: Parsing a tsv file with key value pairs

2015-06-25 Thread Don Drake
Use this package: https://github.com/databricks/spark-csv and change the delimiter to a tab. The documentation is pretty straightforward, you'll get a Dataframe back from the parser. -Don On Thu, Jun 25, 2015 at 4:39 AM, Ravikant Dindokar ravikant.i...@gmail.com wrote: So I have a file

Re: Parsing a tsv file with key value pairs

2015-06-25 Thread anshu shukla
Can you be more specific Or can you provide sample file . On Thu, Jun 25, 2015 at 11:00 AM, Ravikant Dindokar ravikant.i...@gmail.com wrote: Hi Spark user, I am new to spark so forgive me for asking a basic question. I'm trying to import my tsv file into spark. This file has key and value

Re: Parsing a tsv file with key value pairs

2015-06-25 Thread Ravikant Dindokar
So I have a file where each line represents an edge in the graph has two values separated by a tab. Both values are vertex id's (source and sink). I want to parse this file as dictionary in spark RDD. So my question is get these values in the form of dictionary in RDD? sample file : 12 15

Parsing a tsv file with key value pairs

2015-06-24 Thread Ravikant Dindokar
Hi Spark user, I am new to spark so forgive me for asking a basic question. I'm trying to import my tsv file into spark. This file has key and value separated by a \t per line. I want to import this file as dictionary of key value pairs in Spark. I came across this code to do the same for csv