Re: question regarding pyspark

2017-04-21 Thread Pushkar.Gujar
Hi Afshin, If you need to associate header information from 2nd file to first one i.e. , you can do that with specifying custom schema. Below is example from spark-csv package. As you can guess, you will have to do some pre-processing to create customSchema by first reading second file . val

question regarding pyspark

2017-04-21 Thread Afshin, Bardia
I’m ingesting a CSV with hundreds of columns and the original CSV file it’self doesn’t have any header. I do have a separate file that is just the headers, is there a way to tell Spark API this information when loading the CSV file? Or do I have to do some preprocesisng before doing so?