Hi, Do you know any simple way to load multiple csv files (same schema) that are in different paths? Wildcards are not a solution, as I want to load specific csv files from different folders.
I came across a solution (https://stackoverflow.com/questions/37639956/how-to-import-multiple-csv-files-in-a-single-load <https://stackoverflow.com/questions/37639956/how-to-import-multiple-csv-files-in-a-single-load>) that suggests something like spark.read.format("csv").option("header", "false") .schema(custom_schema) .option('delimiter', '\t') .option('mode', 'DROPMALFORMED') .load(paths.split(',')) However, even it mentions that this approach would work in Spark 2.x, I don’t find an implementation of load that accepts an Array[String] as an input parameter. Thanks in advance for your help. Didac Gil de la Iglesia PhD in Computer Science didacg...@gmail.com Spain: +34 696 285 544 Sweden: +46 (0)730229737 Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP