Hi Rex, If the CSV files are in the same folder and there are no other files, specifying the directory to sc.textFiles() (or equivalent) will pull in all the files. If there are other files, you can pass in a pattern that would capture the two files you care about (if thats possible). If neither of these work for you, you can create individual RDDs for each file and union them.
-sujit On Fri, Jun 26, 2015 at 11:00 AM, Rex X <dnsr...@gmail.com> wrote: > With Python Pandas, it is easy to do concatenation of dataframes > by combining pandas.concat > <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html> > and pandas.read_csv > > pd.concat([pd.read_csv(os.path.join(Path_to_csv_files, f)) for f in > csvfiles]) > > where "csvfiles" is the list of csv files > > > HOw can we do this in Spark? > > >