With fileStream you are free to plugin any InputFormat, in your case, you can easily plugin ParquetInputFormat. Here' some parquet hadoop examples <https://github.com/Parquet/parquet-mr/tree/master/parquet-hadoop/src/main/java/parquet/hadoop/example> .
Thanks Best Regards On Thu, Mar 12, 2015 at 5:51 PM, Masf <masfwo...@gmail.com> wrote: > Hi. > > Thanks for your answers, but, to read parquet files is necessary to use > parquetFile method in org.apache.spark.sql.SQLContext, is it true? > > How can I combine your solution with the called to this method? > > Thanks!! > Regards > > On Thu, Mar 12, 2015 at 8:34 AM, Yijie Shen <henry.yijies...@gmail.com> > wrote: > >> org.apache.spark.deploy.SparkHadoopUtil has a method: >> >> /** >> * Get [[FileStatus]] objects for all leaf children (files) under the >> given base path. If the >> * given path points to a file, return a single-element collection >> containing [[FileStatus]] of >> * that file. >> */ >> def listLeafStatuses(fs: FileSystem, basePath: Path): Seq[FileStatus] = >> { >> def recurse(path: Path) = { >> val (directories, leaves) = fs.listStatus(path).partition(_.isDir) >> leaves ++ directories.flatMap(f => listLeafStatuses(fs, f.getPath)) >> } >> >> val baseStatus = fs.getFileStatus(basePath) >> if (baseStatus.isDir) recurse(basePath) else Array(baseStatus) >> } >> >> — >> Best Regards! >> Yijie Shen >> >> On March 12, 2015 at 2:35:49 PM, Akhil Das (ak...@sigmoidanalytics.com) >> wrote: >> >> Hi >> >> We have a custom build to read directories recursively, Currently we use >> it with fileStream like: >> >> val lines = ssc.fileStream[LongWritable, Text, >> TextInputFormat]("/datadumps/", >> (t: Path) => true, true, *true*) >> >> >> Making the 4th argument true to read recursively. >> >> >> You could give it a try >> https://s3.amazonaws.com/sigmoidanalytics-builds/spark-1.2.0-bin-spark-1.2.0-hadoop2.4.0.tgz >> >> Thanks >> Best Regards >> >> On Wed, Mar 11, 2015 at 9:45 PM, Masf <masfwo...@gmail.com> wrote: >> >>> Hi all >>> >>> Is it possible to read recursively folders to read parquet files? >>> >>> >>> Thanks. >>> >>> -- >>> >>> >>> Saludos. >>> Miguel Ángel >>> >> >> > > > -- > > > Saludos. > Miguel Ángel >