Hi all, Thanks for the reply. I'm using parquetFile as input, is that a problem? In hadoop fs -ls, the path (hdfs://domain/user/jianshuang/data/parquet/table/month=2014*) will get list all the files.
I'll test it again. Jianshi On Wed, Jun 18, 2014 at 2:23 PM, Jianshi Huang <jianshi.hu...@gmail.com> wrote: > Hi Andrew, > > Strangely in my spark (1.0.0 compiled against hadoop 2.4.0) log, it says > file not found. I'll try again. > > Jianshi > > > On Wed, Jun 18, 2014 at 12:36 PM, Andrew Ash <and...@andrewash.com> wrote: > >> In Spark you can use the normal globs supported by Hadoop's FileSystem, >> which are documented here: >> http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path) >> >> >> On Wed, Jun 18, 2014 at 12:09 AM, MEETHU MATHEW <meethu2...@yahoo.co.in> >> wrote: >> >>> Hi Jianshi, >>> >>> I have used wild card characters (*) in my program and it worked.. >>> My code was like this >>> b = sc.textFile("hdfs:///path to file/data_file_2013SEP01*") >>> >>> Thanks & Regards, >>> Meethu M >>> >>> >>> On Wednesday, 18 June 2014 9:29 AM, Jianshi Huang < >>> jianshi.hu...@gmail.com> wrote: >>> >>> >>> It would be convenient if Spark's textFile, parquetFile, etc. can >>> support path with wildcard, such as: >>> >>> hdfs://domain/user/jianshuang/data/parquet/table/month=2014* >>> >>> Or is there already a way to do it now? >>> >>> Jianshi >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Github & Blog: http://huangjs.github.com/ >>> >>> >>> >> > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/