I just found a possible answer: http://themodernlife.github.io/scala/spark/hadoop/hdfs/2014/09/28/spark-input-filename/
Will give a try on it. Although it is a bit troublesome, but if it works, will give what I want. Sorry for bother everyone here Regards, Shuai On Sun, Dec 21, 2014 at 4:43 PM, Shuai Zheng <szheng.c...@gmail.com> wrote: > Hi All, > > When I try to load a folder into the RDDs, any way for me to find the > input file name of particular partitions? So I can track partitions from > which file. > > In the hadoop, I can find this information through the code: > > FileSplit fileSplit = (FileSplit) context.getInputSplit(); > String strFilename = fileSplit.getPath().getName(); > > But how can I do this in spark? > > Regards, > > Shuai >