Re: possible issues with listing objects in the HadoopFSrelation

2015-08-12 Thread Cheng Lian
Hi Gil, Sorry for the late reply and thanks for raising this question. The file listing logic in HadoopFsRelation is intentionally made different from Hadoop FileInputFormat. Here are the reasons: 1. Efficiency: when computing RDD partitions, FileInputFormat.listStatus() is called on the

Re: possible issues with listing objects in the HadoopFSrelation

2015-08-12 Thread Gil Vernik
@IBMIL, Dev dev@spark.apache.org Date: 12/08/2015 10:51 Subject:Re: possible issues with listing objects in the HadoopFSrelation Hi Gil, Sorry for the late reply and thanks for raising this question. The file listing logic in HadoopFsRelation is intentionally made different from