Hi, I am experiencing some crashes when using spark over local files (mainly for testing). Some operations fail with
java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program "/bin/ls": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at org.apache.hadoop.util.Shell.runCommand(Shell.java:206) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.util.Shell.execCommand(Shell.java:467) at org.apache.hadoop.util.Shell.execCommand(Shell.java:450) at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:593) at org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:51) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:514) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getPermission(RawLocalFileSystem.java:489) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$buildScan$1$$anon$1$$anonfun$12.apply(newParquet.scala:292) etcetera... Which seems to be related to Shell.java in org.apache.hadoop-util, that uses ls -ld to figure out file permissions (that is in RawLocalFileSystem.loadPermissionsInfo). The problem is that instead of just calling ls, Shell .java calls /bin/ls, which is usually available, but in certain circumstances might not. Regardless of the reasons not to have ls in /bin, hardcoding the directory bans users from using the standard mechanisms to decide which binaries to run in their systems (in this case, $PATH), so I wonder if there is a particular reason why that path has been hardcoded to an absolute path instead to something resolvable using $PATH. Or in other words, is this a bug or a feature? Best -- Samuel