Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1083#discussion_r160685002
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ---
    @@ -250,20 +250,12 @@ private boolean metaDataFileExists(FileSystem fs, 
FileStatus dir) throws IOExcep
         }
     
         boolean isDirReadable(DrillFileSystem fs, FileStatus dir) {
    -      Path p = new Path(dir.getPath(), 
ParquetFileWriter.PARQUET_METADATA_FILE);
           try {
    -        if (fs.exists(p)) {
    -          return true;
    -        } else {
    -
    -          if (metaDataFileExists(fs, dir)) {
    -            return true;
    -          }
    -          List<FileStatus> statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
    -          return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
    -        }
    +        // There should be at least one file, which is readable by Drill
    +        List<FileStatus> statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
    +        return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
    --- End diff --
    
    I did it on purpose. With the old logic of isDirReadable() method an empty 
directory, which contains parquet metadata files, will be processes with 
ParquetGroupScan as a Parquet Table. It leads to obtaining an exception:
    
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java#L878
    
    To process such table with SchemalessScan, isReadable method should return 
false for that case. In other words it shouldn't check availability of metadata 
cache files, but only really readable files by Drill.


---

Reply via email to