Aman Sinha created DRILL-1834:
---------------------------------

             Summary: Misleading error message when querying an empty Parquet 
file
                 Key: DRILL-1834
                 URL: https://issues.apache.org/jira/browse/DRILL-1834
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 0.7.0
            Reporter: Aman Sinha


It is possible that a CTAS may fail and create an empty Parquet file.  When we 
run a query against this file, we get a misleading error message from the 
planner that hides the original IOException, although the log file does have 
the original exception: 

{code:sql}
0: jdbc:drill:zk=local> select count(*) from dfs.`/tmp/empty.parquet`;
Query failed: Query failed: Unexpected exception during fragment 
initialization: Internal error: Error while applying rule 
DrillPushProjIntoScan, args 
[rel#77:ProjectRel.NONE.ANY([]).[](child=rel#76:Subset#0.ENUMERABLE.ANY([]).[],$f0=0),
 rel#68:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, 
/tmp/empty.parquet])]
{code}

The cause of the exception is in the logs: 
Caused by: java.io.IOException: Could not read footer: 
java.lang.RuntimeException: file:/tmp/empty.parquet is not a Parquet file (too 
small)
        at 
parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:195)
 ~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT]
        at 
parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:208)
 ~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT]
        at 
parquet.hadoop.ParquetFileReader.readFooters(ParquetFileReader.java:224) 
~[parquet-hadoop-1.5.1-drill-r4.jar:0.7.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.readFooter(ParquetGroupScan.java:208)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to