The data is okay, because the exact same Parquet directory is working fine
on the local drive, it's just not working when using HDFS.  I tried casting
as you said, but that ended up with the exact same problem.

On Tue, Jan 6, 2015 at 9:49 AM, MapR <[email protected]> wrote:

> Please try casting the colum data type. Also please verify that all the
> column data is satisfying your data type.
>
> Sudhakar Thota
> Sent from my iPhone
>
> > On Jan 5, 2015, at 5:56 AM, Adam Gilmore <[email protected]> wrote:
> >
> > The actual stack trace is:
> >
> > 2015-01-05 13:48:27,356 [2b5569d5-3771-748d-1390-3a8930d02002:frag:1:12]
> > ERROR o.a.drill.exec.ops.FragmentContext - Fragment Context received
> > failure.
> > org.apache.drill.common.exceptions.DrillRuntimeException:
> > java.io.IOException: can not read class parquet.format.PageHeader: don't
> > know what type: 13
> >        at
> >
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:427)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:158)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.buildSchema(StreamingAggBatch.java:83)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:130)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
> > [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > [na:1.7.0_71]
> >        at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > [na:1.7.0_71]
> >        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> > Caused by: java.io.IOException: can not read class
> > parquet.format.PageHeader: don't know what type: 13
> >        at parquet.format.Util.read(Util.java:50)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at parquet.format.Util.readPageHeader(Util.java:26)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at
> >
> org.apache.drill.exec.store.parquet.ColumnDataReader.readPageHeader(ColumnDataReader.java:47)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.store.parquet.columnreaders.PageReader.next(PageReader.java:169)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.processPages(NullableColumnReader.java:76)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.readAllFixedFields(ParquetRecordReader.java:366)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        at
> >
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:409)
> > ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
> >        ... 15 common frames omitted
> > Caused by: parquet.org.apache.thrift.protocol.TProtocolException: don't
> > know what type: 13
> >        at
> >
> parquet.org.apache.thrift.protocol.TCompactProtocol.getTType(TCompactProtocol.java:806)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at
> >
> parquet.org.apache.thrift.protocol.TCompactProtocol.readListBegin(TCompactProtocol.java:536)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at
> >
> parquet.org.apache.thrift.protocol.TCompactProtocol.readSetBegin(TCompactProtocol.java:547)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at
> >
> parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:128)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at
> >
> parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at parquet.format.PageHeader.read(PageHeader.java:897)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        at parquet.format.Util.read(Util.java:47)
> > ~[parquet-format-2.1.1-drill-r1.jar:na]
> >        ... 21 common frames omitted
> >
> >
> >> On Mon, Jan 5, 2015 at 6:26 PM, Adam Gilmore <[email protected]>
> wrote:
> >>
> >> Hi all,
> >>
> >> I'm trying to do a really simple query on a parquet directory on HDFS.
> >>
> >> This works fine:
> >>
> >> select count(*) from hdfs.warehouse.saleparquet
> >>
> >> However, this fails:
> >>
> >> 0: jdbc:drill:local> select sum(sellprice) from
> hdfs.warehouse.saleparquet;
> >> Query failed: Query failed: Failure while running fragment., You tried
> to
> >> do a batch data read operation when you were in a state of STOP.  You
> can
> >> only do this type of operation when you are in a state of OK or
> >> OK_NEW_SCHEMA. [ 92fc8807-220b-466c-bbac-1f524d4251cb on
> >> ip-10-8-1-154.ap-southeast-2.compute.internal:31010 ]
> >> [ 92fc8807-220b-466c-bbac-1f524d4251cb on
> >> ip-10-8-1-154.ap-southeast-2.compute.internal:31010 ]
> >>
> >>
> >> Error: exception while executing query: Failure while executing query.
> >> (state=,code=0)
> >>
> >> Seems like a very simple query.
> >>
> >> Funnily enough, if I copy it off HDFS to the local system and run the
> same
> >> query against the local file, it works fine.  Just purely something to
> do
> >> with HDFS.
> >>
> >> Any ideas?  I'm running 0.7.
> >>
>

Reply via email to