So the reader, in Drill 1.6 that is being used by default is the reader
from the standard parquet library then?  (Not the special Drill reader?

On Sat, May 28, 2016 at 7:22 AM, Abdel Hakim Deneche <[email protected]>
wrote:

> the new parquet reader, the complex reader, is disabled by default. You can
> enable it by setting the following option to true:
>
> store.parquet.use_new_reader
>
>
>
> On Sat, May 28, 2016 at 4:56 AM, John Omernik <[email protected]> wrote:
>
> > I remember reading that drill uses two readers. One for certain cases ( I
> > think flat structures) and the other for complex structures.  A. Am I
> > remembering correctly? B. If so, can I determine via the plan or
> something
> > which is being used? And C. Can I force Drill to try the other reader?
> >
> > On Saturday, May 28, 2016, Ted Dunning <[email protected]> wrote:
> >
> > > The Parquet user/dev mailing list might be helpful here. They have a
> real
> > > stake in making sure that all readers/writers can work together. The
> > > problem here really does sound like there is a borderline case that
> isn't
> > > handled as well in the Drill special purpose parquet reader as in the
> > > normal readers.
> > >
> > >
> > >
> > >
> > >
> > > On Fri, May 27, 2016 at 7:23 PM, John Omernik <[email protected]
> > > <javascript:;>> wrote:
> > >
> > > > So working with MapR support we tried that with Impala, but it didn't
> > > > produce the desired results because the outputfile worked fine in
> > Drill.
> > > > Theory: Evil file is created in Mapr Reduce, and is using a different
> > > > writer than Impala is using. Impala can read the evil file, but when
> it
> > > > writes it uses it's own writer, "fixing" the issue on the fly.  Thus,
> > > Drill
> > > > can't read evil file, but if we try to reduce with Impala, files is
> no
> > > > longer evil, consider it... chaotic neutral ... (For all you D&D
> fans )
> > > >
> > > > I'd ideally love to extract into badness, but on the phone now with
> > MapR
> > > > support to figure out HOW, that is the question at hand.
> > > >
> > > > John
> > > >
> > > > On Fri, May 27, 2016 at 10:09 AM, Ted Dunning <[email protected]
> > > <javascript:;>>
> > > > wrote:
> > > >
> > > > > On Thu, May 26, 2016 at 8:50 PM, John Omernik <[email protected]
> > > <javascript:;>> wrote:
> > > > >
> > > > > > So, if we have a known "bad" Parquet file (I use quotes, because
> > > > > remember,
> > > > > > Impala queries this file just fine) created in Map Reduce, with a
> > > > column
> > > > > > causing Array Index Out of Bounds problems with a BIGINT typed
> > > column.
> > > > > What
> > > > > > would your next steps be to troubleshoot?
> > > > > >
> > > > >
> > > > > I would start reducing the size of the evil file.
> > > > >
> > > > > If you have a tool that can query the bad parquet and write a new
> one
> > > > > (sounds like Impala might do here) then selecting just the evil
> > column
> > > > is a
> > > > > good first step.
> > > > >
> > > > > After that, I would start bisecting to find a small range that
> still
> > > > causes
> > > > > the problem. There may not be such, but it is good thing to try.
> > > > >
> > > > > At that point, you could easily have the problem down to a few
> > > kilobytes
> > > > of
> > > > > data that can be used in a unit test.
> > > > >
> > > >
> > >
> >
> >
> > --
> > Sent from my iThing
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Reply via email to