Thanks Jacques, tried it with the latest nightly and it works.

My query also worked on 1.4 (mapr) when I disabled the union type - the
first line of DRILL-4410 tipped me off there. I'd been running queries for
a long time in a session where I'd enabled the union type some time ago,
and my failing queries didn't need it.



 ----
 Vince Gonzalez
 Systems Engineer
 212.694.3879

 mapr.com

On Mon, Mar 7, 2016 at 12:07 PM, Jacques Nadeau <[email protected]> wrote:

> I think this is likely related to:
>
> https://issues.apache.org/jira/browse/DRILL-4410
>
> A fix has been merged for this. Can you try from tip of master and see if
> this is resolved?
>
> thanks,
> Jacques
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Mar 7, 2016 at 6:04 AM, Vince Gonzalez <[email protected]> wrote:
>
> > Hanifi,
> >
> > I just bumped into this as well.
> >
> > Error: SYSTEM ERROR: OversizedAllocationException: Unable to expand the
> > buffer. Max allowed buffer size is reached.
> >
> > Earlier in the thread you say:
> >
> > > You see this exception because one of the columns in your dataset is
> > larger
> > > than an individual DrillBuf could store. The hard limit
> > > is Integer.MAX_VALUE bytes.
> >
> >
> > So is it right to say the maximum buffer size is 2GB? I'm getting this
> > exception on a data set whose *total* size is less than 1GB - as reported
> > by "du -sh" on the top level directory I am querying. So I'm confused.
> >
> > I have a guess as to which column in my dataset is causing the problem.
> > It's likely a substantial JSON document that comes from a file, and the
> > size of that file varies widely. I process the file into a dictionary in
> > Python before writing it to my workspace in a format that works but for
> > this issue. The largest of these documents weighs in at only 320KB.
> >
> > I could go down the path of reshaping the large document so that Drill
> sees
> > multiple columns, but I don't see how I can be sure that will work since
> > all of the columns in my data are so far below Integer.MAX_VALUE bytes.
> >
> > Is there any other recommendation you can make, apart from further ETLing
> > the data?
> >
> > --vince
> >
> >
> >  ----
> >  Vince Gonzalez
> >  Systems Engineer
> >  212.694.3879
> >
> >  mapr.com
> >
> > On Mon, Feb 8, 2016 at 2:05 PM, Hanifi Gunes <[email protected]>
> wrote:
> >
> > > Thanks for the feedback. Yep my answer seems very much dev focused than
> > > user.
> > >
> > > The error is manifestation of extremely wide columns in your dataset. I
> > > would recommend splitting the list if that's an option.
> > >
> > > Assuming the problem column is a list of integers as below
> > >
> > > {
> > > "wide": [1,2,.....N]
> > > }
> > >
> > > after splitting it should look like
> > >
> > > {
> > > "wide0": [1,2,.....X],
> > > "wide1": [Y,.......Z]
> > > ...
> > > "wideN": [T,.......N]
> > > }
> > >
> > > Sounds like good idea to enhance the error reporting with file & column
> > > name. Filed [1] to track this.
> > >
> > > Thanks.
> > >
> > > 1: https://issues.apache.org/jira/browse/DRILL-4371
> > >
> > >
> > > On Fri, Feb 5, 2016 at 6:28 PM, John Omernik <[email protected]> wrote:
> > >
> > > > Excuse my basic questions, when you say we are you reference Drill
> > > coders?
> > > > So what is Integer.MAX_VALUE bytes? Is that a query time setting?
> > > Drillbit
> > > > setting? Is it editable?  How does that value get interpreted for
> > complex
> > > > data types (objects and arrays).
> > > >
> > > > Not only would the column be helpful, but the source file as well.  (
> > if
> > > > this is an individual record issue....or is this a cumulative error
> > where
> > > > the max size of the sum of the length of multiple records of a column
> > is
> > > at
> > > > issue).
> > > >
> > > >
> > > > Thoughts on how as a user I could address this in my dataset?
> > > >
> > > > Thanks!
> > > >
> > > > On Friday, February 5, 2016, Hanifi Gunes <[email protected]>
> wrote:
> > > >
> > > > > You see this exception because one of the columns in your dataset
> is
> > > > larger
> > > > > than an individual DrillBuf could store. The hard limit
> > > > > is Integer.MAX_VALUE bytes. Around the time we are trying to expand
> > one
> > > > of
> > > > > the buffers, we notice the allocation request is oversized and fail
> > the
> > > > > query. It would be nice if error message contained the column that
> > > raised
> > > > > this issue though.
> > > > >
> > > > > On Fri, Feb 5, 2016 at 1:39 PM, John Omernik <[email protected]
> > > > > <javascript:;>> wrote:
> > > > >
> > > > > > Any thoughts on how to troubleshoot this (I have some fat json
> data
> > > > going
> > > > > > into the buffers apparently) It's not huge data, just
> wide/complex
> > > > (total
> > > > > > size is 1.4 GB)  Any thoughts on how to troubleshoot or settings
> I
> > > can
> > > > > use
> > > > > > to work through these errors?
> > > > > >
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > >
> > > > > > John
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Error: SYSTEM ERROR: OversizedAllocationException: Unable to
> expand
> > > the
> > > > > > buffer. Max allowed buffer size is reached.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Fragment 1:11
> > > > > >
> > > > > >
> > > > > >
> > > > > > [Error Id: db21dea0-ddd7-4fcf-9fea-b5031e358dad on node1
> > > > > >
> > > > > >
> > > > > >
> > > > > >   (org.apache.drill.exec.exception.OversizedAllocationException)
> > > Unable
> > > > > to
> > > > > > expand the buffer. Max allowed buffer size is reached.
> > > > > >
> > > > > >     org.apache.drill.exec.vector.UInt1Vector.reAlloc():214
> > > > > >
> > > > > >
> > > >  org.apache.drill.exec.vector.UInt1Vector$Mutator.setValueCount():469
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.vector.complex.ListVector$Mutator.setValueCount():324
> > > > > >
> > > > > >     org.apache.drill.exec.physical.impl.ScanBatch.next():247
> > > > > >
> > > > > >     org.apache.drill.exec.record.AbstractRecordBatch.next():119
> > > > > >
> > > > > >     org.apache.drill.exec.record.AbstractRecordBatch.next():109
> > > > > >
> > > > > >
> > > >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():132
> > > > > >
> > > > > >     org.apache.drill.exec.record.AbstractRecordBatch.next():162
> > > > > >
> > > > > >     org.apache.drill.exec.record.AbstractRecordBatch.next():119
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.test.generated.StreamingAggregatorGen1931.doWork():172
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():167
> > > > > >
> > > > > >     org.apache.drill.exec.record.AbstractRecordBatch.next():162
> > > > > >
> > > > > >     org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
> > > > > >
> > > > > >     org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> > > > > >
> > > > > >
> >  org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():256
> > > > > >
> > > > > >
> >  org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():250
> > > > > >
> > > > > >     java.security.AccessController.doPrivileged():-2
> > > > > >
> > > > > >     javax.security.auth.Subject.doAs():415
> > > > > >
> > > > > >     org.apache.hadoop.security.UserGroupInformation.doAs():1595
> > > > > >
> > > > > >
>  org.apache.drill.exec.work.fragment.FragmentExecutor.run():250
> > > > > >
> > > > > >     org.apache.drill.common.SelfCleaningRunnable.run():38
> > > > > >
> > > > > >     java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> > > > > >
> > > > > >     java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> > > > > >
> > > > > >     java.lang.Thread.run():745 (state=,code=0)
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Sent from my iThing
> > > >
> > >
> >
>

Reply via email to