@Akif, right. Although this issue manifests more than one problem, the
major cause for this behavior seems that copying nested data per row
overflows the underlying buffer. A problem that's known with flatten. We
should be able to fix this soon.

On Mon, Jun 22, 2015 at 9:49 AM, Andries Engelbrecht <
[email protected]> wrote:

> I was not able to read the JSON doc in the link with either VisualJSON or
> some online JSON editors.
> Are you sure the JSON document is structured properly?
>
> —Andries
>
>
> On Jun 20, 2015, at 2:37 AM, Akif Khan <[email protected]> wrote:
>
> > Hi
> >
> > I found out that the Drill flattening fails when the nesting is too
> large.
> > you can find the json on which it fails here :
> > https://gist.github.com/anonymous/d18a5da201a995084c1b
> >
> > When I ran the query select flatten(campaign['funders'])['user_id'] from
> > `crowd/xal2.json`;
> > It failed, while It works perfectly on smaller Nested json.
> >
> >
> >
> > On Sat, Jun 20, 2015 at 12:51 AM, Jason Altekruse <
> [email protected]>
> > wrote:
> >
> >> The allocation that is failing is not the data actually required for the
> >> flatten operation, but the unneeded copy of all of the lists. If we
> remove
> >> this from the plan a lot more flatten queries will execute
> successfully. We
> >> still don't have a solution for a single list that does not fit in the
> max
> >> allocation size for a buffer, but this is a larger issue that needs to
> be
> >> addressed with some additional design work.
> >>
> >> On Fri, Jun 19, 2015 at 11:57 AM, Hanifi Gunes <[email protected]>
> >> wrote:
> >>
> >>> Jason, pointed out a possible indefinite loop problem where requested
> >>> allocation size > max allowed so we will have to address that before
> >>> checking it in.
> >>>
> >>> It is not entirely clear to me from the description of D-3323 what the
> >>> problem and proposal are. Is the issue solely targeting to fix the
> >>> redundant vector copy issue? And also, how is that contributing to the
> >>> manifestation of the original problem?
> >>>
> >>> -Hanifi
> >>>
> >>> On Fri, Jun 19, 2015 at 10:17 AM, Jason Altekruse <
> >>> [email protected]>
> >>> wrote:
> >>>
> >>>> The patch is currently in review, I don't think that it is going to
> >>>> necessarily fix this issue. I am have been looking into issues with
> >>> flatten
> >>>> and I just opened a new JIRA that I think will actually address your
> >>> issue.
> >>>> This is a little bit of a low level issue with how the flatten is
> >>> currently
> >>>> being planned.
> >>>>
> >>>> https://issues.apache.org/jira/browse/DRILL-3323
> >>>>
> >>>> Are the lists that you are trying to flatten very large? This would
> >> make
> >>> it
> >>>> likely caused by the problem I just filed this JIRA against. I hope
> >> that
> >>> we
> >>>> can get in a fix for this issue in to the 1.1 release.
> >>>>
> >>>> On Fri, Jun 19, 2015 at 1:41 AM, Akif Khan <[email protected]>
> >>>> wrote:
> >>>>
> >>>>> Hi All,
> >>>>>
> >>>>> Thanks for the response, @Hanifi Gunes wanted to ask you whether the
> >>>> patch
> >>>>> is being worked on or has it been released, I couldn't see any patch
> >> on
> >>>> the
> >>>>> JIRA Dashboard.
> >>>>>
> >>>>> On Fri, Jun 19, 2015 at 1:26 AM, Hanifi Gunes <[email protected]>
> >>>> wrote:
> >>>>>
> >>>>>> The patch is in-progress and should be check-in soon. It would be
> >>> great
> >>>>> if
> >>>>>> you could apply and battle-test it.
> >>>>>>
> >>>>>> -Hanifi
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jun 18, 2015 at 9:18 AM, Abdel Hakim Deneche <
> >>>>>> [email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hey Akif,
> >>>>>>>
> >>>>>>> There is a known issue that looks similar to the error you
> >>> reported:
> >>>>>>>
> >>>>>>> DRILL-2851 <https://issues.apache.org/jira/browse/DRILL-2851>
> >>>>>>>
> >>>>>>> There is already a patch for review to fix for this and it may
> >> fix
> >>>> your
> >>>>>>> issue or at the very least give you a more meaningful error
> >>> message.
> >>>>> You
> >>>>>>> could either wait until the patch is merged in master or try it
> >> by
> >>>>>> yourself
> >>>>>>> and see if the issue has been fixed.
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>> On Thu, Jun 18, 2015 at 5:35 AM, Akif Khan <
> >>> [email protected]
> >>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi
> >>>>>>>>
> >>>>>>>> I am re posting my query as there weren't any responses
> >> earlier.
> >>>>> please
> >>>>>>>> tell why this error happens and can it avoided ? Or is it due
> >> to
> >>>> bad
> >>>>>>> data ?
> >>>>>>>>
> >>>>>>>> I wrote a query mentioned below and got this error, I have an
> >>>> amazon
> >>>>>> aws
> >>>>>>>> with four nodes having 32 GB RAM and 8 cores on ubuntu with
> >>> Hadoop
> >>>> FS
> >>>>>> and
> >>>>>>>> zookeeper installed :
> >>>>>>>>
> >>>>>>>> *Query *: select flatten(campaign['funders'])['user_id'] from
> >>>>>>>> `new_crowdfunding`;
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> the s*tructure of new_crowdfunding table* is as follows:
> >>>>>>>> https://gist.github.com/akifkhan/d864ad9dcf5be712ff24
> >>>>>>>>
> >>>>>>>> *Error after running for 40 seconds and printing various
> >>> user_ids*
> >>>>>>>>
> >>>>>>>> java.lang.RuntimeException: java.sql.SQLException: SYSTEM
> >> ERROR:
> >>>>>>>> java.lang.IllegalArgumentException: initialCapacity:
> >> -2147483648
> >>>>>>> (expectd:
> >>>>>>>> 0+)
> >>>>>>>>
> >>>>>>>> Fragment 0:0
> >>>>>>>>
> >>>>>>>> [Error Id: 4fa13e31-ad84-42c6-aa50-c80c92ab026d on
> >>>>> hadoop-slave1:31010]
> >>>>>>>> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> >>>>>>>> at
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> >>>>>>>> at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> >>>>>>>> at sqlline.SqlLine.print(SqlLine.java:1583)
> >>>>>>>> at sqlline.Commands.execute(Commands.java:852)
> >>>>>>>> at sqlline.Commands.sql(Commands.java:751)
> >>>>>>>> at sqlline.SqlLine.dispatch(SqlLine.java:738)
> >>>>>>>> at sqlline.SqlLine.begin(SqlLine.java:612)
> >>>>>>>> at sqlline.SqlLine.start(SqlLine.java:366)
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Abdelhakim Deneche
> >>>>>>>
> >>>>>>> Software Engineer
> >>>>>>>
> >>>>>>>  <http://www.mapr.com/>
> >>>>>>>
> >>>>>>>
> >>>>>>> Now Available - Free Hadoop On-Demand Training
> >>>>>>> <
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Regards
> >>>>>
> >>>>> *Akif Khan*
> >>>>> *InnovAccer Inc.*
> >>>>> *www.innovaccer.com <http://www.innovaccer.com>*
> >>>>> *+91 8802290360*
> >>>>>
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> > Regards
> >
> > *Akif Khan*
> > *InnovAccer Inc.*
> > *www.innovaccer.com <http://www.innovaccer.com>*
> > *+91 8802290360*
>
>

Reply via email to