Hey Steven, I will look into that.  Based on your understanding of the
problem would DRILL-4006 still apply given these conditions

1. When I query a directory of json files, and it fails signaling a
specific JSON file as a culprit. When I remove that file, it works, and
when I do a query only on that culprit JSON file it works as well.
2. When the error occurs, if I restart my drill bits, and run the query
again it seems to work (This one baffles me)

I will look to try the 1.3 release, I am using 1.2.1 release from MapR, so
I may have to wait until they roll a package for easy install (I want to
include their MapRDB Support).

MapR Team: If you have a current release with the Drill 4006 incorporated
and possibly the JDBC Storage Plugin fixes rolled for testing, I'd love to
give it a shot (non-supported of course)



On Thu, Nov 5, 2015 at 12:48 AM, Steven Phillips <[email protected]> wrote:

> This looks like DRILL-4006, a fix for which just went in.
>
> https://issues.apache.org/jira/browse/DRILL-4006
>
>
> On Wed, Nov 4, 2015 at 12:16 PM, John Omernik <[email protected]> wrote:
>
> > I am on MapR's 1.2.1 Package.
> >
> >
> >
> >
> > On Wed, Nov 4, 2015 at 2:14 PM, Abdel Hakim Deneche <
> [email protected]
> > >
> > wrote:
> >
> > > One last thing, what version of Drill do you have installed ?
> > >
> > > On Wed, Nov 4, 2015 at 11:04 AM, John Omernik <[email protected]>
> wrote:
> > >
> > > > No I don't think so.  I am running Drill in Marathon on Mesos, so my
> > > > startup settings are all very static. In addition, the only session
> > > > variable I was changed was the json as text option at the session
> level
> > > and
> > > > I was setting it on both the pre drillbit reboot and the post
> drillbit
> > > > reboot sessions (I need that to query the data).
> > > >
> > > > On Wed, Nov 4, 2015 at 12:46 PM, Abdel Hakim Deneche <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > This is strange indeed. The error message you reported earlier
> > doesn't
> > > > > suggest a memory leak issue but rather a bug when reading a
> specific
> > > set
> > > > of
> > > > > data.
> > > > > Could it be that you changed some session options, and you forgot
> to
> > > set
> > > > > them again after you restarted the drillbits ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Wed, Nov 4, 2015 at 10:37 AM, John Omernik <[email protected]>
> > > wrote:
> > > > >
> > > > > > So I pulled the (I was up to two) files that seemed to be causing
> > > this
> > > > > > issue out, and loaded my data.  (see my other posts on how I did
> > that
> > > > > with
> > > > > > loading into a folder prefixed by .)
> > > > > >
> > > > > > Anywho, my Drill cluster became unstable in general, and I was
> not
> > > able
> > > > > to
> > > > > > run any queries until I bounced by drill bits.
> > > > > >
> > > > > > I did that, got my process working again, and went to go try
> > > > > > troubleshooting this problem again and everything appears to be
> > > working
> > > > > > well now.  I am stumped.   Could a memory leak have caused that
> > error
> > > > > only
> > > > > > on some files?  I am monitoring now to determine if the problem
> > > starts
> > > > > > again, but that is REALLY strange to me. This seems out of
> > character
> > > > for
> > > > > > Drill, both in my use of it, and in how it handles memory has
> been
> > > > > > explained to me.  If I get the error again, I'll ensure I set
> that
> > to
> > > > > get a
> > > > > > full stack trace.
> > > > > >
> > > > > > John
> > > > > >
> > > > > > On Wed, Nov 4, 2015 at 12:13 PM, Abdel Hakim Deneche <
> > > > > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > The error message "index: 9604, length: 4 (expected: range(0,
> > > 8192))"
> > > > > > > suggests an error happened when Drill tried to access a memory
> > > buffer
> > > > > > (most
> > > > > > > likely while writing an int or float value)
> > > > > > > This may be a bug actually exposed by that particular data
> > record.
> > > > > > >
> > > > > > > You can try enabling verbose error logging before running the
> > query
> > > > > > again:
> > > > > > >
> > > > > > > set `exec.errors.verbose`=true;
> > > > > > >
> > > > > > > This should give us a nice stack trace about this error.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Wed, Nov 4, 2015 at 7:29 AM, John Omernik <[email protected]
> >
> > > > wrote:
> > > > > > >
> > > > > > > > There are multiple fields in that record, including two
> lists.
> > > Both
> > > > > > lists
> > > > > > > > have data in them (now I am runnning with json text mode
> > because
> > > at
> > > > > > times
> > > > > > > > the first value is a JSON null, but in these cases, that
> should
> > > be
> > > > > > turned
> > > > > > > > to "null" as  string.  (If I am understanding things
> correctly)
> > > and
> > > > > > > > shouldn't be causing a problem.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Nov 4, 2015 at 9:21 AM, Hsuan Yi Chu <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > What is the data type for that record in line 2402? A list?
> > > > > > > > >
> > > > > > > > > Do you think it could be similar to this issue ?
> > > > > > > > >
> > > > > > > > > https://issues.apache.org/jira/browse/DRILL-4006
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Nov 4, 2015 at 6:48 AM, John Omernik <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey all,
> > > > > > > > > >
> > > > > > > > > > I am working with JSON that is on the whole fairly clean.
> > I
> > > am
> > > > > > > trying
> > > > > > > > to
> > > > > > > > > > load into Parquet files, and the previous days worth of
> > data
> > > > > worked
> > > > > > > > just
> > > > > > > > > > fine, but todays data has something wrong with it and I
> > Can't
> > > > > > figure
> > > > > > > > out
> > > > > > > > > > what it is. Unfortunately, I can't post the data, which I
> > > know
> > > > > > makes
> > > > > > > > this
> > > > > > > > > > hard to troubleshoot for the community. Hopefully I can
> > > provide
> > > > > > some
> > > > > > > > info
> > > > > > > > > > here, and get some pointers on where to look, and then
> > report
> > > > > back
> > > > > > on
> > > > > > > > how
> > > > > > > > > > we could potentially improve the error messages.
> > > > > > > > > >
> > > > > > > > > > The error is below.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I am looking to figure out given the information reported
> > > where
> > > > > I'd
> > > > > > > > look
> > > > > > > > > to
> > > > > > > > > > trouble shoot this. Obviously the file
> > > > > > > > > 02ffc306e877_my_load_1446640931.json
> > > > > > > > > > is where I am looking to start
> > > > > > > > > >
> > > > > > > > > > This file has 3000 lines (records of data, so it's
> > somewhere
> > > in
> > > > > > > > between.
> > > > > > > > > >
> > > > > > > > > > The index/length/expected range don't mean anything to
> me I
> > > > could
> > > > > > use
> > > > > > > > > some
> > > > > > > > > > help there, because I am not even sure what I am looking
> > for.
> > > > > > > > > >
> > > > > > > > > > The record and/or Fragment... do those help me dig in?
> > > > > > > > > >
> > > > > > > > > > Since this is one record per line, I went to line 2402
> but
> > > that
> > > > > > > record
> > > > > > > > > > looks completely normal to me, (like all the other ones)
> > but
> > > > > since
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > > dense text, I am obviously missing something, but is the
> > > record
> > > > > the
> > > > > > > > line
> > > > > > > > > > number?
> > > > > > > > > >
> > > > > > > > > > Any other pointers I can use to trouble shoot this?
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > Error:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Caused by:
> > > > > org.apache.drill.common.exceptions.UserRemoteException:
> > > > > > > > > > DATA_READ ERROR: Error parsing JSON - index: 9604,
> length:
> > 4
> > > > > > > (expected:
> > > > > > > > > > range(0, 8192))
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > File
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /etl/dev/my-metadata/mysqspull/loads/2015-11-04/02ffc306e877_my_load_1446640931.json
> > > > > > > > > >
> > > > > > > > > > Record  2402
> > > > > > > > > >
> > > > > > > > > > Fragment 1:5
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Abdelhakim Deneche
> > > > > > >
> > > > > > > Software Engineer
> > > > > > >
> > > > > > >   <http://www.mapr.com/>
> > > > > > >
> > > > > > >
> > > > > > > Now Available - Free Hadoop On-Demand Training
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Abdelhakim Deneche
> > > > >
> > > > > Software Engineer
> > > > >
> > > > >   <http://www.mapr.com/>
> > > > >
> > > > >
> > > > > Now Available - Free Hadoop On-Demand Training
> > > > > <
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>

Reply via email to