Abdel, so these weren't built with MapRFS, as I am getting the Can't find file system for Scheme MapRFS. I'll be eagerly awaiting the MapR Package! Thanks
John T On Thu, Nov 5, 2015 at 12:27 PM, John Omernik <[email protected]> wrote: > Abdel - > > Thank you, I do understand it's a challenge for troubleshooting, and > apologize to that end. I see you have a @maprtech email, is the binaries in > the release built with the MapRDB support? I need that for my mapr cluster, > that's why I am waiting for a MapR build of 1.3.0. > > On Thu, Nov 5, 2015 at 11:44 AM, Abdel Hakim Deneche < > [email protected]> wrote: > >> Hey John, >> >> If you want to, you can download the binaries for 1.3 release candidate >> from [1] and see if you can reproduce the error. You just need to unzip >> the >> folder and run "bin/drill-embedded". >> >> Without some data to reproduce the issue, it's really hard to come up with >> an explanation. >> >> Thanks >> >> [1] http://people.apache.org/~jacques/apache-drill-1.3.0.rc0/ >> >> On Thu, Nov 5, 2015 at 5:12 AM, John Omernik <[email protected]> wrote: >> >> > Hey Steven, I will look into that. Based on your understanding of the >> > problem would DRILL-4006 still apply given these conditions >> > >> > 1. When I query a directory of json files, and it fails signaling a >> > specific JSON file as a culprit. When I remove that file, it works, and >> > when I do a query only on that culprit JSON file it works as well. >> > 2. When the error occurs, if I restart my drill bits, and run the query >> > again it seems to work (This one baffles me) >> > >> > I will look to try the 1.3 release, I am using 1.2.1 release from MapR, >> so >> > I may have to wait until they roll a package for easy install (I want to >> > include their MapRDB Support). >> > >> > MapR Team: If you have a current release with the Drill 4006 >> incorporated >> > and possibly the JDBC Storage Plugin fixes rolled for testing, I'd love >> to >> > give it a shot (non-supported of course) >> > >> > >> > >> > On Thu, Nov 5, 2015 at 12:48 AM, Steven Phillips <[email protected]> >> > wrote: >> > >> > > This looks like DRILL-4006, a fix for which just went in. >> > > >> > > https://issues.apache.org/jira/browse/DRILL-4006 >> > > >> > > >> > > On Wed, Nov 4, 2015 at 12:16 PM, John Omernik <[email protected]> >> wrote: >> > > >> > > > I am on MapR's 1.2.1 Package. >> > > > >> > > > >> > > > >> > > > >> > > > On Wed, Nov 4, 2015 at 2:14 PM, Abdel Hakim Deneche < >> > > [email protected] >> > > > > >> > > > wrote: >> > > > >> > > > > One last thing, what version of Drill do you have installed ? >> > > > > >> > > > > On Wed, Nov 4, 2015 at 11:04 AM, John Omernik <[email protected]> >> > > wrote: >> > > > > >> > > > > > No I don't think so. I am running Drill in Marathon on Mesos, >> so >> > my >> > > > > > startup settings are all very static. In addition, the only >> session >> > > > > > variable I was changed was the json as text option at the >> session >> > > level >> > > > > and >> > > > > > I was setting it on both the pre drillbit reboot and the post >> > > drillbit >> > > > > > reboot sessions (I need that to query the data). >> > > > > > >> > > > > > On Wed, Nov 4, 2015 at 12:46 PM, Abdel Hakim Deneche < >> > > > > > [email protected]> >> > > > > > wrote: >> > > > > > >> > > > > > > This is strange indeed. The error message you reported earlier >> > > > doesn't >> > > > > > > suggest a memory leak issue but rather a bug when reading a >> > > specific >> > > > > set >> > > > > > of >> > > > > > > data. >> > > > > > > Could it be that you changed some session options, and you >> forgot >> > > to >> > > > > set >> > > > > > > them again after you restarted the drillbits ? >> > > > > > > >> > > > > > > Thanks >> > > > > > > >> > > > > > > On Wed, Nov 4, 2015 at 10:37 AM, John Omernik < >> [email protected]> >> > > > > wrote: >> > > > > > > >> > > > > > > > So I pulled the (I was up to two) files that seemed to be >> > causing >> > > > > this >> > > > > > > > issue out, and loaded my data. (see my other posts on how I >> > did >> > > > that >> > > > > > > with >> > > > > > > > loading into a folder prefixed by .) >> > > > > > > > >> > > > > > > > Anywho, my Drill cluster became unstable in general, and I >> was >> > > not >> > > > > able >> > > > > > > to >> > > > > > > > run any queries until I bounced by drill bits. >> > > > > > > > >> > > > > > > > I did that, got my process working again, and went to go try >> > > > > > > > troubleshooting this problem again and everything appears >> to be >> > > > > working >> > > > > > > > well now. I am stumped. Could a memory leak have caused >> that >> > > > error >> > > > > > > only >> > > > > > > > on some files? I am monitoring now to determine if the >> problem >> > > > > starts >> > > > > > > > again, but that is REALLY strange to me. This seems out of >> > > > character >> > > > > > for >> > > > > > > > Drill, both in my use of it, and in how it handles memory >> has >> > > been >> > > > > > > > explained to me. If I get the error again, I'll ensure I >> set >> > > that >> > > > to >> > > > > > > get a >> > > > > > > > full stack trace. >> > > > > > > > >> > > > > > > > John >> > > > > > > > >> > > > > > > > On Wed, Nov 4, 2015 at 12:13 PM, Abdel Hakim Deneche < >> > > > > > > > [email protected]> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > The error message "index: 9604, length: 4 (expected: >> range(0, >> > > > > 8192))" >> > > > > > > > > suggests an error happened when Drill tried to access a >> > memory >> > > > > buffer >> > > > > > > > (most >> > > > > > > > > likely while writing an int or float value) >> > > > > > > > > This may be a bug actually exposed by that particular data >> > > > record. >> > > > > > > > > >> > > > > > > > > You can try enabling verbose error logging before running >> the >> > > > query >> > > > > > > > again: >> > > > > > > > > >> > > > > > > > > set `exec.errors.verbose`=true; >> > > > > > > > > >> > > > > > > > > This should give us a nice stack trace about this error. >> > > > > > > > > >> > > > > > > > > Thanks >> > > > > > > > > >> > > > > > > > > On Wed, Nov 4, 2015 at 7:29 AM, John Omernik < >> > [email protected] >> > > > >> > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > There are multiple fields in that record, including two >> > > lists. >> > > > > Both >> > > > > > > > lists >> > > > > > > > > > have data in them (now I am runnning with json text mode >> > > > because >> > > > > at >> > > > > > > > times >> > > > > > > > > > the first value is a JSON null, but in these cases, that >> > > should >> > > > > be >> > > > > > > > turned >> > > > > > > > > > to "null" as string. (If I am understanding things >> > > correctly) >> > > > > and >> > > > > > > > > > shouldn't be causing a problem. >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Wed, Nov 4, 2015 at 9:21 AM, Hsuan Yi Chu < >> > > > > [email protected]> >> > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > What is the data type for that record in line 2402? A >> > list? >> > > > > > > > > > > >> > > > > > > > > > > Do you think it could be similar to this issue ? >> > > > > > > > > > > >> > > > > > > > > > > https://issues.apache.org/jira/browse/DRILL-4006 >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > On Wed, Nov 4, 2015 at 6:48 AM, John Omernik < >> > > > [email protected] >> > > > > > >> > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > > Hey all, >> > > > > > > > > > > > >> > > > > > > > > > > > I am working with JSON that is on the whole fairly >> > clean. >> > > > I >> > > > > am >> > > > > > > > > trying >> > > > > > > > > > to >> > > > > > > > > > > > load into Parquet files, and the previous days >> worth of >> > > > data >> > > > > > > worked >> > > > > > > > > > just >> > > > > > > > > > > > fine, but todays data has something wrong with it >> and I >> > > > Can't >> > > > > > > > figure >> > > > > > > > > > out >> > > > > > > > > > > > what it is. Unfortunately, I can't post the data, >> > which I >> > > > > know >> > > > > > > > makes >> > > > > > > > > > this >> > > > > > > > > > > > hard to troubleshoot for the community. Hopefully I >> can >> > > > > provide >> > > > > > > > some >> > > > > > > > > > info >> > > > > > > > > > > > here, and get some pointers on where to look, and >> then >> > > > report >> > > > > > > back >> > > > > > > > on >> > > > > > > > > > how >> > > > > > > > > > > > we could potentially improve the error messages. >> > > > > > > > > > > > >> > > > > > > > > > > > The error is below. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > I am looking to figure out given the information >> > reported >> > > > > where >> > > > > > > I'd >> > > > > > > > > > look >> > > > > > > > > > > to >> > > > > > > > > > > > trouble shoot this. Obviously the file >> > > > > > > > > > > 02ffc306e877_my_load_1446640931.json >> > > > > > > > > > > > is where I am looking to start >> > > > > > > > > > > > >> > > > > > > > > > > > This file has 3000 lines (records of data, so it's >> > > > somewhere >> > > > > in >> > > > > > > > > > between. >> > > > > > > > > > > > >> > > > > > > > > > > > The index/length/expected range don't mean anything >> to >> > > me I >> > > > > > could >> > > > > > > > use >> > > > > > > > > > > some >> > > > > > > > > > > > help there, because I am not even sure what I am >> > looking >> > > > for. >> > > > > > > > > > > > >> > > > > > > > > > > > The record and/or Fragment... do those help me dig >> in? >> > > > > > > > > > > > >> > > > > > > > > > > > Since this is one record per line, I went to line >> 2402 >> > > but >> > > > > that >> > > > > > > > > record >> > > > > > > > > > > > looks completely normal to me, (like all the other >> > ones) >> > > > but >> > > > > > > since >> > > > > > > > > this >> > > > > > > > > > > is >> > > > > > > > > > > > dense text, I am obviously missing something, but is >> > the >> > > > > record >> > > > > > > the >> > > > > > > > > > line >> > > > > > > > > > > > number? >> > > > > > > > > > > > >> > > > > > > > > > > > Any other pointers I can use to trouble shoot this? >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks! >> > > > > > > > > > > > >> > > > > > > > > > > > Error: >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Caused by: >> > > > > > > org.apache.drill.common.exceptions.UserRemoteException: >> > > > > > > > > > > > DATA_READ ERROR: Error parsing JSON - index: 9604, >> > > length: >> > > > 4 >> > > > > > > > > (expected: >> > > > > > > > > > > > range(0, 8192)) >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > File >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> /etl/dev/my-metadata/mysqspull/loads/2015-11-04/02ffc306e877_my_load_1446640931.json >> > > > > > > > > > > > >> > > > > > > > > > > > Record 2402 >> > > > > > > > > > > > >> > > > > > > > > > > > Fragment 1:5 >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > -- >> > > > > > > > > >> > > > > > > > > Abdelhakim Deneche >> > > > > > > > > >> > > > > > > > > Software Engineer >> > > > > > > > > >> > > > > > > > > <http://www.mapr.com/> >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Now Available - Free Hadoop On-Demand Training >> > > > > > > > > < >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > >> > > > > > > Abdelhakim Deneche >> > > > > > > >> > > > > > > Software Engineer >> > > > > > > >> > > > > > > <http://www.mapr.com/> >> > > > > > > >> > > > > > > >> > > > > > > Now Available - Free Hadoop On-Demand Training >> > > > > > > < >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > >> > > > > Abdelhakim Deneche >> > > > > >> > > > > Software Engineer >> > > > > >> > > > > <http://www.mapr.com/> >> > > > > >> > > > > >> > > > > Now Available - Free Hadoop On-Demand Training >> > > > > < >> > > > > >> > > > >> > > >> > >> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available >> > > > > > >> > > > > >> > > > >> > > >> > >> >> >> >> -- >> >> Abdelhakim Deneche >> >> Software Engineer >> >> <http://www.mapr.com/> >> >> >> Now Available - Free Hadoop On-Demand Training >> < >> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available >> > >> > >
