Hi, Andries, Ted, thanks for quick replies. Yes, I'm using latest official build of 0.8.
I made some investigations of possible issues and also found way to hide sensitive data. Please see issue regarding this [1]. In that process I found one strange behavior which I assume lead to this issue. (if dataset files are missed then they are still uploading) [1] https://issues.apache.org/jira/browse/DRILL-2677 On Wed, Apr 1, 2015 at 7:46 PM, Ted Dunning <[email protected]> wrote: > One idea is to post a log-synth [1] schema that generates data the same > shape as your real data. If you can generate fake data that causes the > same problem you give developers a huge head start in solving your problem. > > For the record, are you using the recently announced 0.8 version of Drill? > > > [1] https://github.com/tdunning/log-synth > > > On Wed, Apr 1, 2015 at 3:29 AM, Alexander Reshetov < > [email protected]> wrote: > >> Hello all, >> >> I have 80GB dataset of JSONs which have many nested arrays. >> I'm trying to flatten it and make some calculations, but I got >> exceptions after reading about 2/3 of file. >> >> I could (and want) to post an issue in Jira, but I cannot attach my dataset >> because it has sensitive data and also it's too large. >> >> It there any way to help to investigate issues without posting my dataset? >> >> To give a hit about issue I've attached file with exception text. >>
