One idea is to post a log-synth [1] schema that generates data the same shape as your real data. If you can generate fake data that causes the same problem you give developers a huge head start in solving your problem.
For the record, are you using the recently announced 0.8 version of Drill? [1] https://github.com/tdunning/log-synth On Wed, Apr 1, 2015 at 3:29 AM, Alexander Reshetov < [email protected]> wrote: > Hello all, > > I have 80GB dataset of JSONs which have many nested arrays. > I'm trying to flatten it and make some calculations, but I got > exceptions after reading about 2/3 of file. > > I could (and want) to post an issue in Jira, but I cannot attach my dataset > because it has sensitive data and also it's too large. > > It there any way to help to investigate issues without posting my dataset? > > To give a hit about issue I've attached file with exception text. >
