The other JSON format is officially JSONL.. Can we in the next version of drill 
in Storage Plugins by default include jsonl in extensions??

http://jsonlines.org/

From:

    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },

To

    "json": {
      "type": "json",
      "extensions": [
        "json", "jsonl"
      ]
    },

After working with both JSON and JSONL, JSONL is so much easier to work with 
using other tools and programming languages..

A simple linux GREP command can be used to find data, but trying to GREP a JSON 
file with no line breaks just returns back a wall of text..


-----Original Message-----
From: Paul Rogers [mailto:[email protected]] 
Sent: Monday, August 27, 2018 5:47 PM
To: [email protected]
Subject: Re: RE: Error: DATA_READ ERROR: Error parsing JSON - Cannot read from 
the middle of a record

[EXTERNAL EMAIL]


Hi David,

JSON files are never splittable: there is no single-character way to find the 
start of a JSON record within a file.

Drill is supposed to support two JSON formats: the array format from the 
earlier post, and the non-JSON (but very common) list of objects format in this 
example.

Thanks,
- Paul



    On Monday, August 27, 2018, 5:38:32 PM PDT, Lee, David 
<[email protected]> wrote:

 Get rid of the opening and closing brackets and see if you can turn the commas 
into newlines.. The file needs to be splittable I think to reduce memory 
overhead vs parsing a giant string...

{"var1": "foo", "var2":"bar"}
{"var1": "fo", "var2": "baz"}
{"var1": "f2o", "var2": "baz2"}
{"var1": "f3o", "var2": "baz3"}
{"var1": "f4o", "var2": "baz4"}
{"var1": "f5o", "var2": "baz5"}

-----Original Message-----
From: scott [mailto:[email protected]]
Sent: Monday, August 27, 2018 4:59 PM
To: [email protected]
Subject: Error: DATA_READ ERROR: Error parsing JSON - Cannot read from the 
middle of a record

[EXTERNAL EMAIL]


Hi All,
I'm getting an error querying some of my json files.
The error I'm getting is: Error: DATA_READ ERROR: Error parsing JSON - Cannot 
read from the middle of a record. Current token was START_ARRAY

The json files are in array format, like [ { "var1": "foo", "var2":
"bar"},{"var1": "fo", "var2": "baz"}]

I found a ticket that indicates this format is not supported by Drill yet,
DRILL-1755 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.apache.org_jira_browse_DRILL-2D1755&d=DwIBaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=SpeiLeBTifecUrj1SErsTRw4nAqzMxT043sp_gndNeI&m=G0Hsj4vSq2tBbv1c1dW6zC3pOzA_kSuhlQoFvFKpdJo&s=Dh8nYVKoOA8nQ3XdDmauSethwq9x4ric2_MsYMcfDdc&e=>
 , but I find it hard to believe there is no workaround or solution since this 
was reported
4 years back. Does anyone have a solution or workaround to this problem?

Thanks,
Scott


This message may contain information that is confidential or privileged. If you 
are not the intended recipient, please advise the sender immediately and delete 
this message. See 
http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for 
further information.  Please refer to 
http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for more 
information about BlackRock’s Privacy Policy.

For a list of BlackRock's office addresses worldwide, see 
http://www.blackrock.com/corporate/en-us/about-us/contacts-locations.

© 2018 BlackRock, Inc. All rights reserved.

Reply via email to