Neeraja created DRILL-3202:
------------------------------
Summary: Count(*) fails on JSON wrapped up in single array - JSON
parsing error
Key: DRILL-3202
URL: https://issues.apache.org/jira/browse/DRILL-3202
Project: Apache Drill
Issue Type: Bug
Components: Storage - JSON
Affects Versions: 1.0.0
Reporter: Neeraja
Assignee: Steven Phillips
I have a JSON document as follows.
[
{
"Category": "1,2",
"Comments": "Total sites: 20, RV sites: 20, Elec sites: 20, Water at
site, RV Dump, Showers, Flush Toilets, RV Fee: $14, Tent Fee: $14, Elev: 545',
Tel: 256-577-9619, Nearest town: Muscle Shoals",
"Latitude": "34.800446",
"Longitude": "-87.498242",
"Name": "Alloys Co Park",
"State": "AL",
"Type": "cp",
"URL":
"http://www.campingroadtrip.com/campgrounds/campground/campground/23478/alabama/colbert-county-alloys-park-campground"
}
]
Drill has ability to unwrap the array (without user specifying it) and perform
some SQL operations on it. However count(*) specifically fails on these
documents.
0: jdbc:drill:zk=local> select * from
dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json`
limit 10;
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------+-------+--------+-------+------+
| Category |
Comments
| Latitude | Longitude | Name | State
| Type | URL |
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------+-------+--------+-------+------+
| 1,2 | Total sites: 20, RV sites: 20, Elec sites: 20, Water at site, RV Dump,
Showers, Flush Toilets, RV Fee: $14, Tent Fee: $14, Elev: 545', Tel:
256-577-9619, Nearest town: Muscle Shoals | 34.800446 | -87.498242 | Alloys Co
Park | AL | cp |
http://www.campingroadtrip.com/campgrounds/campground/campground/23478/alabama/colbert-county-alloys-park-campground
|
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------+-------+--------+-------+------+
1 row selected (0.197 seconds)
0: jdbc:drill:zk=local> select distinct type from
dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json`
limit 10;
+-------+
| type |
+-------+
| cp |
+-------+
1 row selected (0.193 seconds)
0: jdbc:drill:zk=local>
0: jdbc:drill:zk=local> select count(*) from
dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json`
limit 10;
Error: DATA_READ ERROR: Error parsing JSON - Cannot read from the middle of a
record. Current token was START_ARRAY
File /Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json
Record 1
Fragment 0:0
[Error Id: 4742f738-1d43-4fef-af48-110065c9dd83 on 172.16.1.82:31010]
(state=,code=0)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)