Abhishek Girish created DRILL-2208:
--------------------------------------
Summary: Error message must be updated when query contains
operations on a flattened column
Key: DRILL-2208
URL: https://issues.apache.org/jira/browse/DRILL-2208
Project: Apache Drill
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Abhishek Girish
Assignee: Jason Altekruse
Currently i observe that if there is a flatten/kvgen operation applied on a
column, no further operations can be performed on the said column unless it is
wrapped inside a nested query.
Consider a simple flatten/kvgen operation on a complex JSON file :
> select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f limit 1;
+------------+
| p |
+------------+
| {"key":"languages","value":{"text":"Mandarin Chinese 12.44%, Spanish 4.85%,
English 4.83%, Arabic 3.25%, Hindi 2.68%, Bengali 2.66%, Portuguese 2.62%,
Russian 2.12%, Japanese 1.8%, Standard German 1.33%, Javanese 1.25% (2009
est.)","note_1":"percents are for \"first language\" speakers only; the six UN
languages - Arabic, Chinese (Mandarin), English, French, Russian, and Spanish
(Castilian) - are the mother tongue or second language of about half of the
world's population, and are the official languages in more than half the states
in the world; some 150 to 200 languages have more than a million
speakers","note_2":"all told, there are an estimated 7,100 languages spoken in
the world; aproximately 80% of these languages are spoken by less than 100,000
people; about 50 languages are spoken by only 1 person; communities that are
isolated from each other in mountainous regions often develop multiple
languages; Papua New Guinea, for example, boasts about 836 separate
languages","note_3":"approximately 2,300 languages are spoken in Asia, 2,150,
in Africa, 1,311 in the Pacific, 1,060 in the Americas, and 280 in Europe"}} |
| {"key":"religions","value":{"text":"Christian 33.39% (of which Roman Catholic
16.85%, Protestant 6.15%, Orthodox 3.96%, Anglican 1.26%), Muslim 22.74%, Hindu
13.8%, Buddhist 6.77%, Sikh 0.35%, Jewish 0.22%, Baha'i 0.11%, other religions
10.95%, non-religious 9.66%, atheists 2.01% (2010 est.)"}} |
| {"key":"population","value":{"text":"7,095,217,980 (July 2013
est.)","top_ten_most_populous_countries_in_millions":"China 1,349.59; India
1,220.80; United States 316.67; Indonesia 251.16; Brazil 201.01; Pakistan
193.24; Nigeria 174.51; Bangladesh 163.65; Russia 142.50; Japan 127.25"}} |
| {"key":"age_structure","value":{"0_14_years":"26% (male 953,496,513/female
890,372,474)","15_24_years":"16.8% (male 614,574,389/female
579,810,490)","25_54_years":"40.6% (male 1,454,831,900/female
1,426,721,773)","55_64_years":"8.4% (male 291,435,881/female
305,185,398)","65_years_and_over":"8.2% (male 257,035,416/female 321,753,746)
(2013 est.)"}} |
| {"key":"dependency_ratios","value":{"total_dependency_ratio":"52
%","youth_dependency_ratio":"39.9 %","elderly_dependency_ratio":"12.1
%","potential_support_ratio":"8.3 (2013)"}} |
+------------+
*Adding a WHERE clause with conditions on this column fails:*
> select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f where
> f.p.`key` = 'languages';
Query failed: RemoteRpcException: Failure while running fragment., languages [
686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
[ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query.
(state=,code=0)
Logs indicate a NumberFormat Exception in the above case.
*And query fails to parse in the below case*
> select flatten(kvgen(f.`people`)).`value` as p from `factbook/world.json` f
> limit 5;
Query failed: ParseException: Encountered "." at line 1, column 34.
Was expecting one of:
"FROM" ...
"," ...
"AS" ...
....
....
"OVER" ...
Error: exception while executing query: Failure while executing query.
(state=,code=0)
Rewriting using an inner query succeeds:
select g.p.`value`.`note_3` from (select flatten(kvgen(f.`people`)) as p from
`factbook/world.json` f) g where g.p.`key`='languages';
+------------+
| EXPR$0 |
+------------+
| approximately 2,300 languages are spoken in Asia, 2,150, in Africa, 1,311 in
the Pacific, 1,060 in the Americas, and 280 in Europe |
+------------+
In both the failure cases the error message needs to be updated to indicate
that the operation is not supported. The current error message and logs are not
clear for an end user.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)