[
https://issues.apache.org/jira/browse/DRILL-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Sinha updated DRILL-2208:
------------------------------
Fix Version/s: 0.9.0
> Error message must be updated when query contains operations on a flattened
> column
> ----------------------------------------------------------------------------------
>
> Key: DRILL-2208
> URL: https://issues.apache.org/jira/browse/DRILL-2208
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 0.8.0
> Reporter: Abhishek Girish
> Assignee: Jason Altekruse
> Priority: Minor
> Fix For: 0.9.0
>
> Attachments: drillbit_flatten.log
>
>
> Currently i observe that if there is a flatten/kvgen operation applied on a
> column, no further operations can be performed on the said column unless it
> is wrapped inside a nested query.
> Consider a simple flatten/kvgen operation on a complex JSON file :
> > select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f limit 1;
> +------------+
> | p |
> +------------+
> | {"key":"languages","value":{"text":"Mandarin Chinese 12.44%, Spanish 4.85%,
> English 4.83%, Arabic 3.25%, Hindi 2.68%, Bengali 2.66%, Portuguese 2.62%,
> Russian 2.12%, Japanese 1.8%, Standard German 1.33%, Javanese 1.25% (2009
> est.)","note_1":"percents are for \"first language\" speakers only; the six
> UN languages - Arabic, Chinese (Mandarin), English, French, Russian, and
> Spanish (Castilian) - are the mother tongue or second language of about half
> of the world's population, and are the official languages in more than half
> the states in the world; some 150 to 200 languages have more than a million
> speakers","note_2":"all told, there are an estimated 7,100 languages spoken
> in the world; aproximately 80% of these languages are spoken by less than
> 100,000 people; about 50 languages are spoken by only 1 person; communities
> that are isolated from each other in mountainous regions often develop
> multiple languages; Papua New Guinea, for example, boasts about 836 separate
> languages","note_3":"approximately 2,300 languages are spoken in Asia, 2,150,
> in Africa, 1,311 in the Pacific, 1,060 in the Americas, and 280 in Europe"}} |
> | {"key":"religions","value":{"text":"Christian 33.39% (of which Roman
> Catholic 16.85%, Protestant 6.15%, Orthodox 3.96%, Anglican 1.26%), Muslim
> 22.74%, Hindu 13.8%, Buddhist 6.77%, Sikh 0.35%, Jewish 0.22%, Baha'i 0.11%,
> other religions 10.95%, non-religious 9.66%, atheists 2.01% (2010 est.)"}} |
> | {"key":"population","value":{"text":"7,095,217,980 (July 2013
> est.)","top_ten_most_populous_countries_in_millions":"China 1,349.59; India
> 1,220.80; United States 316.67; Indonesia 251.16; Brazil 201.01; Pakistan
> 193.24; Nigeria 174.51; Bangladesh 163.65; Russia 142.50; Japan 127.25"}} |
> | {"key":"age_structure","value":{"0_14_years":"26% (male 953,496,513/female
> 890,372,474)","15_24_years":"16.8% (male 614,574,389/female
> 579,810,490)","25_54_years":"40.6% (male 1,454,831,900/female
> 1,426,721,773)","55_64_years":"8.4% (male 291,435,881/female
> 305,185,398)","65_years_and_over":"8.2% (male 257,035,416/female 321,753,746)
> (2013 est.)"}} |
> | {"key":"dependency_ratios","value":{"total_dependency_ratio":"52
> %","youth_dependency_ratio":"39.9 %","elderly_dependency_ratio":"12.1
> %","potential_support_ratio":"8.3 (2013)"}} |
> +------------+
> *Adding a WHERE clause with conditions on this column fails:*
> > select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f where
> > f.p.`key` = 'languages';
> Query failed: RemoteRpcException: Failure while running fragment., languages
> [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
> [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> Logs indicate a NumberFormat Exception in the above case.
> *And query fails to parse in the below case*
> > select flatten(kvgen(f.`people`)).`value` as p from `factbook/world.json` f
> > limit 5;
> Query failed: ParseException: Encountered "." at line 1, column 34.
> Was expecting one of:
> "FROM" ...
> "," ...
> "AS" ...
> ....
> ....
> "OVER" ...
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> Rewriting using an inner query succeeds:
> select g.p.`value`.`note_3` from (select flatten(kvgen(f.`people`)) as p from
> `factbook/world.json` f) g where g.p.`key`='languages';
> +------------+
> | EXPR$0 |
> +------------+
> | approximately 2,300 languages are spoken in Asia, 2,150, in Africa, 1,311
> in the Pacific, 1,060 in the Americas, and 280 in Europe |
> +------------+
> *In both the failure cases the error message needs to be updated to indicate
> that the operation is not supported. The current error message and logs are
> not clear for an end user. *
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)