[jira] [Work started] (IMPALA-7400) "SQL Statements to Remove or Adapt" is out of date

Alex Rodoni (JIRA) Mon, 06 Aug 2018 16:23:18 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Work on IMPALA-7400 started by Alex Rodoni.
-------------------------------------------
> "SQL Statements to Remove or Adapt" is out of date
> --------------------------------------------------
>
>                 Key: IMPALA-7400
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7400
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Docs
>    Affects Versions: Impala 3.0
>            Reporter: Tim Armstrong
>            Assignee: Alex Rodoni
>            Priority: Major
>              Labels: docs
>
> "Impala has no DELETE statement." and "Impala has no UPDATE statement. " are 
> not totally true - Impala has those statements but only for Kudu tables.
> "For example, Impala does not support natural joins or anti-joins," - Impala 
> does support Anti-joins via NOT IN/NOT EXISTS or even explicitly like:
> {code}
> select * from functional.alltypes a1 left anti join functional.alltypestiny 
> a2 on a1.id = a2.id;
> {code}
> "Within queries, Impala requires query aliases for any subqueries:" - this is 
> only true for subqueries used as inline views in the FROM clause. E.g. the 
> following works:
> {code}
> select * from functional.alltypes where id = (select min(id) from 
> functional.alltypes);
> {code}
> " Impala .. requires the CROSS JOIN operator for Cartesian products." - 
> untrue, this works:
> {code}
> select * from functional.alltypes t1, functional.alltypes t2;
> {code}
> "Have you run the COMPUTE STATS statement on each table involved in join 
> queries". This isn't specific to queries with joins, although may have more 
> impact. We recommend that users run COMPUTE STATS on all tables.
> "A CREATE TABLE statement with no PARTITIONED BY clause stores all the data 
> files in the same physical location," - unpartitioned tables with multiple 
> files can have files residing in different locations (and there are already 3 
> replicas per file by default, so the statement is a little misleading even if 
> there's a single file). I think the latest statement about "Have you 
> partitioned at the right granularity so that there is enough data in each 
> partition to parallelize the work for each query?" is also misleading for the 
> same reason.
> "The INSERT ... VALUES syntax is suitable for setting up toy tables with a 
> few rows for functional testing, but because each such statement creates a 
> separate tiny file in HDFS". This advice only applies to HDFS, this should 
> work fine for Kudu tables although the INSERT statements are not particularly 
> fast.
> "The number of expressions allowed in an Impala query might be smaller than 
> for some other database systems, causing failures for very complicated 
> queries" - this doesn't seem right - I don't know why the queries would fail. 
> Also the codegen time isn't really specific to expressions or where clauses. 
> There seems to be a point buried in there, but maybe it's just essentially 
> that "Complex queries may have high codegen time"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work started] (IMPALA-7400) "SQL Statements to Remove or Adapt" is out of date

Reply via email to