[
https://issues.apache.org/jira/browse/IMPALA-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on IMPALA-7400 started by Alex Rodoni.
-------------------------------------------
> "SQL Statements to Remove or Adapt" is out of date
> --------------------------------------------------
>
> Key: IMPALA-7400
> URL: https://issues.apache.org/jira/browse/IMPALA-7400
> Project: IMPALA
> Issue Type: Bug
> Components: Docs
> Affects Versions: Impala 3.0
> Reporter: Tim Armstrong
> Assignee: Alex Rodoni
> Priority: Major
> Labels: docs
>
> "Impala has no DELETE statement." and "Impala has no UPDATE statement. " are
> not totally true - Impala has those statements but only for Kudu tables.
> "For example, Impala does not support natural joins or anti-joins," - Impala
> does support Anti-joins via NOT IN/NOT EXISTS or even explicitly like:
> {code}
> select * from functional.alltypes a1 left anti join functional.alltypestiny
> a2 on a1.id = a2.id;
> {code}
> "Within queries, Impala requires query aliases for any subqueries:" - this is
> only true for subqueries used as inline views in the FROM clause. E.g. the
> following works:
> {code}
> select * from functional.alltypes where id = (select min(id) from
> functional.alltypes);
> {code}
> " Impala .. requires the CROSS JOIN operator for Cartesian products." -
> untrue, this works:
> {code}
> select * from functional.alltypes t1, functional.alltypes t2;
> {code}
> "Have you run the COMPUTE STATS statement on each table involved in join
> queries". This isn't specific to queries with joins, although may have more
> impact. We recommend that users run COMPUTE STATS on all tables.
> "A CREATE TABLE statement with no PARTITIONED BY clause stores all the data
> files in the same physical location," - unpartitioned tables with multiple
> files can have files residing in different locations (and there are already 3
> replicas per file by default, so the statement is a little misleading even if
> there's a single file). I think the latest statement about "Have you
> partitioned at the right granularity so that there is enough data in each
> partition to parallelize the work for each query?" is also misleading for the
> same reason.
> "The INSERT ... VALUES syntax is suitable for setting up toy tables with a
> few rows for functional testing, but because each such statement creates a
> separate tiny file in HDFS". This advice only applies to HDFS, this should
> work fine for Kudu tables although the INSERT statements are not particularly
> fast.
> "The number of expressions allowed in an Impala query might be smaller than
> for some other database systems, causing failures for very complicated
> queries" - this doesn't seem right - I don't know why the queries would fail.
> Also the codegen time isn't really specific to expressions or where clauses.
> There seems to be a point buried in there, but maybe it's just essentially
> that "Complex queries may have high codegen time"
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]