Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9980 )

Change subject: IMPALA-6723: [DOCS] Hints for CTAS
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9980/2/docs/topics/impala_hints.xml
File docs/topics/impala_hints.xml:

http://gerrit.cloudera.org:8080/#/c/9980/2/docs/topics/impala_hints.xml@59
PS2, Line 59:         Inserting into partitioned Parquet tables, where many 
memory buffers could be allocated on each host to
            :         hold intermediate results for each partition.
            :       </li>
            :       <li>
            :         Creating a table based on column definitions from another 
table and
            :         copying data from the source table,
These two use cases are practically the same, as CTAS is practically a CREATE + 
an INSERT...SELECT, so I would try to merge these into one paragraph, or remove 
the CTAS version from this example section.


http://gerrit.cloudera.org:8080/#/c/9980/2/docs/topics/impala_hints.xml@287
PS2, Line 287: does not add exchange node before
             :             inserting to partitioned tables and disables 
re-partitioning.
I think that it would be clearer if this was explained in the SHUFFLE 
paragraph, like "SHUFFLE adds an exchange node which re-partitions the result 
of the SELECT based on the partitioning columns of the target table. This makes 
a partition to be written only by a single node, ..."


http://gerrit.cloudera.org:8080/#/c/9980/2/docs/topics/impala_hints.xml@319
PS2, Line 319: partition
I think that "by each partitioning column" would be clearer.


http://gerrit.cloudera.org:8080/#/c/9980/2/docs/topics/impala_hints.xml@328
PS2, Line 328:             <codeph>/* +NOCLUSTERED */</codeph> does not sort by 
primary key
             :             before insert. Use this hint when inserting to Kudu 
tables. This
             :             hint has no effect on HDFS tables currently as this 
is the default
             :             behavior. This hint is available in <keyword 
keyref="impala28_full"
             :             /> or higher. </li>
Since IMPALA-5293 (Impala 3.x only) the default is CLUSTERED behavior for HDFS 
tables, so NOCLUSTERED will have effect on them.



--
To view, visit http://gerrit.cloudera.org:8080/9980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01057d59127d6193bdb69a2a181a386a039e08f5
Gerrit-Change-Number: 9980
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni <arod...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Wed, 11 Apr 2018 14:42:51 +0000
Gerrit-HasComments: Yes

Reply via email to