John Russell has posted comments on this change.

Change subject: IMPALA-3398: Rework Impala documentation to be 

Patch Set 2:


Because I'm advocating re-doing the automated portions of this change at a 
later time, I don't want to promote this patch set as-is. Is it practical to 
say "abandon" for the code review but still come back and pick out bits and 
pieces (e.g. the changes in impala_jdbc.xml) to become part of a later code 
Commit Message:

PS2, Line 16:     sed -i 's/ rev=""//g' $FILENAME
Let's leave this step out, for a couple of reasons:
- There's no harm to have rev="" with an empty attribute value.
- There are some cases where rev="CDH-xyz" could also be annotated with the 
IMPALA- number, so I think it would be useful to leave behind the empty 
attribute as a reminder that this was a change due to bug fix or feature 
- Sometimes for new feature docs, I add rev="" when I don't have the JIRA 
number handy and come back and fill it in later.
File docs/impala_html.ditaval:

PS2, Line 6: <prop att="audience" val="Hidden" action="exclude"/>
That's fine. We will need to make the same addition in the Cloudera doc build 
so that the attribute name works transparently in both contexts.
File docs/topics/impala_alter_table.xml:

PS2, Line 25:       <indexterm audience="Hidden">ALTER TABLE 
For the <indexterm> tags, maybe we could change those to audience="HTML" so 
they were visible in the PDF. The Cloudera doc team has always been resistant 
to using index entries but I do have them (just hidden) all through the doc 
source. We should do a trial and see if enabling them produces a decent-looking 
index in the PDF.

PS2, Line 220:     <p rev="2.3.0 IMPALA-1568">
The reason I'm so liberal with release numbers and JIRA numbers with this rev= 
attribute is in hopes of automating an "errata" mechanism in future. I.e. "what 
changed in such-and-such a release, what changed due to such-and-such an 
issue". I have an 80% working prototype that I should demo for you guys, to 
judge should that be something that becomes part of the upstream docs or 
continue development on it as a Cloudera-only extension.

PS2, Line 406:     <p>
Here's a case where perhaps there is a corresponding IMPALA- JIRA number we 
should dig up and include in the rev= attribute. That's why I would say to 
leave an empty rev="" behind, so as not to lose that aspect of the history.
File docs/topics/impala_conversion_functions.xml:

PS2, Line 30:       Although in <keyword keyref="impala23_full"/>, the 
<codeph>SHOW FUNCTIONS</codeph> output for
Just checking that this was a change introduced by Jim. This is a case where we 
could re-eveluate whether to take out that note, if the situation it refers to 
is no longer applicable. (I haven't confirmed in this particular case yet.)

PS2, Line 443: <dlentry rev="2.3.0" id="casttosmallint" audience="Hidden">
These castto*() functions might be intended to be permanently for internal 
Impala use only, in which case we could remove the corresponding info entirely.

PS2, Line 619: <dlentry rev="2.3.0" id="typeof">
Hmm, this might have a corresponding IMPALA- JIRA number that should go in the 
File docs/topics/impala_jdbc.xml:

PS2, Line 82:         In Impala 2.0 and later, you must use the Hive 0.13 JDBC 
driver.  If you are
            :         already using JDBC applications with an earlier Impala 
release, you must update
            :         your JDBC driver, because the Hive 0.12 driver that was 
formerly the only choice
            :         is not compatible with Impala
Urgh. "Must" seems like too strong a statement. I wonder if the right thing to 
do here is to say you definitely can use the Hive driver, and maybe there's a 
vendor-specific driver. Because so many times in the past we've had to warn 
some people away from the Hive driver for performance reasons, and direct other 
people to the Hive driver to work around some bug in the Cloudera driver.

PS2, Line 120:           To get the JAR files, install the Hive JDBC driver on 
each host in the cluster that will run
It's worth checking if there's a generic Hadoop or Hive URL we could refer to 

PS2, Line 139:   hive-common-X.XX.X.jar
             :   hive-jdbc-X.XX.X.jar
             :   hive-metastore-X.XX.X.jar
             :   hive-servi
We should confirm that these filenames are accurate if someone goes the generic 
route when installing the Hive driver.

PS2, Line 161:               <xref 
scope="external" format="html">this
Ironic that there's a reference to "Cloudera Impala" that perhaps we could 
never actually get rid of.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple <>
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: Greg Rahn <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: John Russell <>
Gerrit-HasComments: Yes

Reply via email to