Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-4006: impala-config.sh contains dangerous rm -rf 
statements
......................................................................


Patch Set 1:

(20 comments)

> (12 comments)
 > 
 > Thanks for cleaning things up. Not sure if there is a easier way to
 > fix this other than adding quote everywhere ?

I was deeply motivated to fix this after one of the scripts deleted my whole 
git checkout including uncommited changes... Fortunately only a few. :)

Unfortunately there is no better fix to these problems than adding quotes. 
Please also see 
http://www.dwheeler.com/essays/filenames-in-shell.html#doublequote

http://gerrit.cloudera.org:8080/#/c/4078/1/bin/clean.sh
File bin/clean.sh:

> It doesn't look like you quoted any instances of CLUSTER_DIR/IMPALA_HOME et
Done


Line 48: rm -rf "${IMPALA_LOGS_DIR}"/*
> Is this much safer if $IMPALA_LOGS_DIR is empty?
Thanks for spotting this. This is actually even easier to trigger than the one 
that hit me.


PS1, Line 49: $IMPALA_ALL_LOGS_DIRS
> Won't this line suffer from the same problem ?
Yes and no. On one hand, this is supposed to contain multiple log, so having 
multiple space-separated values here is part of the normal workflow. On the 
other, that also means that any extra space inside the directory names becomes 
undistinguishable from the separators. The problem is that there is simply no 
safe way of storing multiple filenames in a single variable.

There are many other places in these scripts as well that do not handle special 
chars correctly in filenames and some of them are very hard to fix. Refactoring 
may not be worth the effort at this point, I just wanted to fix the most 
dangerous unquoted variable usages and while doing so I also handled some other 
ones that I noticed and that were easy to fix.


PS1, Line 53: pushd $IMPALA_HOME/be
> Same problem ?
Done


PS1, Line 59: pushd $IMPALA_HOME/shell
> Same problem ?
Done


PS1, Line 71: rm -f $IMPALA_HOME/llvm-ir/impala*.ll
            : rm -f $IMPALA_HOME/be/generated-sources/impala-ir/*
> Are these lines also unsafe ?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 39:     export IMPALA_HOME=$(dirname "$(cd $(dirname ${(%):-%x}) && pwd)")
> Why redirect for bash but not zsh?
Done


Line 120:   if [[ -z "$KUDU_BUILD_DIR" ]]; then
> Why quote KUDU_BUILD_DIR here but not above?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/impala-python
File bin/impala-python:

Line 2: source "$(dirname "$0")/impala-python-common.sh"
> The nested quoting here seems strange - does this work as intended?
Yes, it does, because the quotes inside $() are not matched agains the quotes 
outside:

$ bash -x /home/ifi/git/Impala\ ASF/bin/impala-python 
++ dirname '/home/ifi/git/Impala ASF/bin/impala-python'
+ source '/home/ifi/git/Impala ASF/bin/impala-python-common.sh'


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/run-all-tests.sh
File bin/run-all-tests.sh:

PS1, Line 101: LOG_DIR="${IMPALA_EE_TEST_LOGS_DIR}"
> Is this variable still used ?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/buildall.sh
File buildall.sh:

Line 248:     "$IMPALA_HOME/bin/clean.sh"
> Feel free to ignore but it should be sufficient to just have the quote arou
It's a matter of style, I usually include everything in the quotes unless there 
are wildcards involved. Let me know if you prefer the other way.


Line 279: if [ -e $HADOOP_LZO/build/native/Linux-*-*/lib/libgplcompression.so ]
> Did you deliberately omit quoting HADOOP_LZO here?
Done


Line 281:   cp $HADOOP_LZO/build/native/Linux-*-*/lib/libgplcompression.* 
$HADOOP_HOME/lib/native
> What about HADOOP_HOME?
Done


PS1, Line 312: METASTORE_SNAPSHOT_FILE
> Quote this path too?
Done


PS1, Line 321: $IMPALA_LZO
> Same problem ?
Done


PS1, Line 323: $IMPALA_LZO
> Same problem ?
Done


PS1, Line 323: IMPALA_LZO
> Quote IMPALA_LZO?
Done


Line 400:   if [[ $SNAPSHOT_FILE  && $METASTORE_SNAPSHOT_FILE ]]; then
> Quote the SNAPSHOT_FILE paths?
Done


PS1, Line 407:  ${CREATE_LOAD_DATA_ARGS}
> I guess this will just expand the paths anyway so there may not be any poin
We can't quote it here, because it contains arguments, separated by spaces. 
This is another example of a bash variable usage that can not be made safe. The 
script is doomed to fail at this point with spec chars in the filenames and 
refactoring is both non-trivial and error-prone.


Line 429:   echo "   . ${IMPALA_HOME}/bin/impala-config.sh"
> quote IMPALA_HOME?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to