-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11106/#review20512
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42276>

    please add javadoc comment for purpose of class



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42278>

    put comment for why this is here



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42280>

    explain more clearly what this function does



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42281>

    if you use a negative start index -n, the existing Hive code 
(non-vectorized) seems to take the tail end total n characters. e.g. 
substr("foo", -2) is "oo". If you use -n and n is greater than the string 
length, the output is the empty string. Please handle this case with the same 
behavior as non vectorized hive.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42282>

    please run ant checkstyle and follow suggestions, e.g. there is no blank 
after if before (



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42283>

    also set output value to empty string if output is null
    
    outV.noNulls needs to get set for every case and doesn't get set here



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42289>

    len[0] - offset could be negative.
    
    Do you need to use len[0] - (start[0] - offset)?
    
    Make sure you have unit tests for case where start != 0.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
<https://reviews.apache.org/r/11106/#comment42296>

    I've heard that if the common case is the first case, things can run 
faster. You could reverse these and make the test for offset <> -1.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42305>

    need to handle negative start index case. It appears your code could get 
array out of bounds in that case.
    
    Also, for substrLength <= 0, result should be empty string



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42308>

    should set isRepeating to false for default case



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42306>

    need to check noNulls first. If noNulls then you can't look into isNull 
array or you could see invalid data



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42309>

    Is this output supposed to be null or empty string?
    
    My add-hoc tests seemed to show it was empty string. You should double 
check.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42310>

    need to set outV.isNull[I] to true or false always, unless you are setting 
outV.noNulls to true.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42311>

    second argument to VEctorizedRowBatch constructor should not be use (it 
defaults to correct value)



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42314>

    argument is not needed -- use default constructor with no args



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42312>

    you need to test for some data with multi-byte characters. There is an 
example of that someplace else in the tests.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42315>

    need to try to test the case where data start position is not 0



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42313>

    need to verify the other rows besides 0 are always set to not null. The 
isNull entries for them could have been true by chance from a previous use of 
the batch


- Eric Hanson


On May 13, 2013, 9:54 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11106/
> -----------------------------------------------------------
> 
> (Updated May 13, 2013, 9:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Description
> -------
> 
> Add Vectorized Substr
> 
> 
> This addresses bug HIVE-4495.
>     https://issues.apache.org/jira/browse/HIVE-4495
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStart.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
>  6e26412 
> 
> Diff: https://reviews.apache.org/r/11106/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>

Reply via email to