> On May 4, 2014, 9:24 p.m., Aditya Kishore wrote:
> > exec/java-exec/src/test/resources/functions/string/testSubstr.json, line 37
> > <https://reviews.apache.org/r/21058/diff/1/?file=574172#file574172line37>
> >
> >     Could you please add a test case with non-English string, for example 
> > Hindi or Chinese.
> 
> Yash Sharma wrote:
>     The characters like Hindi are not being handled currently. 
>     I am getting IndexOutOfBoundsException since Hindi UTF-8 takes around 
> triple size as compared to English text. On adding Hindi strings in test case 
> I get the below exception. 
>     Any tips on solving this?
>     
> -----------------------------------------------------------------------------------------------------------------
>     Physical plan input ????? ?????' :
>     
>     { ref: "col12", expr: "substring('????? ?????', 3,10)"},
>     { ref: "col12", expr: "substring('????? ?????', 3)"}
>     
> -----------------------------------------------------------------------------------------------------------------
>     Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.973 sec 
> <<< FAILURE! - in org.apache.drill.exec.physical.impl.TestStringFunctions
>     testSubstr(org.apache.drill.exec.physical.impl.TestStringFunctions)  Time 
> elapsed: 3.517 sec  <<< ERROR!
>     java.lang.IndexOutOfBoundsException: index: 0, length: 31 (expected: 
> range(0, 11))
>       at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1130)
>       at 
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:341)
>       at io.netty.buffer.AbstractByteBuf.setBytes(AbstractByteBuf.java:502)
>       at io.netty.buffer.SwappedByteBuf.setBytes(SwappedByteBuf.java:396)
>       at 
> org.apache.drill.exec.vector.ValueHolderHelper.getVarCharHolder(ValueHolderHelper.java:49)
>       at 
> org.apache.drill.exec.test.generated.ProjectorGen0.doSetup(ProjectorTemplate.java:997)
>       at 
> org.apache.drill.exec.test.generated.ProjectorGen0.setup(ProjectorTemplate.java:90)
>       at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:175)
>       at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:53)
>       at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:111)
>       at 
> org.apache.drill.exec.physical.impl.SimpleRootExec.next(SimpleRootExec.java:71)
>       at 
> org.apache.drill.exec.physical.impl.TestStringFunctions.runTest(TestStringFunctions.java:99)
>       at 
> org.apache.drill.exec.physical.impl.TestStringFunctions.testSubstr(TestStringFunctions.java:204)
>     
> -----------------------------------------------------------------------------------------------------------------
>

Fixed!
-------------------------------------------------------
EXPECTED        ACTUAL
-------------------------------------------------------
abc             abc
bcd             bcd
bcdef           bcdef
bcdef           bcdef           
????                         ????
????            ????             ????           ????
cdef            cdef            
?????                        ?????
-------------------------------------------------------


- Yash


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21058/#review42105
-----------------------------------------------------------


On May 5, 2014, 10:40 a.m., Yash Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21058/
> -----------------------------------------------------------
> 
> (Updated May 5, 2014, 10:40 a.m.)
> 
> 
> Review request for drill, Aditya Kishore, Jacques Nadeau, Jinfeng Ni, and 
> Mehant Baid.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Adding substr(expression, start) to improve string substring function.
> This is also a bug fix for https://issues.apache.org/jira/browse/DRILL-542.
> 
> 
> Diffs
> -----
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
>  aca5933 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestStringFunctions.java
>  09d1361 
>   exec/java-exec/src/test/resources/functions/string/testSubstr.json e885381 
> 
> Diff: https://reviews.apache.org/r/21058/diff/
> 
> 
> Testing
> -------
> 
> Yes.
> ----------------------------------------------------------------------------------------
> JUnit Test Case:
> ----------------------------------------------------------------------------------------
> 
>  $mvn test -Dtest=TestStringFunctions#testSubstr
> 
> Results :
> 
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 53.030 s
> [INFO] Finished at: 2014-05-04T16:08:26+05:30
> [INFO] Final Memory: 44M/711M
> [INFO] 
> ------------------------------------------------------------------------
> 
> 
> ----------------------------------------------------------------------------------------
> SQLLINE Test
> ----------------------------------------------------------------------------------------
> 
> 0: jdbc:drill:zk=local> SELECT employee_id, first_name, substring(first_name, 
> 3) sub_str FROM cp.`employee.json` limit 20; 
> +-------------+------------+------------+
> | employee_id | first_name |  sub_str   |
> +-------------+------------+------------+
> | 1           | Sheri      | eri        |
> | 2           | Derrick    | rrick      |
> | 4           | Michael    | chael      |
> | 5           | Maya       | ya         |
> | 6           | Roberta    | berta      |
> | 7           | Rebecca    | becca      |
> | 8           | Kim        | m          |
> | 9           | Brenda     | enda       |
> | 10          | Darren     | rren       |
> | 11          | Jonathan   | nathan     |
> | 12          | Jewel      | wel        |
> | 13          | Peggy      | ggy        |
> | 14          | Bryan      | yan        |
> | 15          | Walter     | lter       |
> | 16          | Peggy      | ggy        |
> | 17          | Brenda     | enda       |
> | 18          | Daniel     | niel       |
> | 19          | Dianne     | anne       |
> | 20          | Beverly    | verly      |
> | 21          | Pedro      | dro        |
> +-------------+------------+------------+
> 
> 
> Thanks,
> 
> Yash Sharma
> 
>

Reply via email to