> On May 4, 2014, 9:24 p.m., Aditya Kishore wrote: > > exec/java-exec/src/test/resources/functions/string/testSubstr.json, line 37 > > <https://reviews.apache.org/r/21058/diff/1/?file=574172#file574172line37> > > > > Could you please add a test case with non-English string, for example > > Hindi or Chinese. > > Yash Sharma wrote: > The characters like Hindi are not being handled currently. > I am getting IndexOutOfBoundsException since Hindi UTF-8 takes around > triple size as compared to English text. On adding Hindi strings in test case > I get the below exception. > Any tips on solving this? > > ----------------------------------------------------------------------------------------------------------------- > Physical plan input ????? ?????' : > > { ref: "col12", expr: "substring('????? ?????', 3,10)"}, > { ref: "col12", expr: "substring('????? ?????', 3)"} > > ----------------------------------------------------------------------------------------------------------------- > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.973 sec > <<< FAILURE! - in org.apache.drill.exec.physical.impl.TestStringFunctions > testSubstr(org.apache.drill.exec.physical.impl.TestStringFunctions) Time > elapsed: 3.517 sec <<< ERROR! > java.lang.IndexOutOfBoundsException: index: 0, length: 31 (expected: > range(0, 11)) > at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1130) > at > io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:341) > at io.netty.buffer.AbstractByteBuf.setBytes(AbstractByteBuf.java:502) > at io.netty.buffer.SwappedByteBuf.setBytes(SwappedByteBuf.java:396) > at > org.apache.drill.exec.vector.ValueHolderHelper.getVarCharHolder(ValueHolderHelper.java:49) > at > org.apache.drill.exec.test.generated.ProjectorGen0.doSetup(ProjectorTemplate.java:997) > at > org.apache.drill.exec.test.generated.ProjectorGen0.setup(ProjectorTemplate.java:90) > at > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:175) > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:53) > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:111) > at > org.apache.drill.exec.physical.impl.SimpleRootExec.next(SimpleRootExec.java:71) > at > org.apache.drill.exec.physical.impl.TestStringFunctions.runTest(TestStringFunctions.java:99) > at > org.apache.drill.exec.physical.impl.TestStringFunctions.testSubstr(TestStringFunctions.java:204) > > ----------------------------------------------------------------------------------------------------------------- >
Fixed! ------------------------------------------------------- EXPECTED ACTUAL ------------------------------------------------------- abc abc bcd bcd bcdef bcdef bcdef bcdef ???? ???? ???? ???? ???? ???? cdef cdef ????? ????? ------------------------------------------------------- - Yash ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21058/#review42105 ----------------------------------------------------------- On May 5, 2014, 10:40 a.m., Yash Sharma wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/21058/ > ----------------------------------------------------------- > > (Updated May 5, 2014, 10:40 a.m.) > > > Review request for drill, Aditya Kishore, Jacques Nadeau, Jinfeng Ni, and > Mehant Baid. > > > Repository: drill-git > > > Description > ------- > > Adding substr(expression, start) to improve string substring function. > This is also a bug fix for https://issues.apache.org/jira/browse/DRILL-542. > > > Diffs > ----- > > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java > aca5933 > > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestStringFunctions.java > 09d1361 > exec/java-exec/src/test/resources/functions/string/testSubstr.json e885381 > > Diff: https://reviews.apache.org/r/21058/diff/ > > > Testing > ------- > > Yes. > ---------------------------------------------------------------------------------------- > JUnit Test Case: > ---------------------------------------------------------------------------------------- > > $mvn test -Dtest=TestStringFunctions#testSubstr > > Results : > > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 > > [INFO] > ------------------------------------------------------------------------ > [INFO] BUILD SUCCESS > [INFO] > ------------------------------------------------------------------------ > [INFO] Total time: 53.030 s > [INFO] Finished at: 2014-05-04T16:08:26+05:30 > [INFO] Final Memory: 44M/711M > [INFO] > ------------------------------------------------------------------------ > > > ---------------------------------------------------------------------------------------- > SQLLINE Test > ---------------------------------------------------------------------------------------- > > 0: jdbc:drill:zk=local> SELECT employee_id, first_name, substring(first_name, > 3) sub_str FROM cp.`employee.json` limit 20; > +-------------+------------+------------+ > | employee_id | first_name | sub_str | > +-------------+------------+------------+ > | 1 | Sheri | eri | > | 2 | Derrick | rrick | > | 4 | Michael | chael | > | 5 | Maya | ya | > | 6 | Roberta | berta | > | 7 | Rebecca | becca | > | 8 | Kim | m | > | 9 | Brenda | enda | > | 10 | Darren | rren | > | 11 | Jonathan | nathan | > | 12 | Jewel | wel | > | 13 | Peggy | ggy | > | 14 | Bryan | yan | > | 15 | Walter | lter | > | 16 | Peggy | ggy | > | 17 | Brenda | enda | > | 18 | Daniel | niel | > | 19 | Dianne | anne | > | 20 | Beverly | verly | > | 21 | Pedro | dro | > +-------------+------------+------------+ > > > Thanks, > > Yash Sharma > >
