I create a jira ticket here :
https://issues.apache.org/jira/browse/KYLIN-1104

2015-10-27 11:50 GMT+08:00 yu feng <[email protected]>:

> Hi all, I get error in step "Build Base Cuboid Data" when I build a new
> cube, After modify source code and check the error log I find
> those stacktrace:
> java.lang.ArrayIndexOutOfBoundsException
> at org.apache.kylin.common.util.BytesSplitter.split(BytesSplitter.java:68)
> at
> org.apache.kylin.job.hadoop.cube.BaseCuboidMapper.map(BaseCuboidMapper.java:212)
> at
> org.apache.kylin.job.hadoop.cube.BaseCuboidMapper.map(BaseCuboidMapper.java:55)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> split 0, value length 4096, real length 4876
>
> the last line is debug info added by myself, and I exchange those two line
> :                ex.printStackTrace(System.err);
>  System.err.println("Insane record: " + bytesSplitter);
>
> in BaseCuboidMapper.handleErrorRecord function,
>
> With those infomations I find the original reason is this job
> create bytesSplitter = new BytesSplitter(200, 4096); in setup, Once the
> length of my dimension value is bigger than 4096,
>  ArrayIndexOutOfBoundsException will be throwed in BytesSplitter.split, and
> in mapper function this exception will be catched(I guess maybe kylin take
> this row as an incorrect row or do not think about this situation), then
> call handleErrorRecord. However, in this function, it will print the splits
> infos like this :
> System.err.println("Insane record: " + bytesSplitter);
>
> which will call bytesSplitter.toString() :
> public String toString() {
>         StringBuilder buf = new StringBuilder();
>         buf.append("[");
>         for (int i = 0; i < bufferSize; i++) {
>             if (i > 0)
>                 buf.append(", ");
>
>             buf.append(Bytes.toString(splitBuffers[i].value, 0,
> splitBuffers[i].length));
>         }
>         return buf.toString();
>     }
>
> this function will convert bytes to string and add to a StringBuffer
> object, But in the conversion, the input is splitBuffers[i], which length
> is my column value length(in my example is 4876, and the length is setted
> before copy data in BytesSplitter.split ), and the array was just allocated
> 4096 bytes, That will casue another ArrayIndexOutOfBoundsException and make
> the job failed.
>
> I think the 4096 is the max dimension value length, Is it necessary to
> make is a config property, and we should catch the
> ArrayIndexOutOfBoundsException, Otherwise, I can not go on with my cube
> building.
>
>

Reply via email to