[
https://issues.apache.org/jira/browse/KYLIN-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894835#comment-16894835
]
ASF subversion and git services commented on KYLIN-4106:
--------------------------------------------------------
Commit d2970d821149428e103277b7dd968cee28863da6 in kylin's branch
refs/heads/master from langdamao
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=d2970d8 ]
KYLIN-4106 fix Illegal partition for SelfDefineSortableKey
Signed-off-by: langdamao <[email protected]>
> Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct
> Columns”
> --------------------------------------------------------------------------------------
>
> Key: KYLIN-4106
> URL: https://issues.apache.org/jira/browse/KYLIN-4106
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine
> Affects Versions: v2.6.1, v2.6.2
> Reporter: langdamao
> Assignee: langdamao
> Priority: Critical
> Labels: easyfix
> Fix For: v2.6.4
>
>
> We got this error when Extract Fact Table Distinct Columns @kylin 2.6.1
>
> {code:java}
> Error: java.io.IOException: Illegal partition for
> org.apache.kylin.engine.mr.steps.SelfDefineSortableKey@6b69761b (254)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1096)
> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
> at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.writeFieldValue(FactDistinctColumnsMapper.java:
> 281) at
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:186)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}
> I've found the problem in the follow code in
> *FactDistinctColumnsReducerMapping.java – engine-mr*
> {code:java}
> public int getReducerIdForCol(int colId, Object fieldValue) {
> int begin = colIdToReducerBeginId[colId];
> int span = colIdToReducerBeginId[colId + 1] - begin;
>
> if (span == 1)
> return begin;
>
> int hash = fieldValue == null ? 0 : fieldValue.hashCode();
> return begin + Math.abs(hash) % span;
> }
> {code}
> for the error rowkey it's begin=1, span=5 ,and we got hash=-2147483648
> ,meanwhile Math.abs(-2147483648) return -2147483648 ,so for the above code it
> return -2 ( which was 254 while unsigned).
> this will also cause problem bellow when Function getReduerIdForCol return
> -1 (when begin=1,span=3,hash= -2147483648) ,because value write to rowkey
> reducer is empty_text , but No. -1 reducer need value text
> {code:java}
> Error: java.nio.BufferUnderflowException at
> java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.HeapByteBuffer.get(Heap.ByteBuffer.java:135)
> at org.apache.kylin.measure.hllc.HLLCounter.readRegisters(HLLCounter.java:327)
> at
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:145)
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:60)
> ...{code}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)