[
https://issues.apache.org/jira/browse/KYLIN-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365245#comment-15365245
]
Richard Calaba edited comment on KYLIN-1834 at 7/6/16 10:53 PM:
----------------------------------------------------------------
To further debug the issue I have modified TrieDictionary.java to add
additional log info to method getIdFromValueBytesImpl:
@Override
protected int getIdFromValueBytesImpl(byte[] value, int offset, int len,
int roundingFlag) {
int seq = lookupSeqNoFromValue(headSize, value, offset, offset + len,
roundingFlag);
int id = calcIdFromSeqNo(seq);
if (id < 0)
{
logger.error("Not a valid value: " +
bytesConvert.convertFromBytes(value, offset, len));
logger.error("Seq (="+seq+") returned by
lookupSeqNoFromValue (headSize="+headSize+", value="+value+",
offset="+offset+", len="+len+", roundingFlag="+roundingFlag);
logger.error("Id (="+id+") returned by
calcIdFromSeqNo(seq) with nValues="+nValues+", baseId="+baseId);
}
return id;
}
Now I see this in kylin log:
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:174 : Not a
valid value: -2857007631392161431
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:175 : Seq
(=-1) returned by lookupSeqNoFromValue (headSize=64, value=[B@12647ae0,
offset=0, len=20, roundingFlag=0
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:176 : Id
(=-1) returned by calcIdFromSeqNo(seq) with nValues=44703717, baseId=0
2016-07-06 16:57:16,917 ERROR [pool-2-thread-7] execution.AbstractExecutable:62
: error execute
HadoopShellExecutable{id=21521c0a-c06f-4ee9-b682-2c468bfaf526-03, name=Build
Dimension Dictionary, state=RUNNING}
java.lang.IllegalArgumentException: Value not exists!
at
org.apache.kylin.dimension.Dictionary.getIdFromValueBytes(Dictionary.java:160
So definitely the method lookupSeqNoFromValue fails while trying to encode the
value:
nValues= 44703717 - ??? not sure where this number comes from ????
- # of distinct ids (customer_id) in the fact table is 10 873 977
- # of distinct ids (customer_id) in the lookup table is 13 645 863
- # of distinct IDs (transaction_id) - another high cardinality dimension
withouth lookup table - is 115 732 839
- # of distinct combinations in fact table of date / customer_id (2nd
lookup table in the model using the high card. dimension) is 31 663 787
So no idea where the nValues= 44703717 comes from ...
Method lookupSeqNoFromValue source:
================================
private int lookupSeqNoFromValue(int n, byte[] inp, int o, int inpEnd, int
roundingFlag) {
if (o == inpEnd) // special 'empty' value
return checkFlag(headSize, BIT_IS_END_OF_VALUE) ? 0 :
roundSeqNo(roundingFlag, -1, -1, 0);
int seq = 0; // the sequence no under track
while (true) {
// match the current node, note [0] of node's value has been matched
// when this node is selected by its parent
int p = n + firstByteOffset; // start of node's value
int end = p + BytesUtil.readUnsigned(trieBytes, p - 1, 1); // end
of node's value
for (p++; p < end && o < inpEnd; p++, o++) { // note matching start
from [1]
if (trieBytes[p] != inp[o]) {
int comp = BytesUtil.compareByteUnsigned(trieBytes[p],
inp[o]);
if (comp < 0) {
seq += BytesUtil.readUnsigned(trieBytes, n +
sizeChildOffset, sizeNoValuesBeneath);
}
return roundSeqNo(roundingFlag, seq - 1, -1, seq); //
mismatch
}
}
// node completely matched, is input all consumed?
boolean isEndOfValue = checkFlag(n, BIT_IS_END_OF_VALUE);
if (o == inpEnd) {
return p == end && isEndOfValue ? seq :
roundSeqNo(roundingFlag, seq - 1, -1, seq); // input all matched
}
if (isEndOfValue)
seq++;
// find a child to continue
int c = headSize + (BytesUtil.readUnsigned(trieBytes, n,
sizeChildOffset) & childOffsetMask);
if (c == headSize) // has no children
return roundSeqNo(roundingFlag, seq - 1, -1, seq); // input
only partially matched
byte inpByte = inp[o];
int comp;
while (true) {
p = c + firstByteOffset;
comp = BytesUtil.compareByteUnsigned(trieBytes[p], inpByte);
if (comp == 0) { // continue in the matching child, reset n and
loop again
n = c;
o++;
break;
} else if (comp < 0) { // try next child
seq += BytesUtil.readUnsigned(trieBytes, c +
sizeChildOffset, sizeNoValuesBeneath);
if (checkFlag(c, BIT_IS_LAST_CHILD))
return roundSeqNo(roundingFlag, seq - 1, -1, seq); //
no child can match the next byte of input
c = p + BytesUtil.readUnsigned(trieBytes, p - 1, 1);
} else { // children are ordered by their first value byte
return roundSeqNo(roundingFlag, seq - 1, -1, seq); // no
child can match the next byte of input
}
}
}
}
was (Author: [email protected]):
To further debug the issue I have modified TrieDictionary.java to add
additional log info to method getIdFromValueBytesImpl:
@Override
protected int getIdFromValueBytesImpl(byte[] value, int offset, int len,
int roundingFlag) {
int seq = lookupSeqNoFromValue(headSize, value, offset, offset + len,
roundingFlag);
int id = calcIdFromSeqNo(seq);
if (id < 0)
{
logger.error("Not a valid value: " +
bytesConvert.convertFromBytes(value, offset, len));
logger.error("Seq (="+seq+") returned by
lookupSeqNoFromValue (headSize="+headSize+", value="+value+",
offset="+offset+", len="+len+", roundingFlag="+roundingFlag);
logger.error("Id (="+id+") returned by
calcIdFromSeqNo(seq) with nValues="+nValues+", baseId="+baseId);
}
return id;
}
Now I see this in kylin log:
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:174 : Not a
valid value: -2857007631392161431
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:175 : Seq
(=-1) returned by lookupSeqNoFromValue (headSize=64, value=[B@12647ae0,
offset=0, len=20, roundingFlag=0
2016-07-06 16:57:16,912 ERROR [pool-2-thread-7] dict.TrieDictionary:176 : Id
(=-1) returned by calcIdFromSeqNo(seq) with nValues=44703717, baseId=0
2016-07-06 16:57:16,917 ERROR [pool-2-thread-7] execution.AbstractExecutable:62
: error execute
HadoopShellExecutable{id=21521c0a-c06f-4ee9-b682-2c468bfaf526-03, name=Build
Dimension Dictionary, state=RUNNING}
java.lang.IllegalArgumentException: Value not exists!
at
org.apache.kylin.dimension.Dictionary.getIdFromValueBytes(Dictionary.java:160
So definitely the method lookupSeqNoFromValue fails while trying to encode the
value:
nValues= 44703717 - not sure where this number comes from - # of distinct ids
in the dimenson is approx. 13 mio
Method lookupSeqNoFromValue source:
================================
private int lookupSeqNoFromValue(int n, byte[] inp, int o, int inpEnd, int
roundingFlag) {
if (o == inpEnd) // special 'empty' value
return checkFlag(headSize, BIT_IS_END_OF_VALUE) ? 0 :
roundSeqNo(roundingFlag, -1, -1, 0);
int seq = 0; // the sequence no under track
while (true) {
// match the current node, note [0] of node's value has been matched
// when this node is selected by its parent
int p = n + firstByteOffset; // start of node's value
int end = p + BytesUtil.readUnsigned(trieBytes, p - 1, 1); // end
of node's value
for (p++; p < end && o < inpEnd; p++, o++) { // note matching start
from [1]
if (trieBytes[p] != inp[o]) {
int comp = BytesUtil.compareByteUnsigned(trieBytes[p],
inp[o]);
if (comp < 0) {
seq += BytesUtil.readUnsigned(trieBytes, n +
sizeChildOffset, sizeNoValuesBeneath);
}
return roundSeqNo(roundingFlag, seq - 1, -1, seq); //
mismatch
}
}
// node completely matched, is input all consumed?
boolean isEndOfValue = checkFlag(n, BIT_IS_END_OF_VALUE);
if (o == inpEnd) {
return p == end && isEndOfValue ? seq :
roundSeqNo(roundingFlag, seq - 1, -1, seq); // input all matched
}
if (isEndOfValue)
seq++;
// find a child to continue
int c = headSize + (BytesUtil.readUnsigned(trieBytes, n,
sizeChildOffset) & childOffsetMask);
if (c == headSize) // has no children
return roundSeqNo(roundingFlag, seq - 1, -1, seq); // input
only partially matched
byte inpByte = inp[o];
int comp;
while (true) {
p = c + firstByteOffset;
comp = BytesUtil.compareByteUnsigned(trieBytes[p], inpByte);
if (comp == 0) { // continue in the matching child, reset n and
loop again
n = c;
o++;
break;
} else if (comp < 0) { // try next child
seq += BytesUtil.readUnsigned(trieBytes, c +
sizeChildOffset, sizeNoValuesBeneath);
if (checkFlag(c, BIT_IS_LAST_CHILD))
return roundSeqNo(roundingFlag, seq - 1, -1, seq); //
no child can match the next byte of input
c = p + BytesUtil.readUnsigned(trieBytes, p - 1, 1);
} else { // children are ordered by their first value byte
return roundSeqNo(roundingFlag, seq - 1, -1, seq); // no
child can match the next byte of input
}
}
}
}
> java.lang.IllegalArgumentException: Value not exists! - in Step 4 - Build
> Dimension Dictionary
> ----------------------------------------------------------------------------------------------
>
> Key: KYLIN-1834
> URL: https://issues.apache.org/jira/browse/KYLIN-1834
> Project: Kylin
> Issue Type: Bug
> Affects Versions: v1.5.2, v1.5.2.1
> Reporter: Richard Calaba
> Priority: Blocker
> Attachments: job_2016_06_28_09_59_12-value-not-found.zip
>
>
> Getting exception in Step 4 - Build Dimension Dictionary:
> java.lang.IllegalArgumentException: Value not exists!
> at
> org.apache.kylin.dimension.Dictionary.getIdFromValueBytes(Dictionary.java:160)
> at
> org.apache.kylin.dict.TrieDictionary.getIdFromValueImpl(TrieDictionary.java:158)
> at
> org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:96)
> at
> org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:76)
> at
> org.apache.kylin.dict.lookup.SnapshotTable.takeSnapshot(SnapshotTable.java:96)
> at
> org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:106)
> at
> org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:215)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:59)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
> at
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:60)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> result code:2
> The code which generates the exception is:
> org.apache.kylin.dimension.Dictionary.java:
> /**
> * A lower level API, return ID integer from raw value bytes. In case of
> not found
> * <p>
> * - if roundingFlag=0, throw IllegalArgumentException; <br>
> * - if roundingFlag<0, the closest smaller ID integer if exist; <br>
> * - if roundingFlag>0, the closest bigger ID integer if exist. <br>
> * <p>
> * Bypassing the cache layer, this could be significantly slower than
> getIdFromValue(T value).
> *
> * @throws IllegalArgumentException
> * if value is not found in dictionary and rounding is off;
> * or if rounding cannot find a smaller or bigger ID
> */
> final public int getIdFromValueBytes(byte[] value, int offset, int len,
> int roundingFlag) throws IllegalArgumentException {
> if (isNullByteForm(value, offset, len))
> return nullId();
> else {
> int id = getIdFromValueBytesImpl(value, offset, len,
> roundingFlag);
> if (id < 0)
> throw new IllegalArgumentException("Value not exists!");
> return id;
> }
> }
> ==========================================================
> The Cube is big - fact 110 mio rows, the largest dimension (customer) has 10
> mio rows. I have increased the JVM -Xmx to 16gb and set the
> kylin.table.snapshot.max_mb=2048 in kylin.properties to make sure the Cube
> build doesn't fail (previously we were getting exception complaining about
> the 300MB limit for Dimension dictionary size (req. approx 700MB)).
> ==========================================================
> Before that we were getting exception complaining about the Dictionary
> encoding problem - "Too high cardinality is not suitable for dictionary --
> cardinality: 10873977" - this we resolved by changing the affected
> dimension/row key Encoding from "dict" to "int; length=8" on the Advanced
> Settings of the Cube.
> ==========================================================
> We have 2 high-cardinality fields (one from fact table and one from the big
> dimension (customer - see above). We need to use in distinc_count measure for
> our calculations. I wonder if this exception Value not found! is somewhat
> related ??? Those count_distinct measures are defined one with return type
> "bitmap" (exact precission - only for Int columns) and 2nd with return type
> "hllc16" (error rate <= 1.22 %)
> ==========================================================
> I am looking for any clues to debug the cause of this error and way how to
> circumwent this ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)