[
https://issues.apache.org/jira/browse/KYLIN-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129938#comment-17129938
]
ASF GitHub Bot commented on KYLIN-4547:
---------------------------------------
qin7972 commented on pull request #1245:
URL: https://github.com/apache/kylin/pull/1245#issuecomment-641669948
@shaofengshi
The crc file comes from hdfs.The path:
${yarn.nodemanager.local-dirs}/usercache/root/appcache/application_1591713180468_0013/container_1591713180468_0013_01_000001/meta
This error does not appear in kylin 2.5.1.
Because kylin's CachedCrudAssist loads resources from the directory, it uses
the json suffix rule to match the file, the file with the crc suffix is
filtered out.
Then the JsonSerializer is used to deserialize the files that meet the
conditions, so there will be no problems.
However, in kylin 3.0.0, CachedCrudAssis first scans all files in the
directory, and deserializes at the same time.
The file with the suffix crc is deserialized by JsonSerializer to report
this error.
After the scan is complete, use the json suffix rule to match the file name.
Our pr refer to kylin 2.5.1, which is to add Filter to sink and filter out
the files with crc suffix.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> parse crc file error when building cube with mapreduce
> ------------------------------------------------------
>
> Key: KYLIN-4547
> URL: https://issues.apache.org/jira/browse/KYLIN-4547
> Project: Kylin
> Issue Type: Bug
> Components: Metadata
> Affects Versions: v3.0.0
> Environment: hadoop version: CDH-5.12.2-1.cdh5.12.2
> Reporter: steven-qin
> Assignee: steven-qin
> Priority: Major
>
> It can not parse crc file when building cube with mapreduce.
> Here is the exeception log:
> {code:java}
> // code placeholder
> 2020-06-08 10:08:23,821 INFO [main] org.apache.kylin.common.KylinConfig:
> Creating new manager instance of class org.apache.kylin.cube.CubeManager
> 2020-06-08 10:08:23,844 INFO [main] org.apache.kylin.cube.CubeManager:
> Initializing CubeManager with config
> kylin_metadata30@ifile,path=/yarn/nm/usercache/kylin/appcache/application_1590134125851_1782/container_e55_1590134125851_1782_01_000003/meta
> 2020-06-08 10:08:23,847 INFO [main]
> org.apache.kylin.common.persistence.ResourceStore: Using metadata url
> kylin_metadata30@ifile,path=/yarn/nm/usercache/kylin/appcache/application_1590134125851_1782/container_e55_1590134125851_1782_01_000003/meta
> for resource store
> 2020-06-08 10:08:24,303 ERROR [main]
> org.apache.kylin.common.persistence.ResourceStore: Error reading resource
> /cube/.kylin_sales_cube.json.crc
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParseException:
> Invalid UTF-8 middle byte 0xd2
> at [Source: (DataInputStream); line: 1, column: 11]
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidOther(UTF8StreamJsonParser.java:3543)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeCharForError(UTF8StreamJsonParser.java:3288)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3514)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2621)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:826)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:723)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4129)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3988)
> at
> org.apache.kylin.job.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3058)
> at org.apache.kylin.common.util.JsonUtil.readValue(JsonUtil.java:73)
> at
> org.apache.kylin.common.persistence.JsonSerializer.deserialize(JsonSerializer.java:46)
> at
> org.apache.kylin.common.persistence.ContentReader.readContent(ContentReader.java:40)
> at
> org.apache.kylin.common.persistence.ResourceStore$3.visit(ResourceStore.java:259)
> at
> org.apache.kylin.common.persistence.FileResourceStore.visitFolderImpl(FileResourceStore.java:87)
> at
> org.apache.kylin.common.persistence.ResourceStore.visitFolderInner(ResourceStore.java:766)
> at
> org.apache.kylin.common.persistence.ResourceStore.visitFolderAndContent(ResourceStore.java:751)
> at
> org.apache.kylin.common.persistence.ResourceStore.lambda$getAllResourcesMap$0(ResourceStore.java:255)
> at
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at
> org.apache.kylin.common.persistence.ResourceStore.getAllResourcesMap(ResourceStore.java:253)
> at
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.reloadAll(CachedCrudAssist.java:127)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:152)
> at org.apache.kylin.cube.CubeManager.newInstance(CubeManager.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.kylin.common.KylinConfig.getManager(KylinConfig.java:478)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:104)
> at
> org.apache.kylin.engine.mr.steps.FactDistinctColumnPartitioner.setConf(FactDistinctColumnPartitioner.java:56)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:707)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:776)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)