[jira] [Commented] (KYLIN-921) Dimension with all nulls cause BuildDimensionDictionary failed due to FileNotFoundException

hongbin ma (JIRA) Sun, 02 Aug 2015 23:55:30 -0700

    [ 
https://issues.apache.org/jira/browse/KYLIN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651521#comment-14651521
 ]


hongbin ma commented on KYLIN-921:
----------------------------------

hi Dayue

I reviewed you patch and applied it on 0.7-staging. Thank you for the high 
quality patch.

Let me know:
1. If you have time to apply the changes on 0.8 branch too. 
2. If you have problem setting up CI environment for 0.7-staging or 0.8 
version. Since 0.8 versions, unit tests and integration tests are separated: A 
test case is categorized into a integration test if it needs to interact with 
hbase sandbox/minicluster. You can run "mvn test" to run unit tests only and 
"mvn verify" to run all test cases. For 0.7 versions this is impossible, you'll 
have to follow http://kylin.incubator.apache.org/docs/development/dev_env.html 
for all the test cases whenever you make any changes.



> Dimension with all nulls cause BuildDimensionDictionary failed due to 
> FileNotFoundException
> -------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-921
>                 URL: https://issues.apache.org/jira/browse/KYLIN-921
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v0.7.2
>            Reporter: Dayue Gao
>            Assignee: ZhouQianhao
>             Fix For: v0.7.3
>
>         Attachments: KYLIN-921.patch
>
>
> From mailing list
> ----------------------
> {noformat}
> I am building a cube with some lookup table in between and getting
> exception at third step of cube build i.e Build Dimension Dictionary with
> exception saying
> java.io.FileNotFoundException: File does not exist:
> /tmp/kylin-5a2ea405-24a2-45ed-958e-2a7fddd8cc97/sc_o2s_metrics_verified123455/fact_distinct_columns/SC
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
> at org.apache.kylin.dict.lookup.FileTable.getSignature(FileTable.java:62)
> at
> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
> at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
> at
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
> at
> org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at
> org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The problem is that FactDistinctColumnsMapper's map method skips null values. 
> As a result, if all values of dimension 'x' are null, 
> FactDistinctColumnsReducer will not create file for 'x', thereafter the 
> following job throws FileNotFoundException.
> {code:title=FactDistinctColumnsMapper.java|borderStyle=solid}
> public void map(KEYIN key, HCatRecord record, Context context) throws 
> IOException, InterruptedException {
>         try {
>             // code ommited ...
>             for (int i : factDictCols) {
>                 outputKey.set((short) i);
>                 fieldSchema = schema.get(flatTableIndexes[i]);
>                 Object fieldValue = record.get(fieldSchema.getName(), schema);
>                 // NULL VALUE IS SKIPPED
>                 if (fieldValue == null)
>                     continue;
>                 // code ommited ...
>             }
>         } catch (Exception ex) {
>             handleErrorRecord(record, ex);
>         }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KYLIN-921) Dimension with all nulls cause BuildDimensionDictionary failed due to FileNotFoundException

Reply via email to