If yes please attach the code snippet

On Mon, Aug 3, 2015 at 3:08 PM, hongbin ma <[email protected]> wrote:

> KYLIN-921 is irrelevant to empty segments. It's to deal with cases where
> some dimension are always null (while some other dimension being solid
> values). Will https://issues.apache.org/jira/browse/KYLIN-863 handle the
> case?
>
> On Mon, Aug 3, 2015 at 3:04 PM, Shaofeng SHI (JIRA) <[email protected]>
> wrote:
>
>>
>>     [
>> https://issues.apache.org/jira/browse/KYLIN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651526#comment-14651526
>> ]
>>
>> Shaofeng SHI commented on KYLIN-921:
>> ------------------------------------
>>
>> Hi Dayue and hongbin, in 0.7.3 I already made the change to allow empty
>> cube segment, that means even if a dimension is null, the job will not fail
>> in the third step; the change was already included in KYLIN-863 and made in
>> both 0.7 and 0.8;
>>
>> > Dimension with all nulls cause BuildDimensionDictionary failed due to
>> FileNotFoundException
>> >
>> -------------------------------------------------------------------------------------------
>> >
>> >                 Key: KYLIN-921
>> >                 URL: https://issues.apache.org/jira/browse/KYLIN-921
>> >             Project: Kylin
>> >          Issue Type: Bug
>> >          Components: Job Engine
>> >    Affects Versions: v0.7.2
>> >            Reporter: Dayue Gao
>> >            Assignee: ZhouQianhao
>> >             Fix For: v0.7.3
>> >
>> >         Attachments: KYLIN-921.patch
>> >
>> >
>> > From mailing list
>> > ----------------------
>> > {noformat}
>> > I am building a cube with some lookup table in between and getting
>> > exception at third step of cube build i.e Build Dimension Dictionary
>> with
>> > exception saying
>> > java.io.FileNotFoundException: File does not exist:
>> >
>> /tmp/kylin-5a2ea405-24a2-45ed-958e-2a7fddd8cc97/sc_o2s_metrics_verified123455/fact_distinct_columns/SC
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
>> > at
>> >
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
>> > at
>> org.apache.kylin.dict.lookup.FileTable.getSignature(FileTable.java:62)
>> > at
>> >
>> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:164)
>> > at
>> org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
>> > at
>> >
>> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
>> > at
>> >
>> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
>> > at
>> >
>> org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>> > at
>> >
>> org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
>> > at
>> >
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>> > at
>> >
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>> > at
>> >
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>> > at
>> >
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
>> > at
>> >
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> > at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> > at java.lang.Thread.run(Thread.java:745)
>> > {noformat}
>> > The problem is that FactDistinctColumnsMapper's map method skips null
>> values. As a result, if all values of dimension 'x' are null,
>> FactDistinctColumnsReducer will not create file for 'x', thereafter the
>> following job throws FileNotFoundException.
>> > {code:title=FactDistinctColumnsMapper.java|borderStyle=solid}
>> > public void map(KEYIN key, HCatRecord record, Context context) throws
>> IOException, InterruptedException {
>> >         try {
>> >             // code ommited ...
>> >             for (int i : factDictCols) {
>> >                 outputKey.set((short) i);
>> >                 fieldSchema = schema.get(flatTableIndexes[i]);
>> >                 Object fieldValue = record.get(fieldSchema.getName(),
>> schema);
>> >                 // NULL VALUE IS SKIPPED
>> >                 if (fieldValue == null)
>> >                     continue;
>> >                 // code ommited ...
>> >             }
>> >         } catch (Exception ex) {
>> >             handleErrorRecord(record, ex);
>> >         }
>> >     }
>> > {code}
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to