Re: Fact tables with complex data types.

ShaoFeng Shi Mon, 06 Jun 2016 19:18:12 -0700

Hi Joel,

Kylin supports View as the fact table, today many users are building cubes
from view. There is a known issue about view: Kylin couldn't calculate the
columns' cardinality for a view, the root cause is Hive HCatalog (which
Kylin uses to read source) doesn't support View so far; The error you got
is just this issue;


Kylin 1.5.2 has another issue about using View as fact table (see KYLIN
-1758); The hot-fix version 1.5.2.1 will be released soon, you can try it.

2016-06-06 19:30 GMT+08:00 Joel Victor <[email protected]>:

> Hi,
>
> I am using Kylin 1.5.2 with HDP 2.2
> Currently my fact table contains multiple columns with type array<string>. 
> Kylin
> won't allow me to sync this table since it has complex datatypes. I don't
> need these complex data types in my cube builds but I do require them for
> other jobs.
>
> The table is partitioned on date has 2 buckets and is stored in ORC format.
>
> I tried creating a view over it but it seems Kylin doesn't support views
> as a fact table.
>
> Another approach that I came up with is moving all the columns with
> complex data types from the original table to a separate table and use the
> original table as my fact table for building cubes.
>
> Is there any other way to go about this scenario ?
>
> I get the following error when I sync the view:
> java.lang.RuntimeException: java.io.IOException:
> java.lang.NullPointerException
>         at
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:86)
>         at
> org.apache.kylin.source.hive.cardinality.HiveColumnCardinalityJob.run(HiveColumnCardinalityJob.java:89)
>         at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:91)
>         at
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:121)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
>         at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>         at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
>         at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.NullPointerException
>         at
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
>         at
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
>         at
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:81)
>         ... 10 more
> Caused by: java.lang.NullPointerException
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:191)
>         at
> org.apache.hive.hcatalog.mapreduce.FosterStorageHandler.<init>(FosterStorageHandler.java:59)
>         at
> org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:417)
>         at
> org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:380)
>         at
> org.apache.hive.hcatalog.mapreduce.InitializeInput.extractPartInfo(InitializeInput.java:158)
>         at
> org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:137)
>         at
> org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86)
>         at
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
>
>
> Thanks,
> Joel
>



-- 
Best regards,

Shaofeng Shi

Re: Fact tables with complex data types.

Reply via email to