I think my issue has to do with only certain hive tables (im not sure why),
and it seems the hive tables with issue show nothing under "Cardinality" in
Kylin.
Below is the error in case it is helpful (it did not seem insightful to
me). I am on .72 of Kylin so I will try upgrading.
pool-7-thread-10]:[2015-09-17
08:33:16,440][ERROR][org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:65)]
- error execute
HadoopShellExecutable{id=989dcad2-8ba9-4be1-bb1c-22cb0696007a-02,
name=Build Dimension Dictionary, state=RUNNING}
java.io.FileNotFoundException: File does not exist:
/tmp/kylin-989dcad2-8ba9-4be1-bb1c-22cb0696007a/hierarchy_test/fact_distinct_columns/CLIENT_ID
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1132)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at
org.apache.kylin.dict.lookup.FileTable.getSignature(FileTable.java:71)
at
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:177)
at
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:154)
at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:53)
at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
at
org.apache.kylin.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:53)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
On Wed, Sep 16, 2015 at 9:18 PM, ShaoFeng Shi <[email protected]> wrote:
> Which version of Kylin are you running? The web UI had a bug on showing the
> error messages, but it should have been fixed in v1.0; If you're using some
> browser side debug tool like firebug, you can check the content in the http
> response body; But usually we directly check the log file
> $KYLIN_HOME/tomcat/logs/kylin.log ; Please check it there, and you can
> copy/paste the detail error trace here for trouble shotting;
>
>
>
> 2015-09-17 1:53 GMT+08:00 Patrick McAnneny <[email protected]
> >:
>
> > Luke,
> >
> > Thanks for the quick reply. The hierarchy dimension is hopefully just
> what
> > we need then.
> >
> > However, testing this out gives me errors in the cube build step.
> > If I include 2 columns in the hierarchy dimension (clicking "+new
> > hierarchy" twice), the build job is scheduled, but fails on the "Build
> > Dimension Dictionary" step.
> > If I include a singular hierarchy column for said dimension, scheduling
> the
> > cube build fails with the error: "Can't find column <DB NAME>_<TABLE
> > NAME>_<COLUMN NAME>". It seems there are underscores here in the place of
> > dots; shouldn't it be "<DB NAME>.<TABLE NAME>.<COLUMN NAME>"?
> >
> >
> > For the build job that schedules but fails, I cannot access the logs via
> > the UI, the "loading" spinner just spins indefinitely. Would these be
> > helpful to look at, and if so where would they reside?
> >
> >
> >
> >
> >
> > On Wed, Sep 16, 2015 at 11:51 AM, Luke Han <[email protected]> wrote:
> >
> > > If they are one-to-many relationship, it will be easy to roll up to
> > parent
> > > level, the hierarchy dimension is designed for that.
> > >
> > >
> > > Best Regards!
> > > ---------------------
> > >
> > > Luke Han
> > >
> > > On Wed, Sep 16, 2015 at 10:15 PM, Patrick McAnneny <
> > > [email protected]> wrote:
> > >
> > > > If I had a singular large fact table comprised of a row for every
> > event I
> > > > am tracking, and there are rows that have parent or child events
> > (within
> > > > this same table), could kylin handle these relationships accordingly?
> > Can
> > > > the cube be configured to roll up measures based on parent rows?
> > > >
> > > > Thanks
> > > >
> > >
> >
>