Hi,

I work on a project where we build a cube multiple times a day using Kylin.
We were using Kylin 1.6 and upgraded this week to Kylin 2.0.

Since the upgrade I noticed that the HDFS usage had increased every time we
rebuild the cube and the space is not cleared up. This is although we run
both the StorageCleanupJob and metastore clean command as described here
and here.

When looking into HDFS to see where the increase is I see that the
accumulated data is at: /kylin/kylin_metadata/

It looks like every job is getting a new folder inside that folder and its
size is at least the same as the size of the cube. Seems like some of these
folders were not cleared even for very old jobs but since the upgrade to
V2.0 all the folders for all jobs were not cleared. I deleted some of the
older folders and it didn't affect the cube. I also created a test cube and
then deleted the folder that was created for it and could still query the
cube. Is it safe to delete these folders manually? Is it correct to assume
that after the job is done all the data that needs to be maintained will be
in HBase (Where I can find the cube and the metadata information)?


Many thanks,

Itay

-----
Itay Shwartz

StructureIt
6th Floor
Aldgate Tower
2 Leman Street
London
E1 8FA

direct line: +44 (0)20 3286 9902
mobile: +44 (0)74 1123 6614
www.structureit.net

Reply via email to