Hi ShaoFeng,
A very clear explanation, now I understand the use of Kylin working dir. Thanks! From: ShaoFeng Shi [mailto:[email protected]] Sent: miércoles, 9 de mayo de 2018 3:39 To: user <[email protected]> Subject: Re: Doubts about the hdfs working dir Hi Roberto, The data in hdfs-working-dir includes intermediate files (which will be GC) and Cuboid data (won't be GC). The Cuboid data is kept for the further segments' merge, as Kylin couldn't merge from HBase. If you're sure those segments won't be merged, you can move them to other storage. Please pay attention to the "resources" sub-folder under hdfs-working-dir, which persists some big metadata files like dictionary and snapshots. They shouldn't be moved. 2018-05-09 0:56 GMT+08:00 <[email protected] <mailto:[email protected]> >: Hi, I have some doubts about the use of kylin.env.hdfs-working-dir. I understand working dir is needed to store data about RUNNING or STOPPED JOBS. However, is it necessary to store data from finished jobs?. Although, we often execute kylin cleanup storage command, now our working dir folder is about 300 GB size, looks like a lot of data for historical jobs: 1. We can delete old data manually? I tried to stop all jobs and Kylin, then change working dir. After change working dir Kylin worked well for new jobs. There is any inconvenient to perform this delete manually? I did not experience any problems with the working dir change. 2. Does Kylin 2.3.1 have any advantages over Kylin 2.2 about working dir cleaning? King Regards, Roberto Tardío Olmos Senior Big Data & Business Intelligence Consultant Avenida de Brasil, 17 <https://maps.google.com/?q=Avenida+de+Brasil,+17&entry=gmail&source=g> , Planta 16.28020 Madrid Fijo: 91.788.34.10 http://bigdata.stratebi.com/ http://www.stratebi.com -- Best regards, Shaofeng Shi 史少锋
