Hi, I think you should use Lookup Refresh in cube manage list. Here is my screenshot pic. By Lookup Refresh, you can let kylin query the latest dimension data. You may have to wait for a moment for snapshot rebuilt success. Kylin.log will tell you when it finished successfully.
[cid:[email protected]] -------- Following is log of related job. 2019-08-23 00:24:02,072 INFO [FetcherRunner 198474142-43] threadpool.FetcherRunner:62 : LookupSnapshotBuildJob{id=780d706d-72ba-81bc-4442-6151a91ebab1, name=Lookup CUBE - M1C2 - TABLE - LACUS.KYLIN_ACCOUNT - CST 2019-08-23 00:23:36, state=READY} prepare to schedule and its priority is 30 2019-08-23 00:24:02,073 INFO [FetcherRunner 198474142-43] threadpool.FetcherRunner:66 : LookupSnapshotBuildJob{id=780d706d-72ba-81bc-4442-6151a91ebab1, name=Lookup CUBE - M1C2 - TABLE - LACUS.KYLIN_ACCOUNT - CST 2019-08-23 00:23:36, state=READY} scheduled 2019-08-23 00:24:02,073 INFO [FetcherRunner 198474142-43] threadpool.DefaultFetcherRunner:85 : Job Fetcher: 0 should running, 1 actual running, 0 stopped, 1 ready, 6 already succeed, 0 error, 0 discarded, 0 others 2019-08-23 00:24:02,073 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.AbstractExecutable:162 : Executing AbstractExecutable (Lookup CUBE - M1C2 - TABLE - LACUS.KYLIN_ACCOUNT - CST 2019-08-23 00:23:36) 2019-08-23 00:24:02,077 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.ExecutableManager:471 : job id:780d706d-72ba-81bc-4442-6151a91ebab1 from READY to RUNNING 2019-08-23 00:24:02,077 DEBUG [pool-6-thread-1] cachesync.Broadcaster:116 : Servers in the cluster: [localhost:7193] 2019-08-23 00:24:02,077 DEBUG [pool-6-thread-1] cachesync.Broadcaster:126 : Announcing new broadcast to all: BroadcastEvent{entity=execute_output, event=update, cacheKey=780d706d-72ba-81bc-4442-6151a91ebab1} 2019-08-23 00:24:02,079 DEBUG [pool-6-thread-1] cachesync.Broadcaster:116 : Servers in the cluster: [localhost:7193] 2019-08-23 00:24:02,079 DEBUG [pool-6-thread-1] cachesync.Broadcaster:126 : Announcing new broadcast to all: BroadcastEvent{entity=execute_output, event=update, cacheKey=780d706d-72ba-81bc-4442-6151a91ebab1} 2019-08-23 00:24:02,080 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.AbstractExecutable:162 : Executing AbstractExecutable (Take Snapshot to Metadata Store) 2019-08-23 00:24:02,081 DEBUG [http-bio-7193-exec-9] cachesync.Broadcaster:246 : Broadcasting UPDATE, execute_output, 780d706d-72ba-81bc-4442-6151a91ebab1 2019-08-23 00:24:02,082 DEBUG [http-bio-7193-exec-9] cachesync.Broadcaster:280 : Done broadcasting UPDATE, execute_output, 780d706d-72ba-81bc-4442-6151a91ebab1 2019-08-23 00:24:02,084 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.ExecutableManager:471 : job id:780d706d-72ba-81bc-4442-6151a91ebab1-00 from READY to RUNNING 2019-08-23 00:24:02,086 DEBUG [http-bio-7193-exec-9] cachesync.Broadcaster:246 : Broadcasting UPDATE, execute_output, 780d706d-72ba-81bc-4442-6151a91ebab1 2019-08-23 00:24:02,086 DEBUG [http-bio-7193-exec-9] cachesync.Broadcaster:280 : Done broadcasting UPDATE, execute_output, 780d706d-72ba-81bc-4442-6151a91ebab1 2019-08-23 00:24:02,189 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:385 : Trying to connect to metastore with URI thrift://cdh-master:9083 2019-08-23 00:24:02,191 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:430 : Opened a connection to metastore, current connections: 3 2019-08-23 00:24:02,202 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:482 : Connected to metastore. 2019-08-23 00:24:02,350 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] lookup.LookupSnapshotToMetaStoreStep:65 : take snapshot for table:LACUS.KYLIN_ACCOUNT 2019-08-23 00:24:02,382 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] lookup.SnapshotManager:244 : Loading snapshotTable from /table_snapshot/LACUS.KYLIN_ACCOUNT/7b38cfc3-9e01-f456-a87f-d01403c9ac77.snapshot, with loadData: false 2019-08-23 00:24:02,452 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hbase.HBaseConnection:181 : Using the working dir FS for HBase: hdfs://cdh-master:8020 2019-08-23 00:24:02,706 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:385 : Trying to connect to metastore with URI thrift://cdh-master:9083 2019-08-23 00:24:02,707 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:430 : Opened a connection to metastore, current connections: 4 2019-08-23 00:24:02,708 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] hive.metastore:482 : Connected to metastore. 2019-08-23 00:24:03,041 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] Configuration.deprecation:1174 : mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 2019-08-23 00:24:03,066 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] mapred.FileInputFormat:249 : Total input paths to process : 1 2019-08-23 00:24:03,099 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] mapreduce.InternalUtil:155 : Initializing org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties {name=LACUS.KYLIN_ACCOUNT, numFiles=1, field.delim=,, columns.types=bigint,int,int,string,string, serialization.format=,, columns=account_id,account_buyer_level,account_seller_level,account_country,account_contact, rawDataSize=0, numRows=0, serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, COLUMN_STATS_ACCURATE=true, totalSize=200000, serialization.null.format=\N, transient_lastDdlTime=1561880270} 2019-08-23 00:24:03,548 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] mapred.FileInputFormat:249 : Total input paths to process : 1 2019-08-23 00:24:03,558 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] mapreduce.InternalUtil:155 : Initializing org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties {name=LACUS.KYLIN_ACCOUNT, numFiles=1, field.delim=,, columns.types=bigint,int,int,string,string, serialization.format=,, columns=account_id,account_buyer_level,account_seller_level,account_country,account_contact, rawDataSize=0, numRows=0, serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, COLUMN_STATS_ACCURATE=true, totalSize=200000, serialization.null.format=\N, transient_lastDdlTime=1561880270} 2019-08-23 00:24:03,704 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] lookup.SnapshotManager:244 : Loading snapshotTable from /table_snapshot/LACUS.KYLIN_ACCOUNT/7b38cfc3-9e01-f456-a87f-d01403c9ac77.snapshot, with loadData: true 2019-08-23 00:24:03,773 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] lookup.SnapshotManager:251 : Loaded snapshot at /table_snapshot/LACUS.KYLIN_ACCOUNT/7b38cfc3-9e01-f456-a87f-d01403c9ac77.snapshot 2019-08-23 00:24:03,779 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] persistence.HDFSResourceStore:98 : Writing pushdown file /kylin/kylin_verify_timeout/resources/table_snapshot/LACUS.KYLIN_ACCOUNT/b40c9e0d-b758-1898-7b01-27510a29443b.snapshot.temp.-1395423725 2019-08-23 00:24:03,939 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] persistence.HDFSResourceStore:117 : Move /kylin/kylin_verify_timeout/resources/table_snapshot/LACUS.KYLIN_ACCOUNT/b40c9e0d-b758-1898-7b01-27510a29443b.snapshot.temp.-1395423725 to /kylin/kylin_verify_timeout/resources/table_snapshot/LACUS.KYLIN_ACCOUNT/b40c9e0d-b758-1898-7b01-27510a29443b.snapshot 2019-08-23 00:24:03,946 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] persistence.HDFSResourceStore:65 : Writing marker for big resource /table_snapshot/LACUS.KYLIN_ACCOUNT/b40c9e0d-b758-1898-7b01-27510a29443b.snapshot 2019-08-23 00:24:04,001 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] lookup.LookupSnapshotToMetaStoreStep:68 : update snapshot path to cube metadata 2019-08-23 00:24:04,002 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] cube.CubeManager:372 : Updating cube instance 'M1C2' 2019-08-23 00:24:04,003 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] cachesync.CachedCrudAssist:198 : Saving CubeInstance at /cube/M1C2.json 2019-08-23 00:24:04,005 DEBUG [pool-6-thread-1] cachesync.Broadcaster:116 : Servers in the cluster: [localhost:7193] 2019-08-23 00:24:04,006 DEBUG [pool-6-thread-1] cachesync.Broadcaster:126 : Announcing new broadcast to all: BroadcastEvent{entity=cube, event=update, cacheKey=M1C2} 2019-08-23 00:24:04,010 DEBUG [http-bio-7193-exec-4] cachesync.Broadcaster:246 : Broadcasting UPDATE, cube, M1C2 2019-08-23 00:24:04,011 DEBUG [http-bio-7193-exec-4] cachesync.Broadcaster:246 : Broadcasting UPDATE, project_data, VerifyTimeout 2019-08-23 00:24:04,012 INFO [http-bio-7193-exec-4] service.CacheService:123 : cleaning cache for project VerifyTimeout (currently remove all entries) 2019-08-23 00:24:04,012 DEBUG [http-bio-7193-exec-4] cachesync.Broadcaster:280 : Done broadcasting UPDATE, project_data, VerifyTimeout 2019-08-23 00:24:04,012 DEBUG [http-bio-7193-exec-4] cachesync.Broadcaster:280 : Done broadcasting UPDATE, cube, M1C2 2019-08-23 00:24:04,017 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.ExecutableManager:471 : job id:780d706d-72ba-81bc-4442-6151a91ebab1-00 from RUNNING to SUCCEED 2019-08-23 00:24:04,021 DEBUG [pool-6-thread-1] cachesync.Broadcaster:116 : Servers in the cluster: [localhost:7193] 2019-08-23 00:24:04,021 DEBUG [pool-6-thread-1] cachesync.Broadcaster:126 : Announcing new broadcast to all: BroadcastEvent{entity=execute_output, event=update, cacheKey=780d706d-72ba-81bc-4442-6151a91ebab1} 2019-08-23 00:24:04,024 INFO [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.ExecutableManager:471 : job id:780d706d-72ba-81bc-4442-6151a91ebab1 from RUNNING to SUCCEED 2019-08-23 00:24:04,024 DEBUG [pool-6-thread-1] cachesync.Broadcaster:116 : Servers in the cluster: [localhost:7193] 2019-08-23 00:24:04,024 DEBUG [Scheduler 384496282 Job 780d706d-72ba-81bc-4442-6151a91ebab1-166] execution.AbstractExecutable:332 : no need to send email, user list is empty 2019-08-23 00:24:04,024 DEBUG [pool-6-thread-1] cachesync.Broadcaster:126 : Announcing new broadcast to all: BroadcastEvent{entity=execute_output, event=update, cacheKey=780d706d-72ba-81bc-4442-6151a91ebab1} ---------------- Best wishes, Xiaoxiang Yu 发件人: 青椒肉丝 <[email protected]> 答复: "[email protected]" <[email protected]> 日期: 2019年8月22日 星期四 13:47 收件人: user <[email protected]> 主题: How to build model and cube for dimension tables that update data frequently Hi,Guys: Now I have a dimension table in which the name field changes frequently, but the ID field does not change. The ID and name correspond one by one. Now I have put the name dictionary in the derivative dimension of cube. But whenever I change the value of the name field, I have to rebuild the cube to find the latest name value. So when my cube accumulates two years of data, rebuilding the cube becomes expensive. Can't cube in kylin query the latest dimension data? If feasible, how can I build model and cube? David
