hit-lacus edited a comment on pull request #1351: URL: https://github.com/apache/kylin/pull/1351#issuecomment-671059337
Before we scale up the topic partition , we should doing the following steps: 1. Disable the cube, thus all consumption task will be cacelled. 2. Use the REST API `http://${KYLIN_INSTANCE_IP}7236/kylin/cubes/view/${CUBE_NAME}/instancejson` to check the `CubeInstance.json`. You will find each READY segment has a property named `stream_source_checkpoint`. Here s part of its content: ```json { "uuid": "aab58181-29f7-0593-d7f8-ca9d9b97d49b", "name": "20200809200000_20200809210000", "storage_location_identifier": "APACHE:REALTIME_OLAP_ISH2P3HM30", "date_range_start": 1597003200000, "date_range_end": 1597006800000, "source_offset_start": 0, "source_offset_end": 0, "status": "READY", "size_kb": 102984, "is_merged": false, "estimate_ratio": null, "input_records": 118216, "input_records_size": 0, "last_build_time": 1596982095554, "last_build_job_id": "8f3660b8-72f7-7aac-e3bd-1e06ea30d8e2", "create_time_utc": 1596981735548, "cuboid_shard_nums": {}, "total_shards": 1, "blackout_cuboids": [], "binary_signature": null, "dictionaries": { "USERACTIONSTREAM.DEVICE_BRAND": "/dict/APACHE.USERACTIONSTREAM/DEVICE_BRAND/afb80e63-4fef-d575-3483-1e0314bf4bef.dict", "USERACTIONSTREAM.DEVIDE_TYPE": "/dict/APACHE.USERACTIONSTREAM/DEVIDE_TYPE/61bf9051-3cdd-ec82-bc0f-1bb3226bb411.dict", "USERACTIONSTREAM.LOCATION_CITY": "/dict/APACHE.USERACTIONSTREAM/LOCATION_CITY/292ce446-62a1-0b12-e4d0-13bc249b0dbe.dict", "USERACTIONSTREAM.PAGE_ID": "/dict/APACHE.USERACTIONSTREAM/PAGE_ID/12fa188b-0f89-db3d-560a-8aea5b970349.dict", "USERACTIONSTREAM.NETWORK_TYPE": "/dict/APACHE.USERACTIONSTREAM/NETWORK_TYPE/99d38dea-25ef-c03d-73b7-19a6fda2ce4c.dict", "USERACTIONSTREAM.STR_MINUTE_SECOND": "/dict/APACHE.USERACTIONSTREAM/STR_MINUTE_SECOND/2b23e9aa-dee0-b88f-7e3e-0a3d74d89f89.dict", "USERACTIONSTREAM.ACT_TYPE": "/dict/APACHE.USERACTIONSTREAM/ACT_TYPE/e77f008d-6bd8-bc1f-ba94-ff62764c3e14.dict", "USERACTIONSTREAM.UID": "/dict/APACHE.USERACTIONSTREAM/UID/665546b1-424a-fc42-a35b-1a58fcd1fb5f.dict" }, "snapshots": null, "rowkey_stats": [ [ "ACT_TYPE", 10, 1 ], [ "NETWORK_TYPE", 4, 1 ], [ "LOCATION_CITY", 7, 1 ], [ "STR_MINUTE_SECOND", 3600, 2 ], [ "PAGE_ID", 50, 1 ], [ "DEVICE_BRAND", 5, 1 ], [ "DEVIDE_TYPE", 60, 1 ], [ "UID", 19543, 2 ] ], "stream_source_checkpoint": "{\"0\":363171,\"1\":363198,\"2\":363249,\"3\":363171,\"4\":363199,\"5\":363250,\"6\":363170,\"7\":363196,\"8\":363250,\"9\":363170}" } ``` 3. Check `${KYLIN_RECEIVER_HOME}/logs/kylin_streaming_receiver.log`, you can find the some output: ``` 2020-08-09 22:34:05,381 INFO [UserAnalysisCube_channel] storage.StreamingSegmentManager:645 : Print check point for cube UserAnalysisCube ,CheckPoint{sourceConsumePosition='{"0":381733,"1":381763,"2":381820,"3":381733,"4":381764,"5":381820,"6":381732,"7":381760,"8":381821,"9":381733}', persistedIndexes={1597006800000=13, 1597010400000=8}, longLatencyInfo=LongLatencyInfo{longLatencyEventCnts={20200808000000_20200808010000=3, 20200808060000_20200808070000=2, 20200809000000_20200809010000=2, 20200809060000_20200809070000=2}, totalLongLatencyEventCnt=9}, segmentSourceStartPosition={1597006800000={"0":363171,"1":363198,"2":363249,"3":363171,"4":363199,"5":363250,"6":363170,"7":363196,"8":363250,"9":363170}, 1597010400000={"0":375013,"1":375040,"2":375099,"3":375013,"4":375041,"5":375099,"6":375012,"7":375038,"8":375100,"9":375013}}, checkPointTime=1596983645381, totalCount=3817689, checkPointCount=5801} ``` These logs indicated that the data ingetsed and indexed in receiver side is checkpointed at position : ```json { "0":375013, "1":375040, "2":375099, "3":375013, "4":375041, "5":375099, "6":375012, "7":375038, "8":375100, "9":375013 } ``` 4. When disable cube , data ingetsed and indexed in receiver side will be removed, so when scaled up, we expected receiver will continue its consumpution after following position: ```json { "0":363171, "1":363198, "2":363249, "3":363171, "4":363199, "5":363250, "6":363170, "7":363196, "8":363250, "9":363170 } ``` 5. So let's check if it is correct ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
