hit-lacus edited a comment on pull request #1351:
URL: https://github.com/apache/kylin/pull/1351#issuecomment-671059337


   Before we scale up the topic partition , we should doing the following steps:
   1. Disable the cube, thus all consumption task will be cacelled.
   2. Use the REST API 
`http://${KYLIN_INSTANCE_IP}:${KYLIN_INSTANCE_PORT}/kylin/cubes/view/${CUBE_NAME}/instancejson`
 to check the `CubeInstance.json`. You will find each READY segment has a 
property named `stream_source_checkpoint`. Here s part of its content: 
   ```json
   {
         "uuid": "aab58181-29f7-0593-d7f8-ca9d9b97d49b",
         "name": "20200809200000_20200809210000",
         "storage_location_identifier": "APACHE:REALTIME_OLAP_ISH2P3HM30",
         "date_range_start": 1597003200000,
         "date_range_end": 1597006800000,
         "source_offset_start": 0,
         "source_offset_end": 0,
         "status": "READY",
         "size_kb": 102984,
         "is_merged": false,
         "estimate_ratio": null,
         "input_records": 118216,
         "input_records_size": 0,
         "last_build_time": 1596982095554,
         "last_build_job_id": "8f3660b8-72f7-7aac-e3bd-1e06ea30d8e2",
         "create_time_utc": 1596981735548,
         "cuboid_shard_nums": {},
         "total_shards": 1,
         "blackout_cuboids": [],
         "binary_signature": null,
         "dictionaries": {
           "USERACTIONSTREAM.DEVICE_BRAND": 
"/dict/APACHE.USERACTIONSTREAM/DEVICE_BRAND/afb80e63-4fef-d575-3483-1e0314bf4bef.dict",
           "USERACTIONSTREAM.DEVIDE_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/DEVIDE_TYPE/61bf9051-3cdd-ec82-bc0f-1bb3226bb411.dict",
           "USERACTIONSTREAM.LOCATION_CITY": 
"/dict/APACHE.USERACTIONSTREAM/LOCATION_CITY/292ce446-62a1-0b12-e4d0-13bc249b0dbe.dict",
           "USERACTIONSTREAM.PAGE_ID": 
"/dict/APACHE.USERACTIONSTREAM/PAGE_ID/12fa188b-0f89-db3d-560a-8aea5b970349.dict",
           "USERACTIONSTREAM.NETWORK_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/NETWORK_TYPE/99d38dea-25ef-c03d-73b7-19a6fda2ce4c.dict",
           "USERACTIONSTREAM.STR_MINUTE_SECOND": 
"/dict/APACHE.USERACTIONSTREAM/STR_MINUTE_SECOND/2b23e9aa-dee0-b88f-7e3e-0a3d74d89f89.dict",
           "USERACTIONSTREAM.ACT_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/ACT_TYPE/e77f008d-6bd8-bc1f-ba94-ff62764c3e14.dict",
           "USERACTIONSTREAM.UID": 
"/dict/APACHE.USERACTIONSTREAM/UID/665546b1-424a-fc42-a35b-1a58fcd1fb5f.dict"
         },
         "snapshots": null,
         "rowkey_stats": [
           [
             "ACT_TYPE",
             10,
             1
           ],
           [
             "NETWORK_TYPE",
             4,
             1
           ],
           [
             "LOCATION_CITY",
             7,
             1
           ],
           [
             "STR_MINUTE_SECOND",
             3600,
             2
           ],
           [
             "PAGE_ID",
             50,
             1
           ],
           [
             "DEVICE_BRAND",
             5,
             1
           ],
           [
             "DEVIDE_TYPE",
             60,
             1
           ],
           [
             "UID",
             19543,
             2
           ]
         ],
         "stream_source_checkpoint": 
"{\"0\":363171,\"1\":363198,\"2\":363249,\"3\":363171,\"4\":363199,\"5\":363250,\"6\":363170,\"7\":363196,\"8\":363250,\"9\":363170}"
       }
   ```
   
   3. Check `${KYLIN_RECEIVER_HOME}/logs/kylin_streaming_receiver.log`, you can 
find the some output:
   
   ```
   2020-08-09 22:34:05,381 INFO  [UserAnalysisCube_channel] 
storage.StreamingSegmentManager:645 : Print check point for cube 
UserAnalysisCube 
,CheckPoint{sourceConsumePosition='{"0":381733,"1":381763,"2":381820,"3":381733,"4":381764,"5":381820,"6":381732,"7":381760,"8":381821,"9":381733}',
 persistedIndexes={1597006800000=13, 1597010400000=8}, 
longLatencyInfo=LongLatencyInfo{longLatencyEventCnts={20200808000000_20200808010000=3,
 20200808060000_20200808070000=2, 20200809000000_20200809010000=2, 
20200809060000_20200809070000=2}, totalLongLatencyEventCnt=9}, 
segmentSourceStartPosition={1597006800000={"0":363171,"1":363198,"2":363249,"3":363171,"4":363199,"5":363250,"6":363170,"7":363196,"8":363250,"9":363170},
 
1597010400000={"0":375013,"1":375040,"2":375099,"3":375013,"4":375041,"5":375099,"6":375012,"7":375038,"8":375100,"9":375013}},
 checkPointTime=1596983645381, totalCount=3817689, checkPointCount=5801}
   ```
   
   These logs indicated that the data ingetsed and indexed in receiver side is 
checkpointed at position : 
   
   ```json
   {
       "0":375013,
       "1":375040,
       "2":375099,
       "3":375013,
       "4":375041,
       "5":375099,
       "6":375012,
       "7":375038,
       "8":375100,
       "9":375013
   }
   ```
   
   4. When disable cube , data ingetsed and indexed in receiver side will be 
removed, so when scaled up, we expected receiver will continue its consumpution 
after following position:
   ```json
   {
       "0":363171,
       "1":363198,
       "2":363249,
       "3":363171,
       "4":363199,
       "5":363250,
       "6":363170,
       "7":363196,
       "8":363250,
       "9":363170
   }
   ```
   
   5. So let's check if it is correct ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to