PeterZh6 opened a new issue, #11129:
URL: https://github.com/apache/inlong/issues/11129

   ### Description
   
   **Parent Issue:** [[Feature][Umbrella] Tencent Rhino-bird: Sort metric 
monitoring and reporting #10961](https://github.com/apache/inlong/issues/10961)
   
   **Description:**
   
   This feature introduces enhanced metric instrumentation to improve 
observability within the InLong Sort Flink Connector, specifically for the 
Postgres-CDC connector. The newly added metrics in 
`org.apache.inlong.sort.base.metric.SourceExactlyMetric`cover deserialization 
processes, snapshot states, and checkpoint completion.
   
   ### Key Metric Categories:
   1. **Serialization/Deserialization Metrics:**
      - **Success/Error Counters:** Track successful and failed deserialization 
attempts (`numDeserializeSuccess`, `numDeserializeError`).
      - **Latency Gauges:** Measure the time taken for both serialization and 
deserialization (`deserializeTimeLag`, `serializeTimeLag`).
   
   2. **SnapshotState Metrics:**
      - **Creation/Error Counters:** Monitor the number of snapshots created 
and errors encountered during snapshot operations (`numSnapshotCreate`, 
`numSnapshotError`).
   
   3. **NotifyComplete Metrics:**
      - **Completed Snapshots Counter:** Track the number of completed 
checkpoints (`numCompletedSnapshots`).
      - **Snapshot-to-Checkpoint Latency:** Record the time between snapshot 
creation and checkpoint completion (`snapshotToCheckpointTimeLag`).
   
   ### Implementation Details:
   The metrics are integrated into the Postgres-CDC connector (located in 
`inlong-sort/sort-flink/sort-flink-v1.15/sort-connectors/postgres-cdc`) and can 
be adapted for use in other connectors. Specific changes are made in key 
methods like `deserialize()`, `snapshotState()`, and 
`notifyCheckpointComplete()` to gather detailed performance and error data.
   
   This feature enhances monitoring capabilities, providing critical insights 
into serialization/deserialization performance, checkpoint processes, and other 
key aspects of the connector's operation.
   
   
   
   ### Use case
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes, I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@inlong.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to