vernedeng commented on code in PR #1049: URL: https://github.com/apache/inlong-website/pull/1049#discussion_r1798752457
########## docs/modules/sort/log_report.md: ########## @@ -0,0 +1,229 @@ +--- +title: OpenTelemetry Log Report +sidebar_position: 6 +--- + +## Overview + +As `InLong Sort` runs on different `Task Manager` nodes of `Apache Flink`, each node stores the logs independently, and it is inefficient to view the logs on each node. To solve this, a centralized log management solution based on [OpenTelemetry](https://opentelemetry.io/) is provided, which allows users to efficiently manage Flink logs. + +InLong Sort can integrate the log reporting function into every `Connector`. The log processing flow is shown in the figure below. The logs are reported through [OpenTelemetry](https://opentelemetry.io/), collected and processed by [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/), and then sent to [Grafana Loki](https://grafana.com/oss/loki/) for centralized management. + + + + + +## Integrating Log Reporting for Connector + +InLong Sort wraps the [OpenTelemetryLogger](https://github.com/apache/inlong/blob/6e78dd2de8e917b9fc17a18d5e990b43089bb804/inlong-sort/sort-flink/base/src/main/java/org/apache/inlong/sort/base/util/OpenTelemetryLogger.java) class, which provides a `Builder` to help users to quickly configure an ` OpenTelemetryLogger` and can enable or disable logging reporting by calling its `install` or `uninstall` functions. With the help of `OpenTelemetryLogger`, the connector can report logs more easily. The following steps describe how to use the OpenTelemetryLogger class to integrate log reporting for connector based on[FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:RefactorSourceInterface-Motivation) standard: + +1. Construct an `OpenTelemetryLogger` object using `OpenTelemetryLogger.Builder()` in the constructor method of connector `SourceReader`'s class. +2. Call `install()` method of the `OpenTelemetryLogger` object in `Start()` function of `SourceReader`. +3. Call `uninstall()` method of the `OpenTelemetryLogger` object in `close()` function of `SourceReader`. + +The example is: +```java +import org.apache.inlong.sort.base.util.OpenTelemetryLogger; + +public class XXXSourceReader<T> +{ + + private static final Logger LOG = LoggerFactory.getLogger(XXXSourceReader.class); + + private final OpenTelemetryLogger openTelemetryLogger; + + public XXXSourceReader() { + ... + // initial OpenTelemetryLogger + this.openTelemetryLogger = new OpenTelemetryLogger.Builder() + .setServiceName(this.getClass().getSimpleName()) + .setLocalHostIp(this.context.getLocalHostName()).build(); + } + + @Override + public void start() { + openTelemetryLogger.install(); // start log reporting + ... + } + + @Override + public void close() throws Exception { + openTelemetryLogger.uninstall(); // close log reporting + super.close(); + } + + ... +} +``` +The `OpenTelemetryLogger` currently provides the following configuration items: + +| Configuration | Description | Default value | +| ----------- | -------------------- | ------------- | +|`endpoint` | `OpenTelemetry Collector` address, if not specified,it will try to get from `OTEL_EXPORTER_ENDPOINT` environment variable; if the environment variable is not configured, then use the default value.| `localhost:4317` | +| `serviceName` |` OpenTelemetry`'s service name, which can be used to distinguish between different connectors. |`unnamed_service `| +| `layout` | `Log4j2`'s log format, which is an instance of `PatternLayout` class |`%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n`| +| `logLevel` | Log level |`Level.INFO`| +| `localHostIp` | IP of the `Flink` node, available in `SourceReader` via `this.context.getLocalHostName()`. |`null`| + +## Usage + +In addition to integrating the log reporting function for Connector, you also need to add three docker containers(`opentelemetry-collector`, `grafana loki`, `grafana`), and configure the `OTEL_EXPORTER_ENDPOINT` environment variable for the `Flink` container. The `docker-compose.yml` file is shown below: + + +```yml +# flink jobmanager +jobmanager: + image: apache/flink:1.15-scala_2.12 + container_name: jobmanager + environment: + - | + FLINK_PROPERTIES= + jobmanager.rpc.address: jobmanager + - OTEL_EXPORTER_ENDPOINT=logcollector:4317 + ports: + - "8081:8081" + command: jobmanager + +# flink taskmanager +taskmanager: + image: apache/flink:1.15-scala_2.12 + container_name: taskmanager + environment: + - | + FLINK_PROPERTIES= + jobmanager.rpc.address: jobmanager + taskmanager.numberOfTaskSlots: 2 + - OTEL_EXPORTER_ENDPOINT=logcollector:4317 + command: taskmanager + +# opentelemetry collector +logcollector: + image: otel/opentelemetry-collector-contrib:0.110.0 + container_name: logcollector + volumes: + - ./log-system/otel-config.yaml:/otel-config.yaml + command: [ "--config=/otel-config.yaml"] + ports: + - "4317:4317" + +# grafana loki +loki: + image: grafana/loki:3.0.0 + ports: + - "3100:3100" + volumes: + - ./log-system/loki.yaml:/etc/loki/local-config.yaml + command: -config.file=/etc/loki/local-config.yaml + +# grafana +grafana: + environment: + - GF_PATHS_PROVISIONING=/etc/grafana/provisioning + - GF_AUTH_ANONYMOUS_ENABLED=true + - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin + entrypoint: + - sh + - -euc + - | + mkdir -p /etc/grafana/provisioning/datasources + cat <<EOF > /etc/grafana/provisioning/datasources/ds.yaml Review Comment: cat /etc/grafana/provisioning/datasources/ds.yaml -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
