844700118 opened a new issue #8038: URL: https://github.com/apache/skywalking/issues/8038
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar feature requirement. ### Description **1. The sub-module "cluster" "node" under the k8s module of the dashboard has data, which is normal, but the sub-module "service" has no data displayed. It may be a problem with the OpenTelemetry Collector configuration, but I don't know where the problem is. Ask for help.** **2.Server error log [root@k8s-master ~/apache-skywalking-apm-bin-es7]#tail -f logs/skywalking-oap-server.log** ...... 2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching... 2021-10-27 19:00:32,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching... 2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Pod-1] INFO [] - class io.kubernetes.client.openapi.models.V1Pod#Start listing and watching... 2021-10-27 19:00:33,988 - io.kubernetes.client.informer.cache.ReflectorRunnable - 79 [controller-reflector-io.kubernetes.client.openapi.models.V1Service-1] INFO [] - class io.kubernetes.client.openapi.models.V1Service#Start listing and watching... 2021-10-27 19:00:34,463 - org.apache.skywalking.oap.meter.analyzer.dsl.Expression - 88 [grpcServerPool-1-thread-17] ERROR [] - failed to run "(100 - ((node_memory_SwapFree_bytes * 100) / node_memory_SwapTotal_bytes)).tag({tags -> tags.node_identifier_host_name = 'vm::' + tags.node_identifier_host_name}).service(['node_identifier_host_name'])" java.lang.IllegalArgumentException: null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:128) ~[guava-28.1-jre.jar:?] at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.build(SampleFamily.java:78) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.newValue(SampleFamily.java:487) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily.div(SampleFamily.java:193) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.dsl.SampleFamily$div$9.call(Unknown Source) ~[?:?] at Script1.run(Script1.groovy:1) ~[?:?] at org.apache.skywalking.oap.meter.analyzer.dsl.Expression.run(Expression.java:77) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.Analyzer.analyse(Analyzer.java:115) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.MetricConvert.toMeter(MetricConvert.java:73) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.meter.analyzer.prometheus.PrometheusMetricConverter.toMeter(PrometheusMetricConverter.java:84) ~[meter-analyzer-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.lambda$onNext$6(OCMetricHandler.java:79) ~[otel-receiver-plugin-8.7.0.jar:8.7.0] at java.util.ArrayList.forEach(ArrayList.java:1259) [?:1.8.0_262] at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:79) [otel-receiver-plugin-8.7.0.jar:8.7.0] at org.apache.skywalking.oap.server.receiver.otel.oc.OCMetricHandler$1.onNext(OCMetricHandler.java:61) [otel-receiver-plugin-8.7.0.jar:8.7.0] at io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:249) [grpc-stub-1.32.1.jar:1.32.1] at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309) [grpc-core-1.32.1.jar:1.32.1] at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292) [grpc-core-1.32.1.jar:1.32.1] at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782) [grpc-core-1.32.1.jar:1.32.1] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.32.1.jar:1.32.1] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.32.1.jar:1.32.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_262] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_262] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262] **3.k8s indicator monitoring is normal [root@master131 ~]# kubectl logs -f -n kube-system kube-state-metrics-0** I1027 10:01:11.984341 1 main.go:106] Using default resources I1027 10:01:12.128159 1 main.go:118] Using all namespace I1027 10:01:12.128166 1 main.go:139] metric allow-denylisting: Excluding the following lists that were on denylist: W1027 10:01:12.128948 1 client_config.go:615] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. I1027 10:01:12.212866 1 main.go:241] Testing communication with server I1027 10:01:12.303482 1 main.go:246] Running with Kubernetes cluster version: v1.20. git version: v1.20.2. git tree state: clean. commit: faecb196815e248d3ecfb03c680a4507229c2a56. platform: linux/amd64 I1027 10:01:12.303518 1 main.go:248] Communication with server successful I1027 10:01:12.303837 1 main.go:204] Starting metrics server: [::]:8080 I1027 10:01:12.303864 1 metrics_handler.go:102] Autosharding enabled with pod=kube-state-metrics-0 pod_namespace=kube-system I1027 10:01:12.303886 1 metrics_handler.go:103] Auto detecting sharding settings. I1027 10:01:12.303881 1 main.go:193] Starting kube-state-metrics self metrics server: [::]:8081 I1027 10:01:12.304116 1 main.go:64] levelinfomsgTLS is disabled.http2false I1027 10:01:12.304203 1 main.go:64] levelinfomsgTLS is disabled.http2false I1027 10:01:12.363206 1 builder.go:190] Active resources: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments **4. Data collection OpenTelemetry is normal [root@master131 ~]# kubectl logs -f otel-collector-7bb5b98564-stvdg** 2021-10-27T11:34:43.650Z info service/collector.go:262 Starting otelcol... {"Version": "v0.29.0", "NumCPU": 28} 2021-10-27T11:34:43.657Z info service/collector.go:322 Using memory ballast {"MiBs": 683} 2021-10-27T11:34:43.657Z info service/collector.go:170 Setting up own telemetry... 2021-10-27T11:34:43.659Z info service/telemetry.go:99 Serving Prometheus metrics {"address": ":8888", "level": 0, "service.instance.id": "9903e31e-d72f-4222-a2a8-32c94a0836db"} 2021-10-27T11:34:43.659Z info service/collector.go:205 Loading configuration... 2021-10-27T11:34:43.662Z info service/collector.go:221 Applying configuration... 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:274 Exporter was built. {"kind": "exporter", "exporter": "opencensus"} 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:274 Exporter was built. {"kind": "exporter", "exporter": "logging"} 2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:204 Pipeline was built. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"} 2021-10-27T11:34:43.662Z info builder/receivers_builder.go:230 Receiver was built. {"kind": "receiver", "name": "prometheus", "datatype": "metrics"} 2021-10-27T11:34:43.662Z info service/service.go:137 Starting extensions... 2021-10-27T11:34:43.662Z info builder/extensions_builder.go:53 Extension is starting... {"kind": "extension", "name": "health_check"} 2021-10-27T11:34:43.662Z info healthcheckextension/healthcheckextension.go:41 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Port":0,"TCPAddr":{"Endpoint":"0.0.0.0:13133"}}} 2021-10-27T11:34:43.662Z info builder/extensions_builder.go:59 Extension started. {"kind": "extension", "name": "health_check"} 2021-10-27T11:34:43.662Z info builder/extensions_builder.go:53 Extension is starting... {"kind": "extension", "name": "zpages"} 2021-10-27T11:34:43.662Z info zpagesextension/zpagesextension.go:42 Register Host's zPages {"kind": "extension", "name": "zpages"} 2021-10-27T11:34:43.662Z info zpagesextension/zpagesextension.go:55 Starting zPages extension {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}} 2021-10-27T11:34:43.662Z info builder/extensions_builder.go:59 Extension started. {"kind": "extension", "name": "zpages"} 2021-10-27T11:34:43.662Z info service/service.go:182 Starting exporters... 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "opencensus"} 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:97 Exporter started. {"kind": "exporter", "name": "opencensus"} 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "logging"} 2021-10-27T11:34:43.662Z info builder/exporters_builder.go:97 Exporter started. {"kind": "exporter", "name": "logging"} 2021-10-27T11:34:43.662Z info service/service.go:187 Starting processors... 2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "metrics", "pipeline_datatype": "metrics"} 2021-10-27T11:34:43.662Z info builder/pipelines_builder.go:62 Pipeline is started. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"} 2021-10-27T11:34:43.662Z info service/service.go:192 Starting receivers... 2021-10-27T11:34:43.662Z info builder/receivers_builder.go:70 Receiver is starting... {"kind": "receiver", "name": "prometheus"} 2021-10-27T11:34:43.663Z info kubernetes/kubernetes.go:282 Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"} 2021-10-27T11:34:43.679Z info kubernetes/kubernetes.go:282 Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"} 2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0", "subs": "[jvm-node-exporter]"} 2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/1", "subs": "[kubernetes-cadvisor]"} 2021-10-27T11:34:43.680Z info discovery/manager.go:195 Starting provider {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/2", "subs": "[kube-state-metrics]"} 2021-10-27T11:34:43.680Z info builder/receivers_builder.go:75 Receiver started. {"kind": "receiver", "name": "prometheus"} 2021-10-27T11:34:43.680Z info discovery/manager.go:213 Discoverer channel closed {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "static/0"} 2021-10-27T11:34:43.680Z info healthcheck/handler.go:129 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"} 2021-10-27T11:34:43.680Z info service/collector.go:182 Everything is ready. Begin running and processing data. 2021-10-27T11:34:50.493Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 170} 2021-10-27T11:34:50.493Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 170} 2021-10-27T11:34:50.708Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 70} 2021-10-27T11:34:51.930Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 46} 2021-10-27T11:34:52.944Z INFO loggingexporter/logging_exporter.go:56 MetricsExporter {"#metrics": 70} **5. I am not sure if the OpenTelemetry configuration is correct [root@master131 ~]# vi ./otel-collector-config.yaml** apiVersion: v1 kind: ConfigMap metadata: name: otel-collector-conf labels: app: opentelemetry component: otel-collector-conf namespace: default data: otel-collector-config: | #1. Data export receivers: prometheus: config: global: scrape_interval: 5s evaluation_interval: 5s scrape_configs: #Collect jvm - job_name: 'jvm-node-exporter' static_configs: - targets: ['192.168.1.131:9110'] #Collect k8s - job_name: 'kubernetes-cadvisor' scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - source_labels: [] # relabel the cluster name target_label: cluster replacement: k8s-131 - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor - source_labels: [instance] # relabel the node name separator: ; regex: (.+) target_label: node replacement: $$1 action: replace - job_name: kube-state-metrics kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name] regex: kube-state-metrics replacement: $$1 action: keep - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [] # relabel the cluster name target_label: cluster replacement: k8s-131 #2.Workflow, preprocessing work done before exporting the data source processors: batch: #Self-health check extensions: health_check: {} zpages: {} #3.data import exporters: opencensus: endpoint: "192.168.1.214:11800" insecure: true logging: logLevel: info service: extensions: [health_check, zpages] pipelines: metrics: receivers: [prometheus] processors: [batch] exporters: [opencensus,logging] --- apiVersion: v1 kind: Service metadata: name: otel-collector labels: app: opentelemetry component: otel-collector namespace: default spec: type: NodePort ports: - name: metrics port: 8888 targetPort: 8888 nodePort: 58888 selector: component: otel-collector --- apiVersion: apps/v1 kind: Deployment metadata: name: otel-collector labels: app: opentelemetry component: otel-collector namespace: default spec: selector: matchLabels: app: opentelemetry component: otel-collector minReadySeconds: 5 progressDeadlineSeconds: 120 replicas: 1 template: metadata: labels: app: opentelemetry component: otel-collector spec: serviceAccountName: prometheus containers: - command: - "/otelcol" - "--config=/conf/otel-collector-config.yaml" - "--log-level=info" - "--mem-ballast-size-mib=683" image: otel/opentelemetry-collector:0.29.0 name: otel-collector resources: limits: cpu: 1 memory: 2Gi requests: cpu: 200m memory: 400Mi ports: - containerPort: 55679 # ZPages endpoint - containerPort: 55680 # ZPages endpoint - containerPort: 4317 # OpenTelemetry receiver - containerPort: 8888 # querying metrics volumeMounts: - name: otel-collector-config-vol mountPath: /conf volumes: - configMap: name: otel-collector-conf items: - key: otel-collector-config path: otel-collector-config.yaml name: otel-collector-config-vol ### Use case _No response_ ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
