I think you are using old release, after 8.7.0, many things are changed to improve performance. Many less resources are required.
dafang <13240156...@163.com>于2021年9月13日 周一下午7:18写道: > OK.I think I have found the reason.Now share to you. > I have found that if I set es-bulk size equals 5,then the es "request too > large" error will never apper.But at the same time,the grpc server will > happen some error,such as "cancelled before receiving half close", and it > makes sw-agent can't send data(trace or jvm) to server anymore.This seems > to require a balance between grpc receive speed and ES write speed to find > a balance poin > > > > > > > > > > > > > > > > > > 在 2021-09-13 17:45:40,"Sheng Wu" <wu.sheng.841...@gmail.com> 写道: > >Unknown means unknown I am afraid. > >I can't explain it. Firewall, proxy, security policy, etc. could you > >any of them or others. > > > >Sheng Wu 吴晟 > >Twitter, wusheng1108 > > > >dafang <13240156...@163.com> 于2021年9月13日周一 下午4:37写道: > >> > >> Hello god wu.Through my check, I have found that there some error info > in my skywalking-agent logs,such as "Send UpstreamSegment to collector fail > with a grpc internal exception. > org.apache.skywalking.apm.dependencies.io.grpc.StatusRuntimeException: > UNAVAILABLE: Network closed for unknown reason" > >> How to explain it? > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> At 2021-09-13 15:05:24, "Sheng Wu" <wu.sheng.841...@gmail.com> wrote: > >> >(1) All data in that bulk(ElasticSearch concept, read their doc) will > >> >be lost, yes. > >> >(2) This only means your agent gets disconnected from Server > >> >unexpectedly. For a reason about why, it wouldn't tell. > >> > > >> >About what you described in Chinese, first of all, it is better to > >> >keep Chinese and English consistent, don't put more information on one > >> >side, it is confusing. > >> >Why the agent will be disconnected forever, it can't be told from what > >> >you have provided. > >> >Auto reconnecting is working normally AFAIK. > >> > > >> >Sheng Wu 吴晟 > >> >Twitter, wusheng1108 > >> > > >> >dafang <13240156...@163.com> 于2021年9月13日周一 下午2:58写道: > >> >> > >> >> And now. I have two questions > >> >> 1.if this error exist,will all trace and jvm metric be lost? > >> >> 2.if there some msg in server logs just > like:"org.apache.skywalking.oap.server.receiver.trace.provider.handler.v8.grpc.TraceSegmentReportServiceHandler > - 86 [grpcServerPool-1-thread-7] ERROR [] - CANCELLED: cancelled before > receiving half close > >> >> io.grpc.StatusRuntimeException: CANCELLED: cancelled before > receiving half close" > >> >> will this make trace or jvm metrics be lost? > >> >> > >> >> > >> >> > 中文解释一下:我现在线上100多台机器,就会经常出现某些实例机器是好的,但是就会经常出现机器trace指标或者jvm指标丢失后就完全不会再出现,除非重启服务,我上面列举的这两个情况会导致我预见的这种情况么? > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> 在 2021-09-13 14:50:14,"Sheng Wu" <wu.sheng.841...@gmail.com> 写道: > >> >> >That error does matter. HTTP too large will make ElasticSearch > reject. > >> >> >your bulk insert, which causes data loss. > >> >> > > >> >> >Sheng Wu 吴晟 > >> >> >Twitter, wusheng1108 > >> >> > > >> >> >dafang <13240156...@163.com> 于2021年9月13日周一 下午2:23写道: > >> >> >> > >> >> >> Hi skywalking dev team: > >> >> >> In our prod env,I had found that the trace and jvm metrics lost > after some service start . And agent logs show no error info.Only server > log show: "Es 413 request too large".Will this problem cause complete data > loss? > >> >> >> > >> >> >> > >> >> >> 我用中文再形容一下: > >> >> >> > 最近发现我们线上服务集群原本有15台机器,但是接入skywalking之后,有一部分(大概5-6台),过了一段时间之后,trace指标或者jvm指标或者两者同时 > 会消失,但是此时该服务是可以继续提供服务的,只是监控数据没有了。经过排查 > 发现agent-log中没有任何错误信息,仅在服务端的日志中找到一些"413 request too large"的es报错,我想咨询一下 > ,这个问题会导致trace或者jvm指标入库失败之后,再也不会采集存储了么? > >> >> >> > >> >> >> > >> >> >> wait for your help > >> >> >> yours > >> >> >> 大方 > >> >> >> 2021.09.13 > -- Sheng Wu 吴晟 Apache SkyWalking Apache Incubator Apache ShardingSphere, ECharts, DolphinScheduler podlings Zipkin Twitter, wusheng1108