Re: Progress Report for IoTDB-Skywalking Adapter

Sheng Wu Wed, 04 Aug 2021 01:38:27 -0700

The resources rely on the scale and traffic.

I could share our benchmark result for the ElasticSearch
cluster(hosted cloud service)
```
The whole resource cost is 30U/30G of a 5 nodes OAP cluster for 92k
ingress load. Thousands of metrics are generated, 100(max) error logs
sampling per second.
```
ElasticSearch cluster is a 3 nodes cluster, 3G memory, 12m metrics,
3.1G disk cost, 184 shards.



Sheng Wu 吴晟
Twitter, wusheng1108

Xiangdong Huang <[email protected]> 于2021年8月4日周三 下午4:03写道：

>
> > the cardinality of metrics could be millions easily, even don't
> consider the time bucket.
>
> I think the time bucket maybe can be ignored because we have precise
> timestamp (which is indexed).
>
> So, in general, to manage so many metrics,  how the specification of the
> server that the DB will be deployed is?
> (e.g., the memory? the disk? the CPU?)
>
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院
>
>
> Sheng Wu <[email protected]> 于2021年8月4日周三 下午3:11写道：
>
> > Yes, the cardinality of metrics could be millions easily, even don't
> > consider the time bucket.
> > Trace id in traces and logs are billions, also only according to
> > TTL(time deleting period).
> >
> > APM data formats are more complex than a metric monitoring system :)
> > And we have more complex scenario to cover, logs, tracing and topology
> > are those cases.
> >
> > Sheng Wu 吴晟
> > Twitter, wusheng1108
> >
> > Xiangdong Huang <[email protected]> 于2021年8月4日周三 下午3:07写道：
> > >
> > > Hi Sheng,
> > >
> > > > Metric is a million-level dataset as the number of service instances
> > > > and endpoints.
> > > > Log and Trace are a billion level dataset
> > >
> > > Do you mean the cardinality of indexed attributes of Metrics is
> > > million-level?
> > > i.e., the different values of _time_bucket and _entity_id attributes (and
> > > some other attributes, excepts the "value" attribute like sum, count)?
> > >
> > > Best,
> > > -----------------------------------
> > > Xiangdong Huang
> > > School of Software, Tsinghua University
> > >
> > >  黄向东
> > > 清华大学 软件学院
> > >
> > >
> > > 刘威 <[email protected]> 于2021年8月4日周三 下午2:41写道：
> > >
> > > > Thanks you. I will contact you in a few days through QQ. (I have joined
> > > > the QQ group of Skywalking.)
> > > >
> > > > Wei Liu 刘威
> > > >
> > > > On 2021/08/04 06:12:50, Sheng Wu <[email protected]&gt; wrote:
> > > > &gt; If you have questions(I saw those in the readme), you could set
> > up an
> > > > &gt; online meeting, I am glad to help.
> > > > &gt;
> > > > &gt; Sheng Wu 吴晟
> > > > &gt; Twitter, wusheng1108
> > > > &gt;
> > > > &gt; Sheng Wu <[email protected]&gt; 于2021年8月4日周三 下午2:06写道：
> > > > &gt; &gt;
> > > > &gt; &gt; Hi
> > > > &gt; &gt;
> > > > &gt; &gt; I can see you provided an unindexed solution, which could be
> > > > used as a
> > > > &gt; &gt; POC implementation.
> > > > &gt; &gt; If you want end users really to adopt this, we need to make
> > sure
> > > > of
> > > > &gt; &gt; the performance.
> > > > &gt; &gt; Metric is a million-level dataset as the number of service
> > > > instances
> > > > &gt; &gt; and endpoints.
> > > > &gt; &gt; Log and Trace are a billion level dataset
> > > > &gt; &gt;
> > > > &gt; &gt; N K write/s op for the metrics.
> > > > &gt; &gt; The 10 * number of cluster services * traffic is the scale of
> > > > trace
> > > > &gt; &gt; and log, even more.
> > > > &gt; &gt;
> > > > &gt; &gt; Sheng Wu 吴晟
> > > > &gt; &gt; Twitter, wusheng1108
> > > > &gt; &gt;
> > > > &gt; &gt; 刘威 <[email protected]&gt; 于2021年8月4日周三 下午12:54写道：
> > > > &gt; &gt; &gt;
> > > > &gt; &gt; &gt; Hi, I'm a participant in Summer 2021 of Open Source
> > > > Promotion Plan. My name is Liu Wei.
> > > > &gt; &gt; &gt; My project is create a IoTDB-Adapter for Skywalking.
> > > > (Project Link: https://summer.iscas.ac.cn/#/org/prodetail/210070771)
> > > > &gt; &gt; &gt;
> > > > &gt; &gt; &gt; I have pushed my progress to my Github fork (
> > > > https://github.com/LIU-WEI-git/skywalking) and official GitLab (
> > > > https://gitlab.summer-ospp.ac.cn/summer2021/210070771)
> > > > &gt; &gt; &gt; My current solution is similar to relational database.
> > It
> > > > is introduced in README.md on
> > > >
> > https://github.com/LIU-WEI-git/skywalking/tree/iotdb-adapter/oap-server/server-storage-plugin/storage-iotdb-plugin
> > .
> > > > What's more, I also introduce another solution with index in README.md
> > and
> > > > will adopt it later.
> > > > &gt; &gt; &gt;
> > > > &gt; &gt; &gt; I have implement all insert, update, delete method and
> > part
> > > > of query method. I will push them together when I finish the rest.
> > > > &gt; &gt; &gt; If you have any suggestion, please give it to me,
> > thanks in
> > > > advance.
> > > > &gt; &gt; &gt;
> > > > &gt; &gt; &gt; I have found much different query size limit in the
> > other
> > > > storage plugins(in application.yml). Dose it relate to upper
> > applications?
> > > > &gt; &gt; &gt; How many default pieces of data should be queried from
> > > > those model, such as Segment, ProfileTask, ProfileTaskLog,
> > > > ProfileThreadSnapshot, NetworkAddressAlias and MetaData?
> > > > &gt; &gt; &gt; Should I set the same query size limit for IoTDB storage
> > > > plugin?
> > > > &gt;
> >

Re: Progress Report for IoTDB-Skywalking Adapter

Reply via email to