Hi Thanks for reply. I am sure that I don’t violate anything from the IP law. I just had read the source code and totally different.
I agreed about add `inspired by RocketMQ` in the main entrance class comment and I will do that. About the code: I will add code in `apm-datacarrier` module and import dependency `protostuff`. Implements QueueBuffer interface, when a message to add, serialize it and save to file. When agent shutdown, files will be deleted. I don’t want to implements file recovery, because it’s useless in k8s. But there is one question: How could I use log in the module? Logging is in the `agent-core` module, and `agent-core` depend on `data-carrier`. Any questions, reply message for detail. Thanks. Dao Jun 道君 Alibaba-inc, tjm > 在 2020年7月14日,下午5:23,Sheng Wu <[email protected]> 写道: > > 道君 <[email protected]> 于2020年7月14日周二 下午5:19写道: > >> Hi >> Thanks for reply. >> >> First, what mean about `inspired`: >> It means I had read their source code then I know how a MQ file storage >> works, but, I think them not suitable for application agent but suitable >> for broker, because of it use very large direct buffer and too complex. So >> I redesign and simplify and code with my own. No source code copy, totally >> different from them, just learn from them. >> > > This is good to know. But please add `inspired by RocketMQ` on the main > entrance class comment. We should show respect to the original author, even > we don't copy anything from them. > For MetaQ, we can't say it, as it is open to you only, we can't know what > happens inside. Just make sure you don't violate anything from the IP law > perspective, as you are from Alibaba team, so this would be good for both > of us. > > >> >> Second, k8s limited perf of disk: >> I async flush channel buffer to disk, so, it will not effect to write, >> only effect to read(in channel mode). >> > > My point is, this feature is optional, and the default is OFF. > > >> >> >> Third, read data from DataCarrier: >> I don’t test the case, file storage don’t care about the class instance, >> only bytes. I think we can choose a high perf object serialize util to >> serialize object to bytes. I think protostuff is good. >> > > This is about where do you plan to add the codes into the agent. Could you > explain this more clear? > > >> >> Any questions, reply mail for detail. >> Thanks. >> >> Dao Jun 道君 >> Alibaba-inc, tjm >> >> >>> 在 2020年7月14日,下午4:44,Sheng Wu <[email protected]> 写道: >>> >>> Hi >>> >>> Inline >>> >>> >>> xkz <[email protected]> 于2020年7月14日周二 下午3:32写道: >>> >>>> Hi >>>> Thanks for Sheng Wu’s reply. Do you mean avoiding unlimited memory >>>> increasing by saying 'because we need to keep memory safe'? >>>> >>>> Since we can not use google’s products in China, I try to describe my >>>> design in detail in this email. Sorry about this. >>>> The file storage I said was inspired by RoketMQ and MetaQ(Alibaba-inc >>>> internal project). >>>> >>> >>> Please define `Inspired`. Because >>> 1. If you have copied some codes from RocketMQ, we need to indicate them >>> and update LICENSE to describe we did. >>> 2. At the same time, if codes are from MetaQ, we need Alibaba SGA about >>> those codes, because those are codes owned by a company and not >>> open-sourced. >>> This is very important to us. Please make sure there is no IP issue. >>> >>> >>>> >>>> First, the read/write mode: >>>> I will use a direct buffer pool, default 100M per buffer and pool size >> is >>>> 2. When a file is created(called StoreFile), take a direct buffer from >> pool >>>> and set as an instance field, write message in the buffer, async write >>>> buffer to file-channel and async flush buffer to disk. >>>> Read message has two modes: in channel or in direct buffer. When >> creating >>>> a new file, if take a buffer from buffer pool failed(buffer pool return >>>> null), it will return previous file’s buffer to buffer pool. So, current >>>> files(depend on buffer pool size) read message in buffer, previous files >>>> read message in channel. >>>> This design could save direct memory for one physical machine running >> many >>>> application instances. >>>> >>>> Second, when and how to activate this feature: >>>> This feature I think it’s very suitable for tracing data provided fast >> but >>>> OAP server or STORAGE consumes slowly, so file storage should be very >>>> important, because we don’t need to worry about data loss. >>>> I want to add a config key in agent profile, if the config value >>>> configured as FILE_CACHE or default as FILE_CACHE, the feature can be >>>> activated. >>>> >>> >>> This looks good for me, and this should be OFF in default, as today, in >>> many k8s deployments, there is very limited perf of local disk. >>> >>> >>>> >>>> Third, what is performance: >>>> I tested it on my PC(Macbook pro 2016, 2 core 8g RAM, -Xms1g -Xmx1g) >> with >>>> only one thread, put 1000_000 messages(2000 bytes per message) to file >>>> costs about 20 seconds(generate random string and save to disk total >> cost >>>> 20 seconds), 50_000 message per second. >>>> >>> >>> 50k/s seems fine, but do you read the data from the DataCarrier, then >> write >>> to the file? Or the TracingContext access the file buffer directly? Those >>> are different scenarios and have different performance requirements. >>> >>> >>>> >>>> Any questions, reply mail for detail. >>>> Thanks. >>>> >>>> Dao Jun 道君 >>>> Alibaba-inc, tjm >>>> >>>>> 在 2020年7月14日,上午10:34,Sheng Wu <[email protected]> 写道: >>>>> >>>>> Hi >>>>> >>>>>> I have noticed that skywalking use heap buffer to cache tracing data. >>>>> >>>>> It could cause data loss, but that is intentional. Because we need to >>>> keep >>>>> memory safe. >>>>> >>>>> Back to what your asking, if you want to build a local file system >> based >>>>> cache, I think you should submit a design, including >>>>> 1. What is the file write/read model >>>>> 2. When should activate this feature, and how >>>>> 3. What is performance? And do you have available benchmark result >>>> between >>>>> memory write and file write in high concurrency situation. >>>>> more if you thing need to say. >>>>> >>>>> You could use this[1] as design doc template. Look forward to your >>>> detail. >>>>> >>>>> [1] >>>>> >>>> >> https://docs.google.com/document/d/1biRE3Bc0cTbs7qnBozUuAxCmeP5n8y0JKJAyzqitLnM/edit >>>>> >>>>> Sheng Wu 吴晟 >>>>> Twitter, wusheng1108 >>>>> >>>>> >>>>> Aries <[email protected]> 于2020年7月14日周二 上午10:13写道: >>>>> >>>>>> Hi all: I have noticed that skywalking use heap buffer to cache >>>>>> tracing data. It usually cause data loss. Because of this problem, I >>>> want >>>>>> to add a high-performance file storage to skywalking,so that tracing >>>> data >>>>>> can be saved to disk. If tracing data saved to file,skywalking >> will >>>>>> have strong ability to accumulate data and we do not have to care >> about >>>> how >>>>>> many tracing data provided or whether OAP server working, data had >>>> saved. >>>>>> Do we need this feature? Any suggestions? Thanks >>>> >>>> >> >>
