道君 <[email protected]> 于2020年7月14日周二 下午7:21写道:
> Hi > Thanks for reply. > > I am sure that I don’t violate anything from the IP law. I just had read > the source code and totally different. > Thanks for making this clear. > > I agreed about add `inspired by RocketMQ` in the main entrance class > comment and I will do that. > > About the code: > I will add code in `apm-datacarrier` module and import dependency > `protostuff`. Implements QueueBuffer interface, when a message to add, > serialize it and save to file. > When agent shutdown, files will be deleted. I don’t want to implements > file recovery, because it’s useless in k8s. > Good to me. > > But there is one question: > How could I use log in the module? Logging is in the `agent-core` module, > and `agent-core` depend on `data-carrier`. > DataCarrier#setLoggerFactory would be a good idea. Then you could inject the agent logger factory into it. > Any questions, reply message for detail. > Thanks. > > Dao Jun 道君 > Alibaba-inc, tjm > > > > 在 2020年7月14日,下午5:23,Sheng Wu <[email protected]> 写道: > > > > 道君 <[email protected]> 于2020年7月14日周二 下午5:19写道: > > > >> Hi > >> Thanks for reply. > >> > >> First, what mean about `inspired`: > >> It means I had read their source code then I know how a MQ file storage > >> works, but, I think them not suitable for application agent but suitable > >> for broker, because of it use very large direct buffer and too complex. > So > >> I redesign and simplify and code with my own. No source code copy, > totally > >> different from them, just learn from them. > >> > > > > This is good to know. But please add `inspired by RocketMQ` on the main > > entrance class comment. We should show respect to the original author, > even > > we don't copy anything from them. > > For MetaQ, we can't say it, as it is open to you only, we can't know what > > happens inside. Just make sure you don't violate anything from the IP law > > perspective, as you are from Alibaba team, so this would be good for both > > of us. > > > > > >> > >> Second, k8s limited perf of disk: > >> I async flush channel buffer to disk, so, it will not effect to write, > >> only effect to read(in channel mode). > >> > > > > My point is, this feature is optional, and the default is OFF. > > > > > >> > >> > >> Third, read data from DataCarrier: > >> I don’t test the case, file storage don’t care about the class instance, > >> only bytes. I think we can choose a high perf object serialize util to > >> serialize object to bytes. I think protostuff is good. > >> > > > > This is about where do you plan to add the codes into the agent. Could > you > > explain this more clear? > > > > > >> > >> Any questions, reply mail for detail. > >> Thanks. > >> > >> Dao Jun 道君 > >> Alibaba-inc, tjm > >> > >> > >>> 在 2020年7月14日,下午4:44,Sheng Wu <[email protected]> 写道: > >>> > >>> Hi > >>> > >>> Inline > >>> > >>> > >>> xkz <[email protected]> 于2020年7月14日周二 下午3:32写道: > >>> > >>>> Hi > >>>> Thanks for Sheng Wu’s reply. Do you mean avoiding unlimited memory > >>>> increasing by saying 'because we need to keep memory safe'? > >>>> > >>>> Since we can not use google’s products in China, I try to describe my > >>>> design in detail in this email. Sorry about this. > >>>> The file storage I said was inspired by RoketMQ and MetaQ(Alibaba-inc > >>>> internal project). > >>>> > >>> > >>> Please define `Inspired`. Because > >>> 1. If you have copied some codes from RocketMQ, we need to indicate > them > >>> and update LICENSE to describe we did. > >>> 2. At the same time, if codes are from MetaQ, we need Alibaba SGA about > >>> those codes, because those are codes owned by a company and not > >>> open-sourced. > >>> This is very important to us. Please make sure there is no IP issue. > >>> > >>> > >>>> > >>>> First, the read/write mode: > >>>> I will use a direct buffer pool, default 100M per buffer and pool size > >> is > >>>> 2. When a file is created(called StoreFile), take a direct buffer from > >> pool > >>>> and set as an instance field, write message in the buffer, async write > >>>> buffer to file-channel and async flush buffer to disk. > >>>> Read message has two modes: in channel or in direct buffer. When > >> creating > >>>> a new file, if take a buffer from buffer pool failed(buffer pool > return > >>>> null), it will return previous file’s buffer to buffer pool. So, > current > >>>> files(depend on buffer pool size) read message in buffer, previous > files > >>>> read message in channel. > >>>> This design could save direct memory for one physical machine running > >> many > >>>> application instances. > >>>> > >>>> Second, when and how to activate this feature: > >>>> This feature I think it’s very suitable for tracing data provided fast > >> but > >>>> OAP server or STORAGE consumes slowly, so file storage should be very > >>>> important, because we don’t need to worry about data loss. > >>>> I want to add a config key in agent profile, if the config value > >>>> configured as FILE_CACHE or default as FILE_CACHE, the feature can be > >>>> activated. > >>>> > >>> > >>> This looks good for me, and this should be OFF in default, as today, in > >>> many k8s deployments, there is very limited perf of local disk. > >>> > >>> > >>>> > >>>> Third, what is performance: > >>>> I tested it on my PC(Macbook pro 2016, 2 core 8g RAM, -Xms1g -Xmx1g) > >> with > >>>> only one thread, put 1000_000 messages(2000 bytes per message) to file > >>>> costs about 20 seconds(generate random string and save to disk total > >> cost > >>>> 20 seconds), 50_000 message per second. > >>>> > >>> > >>> 50k/s seems fine, but do you read the data from the DataCarrier, then > >> write > >>> to the file? Or the TracingContext access the file buffer directly? > Those > >>> are different scenarios and have different performance requirements. > >>> > >>> > >>>> > >>>> Any questions, reply mail for detail. > >>>> Thanks. > >>>> > >>>> Dao Jun 道君 > >>>> Alibaba-inc, tjm > >>>> > >>>>> 在 2020年7月14日,上午10:34,Sheng Wu <[email protected]> 写道: > >>>>> > >>>>> Hi > >>>>> > >>>>>> I have noticed that skywalking use heap buffer to cache tracing > data. > >>>>> > >>>>> It could cause data loss, but that is intentional. Because we need to > >>>> keep > >>>>> memory safe. > >>>>> > >>>>> Back to what your asking, if you want to build a local file system > >> based > >>>>> cache, I think you should submit a design, including > >>>>> 1. What is the file write/read model > >>>>> 2. When should activate this feature, and how > >>>>> 3. What is performance? And do you have available benchmark result > >>>> between > >>>>> memory write and file write in high concurrency situation. > >>>>> more if you thing need to say. > >>>>> > >>>>> You could use this[1] as design doc template. Look forward to your > >>>> detail. > >>>>> > >>>>> [1] > >>>>> > >>>> > >> > https://docs.google.com/document/d/1biRE3Bc0cTbs7qnBozUuAxCmeP5n8y0JKJAyzqitLnM/edit > >>>>> > >>>>> Sheng Wu 吴晟 > >>>>> Twitter, wusheng1108 > >>>>> > >>>>> > >>>>> Aries <[email protected]> 于2020年7月14日周二 上午10:13写道: > >>>>> > >>>>>> Hi all: I have noticed that skywalking use heap buffer to cache > >>>>>> tracing data. It usually cause data loss. Because of this problem, I > >>>> want > >>>>>> to add a high-performance file storage to skywalking,so that > tracing > >>>> data > >>>>>> can be saved to disk. If tracing data saved to file,skywalking > >> will > >>>>>> have strong ability to accumulate data and we do not have to care > >> about > >>>> how > >>>>>> many tracing data provided or whether OAP server working, data had > >>>> saved. > >>>>>> Do we need this feature? Any suggestions? Thanks > >>>> > >>>> > >> > >> > >
