Maybe it didn't express my meaning well. Yes, DolphinScheduler doesn't need to do this storage. We just need to do the log plug-in function to support users to implement their own log storage mode, just like we do alert plug-ins.
可能是没有表达好的我的意思。 是的,DolphinScheduler没必要去做这个存储,我们只需要做日志插件功能,支持用户实现自己的日志存储方式,就像我们做告警插件一样。 boyi <[email protected]> 于2020年11月3日周二 下午10:37写道: > hi: > > > Agree with Calvin KIRs > > > > > > > Generally, there will be a unified log collection for such files, such as > ELK. Especially when deployed in the docker environment, we can consider > focusing on more important things > > > -------------------------------------- > BoYi ZhangE-mail : [email protected] > On 11/3/2020 22:23,CalvinKirs<[email protected]> wrote: > I understand what you mean, but what I want to express is whether it would > be better for users to do this part of the work. Generally, companies have > their own log collection system. If we were to do this part of the storage, > the workload would be great, and the benefits would not be great. > > > > 我明白您的意思,但我想表达的是,这部分工作交给用户来做会不会更好一点,一般企业都会有自己的日志收集系统。如果我们来做这部分存储的话工作量会很大,并且收益并不是很可观。 > > > Best wishes! > CalvinKirs > > > On 11/3/2020 22:09,leon bao<[email protected]> wrote: > @CalvinKirs > > about log spi , now have a requirement in scalable > services(master/worker), this kind of application scenario requires that > the task log cannot be stored in the worker / master, but needs to be > stored in a third-party place, which maybe database or other > storage. Therefore, if the dolphin scheduler can provide this plug-in > function, users can read and write logs according to their own needs. > > > 关于日志插件,我们现在有一个需求是可伸缩的服务(master/worker),这种应用场景就要求任务的日志不能存在某一个worker/master上, > > 而是需要存在一个第三方的地方,可能是数据库或者其他存储。所以如果DolphinScheduler能提供这个插件功能,用户就可以根据自己需求来实现日志的读写. > > CalvinKirs <[email protected]> 于2020年11月3日周二 下午6:40写道: > > > > Great planning! > But I have a little question, what are the specific requirements of the > log SPI? I am not very clear at present, are we only implementing SPI for > data storage? If this is the case, is it necessary? I think this user can > use logagent (or other technologies) for related implementations. Different > users have different needs. Some users may also involve aggregation, > calculation, and even different magnitudes, and may use additional > components. Therefore, if we store the original data in this piece, a lot > of redundant data may be generated. > > > 非常棒的规划! > > 但是我有一点疑问,日志SPI这块的具体需求是什么?我目前不是很明确,我们是只对数据存储做SPI实现吗?如果是这样的话,是否有必要呢?我认为这块用户可以自己使用logagent(或者其他技术) > > 来进行相关的实现,不同用户的需求不同,有的用户可能还牵扯到聚合、计算,甚至量级不同,还有可能使用额外的组件。因此,这块如果我们来对原始数据做存储的话可能会产生很多冗余数据。 > > > Best wishes! > CalvinKirs > > > On 11/3/2020 11:47,leon bao<[email protected]> wrote: > Hello Everyone: > > DS has good horizontal scalability with its non central design > architecture, which attracts many developers. With more and more users, the > demand for scheduling is becoming more and more complex. > At the same time, the functional design of DS is required to be more > scalable,for example: the plug-in function of alarm mode. > So we can discuss what parts of plug-ins DS can do at present. We can > reconstruct DolphinScheduler according to the results of this discussion. > At present, there are several parts of demand: > > - alert model: > refer to: > https://github.com/apache/incubator-dolphinscheduler/issues/3049 > > - task plugin: > refer to: > https://github.com/apache/incubator-dolphinscheduler/issues/2869 > > - register center: > refer to: > > > > https://lists.apache.org/thread.html/r755a57e3b859563de2dddf8aa2f336fcf28934e7bbb2c3f97fe5fe3d%40%3Cdev.dolphinscheduler.apache.org%3E > https://github.com/apache/incubator-dolphinscheduler/issues/3961 > > - log model: > The current log is recorded by writing local files of the server. > Can we make this plug-in type, which can facilitate users to extend the > log reading and writing types, such as writing to the database or other > third-party systems. > > - global task queue > At present, tasks are stored in the memory queue of the master, which > results in the priority of a task can only work within the scope of a > master. > In order to make the priority of a task effective globally, we need a > global queue to make the global priority work. > (in version 1.2, we used zookeeper as the global queue, which was removed > because of the delay of ZK operation) > > Implementation details can be discussed within each topic. Here, we only > discuss the requirements. > Very appreciate you can put forward more opinions. > > > > ================================================================================================================================================================== > > > > DS目前以无中心的设计架构具备了很好的横向扩展性,这个特性吸引了很多的开发者。随着DS用户越来越多,对调度的需求越来越复杂,同时也要求DS在功能设计上要更具有可扩展性 > 比如告警方式的插件功能,所以在这里大家可以讨论目前DS可以做哪些部分的插件,后续我们可以根据这个讨论结果,来对DS进行插件方面的重构。 > 目前已经有需求的几个部分: > > - 告警插件(running) > 相关讨论: > https://github.com/apache/incubator-dolphinscheduler/issues/3049 > > - 任务插件 > https://github.com/apache/incubator-dolphinscheduler/issues/2869 > > - 注册中心 > 相关讨论: > > > > https://lists.apache.org/thread.html/r755a57e3b859563de2dddf8aa2f336fcf28934e7bbb2c3f97fe5fe3d%40%3Cdev.dolphinscheduler.apache.org%3E > https://github.com/apache/incubator-dolphinscheduler/issues/3961 > > - 日志插件 > 目前的日志是通过写服务器本地文件的形式记录的,是不是可以把这个做成插件类型,方便用户扩展日志读写类型,比如写到数据库或者其他第三方系统中。 > > - 全局队列插件 > > > > 目前任务是被存储在master的内存队列,这就导致了任务的优先级只能在一定范围内起作用,为了让任务的优先级在全局有效,我们需要一种全局队列来让全局优先级起作用。(比如1.2版本我们使用的zookeeper作为全局队列,后面因为zk操作的延时性我们去掉了这个)。 > > 实现细节可以在每个话题内部进行讨论,在这里我们只讨论需求,希望大家可以提出更多意见。 > > -- > DolphinScheduler(Incubator) PPMC > BaoLiang 鲍亮 > [email protected] > > > > -- > DolphinScheduler(Incubator) PPMC > BaoLiang 鲍亮 > [email protected] > -- DolphinScheduler(Incubator) PPMC BaoLiang 鲍亮 [email protected]
