ic4y commented on issue #2279:
URL: 
https://github.com/apache/incubator-seatunnel/issues/2279#issuecomment-1197947780

   @mosence 
   
   I understand the exact opposite
         I think in the offline task scenario, the task is divided into many 
tasks. These tasks are executed in batches with a fixed number of threads, 
which is the most efficient way, because a task will be executed by one thread, 
and there will be no thread scheduling, switching and waiting during the 
execution process.
   
   In real-time synchronization scenarios (unbounded data processing), shared 
threads can bring optimization. For example, in the scenario of real-time 
synchronization of a large number of small tables (thousands of thousands), in 
fact, the amount of data will not be very large, but the number of tasks will 
be very large, resulting in a large number of tasks. At this time, it is 
definitely not cost-effective for a task to start a thread, and java does not 
Good coroutine implementation, so sharing threads is an optimized way.
   
   You mentioned that there must be at least one process that continuously 
ingests data. I don’t think this is necessary, because the design of Task is to 
drive the execution by calling the Call method again and again, and a thread 
can be responsible for calling the call method of multiple Tasks.
   
   -------------------------------------------------------------
   
   我的理解恰恰相反
   
我认为在离线任务场景下,任务被分为众多的Task。这些Task被固定的线程数分批执行完成,这是最高效的做法,因为一个Task会由一个线程执行完成,执行过程中不会涉及线程的调度与切换和等待。
   
   
在实时同步场景(无界数据处理)下,共享线程才能带来优化。例如大量小表(几千上万)实时同步的场景,其实数据量不会很大但任务数会非常多,造成Task数量很多,这时候在一个Task开一个线程肯定是不划算的,加上java没有很好的协程实现,所以共享线程是一个优化的方式。
   
   
你提到的必须有最少一个进程持续进行数据摄入,这个我认为不是必须的,因为Task的设计是通过一次次调用Call方法去驱动执行,一个线程完全可以负责多个Task的call方法的调用。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to