VishalMCF commented on issue #4635:
URL: https://github.com/apache/eventmesh/issues/4635#issuecomment-1853817079

   @pandaapo @HarshSawarkar I have a theoretical question regarding the File 
Source connector. Hope I get some suggestions or any resources to study related 
to my question. Because we are not implementing the commit() method, every time 
a change happens in a file our connector will read the entire file and then 
push it to the event broker. This seems to be inefficient for now but maybe we 
can ignore that for now. 
   But let's say if we want to implement the commit() method how are we going 
to manage the offset? For each file, we need to store that offset persistently 
somewhere. From what I have read about offset in the context of file source, it 
can be the location of the byte last read or maybe the line number. What if the 
file is edited from the line existing previosuly than offset? For eg:- (offset 
present at line 91 but the user changed at line 22). I am not sure if my 
understanding is proper about file source connector but I am trying to refine 
it. Hope my question is clear enough. Thanks in Advance!!
   
   
   @pandaapo @HarshSawarkar 我有一个关于文件源连接器的理论问题。 希望我能得到一些建议或任何与我的问题相关的研究资源。 
因为我们没有实现 commit() 方法,所以每次文件中发生更改时,我们的连接器都会读取整个文件,然后将其推送到事件代理。 
目前这似乎效率低下,但也许我们现在可以忽略它。
   但是假设如果我们想实现 commit() 方法,我们将如何管理偏移量? 对于每个文件,我们需要将该偏移量永久存储在某处。 
根据我在文件源上下文中读到的偏移量,它可以是上次读取的字节的位置,也可以是行号。 如果文件是从先前存在的行而不是偏移量编辑的,该怎么办? 
例如:-(偏移量出现在第 91 行,但用户在第 22 行更改)。 我不确定我对文件源连接器的理解是否正确,但我正在尝试完善它。 希望我的问题足够清楚。 
提前致谢!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to