Dear Wiśniowski Piotr,

Thank you for your detailed and insightful response.

While I'm still working on building the basic structure of the MailIO 
connector, your information will be very helpful in setting the direction for 
future development. I particularly found your insights about the complexity of 
write operations and the features needed in enterprise environments very useful.

Thank you for pointing out the challenges with distributed throttling and 
exactly-once processing. I will research these aspects more thoroughly and look 
for potential solutions.

Thank you again for taking the time to provide such a detailed response. I hope 
we can continue to share progress and exchange ideas as the project moves 
forward.

Best regards
LDesire


> 2024. 11. 12. 오후 9:41, Piotr Wiśniowski <contact.wisniowskipi...@gmail.com> 
> 작성:
> 
> Hi,
> Had theoretical poc project in the past with quite similar functionalities 
> needed.
> Bounded read makes sense, and can be threatened as special case of unbounded 
> read. The second I could imagine is doing the same (reading emails for 
> downstream processing like some logic triggers or ml categorization and then 
> send to different departments). 
> From my perspective write is way more complicated and not sure If 
> beam/streaming applications are best pick for this tasks. Two potential 
> problems is that it needs distributed throttling out of the box for sending 
> emails. This can be done by using fixed parallelism (for example fixed number 
> of keys) and adaptive throttling (there is some out of the box code for that 
> already). The second problem I see is that even exactly once processing 
> options in runners (dataflow/flink) do not guarantee that sending will be 
> executed only once in all cases (this only guarantee that only a single 
> output will be seen downstream). To get around that probably double locking 
> would be required, but this together with throttling might be challenging to 
> get at same time.
> Regarding potential use cases for write, definitely distributed notification 
> systems - have seen ideas for such projects already in at least 3 corporation 
> s. Some features they required (as far as my memory is correct):
> - templating messages for output (Jinja like) but this could technically be 
> pushed upstream
> - priority queue - so that if there is a more urgent message in a priority 
> queue it should be send first before normal queue at same time considering 
> throttling.
> - single destination throttling - so a single email will get at most x msgs 
> per week.
> - channel configuration - so that user receiving notification could configure 
> which channel he wants to get msgs (email, slack, mobile push, sms etc. ).
> But above are typical requirements for whole notification apps, nor only for 
> the mail io, but I guess you could extract from this some use cases.
> 
> For the unbounded read, definitely emails could be used as some kind of 
> interface users could use to trigger asynchronous tasks (gdpr data deletion 
> for example). Having dedicated mail io read would avoid the need of having 
> separate be app to fetch the emails or additional brooker configuration for 
> emails systems (sometimes this is not possible because security policies in 
> corporations).
> 
> Let me know if this is helpful. Happy to see such initiatives 🙂
> Best Wiśniowski Piotr
> 
> 
> wt., 12 lis 2024, 13:03 użytkownik LDesire <two_som...@icloud.com 
> <mailto:two_som...@icloud.com>> napisał:
>> Hello,
>> 
>> I am currently working on developing a MailIO connector for Apache Beam.
>> 
>> While I have made progress implementing bounded read functionality, I'm 
>> somewhat uncertain about the practical use cases where users would need the 
>> MailIO connector.
>> 
>> The use cases I've considered are:
>> 
>> - Bounded Read:
>> Email folder archiving - For example, archiving all messages from specific 
>> folders to storage systems like GCS, HDFS, or S3.
>> 
>> - Write:
>> Integrating with messaging systems like Pub/Sub to collect user behavior 
>> data, generating AI-powered messages based on these behaviors, and then 
>> using MailIO.write to compose and send emails.
>> 
>> I haven't considered implementing Unbounded Read yet.
>> 
>> I'm wondering if there might be other valuable use cases that I haven't 
>> thought of?
>> 
>> Thank you.

Reply via email to