Hi, It should be possible to deploy a single Flink cluster across geo-distributed nodes, but Flink currently offers no optimization for such a specific use case. AFAIK, the general pattern for dealing with geographically distributed data sources right now, would be to replicate data across clusters, such that they end up within a same target central destination before a processing framework such as Flink handles them. Good designs for multi-cluster data replication across, say multiple Kafka cluster, would be out of scope of this mailing list, though. Some quick googling led me to slides such as [1], but I'm sure there's will be more resources out there.
Cheers, Gordon [1] https://www.slideshare.net/ConfluentInc/common-patterns-of-multi-datacenter-architectures-with-apache-kafka On Wed, Jun 27, 2018 at 3:08 AM Stephen <bahuash...@gmail.com> wrote: > Hi, > Can Flink be deployed in a geo-distributed environment instead of being in > local clusters? > As far as I know, raw data should be moved to local cloud environment or > local clusters before Flink handle it. Consider this situation where data > sources are on different areas which might be cross different countries > that moving data with wlan is slow and expensive. How to solve this > problem? Is there solution for this now? > > Thanks. >