Dear Wiki user, You have subscribed to a wiki page or wiki category on "Samza Wiki" for change notification.
The "FAQ" page has been changed by ChrisRiccomini: https://wiki.apache.org/samza/FAQ?action=diff&rev1=4&rev2=5 Note that the first two configurations also have system-level settings(i.e. systems.<your system>.samza.reset.offset and systems.<your system>.samza.offset.default). + == How should Samza be run on AWS? + + From Gian Merlino: + + + We've been using Samza in production on AWS for a little over a month. + We're just using the YARN runner on a mostly stock hadoop 2.4.0 cluster + (not EMR). Our experience is that c3s work well for the YARN instances and + i2s work well for the Kafka instances. Things have been pretty solid with + that setup. + + For scaling up and scaling down YARN, we just terminate instances or add + instances, and this works pretty well. It can take a few minutes for the + cluster to realize a node has gone and respawn containers elsewhere. + + We have a separate Kafka cluster just for Samza's use, different from our + main Kafka cluster. The main reason is that we wanted to isolate off the + disk and network load of state compactions and restores (we don't use + compacted topics in our main Kafka cluster, but we do use them with Samza, + and the extra load on Kafka can be substantial). +
