Kafka could be a good fit for transporting large documents, given that you use compression and tune the consumer fetch buffer correctly. Since Kafka does not stream read large messages, you need at least n MB memory available to read a n MB large message. Typically you would need more memory than that, since multiple consumer threads run on the same machine. To protect the consumer from running out of memory, we are adding a server side config that controls the largest message size that the server will accept.
Thanks, Neha On Tue, Oct 16, 2012 at 7:51 AM, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote: > Hi, > > We're considering using Kafka for transport of potentially large "documents" > (think documents for full-text indexing, docs that can be as small as tweets > or as large as PDF files, say 5MB). > > I'm wondering if Kafka is suitable for transporting potentially large > documents or if there is something inherent in Kafka that makes it a poor > choice for this use case? > > Thanks, > Otis > ---- > Performance Monitoring for Solr / ElasticSearch / HBase - > http://sematext.com/spm