[ https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082004#comment-15082004 ]
xingang edited comment on KAFKA-3062 at 1/4/16 11:30 PM: --------------------------------------------------------- Yes! Example: Huge volume data producing to >60 partition, and 15 consumers will work on this data. 10 of them are time-latency sensitive, which is nearly real-time processing, it's better for them to consume from the page cache to get the data, sometime a little data loss even can be tolerant as its processing shows processing result for realtime 5 of them are reports processing from the data, it's Ok to be hours or even daily jobs, it does not require to show its result in a short time. considering, if the 5 stats-processing are in a lag, and they will consume from the disk, and make the page cache full of them, since such history data consuming are N times faster than the producing rate. hence, the 10 time-latency sensitive processing are sad, since they always see the page cache missing~~ once they get a short time lag Thanks for your quick response! was (Author: itismewxg): Yes! Example: Huge volume data producing to >60 partition, and 15 consumers will works on this data. 10 of them are time-latency sensitive, which is nearly real-time processing, it's better for them to consume from the page cache to get the data, sometime a little data loss even can be tolerant as its processing shows processing result for realtime 5 of them are reports processing from the data, it's Ok to be hours or even daily jobs, it does not require to show its result in a short time. considering, if the 5 stats-processing are in a lag, and they will consume from the disk, and make the page cache full of them, since such history data consuming are N times faster than the producing rate. hence, the 10 time-latency sensitive processing are sad, since they always see the page cache missing~~ once they get a short time lag Thanks for your quick response! > Read from kafka replication to get data likely Version based > ------------------------------------------------------------ > > Key: KAFKA-3062 > URL: https://issues.apache.org/jira/browse/KAFKA-3062 > Project: Kafka > Issue Type: Improvement > Reporter: xingang > > Since Kafka require all the reading happens in the leader for the consistency. > If there could be possible for the reading can happens in replication, thus, > for data have a number of consumers, for the consumers Not latency-sensitive > But Data-Loss sensitive can fetch its data from replication, in this case, it > will pollute the Pagecache for other consumers which are latency-sensitive -- This message was sent by Atlassian JIRA (v6.3.4#6332)