[ 
https://issues.apache.org/jira/browse/SPARK-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003478#comment-15003478
 ] 

Liang-Chi Hsieh edited comment on SPARK-11698 at 11/13/15 3:17 AM:
-------------------------------------------------------------------

Yes, but it is intentional. We don't want to increase data latency due to heavy 
data loading. So we need to ignore some data in each iteration and keep 
consuming latest data. Because these data are not very important for us, we can 
ignore some part of them without problem.


was (Author: viirya):
Yes, but it is intentional. We don't want to increase data latency due to heavy 
data loading. So we need to ignore some data in each iteration and keep 
consuming latest data.

> Add option to ignore kafka messages that are out of limit rate
> --------------------------------------------------------------
>
>                 Key: SPARK-11698
>                 URL: https://issues.apache.org/jira/browse/SPARK-11698
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: Liang-Chi Hsieh
>
> With spark.streaming.kafka.maxRatePerPartition, we can control the max rate 
> limit. However, we can not ignore these messages out of limit. These messages 
> will be consumed in next iteration. We have a use case that we need to ignore 
> these messages and process latest messages in next iteration.
> In other words, we simply want to consume part of messages in each iteration 
> and ignore remaining messages that are not consumed.
> We add an option for this purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to