[ https://issues.apache.org/jira/browse/SPARK-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573806#comment-15573806 ]
Cody Koeninger commented on SPARK-17813: ---------------------------------------- So issues to be worked out here (assuming we're still ignoring compacted topics) maxOffsetsPerTrigger - how are these maximums distributed among partitions? What about skewed topics / partitions? maxOffsetsPerTopicPartitionPerTrigger - (this isn't just hypothetical, e.g. SPARK-17510) If we do this, how is this configuration communicated? {noformat} option("maxOffsetsPerTopicPartitionPerTrigger", """{"topicFoo":{"0":600}, "topicBar":{"0":300, "1": 600}}""") {noformat} {noformat} option("maxOffsetsPerTopicPerTrigger", """{"topicFoo": 600, "topicBar": 300}""") {noformat} > Maximum data per trigger > ------------------------ > > Key: SPARK-17813 > URL: https://issues.apache.org/jira/browse/SPARK-17813 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Michael Armbrust > > At any given point in a streaming query execution, we process all available > data. This maximizes throughput at the cost of latency. We should add > something similar to the {{maxFilesPerTrigger}} option available for files. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org