[ 
https://issues.apache.org/jira/browse/SAMZA-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900544#comment-13900544
 ] 

Jakob Homan commented on SAMZA-146:
-----------------------------------

In our production jobs that are consuming on the order of ~2500 topic 
partitions across four containers, we see performance drop significantly (by an 
order of magnitude).  This is particularly evident where most of the topics are 
relatively low volume compared to a few high volume topics.  Specifically,
{code:title=SystemConsumers.java|borderStyle=solid}
  def choose = {
    // ... contents elided
    refresh
    envelopeFromChooser
  }

  private def refresh {
    debug("Refreshing chooser with new messages.")

    // Poll every system for new messages.
    consumers.keys.foreach(poll(_))

    // Update the chooser.
    neededByChooser.foreach(systemStreamPartition =>
      // If we have messages for a stream that the chooser needs, then update.
      if (fetchMap(systemStreamPartition).intValue < maxMsgsPerStreamPartition) 
{
        chooser.update(unprocessedMessages(systemStreamPartition).dequeue)
        updateFetchMap(systemStreamPartition)
        neededByChooser -= systemStreamPartition
      })
  }
{code}
The call for each of the consumers to poll for new messages it expensive 
relative to the single call to choose (and subsequent delivery of a messages to 
process).  Instrumenting these calls shows that more than 99% of the calls to 
poll are returning back nothing.

> Throughput degrades unreasonably as the number of topic-partitions increases
> ----------------------------------------------------------------------------
>
>                 Key: SAMZA-146
>                 URL: https://issues.apache.org/jira/browse/SAMZA-146
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.7.0
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>             Fix For: 0.7.0
>
>
> Currently we poll each stream partition in need of messages before each call 
> to process.  For jobs with a large number of partitions, and particularly 
> when those partitions are low volume, this dramatically impacts performance.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to