Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/896#issuecomment-120302065
  
    > Looks like this change breaks the YARN integration. The YARN WordCount no 
longer works.
    
    Should be working again now.
    
    > It would be good if the accumulator update interval was configurable.
    > Edit: Is that the same value as the heartbeats?
    
    Yes, that was a design rationale to keep the message count low. We could 
only send the accumulators in every Nth heartbeat and let it be configurable.
    
    > The is a potential modification conflict: Drawing a snapshot for 
serialization and registering a new accumulator can lead to a 
ConcurrentModificationException in the drawing of the snapshot.
    
    I conducted tests with concurrent insertions and deletions and found that 
only concurrent removals cause ConcurrentModificationExceptions. Removals are 
not allowed for accumulators. Anyways, we could switch to a synchronized or 
copy on write hash map. If we do I would opt for the latter.
    
    > The naming of the accumulators refers sometimes to "flink vs. 
user-defined", and sometimes to "internal vs. external". Can we make this 
consistent? I actually like the "flink vs. user-defined" naming better.
    
    Then let's stick to the "flink vs. user-defined" naming scheme.
    
    > I think the code would be simpler is the registry simply always had a 
created map for internal and external accumulators. Also, a "reporter" object 
would help. 
    
    I agree that would be a nicer way of dealing with the API.
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to