Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135398622
  
    Okay, looking at the "zipWithIndex" code, here is what really is the 
problem:
    
    Each function actually modifies the list, by sorting it. The here proposes 
solution solves it, by making sure everyone has its own copy of the list. That, 
btw, would have worked with any ArrayList as well. CopyOnWriteList seems a bit 
overkill.
    
    A nicer way to solve this is IMHO to use a broadcast variable initializer, 
which would guarantee that the list is sorted once (by the first one that 
accesses it) and then everyone shares the same sorted list.
      - Less memory consumption (not super critical, as we are talking about 
small lists)
      - Less work, since only one sort happens per TaskManager, rather than one 
sort per task.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to