Micah Whitacre created CRUNCH-479:
-------------------------------------

             Summary: Writing to target with WriteMode.APPEND merges values 
into PCollection
                 Key: CRUNCH-479
                 URL: https://issues.apache.org/jira/browse/CRUNCH-479
             Project: Crunch
          Issue Type: Bug
          Components: Core
            Reporter: Micah Whitacre
            Assignee: Josh Wills


This was mentioned as part of CDK-617[1].  A PCollection that contains a set of 
values, is written to a target with WriteMode.APPEND, and then that PCollection 
is materialized, when you iterate over that PCollection it contains not only 
the new values that were appended but also the existing values.  This is 
surprising as most would expect that collection to only contain the original 
collection of values.  A use case for this might be if the solution is looking 
to only process the new values instead of dealing with all of the existing data.

[1] - https://issues.cloudera.org/browse/CDK-671



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to