[ https://issues.apache.org/jira/browse/PIG-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620710#comment-16620710 ]
Koji Noguchi commented on PIG-5357: ----------------------------------- {code:title=DistinctDataBag.java} + 67 public DistinctDataBag(Set<Tuple> tuples) { + 68 mContents = tuples; {code} I wasn't sure if this would work given the way we hardcode the HashSet type later at {code:title=DistinctDataBag.java} 236 // If this is the first read, we need to sort the data. 237 synchronized (mContents) { 238 if (mContents instanceof HashSet) { {code} As to whether we want to touch BagFactory at this point, I'll defer it to [~rohini] / [~daijy]. > BagFactory interface should support creating a distinct bag from a set > ---------------------------------------------------------------------- > > Key: PIG-5357 > URL: https://issues.apache.org/jira/browse/PIG-5357 > Project: Pig > Issue Type: Improvement > Reporter: Jacob Tolar > Priority: Minor > Attachments: PIG-5357-1.patch > > > It would be nice if BagFactory supported creating a distinct bag from a set > of tuples, similar to: > {code:java} > newDefaultBag(List<Tuple> listOfTuples); > {code} > [https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/BagFactory.java] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)