[ 
https://issues.apache.org/jira/browse/BEAM-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2699:
----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the 
pull request resolve the issue? If so, could you please mark it resolved? This 
will help the project have a clear view of its open issues.

> AppliedPTransform is used as a key in hashmaps but PTransform is not 
> hashable/equality-comparable
> -------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-2699
>                 URL: https://issues.apache.org/jira/browse/BEAM-2699
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-core
>            Reporter: Eugene Kirpichov
>            Priority: P3
>
> There's plenty of occurrences in runners-core of Map or BiMap where the key 
> is an AppliedPTransform.
> However, PTransform does not advertise that it is required to implement 
> equals/hashCode, and some transforms can't do it properly anyway - for 
> example, transforms that capture a ValueProvider which is also not 
> hashable/eq-comparable. I'm surprised that things aren't already very broken 
> because of this.
> Fundamentally, I don't see why we should ever compare two PTransform's for 
> equality.
> I looked at the code and wondered "can AppliedPTransform simply be 
> identity-hashable", but right now the answer is no because we can create an 
> AppliedPTransform for the same transform applied to the same thing multiple 
> times.
> Fixing that appears to be not very easy, but definitely possible. Ideally 
> TransformHierarchy.Node would just know its AppliedPTransform, however a Node 
> can be constructed when there's yet no Pipeline. Suppose there's gotta be 
> some way to propagate a Pipeline into Node.finishSpecifying() (which should 
> be called exactly once on the Node, and this should be enforced), and have 
> finishSpecifying() return the AppliedPTransform, and have the caller use that 
> instead of potentially repeatedly calling .toAppliedPTransform() on the same 
> Node.
> [~kenn] is on vacation but perhaps [~tgroh] can help with this meanwhile?
> CC: [~reuvenlax]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to