I think we cannot change (documented) semantics without any notice. That
could cause issues for upgraded pipelines (without proper testing, but
can we rely on that?). I'd therefore suggest to:
a) deprecate computeIfAbsent()
b) create compute() as a new method and temporary replacement for
computeIfAbsent()
c) after several (3, 4?) releases remove computeIfAbsent()
d) after next several releases return it back with changed semantics
This way the unexpected change would affect only users skipping many
releases.
Jan
On 2/17/22 19:08, Kenneth Knowles wrote:
I think this is a design & documentation bug. The behavior does not
make sense. Deviating from Map#computeIfAbsent cannot really be fixed
by documentation in bold, even though that is definitely better than
nothing. Fixing this should not be considered a breaking change. It
should be identical to Map#computeIfAbsent but deferred, hence a
ReadableState.
Kenn
On Wed, Feb 16, 2022 at 2:12 PM Luke Cwik <[email protected]> wrote:
The least we could do is add documentation to the current
MapState.computeIfAbsent API in bold that this differs from Java's
Map#computeIfAbsent since it returns the prior value and not the
new value.
You can add a computeIfAbsent2(...) and deprecate the old one
specifying the difference in the API.
Adding compute() would also be nice but would add complexity for
users who only want the computeIfAbsent2() implementation.
On Wed, Feb 16, 2022 at 1:01 AM Jan Lukavský <[email protected]> wrote:
That behavior is documented, so if that is a bug, I suppose it
is a design bug [1]. I'm not sure how to fix that, because we
cannot simply change the semantics without changing API, that
could result in user-code bugs. If I understand it correctly,
the only reason for not returning the actual value, but a
ReadableState is to enable calling readLater() to batch
multiple gets to single call. That makes sense and is sort of
consistent with the other usages of state in general. But
makes it impossible to return the actual value V instead of
ReadableState<V>.
Maybe we could deprecate 'computeIfAbsent()' and introduce
'compute()' with the JDK-compliant semantics? It is not the
same, though, but should be more practical than the current
version of 'computeIfAbsent', which enforces various hacks to
actually retrieve the computed (that is the current!) value.
WDYT?
Jan
[1]
https://beam.apache.org/releases/javadoc/2.36.0/org/apache/beam/sdk/state/MapState.html#computeIfAbsent-K-java.util.function.Function-
On 2/16/22 02:10, Reuven Lax wrote:
Sounds like a bug, probably related to the bug mentioned in
that thread.
On Fri, Feb 11, 2022 at 8:58 AM <[email protected]> wrote:
Yes, exactly, it returns null. The Java contract is to
return the existing value or the new (computed value)
instead.
Dne 11. 2. 2022 16:24 napsal uživatel Reuven Lax
<[email protected]>:
I'm not sure I understand what you mean. Do you mean
it always returns null? Since computeIfAbsent doesn't
modify the value unless it is not present, I'm not
sure what the "old value" means, unless you mean that
it returns null.
On Fri, Feb 11, 2022 at 4:54 AM Jan Lukavský
<[email protected]> wrote:
Hi,
I just found another weirdness in the MapState
API - method
computeIfAbsent returns the _old_ value
associated with the key. That is
inconsistent with Java API and frankly with what
users should expect
when using this method. The semantics should be
"retrieve current value,
if not present associate the key with this one
and return me the current
value". Is this behavior intentional? What is the
motivation? In
combination with the previous thread [1] that
states, that the value is
not updated until read() of the state returned
from computeIfAbsent is
called this seems to force users to either fetch
the state twice or do
various workarounds.
Jan
[1]
https://lists.apache.org/thread/mp979twlgm76gjg014f5r0k2p0nlcj2q