[
https://issues.apache.org/jira/browse/BEAM-5928?focusedWorklogId=162169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-162169
]
ASF GitHub Bot logged work on BEAM-5928:
----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Nov/18 01:06
Start Date: 03/Nov/18 01:06
Worklog Time Spent: 10m
Work Description: reuvenlax commented on a change in pull request #6927:
[BEAM-5928] Change hash map to concurrent map.
URL: https://github.com/apache/beam/pull/6927#discussion_r230540464
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java
##########
@@ -101,12 +101,12 @@
private static final Map<TypeName, StackManipulation> CODER_MAP;
// Cache for Coder class that are already generated.
- private static Map<UUID, Coder<Row>> generatedCoders = Maps.newHashMap();
+ private static Map<UUID, Coder<Row>> generatedCoders =
Maps.newConcurrentMap();
Review comment:
In my experience, even in these cases it's best to use a lock or a
thread-safe data structure. If not, weird things can still go wrong (e.g. since
no memory fence is set, the compiler or the processor is free to reorder the
read and the write in such a way that breaks your assumptions).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 162169)
Time Spent: 0.5h (was: 20m)
> ConcurrentModificationException from RowCoderGenerator lazy caching
> -------------------------------------------------------------------
>
> Key: BEAM-5928
> URL: https://issues.apache.org/jira/browse/BEAM-5928
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Benson Tucker
> Assignee: Reuven Lax
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> h3. Summary:
> RowCoderGenerator caches a delegate Coder<Row> once encode or decode is
> exercised, but there's not an API for caching this delegate eagerly.
> h3. Use Case:
> When creating several PCollections to perform distinct reads with the same
> schema, you might create one RowCoder.of(schema) before creating the list of
> PCollections / PCollectionsList. However, once the pipeline begins and rows
> arrive for encoding, these pipelines will simultaneously try to cache a
> delegate coder for the row's schema.
> h3. Workaround:
> You can force the eager caching of the code by exercising encode in the main
> application before creating PCollections using the RowCoder:
> {code:java}
> try {
> myRowCoder.encode(null, null);
> } catch (IOException | NullPointerException e) {
> // do nothing
> }
> {code}
> h3. Context:
> I've only encountered this during development with the direct runner.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)