[ 
https://issues.apache.org/jira/browse/BEAM-12754?focusedWorklogId=637548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-637548
 ]

ASF GitHub Bot logged work on BEAM-12754:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Aug/21 21:11
            Start Date: 12/Aug/21 21:11
    Worklog Time Spent: 10m 
      Work Description: steveniemitz commented on a change in pull request 
#15327:
URL: https://github.com/apache/beam/pull/15327#discussion_r688087440



##########
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java
##########
@@ -316,27 +318,43 @@ static void encodeDelegate(
 
       // Encode the field count. This allows us to handle compatible schema 
changes.
       VAR_INT_CODER.encode(value.getFieldCount(), outputStream);
-      // Encode a bitmap for the null fields to save having to encode a bunch 
of nulls.
-      NULL_LIST_CODER.encode(scanNullFields(value, hasNullableFields), 
outputStream);
-      for (int encodingPos = 0; encodingPos < value.getFieldCount(); 
++encodingPos) {
-        @Nullable Object fieldValue = 
value.getValue(encodingPosToIndex[encodingPos]);
-        if (fieldValue != null) {
-          coders[encodingPos].encode(fieldValue, outputStream);
+
+      if (hasNullableFields) {
+        // If the row has null fields, extract the values out once so that 
both scanNullFields and
+        // the encoding can share it and avoid having to extract them twice.
+
+        List<Object> fieldValues = value.getValues();
+        // Encode a bitmap for the null fields to save having to encode a 
bunch of nulls.
+        NULL_LIST_CODER.encode(scanNullFields(fieldValues), outputStream);
+        for (int encodingPos = 0; encodingPos < fieldValues.size(); 
++encodingPos) {
+          @Nullable Object fieldValue = 
fieldValues.get(encodingPosToIndex[encodingPos]);

Review comment:
       ugh well that's very confusing.  I'll change this to copy into an array 
instead then using `getValue`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 637548)
    Time Spent: 0.5h  (was: 20m)

> RowCoderGenerator calls getValue multiple times
> -----------------------------------------------
>
>                 Key: BEAM-12754
>                 URL: https://issues.apache.org/jira/browse/BEAM-12754
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>    Affects Versions: 2.31.0
>            Reporter: Steve Niemitz
>            Assignee: Steve Niemitz
>            Priority: P2
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> RowCoderGenerator.encodeDelegate calls getValue for each field on a row 
> twice, one to check if it is null in scanNullFields, and one to actually get 
> the value to be encoded. 
> If getValue is expensive (for example, it has to recursively adapt a type to 
> a beam Row), this causes unneeded extra work.
> Instead we could call value.getValues to get all values once, then pass them 
> to scanNullFields and re-use them when encoding the values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to