[jira] [Work logged] (BEAM-6756) Support lazy iterables in schemas

ASF GitHub Bot (Jira) Mon, 18 Nov 2019 07:15:26 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-6756?focusedWorklogId=345343&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-345343
 ]


ASF GitHub Bot logged work on BEAM-6756:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Nov/19 15:14
            Start Date: 18/Nov/19 15:14
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on pull request #10003: 
[BEAM-6756] Create Iterable type for Schema
URL: https://github.com/apache/beam/pull/10003#discussion_r347424284
 
 

 ##########
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java
 ##########
 @@ -450,28 +469,28 @@ static int deepHashCodeForMap(
       return h;
     }
 
-    static boolean deepEqualsForList(List<Object> a, List<Object> b, 
Schema.FieldType elementType) {
+    static boolean deepEqualsForIterable(
+        Iterable<Object> a, Iterable<Object> b, Schema.FieldType elementType) {
       if (a == b) {
         return true;
       }
 
-      if (a.size() != b.size()) {
-        return false;
-      }
 
 Review comment:
   It could be nice to have this constant-time short circuit for the array 
case, maybe you could still have `deepEqualsForList` that does the size check 
and then defers to `deepEqualsForIterable`?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 345343)
    Time Spent: 1.5h  (was: 1h 20m)

> Support lazy iterables in schemas
> ---------------------------------
>
>                 Key: BEAM-6756
>                 URL: https://issues.apache.org/jira/browse/BEAM-6756
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-java-core
>            Reporter: Reuven Lax
>            Assignee: Reuven Lax
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The iterables returned by GroupByKey and CoGroupByKey are lazy; this allows a 
> runner to page data into memory if the full iterable is too large. We 
> currently don't support this in Schemas, so the Schema Group and CoGroup 
> transforms materialize all data into memory. We should add support for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-6756) Support lazy iterables in schemas

Reply via email to