[ 
https://issues.apache.org/jira/browse/BEAM-14213?focusedWorklogId=753056&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-753056
 ]

ASF GitHub Bot logged work on BEAM-14213:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Apr/22 18:55
            Start Date: 05/Apr/22 18:55
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on code in PR #17253:
URL: https://github.com/apache/beam/pull/17253#discussion_r843159960


##########
sdks/python/apache_beam/typehints/batch_test.py:
##########
@@ -0,0 +1,111 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Unit tests for the batched type-hint objects."""
+
+import unittest
+
+import numpy as np
+import pandas as pd
+from parameterized import parameterized
+from parameterized import parameterized_class
+
+from apache_beam.typehints import row_type
+from apache_beam.typehints.batch import BatchConverter
+from apache_beam.typehints.batch import N
+from apache_beam.typehints.batch import NumpyArray
+from apache_beam.typehints.typehints import check_constraint
+from apache_beam.typehints.typehints import validate_composite_type_param
+
+
+@parameterized_class(
+    [{
+        'batch_typehint': np.ndarray,
+        'element_typehint': np.int32,
+        'batch': np.array(range(100), np.int32)
+    },
+     {
+         'batch_typehint': NumpyArray[np.int64, (N, 10)],
+         'element_typehint': NumpyArray[np.int64, (10, )],
+         'batch': np.array([list(range(i, i + 10)) for i in range(100)],
+                           np.int64),
+     },
+     {
+         'batch_typehint': pd.DataFrame,
+         'element_typehint': row_type.RowTypeConstraint([
+             ('f_str', str), ('f_int64', np.int64), ('f_int32', np.int32)
+         ]),
+         'batch': pd.DataFrame({
+             'f_str': pd.Series(map(str, range(100)), dtype=pd.StringDtype()),
+             'f_int64': pd.Series(range(100), dtype=np.int64),
+             'f_int32': pd.Series(range(100), dtype=np.int32)
+         }),
+     }])
+class BatchTest(unittest.TestCase):
+  def setUp(self):
+    self.utils = BatchConverter.from_typehints(
+        element_type=self.element_typehint, batch_type=self.batch_typehint)
+
+  def equality_check(self, left, right):
+    if isinstance(left, np.ndarray) and isinstance(right, np.ndarray):
+      return np.array_equal(left, right)
+    elif isinstance(left, pd.DataFrame) and isinstance(right, pd.DataFrame):
+      return left.equals(right)

Review Comment:
   I backed out the pandas DataFrame batchConverter (f909d92). Branch 
https://github.com/TheNeuralBit/beam/tree/batched-dofn-pandas has this change 
added back





Issue Time Tracking
-------------------

    Worklog Id:     (was: 753056)
    Time Spent: 50m  (was: 40m)

> Add support for Batched DoFns in the Python SDK
> -----------------------------------------------
>
>                 Key: BEAM-14213
>                 URL: https://issues.apache.org/jira/browse/BEAM-14213
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>            Reporter: Brian Hulette
>            Assignee: Brian Hulette
>            Priority: P2
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add an implementation for https://s.apache.org/batched-dofns to the Python 
> SDK.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to