Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/spark/pull/21733#discussion_r206791325
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StatefulOperatorsHelperSuite.scala
---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.sql.catalyst.expressions.{Attribute,
SpecificInternalRow, UnsafeProjection, UnsafeRow}
+import
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection
+import
org.apache.spark.sql.execution.streaming.StatefulOperatorsHelper.StreamingAggregationStateManager
+import org.apache.spark.sql.streaming.StreamTest
+import org.apache.spark.sql.types.{IntegerType, StructField, StructType}
+
+class StatefulOperatorsHelperSuite extends StreamTest {
+ import TestMaterial._
+
+ test("StateManager v1 - get, put, iter") {
+ val stateManager = newStateManager(KEYS_ATTRIBUTES, OUTPUT_ATTRIBUTES,
1)
+
+ // in V1, input row is stored as value
+ testGetPutIterOnStateManager(stateManager, OUTPUT_ATTRIBUTES,
TEST_ROW, TEST_KEY_ROW, TEST_ROW)
+ }
+
+ // ============================ StateManagerImplV2
============================
+ test("StateManager v2 - get, put, iter") {
+ val stateManager = newStateManager(KEYS_ATTRIBUTES, OUTPUT_ATTRIBUTES,
2)
+
+ // in V2, row for values itself (excluding keys from input row) is
stored as value
+ // so that stored value doesn't have key part, but state manager V2
will provide same output
+ // as V1 when getting row for key
+ testGetPutIterOnStateManager(stateManager, VALUES_ATTRIBUTES,
TEST_ROW, TEST_KEY_ROW,
+ TEST_VALUE_ROW)
+ }
+
+ private def newStateManager(
+ keysAttributes: Seq[Attribute],
+ outputAttributes: Seq[Attribute],
--- End diff --
Yes, and actually, for StateManager, `input row attributes` and `output
attributes` are same according to how StateStore*Exec work, so I picked either
one. I'm happy to rename if `inputRowAttributes` is clearer to give insight
which schema should be passed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]