Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r209039505
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/ContinuousReadSupport.java
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2.reader.streaming;
+
+import org.apache.spark.annotation.InterfaceStability;
+import org.apache.spark.sql.execution.streaming.BaseStreamingSource;
+import org.apache.spark.sql.sources.v2.reader.InputPartition;
+import org.apache.spark.sql.sources.v2.reader.ScanConfig;
+import org.apache.spark.sql.sources.v2.reader.ScanConfigBuilder;
+
+/**
+ * An interface that defines how to scan the data from data source for
continuous streaming
+ * processing.
+ *
+ * The execution engine will create an instance of this interface at the
start of a streaming query,
+ * then call {@link #newScanConfigBuilder(Offset)} and create an instance
of {@link ScanConfig} for
+ * the duration of the streaming query or until {@link
#needsReconfiguration(ScanConfig)} is true.
+ * The {@link ScanConfig} will be used to create input partitions and
reader factory to process data
+ * for its duration. At the end {@link #stop()} will be called when the
streaming execution is
+ * completed. Note that a single query may have multiple executions due to
restart or failure
+ * recovery.
+ */
[email protected]
+public interface ContinuousReadSupport extends StreamingReadSupport,
BaseStreamingSource {
+
+ /**
+ * Returns a builder of {@link ScanConfig}. The builder can take some
query specific information
+ * to do operators pushdown, streaming offsets, etc., and keep these
information in the
+ * created {@link ScanConfig}.
+ *
+ * This is the first step of the data scan. All other methods in {@link
ContinuousReadSupport}
+ * needs to take {@link ScanConfig} as an input.
+ *
+ * If this method fails (by throwing an exception), the action will fail
and no Spark job will be
+ * submitted.
+ */
+ ScanConfigBuilder newScanConfigBuilder(Offset start);
+
+ /**
+ * Returns a factory to produce {@link ContinuousPartitionReader}s for
{@link InputPartition}s.
--- End diff --
Nit: There are still several cases of this javadoc style problem.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]