VeronicaWasson commented on code in PR #31476: URL: https://github.com/apache/beam/pull/31476#discussion_r1630177864
########## sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/SolaceIO.java: ########## @@ -0,0 +1,551 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io.solace; + +import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkNotNull; +import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState; + +import com.google.auto.value.AutoValue; +import com.solacesystems.jcsmp.BytesXMLMessage; +import com.solacesystems.jcsmp.Destination; +import com.solacesystems.jcsmp.JCSMPFactory; +import com.solacesystems.jcsmp.Queue; +import com.solacesystems.jcsmp.Topic; +import java.io.IOException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.annotations.Internal; +import org.apache.beam.sdk.coders.CannotProvideCoderException; +import org.apache.beam.sdk.coders.Coder; +import org.apache.beam.sdk.io.solace.broker.SempClientFactory; +import org.apache.beam.sdk.io.solace.broker.SessionService; +import org.apache.beam.sdk.io.solace.broker.SessionServiceFactory; +import org.apache.beam.sdk.io.solace.data.Solace; +import org.apache.beam.sdk.io.solace.data.Solace.SolaceRecordMapper; +import org.apache.beam.sdk.io.solace.data.SolaceRecordCoder; +import org.apache.beam.sdk.io.solace.read.UnboundedSolaceSource; +import org.apache.beam.sdk.schemas.NoSuchSchemaException; +import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.transforms.SerializableFunction; +import org.apache.beam.sdk.values.PBegin; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.TypeDescriptor; +import org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.annotations.VisibleForTesting; +import org.checkerframework.checker.nullness.qual.Nullable; +import org.joda.time.Instant; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * A {@link PTransform} to read and write from/to Solace event broker. + * + * <p>Note: this API is beta and subject to change. + * + * <h2>Reading</h2> + * + * To read from Solace, use the {@link SolaceIO#read()} or {@link SolaceIO#read(TypeDescriptor, + * SerializableFunction, SerializableFunction)}. + * + * <h3>No-arg {@link SolaceIO#read()} top-level method</h3> + * + * <p>This method returns a PCollection of {@link Solace.Record} objects. It uses a default mapper + * ({@link SolaceRecordMapper#map(BytesXMLMessage)}) to map from the received {@link + * BytesXMLMessage} from Solace, to the {@link Solace.Record} objects. + * + * <p>By default, it also uses a {@link BytesXMLMessage#getSenderTimestamp()} for watermark + * estimation. This {@link SerializableFunction} can be overridden with {@link + * Read#withTimestampFn(SerializableFunction)} method. + * + * <p>When using this method, the Coders are inferred automatically. + * + * <h3>Advanced {@link SolaceIO#read(TypeDescriptor, SerializableFunction, SerializableFunction)} + * top-level method</h3> + * + * <p>With this method, the user can: + * + * <ul> + * <li>specify custom output type of the PTransform (for example their own class consisting only + * of the relevant fields, optimized for their use-case), + * <li>create a custom mapping between {@link BytesXMLMessage} and their output type and + * <li>specify what field to use for watermark estimation from their mapped field (for example, in + * this method the user can use a field which is encoded in the payload as a timestamp, which + * cannot be done with the {@link SolaceIO#read()} method. + * </ul> + * + * <h3>Reading from a queue ({@link Read#from(Solace.Queue)}} or a topic ({@link + * Read#from(Solace.Topic)})</h3> + * + * <p>Regardless of the top-level read method choice, the user can specify whether to read from a + * Queue - {@link Read#from(Solace.Queue)}, or a Topic {@link Read#from(Solace.Topic)}. + * + * <p>Note: when a user specifies to read from a Topic, the connector will create a matching Queue + * and a Subscription. The user must ensure that the SEMP API is reachable from the driver program + * and must provide credentials that have `write` permission to the <a + * href="https://docs.solace.com/Admin/SEMP/Using-SEMP.htm">SEMP Config API</a>. The created Queue + * will be non-exclusive. The Queue will not be deleted when the pipeline is terminated. + * + * <p>Note: If the user specifies to read from a Queue, <a + * href="https://beam.apache.org/documentation/programming-guide/#overview">the driver program</a> + * will execute a call to the SEMP API to check if the Queue is `exclusive` or `non-exclusive`. The + * user must ensure that the SEMP API is reachable from the driver program and provide credentials + * with `read` permission to the {@link Read#withSempClientFactory(SempClientFactory)}. + * + * <h3>Usage example</h3> + * + * <h4>The no-arg {@link SolaceIO#read()} method</h4> + * + * <p>The minimal example - reading from an existing Queue, using the no-arg {@link SolaceIO#read()} + * method, with all the default configuration options. + * + * <pre>{@code + * PCollection<Solace.Record> events = + * pipeline.apply( + * SolaceIO.read() + * .from(Queue.fromName(options.getSolaceReadQueue())) + * .withSempClientFactory( + * BasicAuthSempClientFactory.builder() + * .host("http://" + options.getSolaceHost() + ":8080") + * .username(options.getSolaceUsername()) + * .password(options.getSolacePassword()) + * .vpnName(options.getSolaceVpnName()) + * .build()) + * .withSessionServiceFactory( + * BasicAuthJcsmpSessionServiceFactory.builder() + * .host(options.getSolaceHost()) + * .username(options.getSolaceUsername()) + * .password(options.getSolacePassword()) + * .vpnName(options.getSolaceVpnName()) + * .build())); + * }</pre> + * + * <h4>The advanced {@link SolaceIO#read(TypeDescriptor, SerializableFunction, + * SerializableFunction)} method</h4> + * + * <p>When using this method we can specify a custom output PCollection type and a custom timestamp Review Comment: "you can specify" or "the user can specify" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
