One distinction here is the difference between the URN for a provider /
transform, and the class name in Java.

We should have a standard for both, but they are distinct

On Tue, Nov 15, 2022 at 3:39 PM Chamikara Jayalath via dev <
dev@beam.apache.org> wrote:

>
>
> On Tue, Nov 15, 2022 at 11:50 AM Damon Douglas via dev <
> dev@beam.apache.org> wrote:
>
>> Hello Everyone,
>>
>> Do we like the following Java class naming convention for
>> SchemaTransformProviders [1]?  The proposal is:
>>
>> <IOName>(Read|Write)SchemaTransformProvider
>>
>>
>> *For those new to Beam, even if this is your first day, consider
>> yourselves a welcome contributor to this conversation.  Below are
>> definitions/references and a suggested learning guide to understand this
>> email.*
>>
>> Explanation
>>
>> The <IOName> identifies the Beam I/O [2] and Read or Write identifies a
>> read or write Ptransform, respectively.
>>
>
> Schema-aware transforms are not restricted to I/Os. An arbitrary transform
> can be a Schema-Transform.  Also, designation Read/Write does not map to an
> arbitrary transform. Probably we should try to make this more generic ?
>
> Also, probably what's more important is the identifier of the
> SchemaTransformProvider being unique. Note the class name (the latter is
> guaranteed to be unique if we follow the Java package naming guidelines).
>
> FWIW, we came up with a similar generic URN naming scheme for
> cross-language transforms:
> https://beam.apache.org/documentation/programming-guide/#1314-defining-a-urn
>
> Thanks,
> Cham
>
>
>> For example, to implement a SchemaTransformProvider [1] for
>> BigQueryIO.Write[7], would look like:
>>
>> BigQueryWriteSchemaTransformProvider
>>
>>
>> And to implement a SchemaTransformProvider for PubSubIO.Read[8] would
>> like like:
>>
>> PubsubReadSchemaTransformProvider
>>
>>
>> Definitions/References
>>
>> [1] *SchemaTransformProvider*: A way for us to instantiate Beam IO
>> transforms using a language agnostic configuration.
>> SchemaTransformProvider builds a SchemaTransform[3] from a Beam Row[4] that
>> functions as the configuration of that SchemaProvider.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/transforms/SchemaTransformProvider.html
>>
>> [2] *Beam I/O*: PTransform for reading from or writing to sources and
>> sinks.
>> https://beam.apache.org/documentation/programming-guide/#pipeline-io
>>
>> [3] *SchemaTransform*: An interface containing a buildTransform method
>> that returns a PCollectionRowTuple[5] to PCollectionRowTuple PTransform.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/transforms/SchemaTransform.html
>>
>> [4] *Row*: A Beam Row is a generic element of data whose properties are
>> defined by a Schema[5].
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/values/Row.html
>>
>> [5] *Schema*: A description of expected field names and their data types.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/Schema.html
>>
>> [6] *PCollectionRowTuple*: A grouping of Beam Rows[4] into a single
>> PInput or POutput tagged by a String name.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/values/PCollectionRowTuple.html
>>
>> [7] *BigQueryIO.Write*: A PTransform for writing Beam elements to a
>> BigQuery table.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.html
>>
>> [8] *PubSubIO.Read*: A PTransform for reading from Pub/Sub and emitting
>> message payloads into a PCollection.
>>
>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.Read.html
>>
>> Suggested Learning/Reading to understand this email
>>
>> 1. https://beam.apache.org/documentation/programming-guide/#overview
>> 2. https://beam.apache.org/documentation/programming-guide/#transforms
>> (Up to 4.1)
>> 3. https://beam.apache.org/documentation/programming-guide/#pipeline-io
>> 4. https://beam.apache.org/documentation/programming-guide/#schemas
>>
>

Reply via email to