TheNeuralBit commented on code in PR #23224:
URL: https://github.com/apache/beam/pull/23224#discussion_r1007390340
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3749,39 +3749,89 @@ the user ids from a `PCollection` of purchases one
would write (using the `Selec
purchases.apply(Select.fieldNames("userId"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select("user_id")
+{{< /highlight >}}
+
##### **Nested fields**
+{{< paragraph class="language-py" >}}
+Support for Nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for Nested fields hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for Nested fields hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3749,39 +3749,89 @@ the user ids from a `PCollection` of purchases one
would write (using the `Selec
purchases.apply(Select.fieldNames("userId"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select("user_id")
+{{< /highlight >}}
+
##### **Nested fields**
+{{< paragraph class="language-py" >}}
+Support for Nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-java" >}}
Individual nested fields can be specified using the dot operator. For example,
to select just the postal code from the
shipping address one would write
+{{< /paragraph >}}
{{< highlight java >}}
purchases.apply(Select.fieldNames("shippingAddress.postCode"));
{{< /highlight >}}
-
+
+<!-- {{< highlight py >}}
+input_pc = ... # {"user_id": ..., "shipping_address": "post_code": ...,
"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select(post_code=lambda item:
str(item["shipping_address.post_code"]))
+{{< /highlight >}} -->
##### **Wildcards**
+{{< paragraph class="language-py" >}}
+Support for wildcards hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for wildcards hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for wildcards hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for wildcards hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3913,6 +3987,15 @@ selected field will appear as its own array field. For
example
purchases.apply(Select.fieldNames( "transactions.bank",
"transactions.purchaseAmount"));
{{< /highlight >}}
+{{< paragraph class="language-py" >}}
+Support for nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for nested hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for nested fields hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for nested fields hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4092,13 +4224,22 @@ that are likely associated with that transaction (both
the user and product matc
"natural join" - one in which the same field names are used on both the
left-hand and right-hand sides of the join -
and is specified with the `using` keyword:
+{{< paragraph class="language-py" >}}
+Support for joins hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for joins hasn't been developed for GO SDK yet
+{{< /paragraph >}}
+
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for joins hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for joins hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4145,6 +4295,14 @@ can optionally be expanded - providing individual joined
records, as in the `Joi
processed in unexpanded format - providing the join key along with Iterables
of all records from each input that matched
that key.
+{{< paragraph class="language-py" >}}
+Support for joins hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for joins hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for joins hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for joins hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3867,6 +3933,14 @@ The same is true for wildcard selections. The following
purchases.apply(Select.fieldNames("userId", "shippingAddress.*"));
{{< /highlight >}}
+{{< paragraph class="language-py" >}}
+Support for Wildcards hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Wildcards hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for Wildcards hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for Wildcards hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3749,39 +3749,89 @@ the user ids from a `PCollection` of purchases one
would write (using the `Selec
purchases.apply(Select.fieldNames("userId"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select("user_id")
+{{< /highlight >}}
+
##### **Nested fields**
+{{< paragraph class="language-py" >}}
+Support for Nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-java" >}}
Individual nested fields can be specified using the dot operator. For example,
to select just the postal code from the
shipping address one would write
+{{< /paragraph >}}
{{< highlight java >}}
purchases.apply(Select.fieldNames("shippingAddress.postCode"));
{{< /highlight >}}
-
+
+<!-- {{< highlight py >}}
+input_pc = ... # {"user_id": ..., "shipping_address": "post_code": ...,
"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select(post_code=lambda item:
str(item["shipping_address.post_code"]))
+{{< /highlight >}} -->
##### **Wildcards**
+{{< paragraph class="language-py" >}}
+Support for wildcards hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for wildcards hasn't been developed for GO SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-java" >}}
The * operator can be specified at any nesting level to represent all fields
at that level. For example, to select all
shipping-address fields one would write
+{{< /paragraph >}}
{{< highlight java >}}
purchases.apply(Select.fieldNames("shippingAddress.*"));
{{< /highlight >}}
+<!--
+{{< highlight py >}}
+#TODO(https://github.com/apache/beam/issues/23275): Add support for projecting
nested fields
+input_pc = ... # {"user_id": ..., "shipping_address": "post_code": ...,
"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.Select("shipping_address.*"))
+{{< /highlight >}} -->
##### **Arrays**
+{{< paragraph class="language-java" >}}
An array field, where the array element type is a row, can also have subfields
of the element type addressed. When
selected, the result is an array of the selected subfield type. For example
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" >}}
+Support for Array fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Array fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for Array fields hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for Array fields hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4002,21 +4095,48 @@ Will result in the following schema
</tbody>
</table>
<br/>
+{{< /paragraph >}}
##### **Grouping aggregations**
+{{< paragraph class="language-java" >}}
The `Group` transform allows simply grouping data by any number of fields in
the input schema, applying aggregations to
those groupings, and storing the result of those aggregations in a new schema
field. The output of the `Group` transform
has a schema with one field corresponding to each aggregation performed.
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" >}}
+The `GroupBy` transform allows simply grouping data by any number of fields in
the input schema, applying aggregations to
+those groupings, and storing the result of those aggregations in a new schema
field. The output of the `GroupBy` transform
+has a schema with one field corresponding to each aggregation performed.
+{{< /paragraph >}}
+{{< paragraph class="language-java" >}}
The simplest usage of `Group` specifies no aggregations, in which case all
inputs matching the provided set of fields
are grouped together into an `ITERABLE` field. For example
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" >}}
+The simplest usage of `GroupBy` specifies no aggregations, in which case all
inputs matching the provided set of fields
+are grouped together into an `ITERABLE` field. For example
+{{< /paragraph >}}
{{< highlight java >}}
-purchases.apply(Group.byFieldNames("userId", "shippingAddress.streetAddress"));
+purchases.apply(Group.byFieldNames("userId", "bank"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"bank": ..., "purchase_amount": ...}
+output_pc = input_pc | beam.GroupBy('user_id','bank')
+{{< /highlight >}}
+
+{{< paragraph class="language-go" >}}
+Support for nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-go" >}}
Support for schema-aware grouping hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3839,6 +3897,14 @@ could select only the userId and streetAddress fields as
follows
purchases.apply(Select.fieldNames("userId", "shippingAddress.streetAddress"));
{{< /highlight >}}
+{{< paragraph class="language-py" >}}
+Support for Nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for Nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for Nested fields hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for Nested fields hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4061,6 +4181,18 @@ purchases.apply(Group.byFieldNames("userId")
.aggregateField("costCents", Top.<Long>largestLongsFn(10),
"topPurchases"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"item_Id": ..., "cost_cents": ...}
+output_pc = input_pc | beam.GroupBy("user_id")
+ .aggregate_field("item_id",CountCombineFn,"num_purchases")
+ .aggregate_field("cost_cents",sum,"total_spendcents")
+ .aggregate_field("cost_cents",TopCombineFn,"top_purchases")
Review Comment:
```suggestion
input_pc = ... # {"user_id": ..., "item_Id": ..., "cost_cents": ...}
output_pc = input_pc | beam.GroupBy("user_id")
.aggregate_field("item_id", CountCombineFn, "num_purchases")
.aggregate_field("cost_cents", sum, "total_spendcents")
.aggregate_field("cost_cents", TopCombineFn, "top_purchases")
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -3950,6 +4034,15 @@ Another use of the Select transform is to flatten a
nested schema into a single
purchases.apply(Select.flattenedSchema());
{{< /highlight >}}
+{{< paragraph class="language-py" >}}
+Support for nested fields hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for nested fields hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for nested fields hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4061,6 +4181,18 @@ purchases.apply(Group.byFieldNames("userId")
.aggregateField("costCents", Top.<Long>largestLongsFn(10),
"topPurchases"));
{{< /highlight >}}
+{{< highlight py >}}
+input_pc = ... # {"user_id": ...,"item_Id": ..., "cost_cents": ...}
+output_pc = input_pc | beam.GroupBy("user_id")
+ .aggregate_field("item_id",CountCombineFn,"num_purchases")
+ .aggregate_field("cost_cents",sum,"total_spendcents")
+ .aggregate_field("cost_cents",TopCombineFn,"top_purchases")
+{{< /highlight >}}
+
+{{< paragraph class="language-go" >}}
+Support for nested fields hasn't been developed for GO SDK yet
+{{< /paragraph >}}
Review Comment:
```suggestion
{{< paragraph class="language-go" >}}
Support for schema-aware grouping hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -4119,12 +4260,21 @@ The resulting schema is the following:
</tbody>
</table>
<br/>
+{{< /paragraph >}}
Each resulting row contains one Transaction and one Review that matched the
join condition.
If the fields to match in the two schemas have different names, then the on
function can be used. For example, if the
Review schema named those fields differently than the Transaction schema, then
we could write the following:
+{{< paragraph class="language-py" >}}
+Support for joins hasn't been developed for python SDK yet
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Support for joins hasn't been developed for GO SDK yet
+{{< /paragraph >}}
+
Review Comment:
```suggestion
{{< paragraph class="language-py" >}}
Support for joins hasn't been developed for the Python SDK yet.
{{< /paragraph >}}
{{< paragraph class="language-go" >}}
Support for joins hasn't been developed for the Go SDK yet.
{{< /paragraph >}}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]