mtauha commented on PR #37672:
URL: https://github.com/apache/beam/pull/37672#issuecomment-3938825406

   > Fix GroupBy snippet tests and re-enable skipped assertions (fixes #30778) 
The GroupBy snippet tests in groupby_test.py were permanently skipped via 
skip_due_to_30778 = True. This PR fixes the underlying issues in the snippet 
files so all 7 tests pass. Root causes fixed:
   > 
   > Missing strawberry entry in GROCERY_LIST across all snippet files caused 
incorrect aggregate results if test block was outside the with beam.Pipeline() 
context, making grouped inaccessible beam.Map(print) was chained inside the 
pipeline, polluting the snippet and breaking test assertions 
check_simple_aggregate_result was applying beam.MapTuple(normalize_kv) to Row 
objects returned by aggregate_field, which expects KV pairs
   > 
   > Files changed:
   > 
   > groupby_attr.py groupby_attr_expr.py groupby_expr.py 
groupby_expr_aggregate.py groupby_global_aggregate.py 
groupby_simple_aggregate.py groupby_two_exprs.py groupby_test.py
   > 
   > Testing: pytest -q 
sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupby_test.py
 7 passed in 0.99s
   > 
   > addresses #30778 Update CHANGES.md with noteworthy changes.
   
   
   
   Hi @MansiSingh17! Great work identifying and fixing the root causes. Your 
approach is correct. I was also working on this issue and reviewed your PR 
carefully. I noticed a few things that might need attention:
   
   1. **`skip_due_to_30778` flag**: It doesn't look like the skip flag in 
`groupby_test.py` has been removed. If it's still set to `True`, the tests will 
continue to be skipped and won't actually run, which means the fix can't be 
verified by CI.
   2. **`groupby_attr_expr.py`: unconditional `beam.Map(print)`**... The 
`grouped | beam.Map(print)` line runs unconditionally, even during tests. It 
should be inside an `else` block like the other files:
    ```python
       if test:
         test(grouped)
       else:
         grouped | beam.Map(print)
    ```
   
    3. **Removed `[START/END]` snippet markers**: In `groupby_attr_expr.py` and 
`groupby_two_exprs.py`, the `# [START ...]` and `# [END ...]` markers were 
removed. These are used by the Apache Beam documentation pipeline to extract 
code snippets for the website, so removing them could break doc generation.
   4. **`groupby_simple_aggregate.py`: unnecessary `to_grocery_row()` helper**. 
Since `GROCERY_LIST` already contains `beam.Row` objects, the added 
`to_grocery_row()` conversion function is redundant and adds unnecessary 
complexity to what is meant to be a readable example snippet.
   
   Happy to help address any of these if useful! Great effort overall on 
tackling this. 🙌
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to