[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=328689&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328689 ]
ASF GitHub Bot logged work on BEAM-7389: ---------------------------------------- Author: ASF GitHub Bot Created on: 15/Oct/19 17:18 Start Date: 15/Oct/19 17:18 Worklog Time Spent: 10m Work Description: davidcavazos commented on pull request #9790: [BEAM-7389] Show code snippet outputs as stdout URL: https://github.com/apache/beam/pull/9790#discussion_r335076939 ########## File path: sdks/python/apache_beam/examples/snippets/transforms/elementwise/filter_test.py ########## @@ -31,31 +31,26 @@ def check_perennials(actual): - # [START perennials] - perennials = [ - {'icon': '🍓', 'name': 'Strawberry', 'duration': 'perennial'}, - {'icon': '🍆', 'name': 'Eggplant', 'duration': 'perennial'}, - {'icon': '🥔', 'name': 'Potato', 'duration': 'perennial'}, - ] - # [END perennials] - assert_that(actual, equal_to(perennials)) + expected = '''[START perennials] +{'icon': '🍓', 'name': 'Strawberry', 'duration': 'perennial'} +{'icon': '🍆', 'name': 'Eggplant', 'duration': 'perennial'} +{'icon': '🥔', 'name': 'Potato', 'duration': 'perennial'} +[END perennials]'''.splitlines()[1:-1] Review comment: If they run the code in the Colab notebook, or they copy/paste the code snippet into a file and run it, they will see the outputs of the `print` statements in `stdout`. The docs are currently like this: ```py import apache_beam as beam with beam.Pipeline() as pipeline: perennials = ( pipeline | 'Gardening plants' >> beam.Create([ {'icon': '🍓', 'name': 'Strawberry', 'duration': 'perennial'}, {'icon': '🥕', 'name': 'Carrot', 'duration': 'biennial'}, {'icon': '🍆', 'name': 'Eggplant', 'duration': 'perennial'}, {'icon': '🍅', 'name': 'Tomato', 'duration': 'annual'}, {'icon': '🥔', 'name': 'Potato', 'duration': 'perennial'}, ]) | 'Filter perennials' >> beam.Filter( lambda plant: plant['duration'] == 'perennial') | beam.Map(print) ) ``` Outputs: ``` perennials = [ {'icon': '🍓', 'name': 'Strawberry', 'duration': 'perennial'}, {'icon': '🍆', 'name': 'Eggplant', 'duration': 'perennial'}, {'icon': '🥔', 'name': 'Potato', 'duration': 'perennial'}, ] ``` But if they run the code, they'll actually see this: ``` {'icon': '🍓', 'name': 'Strawberry', 'duration': 'perennial'} {'icon': '🍆', 'name': 'Eggplant', 'duration': 'perennial'} {'icon': '🥔', 'name': 'Potato', 'duration': 'perennial'} ``` That's exactly the same output people will see in the docs with these changes. It's not much of a difference content-wise, but the format implies that it results in a list, when it in fact results in a PCollection, where we're only printing the values. Also, the variable `perennials` by the end of the snippet is just a `PCollection` of `None` values since that's the output of `print`, so it's also misleading that way. Furthermore, running the code in Colab gives the second results, while we see the first results in the docs, making them inconsistent. I know it's a bit more confusing to read in the code, but for the snippets goal I think it makes sense to sacrifice a little readability in the *tests* to have more readability in the *public docs*. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 328689) Time Spent: 68h 40m (was: 68.5h) > Colab examples for element-wise transforms (Python) > --------------------------------------------------- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website > Reporter: Rose Nguyen > Assignee: David Cavazos > Priority: Minor > Time Spent: 68h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)