[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89253
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 04:52
Start Date: 10/Apr/18 04:52
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #5071: [BEAM-4037] Add 
streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py 
b/sdks/python/apache_beam/examples/snippets/snippets.py
index 9b150c8d3f0..0f9543a64aa 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets.py
@@ -30,6 +30,8 @@
 string. The tags can contain only letters, digits and _.
 """
 
+import argparse
+
 import six
 
 import apache_beam as beam
@@ -628,6 +630,64 @@ def format_result(word_count):
 p.visit(SnippetUtils.RenameFiles(renames))
 
 
+def examples_wordcount_streaming(argv):
+  import apache_beam as beam
+  from apache_beam import window
+  from apache_beam.io import ReadFromPubSub
+  from apache_beam.io import WriteStringsToPubSub
+  from apache_beam.options.pipeline_options import PipelineOptions
+  from apache_beam.options.pipeline_options import SetupOptions
+  from apache_beam.options.pipeline_options import StandardOptions
+
+  # Parse out arguments.
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+  '--output_topic', required=True,
+  help=('Output PubSub topic of the form '
+'"projects//topic/".'))
+  group = parser.add_mutually_exclusive_group(required=True)
+  group.add_argument(
+  '--input_topic',
+  help=('Input PubSub topic of the form '
+'"projects//topics/".'))
+  group.add_argument(
+  '--input_subscription',
+  help=('Input PubSub subscription of the form '
+'"projects//subscriptions/."'))
+  known_args, pipeline_args = parser.parse_known_args(argv)
+
+  pipeline_options = PipelineOptions(pipeline_args)
+  pipeline_options.view_as(StandardOptions).streaming = True
+
+  with TestPipeline(options=pipeline_options) as p:
+# [START example_wordcount_streaming_read]
+# Read from Pub/Sub into a PCollection.
+if known_args.input_subscription:
+  lines = p | beam.io.ReadFromPubSub(
+  subscription=known_args.input_subscription)
+else:
+  lines = p | beam.io.ReadFromPubSub(topic=known_args.input_topic)
+# [END example_wordcount_streaming_read]
+
+output = (
+lines
+| 'DecodeUnicode' >> beam.FlatMap(
+lambda encoded: encoded.decode('utf-8'))
+| 'ExtractWords' >> beam.FlatMap(
+lambda x: __import__('re').findall(r'[A-Za-z\']+', x))
+| 'PairWithOnes' >> beam.Map(lambda x: (x, 1))
+| beam.WindowInto(window.FixedWindows(15, 0))
+| 'Group' >> beam.GroupByKey()
+| 'Sum' >> beam.Map(lambda word_ones: (word_ones[0], 
sum(word_ones[1])))
+| 'Format' >> beam.Map(
+lambda word_and_count: '%s: %d' % word_and_count))
+
+# [START example_wordcount_streaming_write]
+# Write to Pub/Sub
+output | beam.io.WriteStringsToPubSub(known_args.output_topic)
+# [END example_wordcount_streaming_write]
+
+
 def examples_ptransforms_templated(renames):
   # [START examples_ptransforms_templated]
   import apache_beam as beam
diff --git a/sdks/python/apache_beam/examples/snippets/snippets_test.py 
b/sdks/python/apache_beam/examples/snippets/snippets_test.py
index 349d52542da..4380ce47271 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets_test.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets_test.py
@@ -1,3 +1,4 @@
+# coding=utf-8
 #
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
@@ -25,6 +26,8 @@
 import unittest
 import uuid
 
+import mock
+
 import apache_beam as beam
 from apache_beam import coders
 from apache_beam import pvalue
@@ -36,8 +39,10 @@
 from apache_beam.options.pipeline_options import GoogleCloudOptions
 from apache_beam.options.pipeline_options import PipelineOptions
 from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.test_stream import TestStream
 from apache_beam.testing.util import assert_that
 from apache_beam.testing.util import equal_to
+from apache_beam.transforms.window import TimestampedValue
 from apache_beam.utils.windowed_value import WindowedValue
 
 # Protect against environments where apitools library is not 

[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89248
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 04:22
Start Date: 10/Apr/18 04:22
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #5071: [BEAM-4037] 
Add streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379969473
 
 
   Thanks, PTAL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89248)
Time Spent: 1h 20m  (was: 1h 10m)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89241
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 03:50
Start Date: 10/Apr/18 03:50
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5071: [BEAM-4037] Add 
streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379965425
 
 
   Could you fix the lint errors:
   
   ```
   Running pylint for module apache_beam:
   * Module apache_beam.examples.snippets.snippets
   C:681, 0: Line too long (81/80) (line-too-long)
   * Module apache_beam.examples.snippets.snippets_test
   C: 25, 0: standard import "import os" should be placed before "import mock" 
(wrong-import-order)
   C: 26, 0: standard import "import tempfile" should be placed before "import 
mock" (wrong-import-order)
   C: 27, 0: standard import "import unittest" should be placed before "import 
mock" (wrong-import-order)
   C: 28, 0: standard import "import uuid" should be placed before "import 
mock" (wrong-import-order)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89241)
Time Spent: 1h 10m  (was: 1h)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89219
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 02:08
Start Date: 10/Apr/18 02:08
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #5071: [BEAM-4037] 
Add streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379950759
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89219)
Time Spent: 1h  (was: 50m)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89209
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 01:04
Start Date: 10/Apr/18 01:04
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5071: [BEAM-4037] Add 
streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379940913
 
 
   Thank you I can merge once tests pass.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89209)
Time Spent: 50m  (was: 40m)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89204
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 00:31
Start Date: 10/Apr/18 00:31
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #5071: [BEAM-4037] 
Add streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379935862
 
 
   Sent out commit with `# coding=utf-8` fix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89204)
Time Spent: 40m  (was: 0.5h)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89203
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 10/Apr/18 00:26
Start Date: 10/Apr/18 00:26
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5071: [BEAM-4037] Add 
streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379935233
 
 
   LGTM. snippets_test is failing, please fix that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89203)
Time Spent: 0.5h  (was: 20m)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89180
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 09/Apr/18 23:12
Start Date: 09/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #5071: [BEAM-4037] 
Add streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071#issuecomment-379922069
 
 
   R: @aaltay 
   CC: @melap 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89180)
Time Spent: 20m  (was: 10m)

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4037) Add Python streaming wordcount snippets and test

2018-04-09 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4037?focusedWorklogId=89179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89179
 ]

ASF GitHub Bot logged work on BEAM-4037:


Author: ASF GitHub Bot
Created on: 09/Apr/18 23:11
Start Date: 09/Apr/18 23:11
Worklog Time Spent: 10m 
  Work Description: charlesccychen opened a new pull request #5071: 
[BEAM-4037] Add streaming wordcount snippets and test
URL: https://github.com/apache/beam/pull/5071
 
 
   This change allows our documentation to refer to these tested streaming 
wordcount snippets.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89179)
Time Spent: 10m
Remaining Estimate: 0h

> Add Python streaming wordcount snippets and test
> 
>
> Key: BEAM-4037
> URL: https://issues.apache.org/jira/browse/BEAM-4037
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should add Python streaming wordcount snippets and tests.  The 
> documentation will refer to these snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)