[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=334951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-334951
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 28/Oct/19 13:54
Start Date: 28/Oct/19 13:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546956867
 
 
   Congrats @chadrik! :) Also thanks for your help @chamikaramj and 
@TheNeuralBit.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 334951)
Time Spent: 9h 50m  (was: 9h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=334639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-334639
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 27/Oct/19 16:18
Start Date: 27/Oct/19 16:18
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546710045
 
 
   Whoohoo!
   
   On Fri, Oct 25, 2019 at 11:00 AM Chamikara Jayalath <
   notificati...@github.com> wrote:
   
   > Merged #9268  into master.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 334639)
Time Spent: 9h 40m  (was: 9.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=334262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-334262
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 25/Oct/19 17:59
Start Date: 25/Oct/19 17:59
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9268: 
[BEAM-7738] Add external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 334262)
Time Spent: 9.5h  (was: 9h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=334257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-334257
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 25/Oct/19 17:58
Start Date: 25/Oct/19 17:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546451970
 
 
   LGTM. Thanks.
   
   Looks like Max already approved so I'll merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 334257)
Time Spent: 9h 20m  (was: 9h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333643
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 18:24
Start Date: 24/Oct/19 18:24
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546043127
 
 
   @TheNeuralBit let's move the conversation about row coders over to 
https://jira.apache.org/jira/browse/BEAM-7870
   
   This PR is ready to merge!
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333643)
Time Spent: 9h 10m  (was: 9h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333641
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 18:23
Start Date: 24/Oct/19 18:23
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r338724117
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,168 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
 
 Review comment:
   I don't love that idea.  The two sets of xforms are do not have the same 
arguments, in part because of the need for the expansion service endpoint, but 
also because the Java versions are not guaranteed to be the same as the python 
versions (though I would like that to be the case).  Even the expansion service 
argument could somehow be passed through the options, the two sets of xform 
classes may have different base classes:  ideally the ones in `io.external` 
inherit from `ExternalTransform` (they currently do not, but they should be 
able to once we resolve the issue we're working on with @TheNeuralBit )
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333641)
Time Spent: 9h  (was: 8h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333630
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 18:14
Start Date: 24/Oct/19 18:14
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546039008
 
 
   > Would we actually need the PubsubMessage protobuf conversion methods for 
the external transform version of pubsubio/kafkaio
   
   We would not need to use PubsubMessage _protobuf_ type any more (and thus we 
wouldn't need the conversion methods), but we may still want the beam 
PubsubMessage class.  The latter is yielded by  `io.pubsub.ReadFromPubSub`, so 
if we ant `io.external.pubsub.ReadFromPubSub` to be a drop-in replacement 
(which I am in favor of) then that means keeping 
`io.external.pubsub.PubsubMessage` and adding the ability to convert between 
this and a row-type.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333630)
Time Spent: 8h 50m  (was: 8h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333627
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 18:07
Start Date: 24/Oct/19 18:07
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r338716845
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,168 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
+'ReadFromPubsubSchema',
+[
+('topic', typing.Optional[unicode]),
+('subscription', typing.Optional[unicode]),
+('id_label', typing.Optional[unicode]),
+('with_attributes', bool),
+('timestamp_attribute', typing.Optional[unicode]),
+]
+)
+
+
+class ReadFromPubSub(beam.PTransform):
 
 Review comment:
   I think it's a feature that they are named the same.  I can take a pipeline 
designed to run on dataflow, and simply change the imports to get it to run on 
flink. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333627)
Time Spent: 8h 40m  (was: 8.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333613
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 17:46
Start Date: 24/Oct/19 17:46
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9268: 
[BEAM-7738] Add external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r338707254
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,168 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
+'ReadFromPubsubSchema',
+[
+('topic', typing.Optional[unicode]),
+('subscription', typing.Optional[unicode]),
+('id_label', typing.Optional[unicode]),
+('with_attributes', bool),
+('timestamp_attribute', typing.Optional[unicode]),
+]
+)
+
+
+class ReadFromPubSub(beam.PTransform):
 
 Review comment:
   Should we rename to avoid conflicts with ReadFromPubSub/WriteToPubSub in 
io/gcp/pubsub.py ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333613)
Time Spent: 8.5h  (was: 8h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333612
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 17:45
Start Date: 24/Oct/19 17:45
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9268: 
[BEAM-7738] Add external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r338706683
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,168 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
 
 Review comment:
   Doesn't have to be done in this PR, but in the future it will be great if we 
just move external PubSub and Kafka transforms to io/gcp/pubsub.py and 
io/kafka.py and configure/branch based on pipeline options.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333612)
Time Spent: 8h 20m  (was: 8h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333610
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 17:35
Start Date: 24/Oct/19 17:35
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-546023515
 
 
   > Will there be a similar system in python for converting between a row type 
to a custom type
   
   It hasn't been designed, but that's definitely something I'd like to 
support. Would we actually need the `PubsubMessage` protobuf conversion methods 
for the external transform version of pubsubio/kafkaio, or are you just 
thinking it would be good to use the same type for consistency? I'm just 
wondering if we can get away with a simple `TypedDict` sub-class for now
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333610)
Time Spent: 8h 10m  (was: 8h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333536
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 15:58
Start Date: 24/Oct/19 15:58
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545985919
 
 
   @TheNeuralBit  On the python side, as with the Java SDK, there is a custom 
`PubsubMessage` class (take a look in `apache_beam.io.gcp.pubsub`).  The main 
thing it provides is methods for converting to/from protobuf.  
   
   In your example above, you registered the `TypedDict` sub-class: will there 
be a similar system in python for converting between a row type to a custom 
type, like `apache_beam.io.gcp.pubsub.PubsubMessage`? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333536)
Time Spent: 8h  (was: 7h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=333523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333523
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Oct/19 15:39
Start Date: 24/Oct/19 15:39
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545977302
 
 
   @mxm I marked kafak and pubsub external transforms as external.  I think 
this is ready to merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 333523)
Time Spent: 7h 50m  (was: 7h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332302
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 23:59
Start Date: 22/Oct/19 23:59
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545204985
 
 
   D'oh my bad, we _do_ have control over `PubsubMessage` :man_facepalming: I 
assumed it was part of the pubsub client library. Yeah I vote we use 
`DefaultSchema` with either `JavaBeanSchema` or `JavaFieldSchema`, whichever 
works with fewer changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332302)
Time Spent: 7h 40m  (was: 7.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332255
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 21:54
Start Date: 22/Oct/19 21:54
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545172864
 
 
   >  If it were a class that we had control over we could use the 
DefaultSchema annotation, as long as one of the included SchemaProvider 
implementations would work (I think JavaBeanSchema is the closest but wouldn't 
work because PubsubMessage doesn't have setters).
   
   It may reduce our overall technical debt if we just implement the full set 
of setters and getters on `PubsubMessage` and use `DefaultSchema`.  It doesn't 
seem like it would be a bad thing.  Who would have an opinion on that?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332255)
Time Spent: 7.5h  (was: 7h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332181
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 20:19
Start Date: 22/Oct/19 20:19
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545135957
 
 
   > Right now I'm mostly thinking about the latter
   
   Agreed, that's what I'm thinking about too.
   
   > Maybe I'm thinking about this wrong, but I think the PubsubMessage is 
structured:
   
   Ah ok, fair. I was referring specifically to the structure (or lack thereof) 
of the byte array payload, but you're right the (Python SDK) user can handle 
creating a byte array themselves, and the row coder can just encode `{byte[] 
payload, Map attributes, String messageId}`
   
   > What are the requirements for registering a Row converter?
   
   ### Java
   There are a variety of ways to do it. If it were a class that we had control 
over we could use the 
[`DefaultSchema`](https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/schemas/DefaultSchema.html)
 annotation, as long as one of the included SchemaProvider implementations 
would work (I think JavaBeanSchema is the closest but wouldn't work because 
PubsubMessage doesn't have setters). I think what we'd want to do here is just 
implement a 
[`SchemaProvider`](https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/schemas/SchemaProvider.html)
 and a 
[`SchemaProviderRegistrar`](https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/schemas/SchemaProviderRegistrar.html)
 for `PubsubMessage` and include it in Beam.
   
   @reuvenlax may have a better suggestion.
   
   ### Python
   With my PR I think it could look like:
   ```python
   # this is py3 syntax for clarity, but we'd probably
   # need to use the TypedDict('PubsubMessage', ...) version
   class PubsubMessage(TypedDict):
  message: ByteString
  attributes: Mapping[unicode, unicode]
  messageId: unicode
   
   coders.registry.register_coder(PubsubMessage, coders.RowCoder)
   
   pcoll
 | 'make some messages' >> 
beam.Map(makeMessage).with_output_types(PubsubMessage)
 | 'write to pubsub' >> beam.io.WriteToPubsub(project, topic) # or something
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332181)
Time Spent: 7h 20m  (was: 7h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332151
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 18:54
Start Date: 22/Oct/19 18:54
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545103766
 
 
   > Are you thinking you'd use beam:coder:row:v1 as the interface for the 
external transform, and the Java ExternalTransform implementations would handle 
the conversion of Row to/from PubsubMessage? 
   
   There are two places that I see  beam:coder:row:v1 being useful:
   
   1. as a way to declare the construction interface of an external transform, 
and encode its values.  A schema coder would replace the `configuration` 
mapping in `pipeline.ExternalConfiguration.ExternalConfigurationPayload` proto.
   2. as a coder for structured elements that are exchanged between sdks
   
   Right now I'm mostly thinking about the latter, which is when 
`PubsubMessage` comes into play.
   
   > There's no trivial way to register a converter between Row and 
PubsubMessage since the latter isn't structured, 
   
   Maybe I'm thinking about this wrong, but I think the `PubsubMessage` _is_ 
structured:
   
   ```java
   public class PubsubMessage {
   
 private byte[] message;
 private Map attributes;
 private String messageId;
   
 /** Returns the main PubSub message. */
 public byte[] getPayload() {
   return message;
 }
   
 /** Returns the full map of attributes. This is an unmodifiable map. */
 public Map getAttributeMap() {
   return attributes;
 }
   
 /** Returns the messageId of the message populated by Cloud Pub/Sub. */
 @Nullable
 public String getMessageId() {
   return messageId;
 }
   ```
   
   I'm not a Java expert by any means, but this seems like a type that would 
work with AutoValue, we just need to rename `message` to `payload` and 
`attributes` to `attributeMap`.
   
   What are the requirements for registering a Row converter?
   
   > but of course on the Java side we could have code to serialize the Row to 
a variety of formats to put in the PubsubMessage payload: Avro, JSON, or the 
row serialization format itself (although I'm not sure we'd want to encourage 
using that outside of Beam), would be pretty simple to add. 
   
   I think the payload is not a concern when it comes to portability of 
external transforms:  it gets encoded/decoded by another transform, not 
PubsubRead/Write.  We can just assume that's a byte array.
   
   My grasp on the Java side is a bit tenuous, so I'd like for @mxm to confirm 
or deny what I've written here.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332151)
Time Spent: 7h 10m  (was: 7h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332137
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 18:26
Start Date: 22/Oct/19 18:26
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545092832
 
 
   > How close are we to having schema coders ready to use? IIUC, we should be 
able to register a converter between Row -> PubsubMessage, right? What's the 
mechanism for that? It would be great to put that to use here as soon as the 
schema coders are ready.
   
   @robertwb gave #9188 a LGTM, I just need to get CI passing and I think we 
can merge it. I'll work on that now.
   
   Are you thinking you'd use beam:coder:row:v1 as the interface for the 
external transform, and the Java ExternalTransform implementations would handle 
the conversion of Row to/from PubsubMessage? There's no trivial way to register 
a converter between Row and PubsubMessage since the latter isn't structured, 
but of course on the Java side we could have code to serialize the Row to a 
variety of formats to put in the PubsubMessage payload: Avro, JSON, or the row 
serialization format itself (although I'm not sure we'd want to encourage using 
that outside of Beam), would be pretty simple to add. Maybe the format to use 
could be part of the external transform payload.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332137)
Time Spent: 7h  (was: 6h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332094
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 17:08
Start Date: 22/Oct/19 17:08
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-545061420
 
 
   > Perhaps we should even add the additional commits as part of Beam, at 
least until we have worked around the coders problems?
   
   I think it would be great for this to work out of the box, but I'm not sure 
what the ramifications of adding those commits would be.   They don't just 
modify the standard coders, they also change the standard beam java docker 
containers by adding additional dependencies.  Is it safe to add those even if 
people aren't using pubsub/kafka?   Is the only downside larger container sizes?
   
   @TheNeuralBit How close are we to having schema coders ready to use?  IIUC, 
we should be able to register a converter between Row -> PubsubMessage, right?  
 What's the mechanism for that?  It would be great to put that to use here as 
soon as the schema coders are ready. 
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332094)
Time Spent: 6h 50m  (was: 6h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332062
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 16:17
Start Date: 22/Oct/19 16:17
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r337614947
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
+'ReadFromPubsubSchema',
+[
+('topic', typing.Optional[unicode]),
+('subscription', typing.Optional[unicode]),
+('id_label', typing.Optional[unicode]),
+('with_attributes', bool),
+('timestamp_attribute', typing.Optional[unicode]),
+]
+)
+
+
+class ReadFromPubSub(beam.PTransform):
+  """An external ``PTransform`` for reading from Cloud Pub/Sub."""
+
+  URN = 'beam:external:java:pubsub:read:v1'
+
+  def __init__(self, topic=None, subscription=None, id_label=None,
+   with_attributes=False, timestamp_attribute=None,
+   expansion_service=None):
+"""Initializes ``ReadFromPubSub``.
+
+Args:
+  topic: Cloud Pub/Sub topic in the form
+"projects//topics/". If provided, subscription must be
+None.
+  subscription: Existing Cloud Pub/Sub subscription to use in the
+form "projects//subscriptions/". If not
+specified, a temporary subscription will be created from the specified
+topic. If provided, topic must be None.
+  id_label: The attribute on incoming Pub/Sub messages to use as a unique
+record identifier. When specified, the value of this attribute (which
+can be any string that uniquely identifies the record) will be used for
+deduplication of messages. If not provided, we cannot guarantee
+that no duplicate data will be delivered on the Pub/Sub stream. In this
+case, deduplication of the stream will be strictly best effort.
+  with_attributes:
+True - output elements will be
+:class:`~apache_beam.io.gcp.pubsub.PubsubMessage` objects.
+False - output elements will be of type ``bytes`` (message
+data only).
+  timestamp_attribute: Message value to use as element timestamp. If None,
+uses message publishing time as the timestamp.
+
+Timestamp values should be in one of two formats:
+
+- A numerical value representing the number of milliseconds since the
+  Unix epoch.
+- A string in RFC 3339 format, UTC timezone. Example:
+  ``2015-10-29T23:41:41.123Z``. The sub-second component of the
+  timestamp is optional, and digits beyond the first three (i.e., time
+  units smaller than milliseconds) may be ignored.
+"""
+self.params = ReadFromPubsubSchema(
+topic=topic,
+subscription=subscription,
+id_label=id_label,
+with_attributes=with_attributes,
+timestamp_attribute=timestamp_attribute)
+self.expansion_service = expansion_service
+
+  def expand(self, pbegin):
+pcoll = pbegin.apply(
+ExternalTransform(
+self.URN, NamedTupleBasedPayloadBuilder(self.params),
+self.expansion_service))
+
+if self.params.with_attributes:
+  pcoll = pcoll | 'FromProto' >> Map(pubsub.PubsubMessage._from_proto_str)
+  pcoll.element_type = pubsub.PubsubMessage
+else:
+  pcoll.element_type = bytes
+return pcoll
+
+
+WriteToPubsubSchema = typing.NamedTuple(
+'WriteToPubsubSchema',
+[
+('topic', 

[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332064
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 16:18
Start Date: 22/Oct/19 16:18
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r337616849
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
+'ReadFromPubsubSchema',
+[
+('topic', typing.Optional[unicode]),
+('subscription', typing.Optional[unicode]),
+('id_label', typing.Optional[unicode]),
+('with_attributes', bool),
+('timestamp_attribute', typing.Optional[unicode]),
+]
+)
+
+
+class ReadFromPubSub(beam.PTransform):
+  """An external ``PTransform`` for reading from Cloud Pub/Sub."""
 
 Review comment:
   Can be done in a follow-up.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332064)
Time Spent: 6h 40m  (was: 6.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=332063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-332063
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 22/Oct/19 16:17
Start Date: 22/Oct/19 16:17
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r337614878
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import typing
+
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam.io.gcp import pubsub
+from apache_beam.transforms import Map
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(
+'ReadFromPubsubSchema',
+[
+('topic', typing.Optional[unicode]),
+('subscription', typing.Optional[unicode]),
+('id_label', typing.Optional[unicode]),
+('with_attributes', bool),
+('timestamp_attribute', typing.Optional[unicode]),
+]
+)
+
+
+class ReadFromPubSub(beam.PTransform):
+  """An external ``PTransform`` for reading from Cloud Pub/Sub."""
 
 Review comment:
   Should we mention here or in the module comments that this experimental and 
requires the non-standard coders to be in place?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 332063)
Time Spent: 6.5h  (was: 6h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=331611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331611
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 21/Oct/19 19:41
Start Date: 21/Oct/19 19:41
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-544674970
 
 
   @mxm   This is tested on prem and ready to go!  (with the caveat that the 
special commit to add dependencies is still required for end users, as is the 
case with kafka: chadrik@d12b990)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331611)
Time Spent: 6h 20m  (was: 6h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=331472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331472
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 21/Oct/19 16:20
Start Date: 21/Oct/19 16:20
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-544592670
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331472)
Time Spent: 6h 10m  (was: 6h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=331151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331151
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 21/Oct/19 02:44
Start Date: 21/Oct/19 02:44
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-544327461
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331151)
Time Spent: 6h  (was: 5h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=331129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331129
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 21/Oct/19 00:36
Start Date: 21/Oct/19 00:36
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-544309820
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331129)
Time Spent: 5h 50m  (was: 5h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=331116=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331116
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 20/Oct/19 22:48
Start Date: 20/Oct/19 22:48
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-544300982
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331116)
Time Spent: 5h 40m  (was: 5.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=329217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329217
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Oct/19 15:40
Start Date: 16/Oct/19 15:40
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-542763520
 
 
   We are doing some final testing in production today and then I’ll hopefully
   give you the thumbs up.
   
   -chad
   
   
   
   On Wed, Oct 16, 2019 at 3:13 AM Maximilian Michels 
   wrote:
   
   > Any update here?
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329217)
Time Spent: 5.5h  (was: 5h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=329092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329092
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Oct/19 10:13
Start Date: 16/Oct/19 10:13
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-542631173
 
 
   Any update here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329092)
Time Spent: 5h 20m  (was: 5h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=325393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325393
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 09/Oct/19 00:09
Start Date: 09/Oct/19 00:09
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-539755630
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325393)
Time Spent: 5h  (was: 4h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=325394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325394
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 09/Oct/19 00:09
Start Date: 09/Oct/19 00:09
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-539755717
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325394)
Time Spent: 5h 10m  (was: 5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-10-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=322355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-322355
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 02/Oct/19 23:20
Start Date: 02/Oct/19 23:20
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-537721270
 
 
   This is now waiting on the BooleanCoder in python
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 322355)
Time Spent: 4h 50m  (was: 4h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=319133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319133
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 26/Sep/19 18:28
Start Date: 26/Sep/19 18:28
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-535629521
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 319133)
Time Spent: 4.5h  (was: 4h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=319132=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319132
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 26/Sep/19 18:28
Start Date: 26/Sep/19 18:28
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-535629486
 
 
   Python failure seems to be a docs issue 
(:sdks:python:test-suites:tox:py2:docs'.)
   
   Could be a flake. Retrying.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 319132)
Time Spent: 4h 20m  (was: 4h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=319129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319129
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 26/Sep/19 18:23
Start Date: 26/Sep/19 18:23
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-535627722
 
 
   Seems like the Java PreCommit failure is for the new test ?
   
   Stacktrace is:
   org.apache.beam.sdk.io.gcp.pubsub.PubsubIOExternalTest > 
testConstructPubsubRead FAILED
   java.lang.RuntimeException at PubsubIOExternalTest.java:97
   Caused by: java.lang.RuntimeException at PubsubIOExternalTest.java:97
   Caused by: java.lang.reflect.InvocationTargetException at 
PubsubIOExternalTest.java:97
   Caused by: java.lang.IllegalStateException at 
PubsubIOExternalTest.java:97
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 319129)
Time Spent: 4h 10m  (was: 4h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=318411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-318411
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 25/Sep/19 15:53
Start Date: 25/Sep/19 15:53
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-535089028
 
 
   Will take a look ASAP, on a low time budget at the moment :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 318411)
Time Spent: 4h  (was: 3h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=317650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-317650
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Sep/19 16:48
Start Date: 24/Sep/19 16:48
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-534646103
 
 
   > Does this have to be rebased now that #9098 has been merged?
   
   It will work as is, but it should be converted to the new API to provide 
more real-world examples.  I'll do that today.  
   
   If you could help me diagnose the Java test failure it would be much 
appreciated!
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 317650)
Time Spent: 3h 50m  (was: 3h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=317648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-317648
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Sep/19 16:45
Start Date: 24/Sep/19 16:45
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-534645003
 
 
   @chadrik Does this have to be rebased now that #9098 has been merged?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 317648)
Time Spent: 3h 40m  (was: 3.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=317646=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-317646
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Sep/19 16:44
Start Date: 24/Sep/19 16:44
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-534644725
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 317646)
Time Spent: 3.5h  (was: 3h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=317645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-317645
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 24/Sep/19 16:44
Start Date: 24/Sep/19 16:44
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-534644688
 
 
   Build results have expired: 
https://builds.apache.org/job/beam_PreCommit_Java_Commit/7551/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 317645)
Time Spent: 3h 20m  (was: 3h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=309452=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-309452
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 10/Sep/19 01:37
Start Date: 10/Sep/19 01:37
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-529730980
 
 
   Hi.  Still hoping for a little help bringing this to a close. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 309452)
Time Spent: 3h 10m  (was: 3h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=307440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307440
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Sep/19 21:07
Start Date: 05/Sep/19 21:07
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-528582296
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307440)
Time Spent: 3h  (was: 2h 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=307331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307331
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:59
Start Date: 05/Sep/19 16:59
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-528461878
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307331)
Time Spent: 2h 50m  (was: 2h 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=307330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307330
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:59
Start Date: 05/Sep/19 16:59
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-528461794
 
 
   I think this PR is ready to merge, but I could use a little help 
understanding how to resolve the test failures.
   
   Java PreCommit has no test failures, so I'm not sure why it's marked as 
failed here. Is it because of the warnings?  Jenkins seems to imply that 
anything over 12 warnings is an error, but I don't see any warnings in files 
that I changed:
https://builds.apache.org/job/beam_PreCommit_Java_Commit/7551/
   
   Likewise Python PreCommit list no failures [in 
jenkins](https://builds.apache.org/job/beam_PreCommit_Python_Commit/8356/testReport/),
 but is marked as failed here.
   
   And Portable_Python PreCommit just seems to have timed out. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307330)
Time Spent: 2h 40m  (was: 2.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=304550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304550
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:38
Start Date: 30/Aug/19 19:38
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-526723440
 
 
   PubsubIO is officially working on Flink!
   
   Some notes:
   - In order for this to work from python, this PR needs to be merged: 
https://github.com/apache/beam/pull/9426
   - This PR is _not_ based on my pending "user-friendly external transform" 
PR: https://github.com/apache/beam/pull/9098.  There are still some issues that 
I need to work out with @mxm on that one, so I think this should go first and 
I'll update that one afterward.
   
   There's one giant caveat:  this actually does not work unless you have this 
special commit:  
https://github.com/chadrik/beam/commit/d12b99084809ec34fcf0be616e94301d3aca4870.
  It turns out the same is currently true for KafkaIO external transform 
support, which came as quite a surprise.  The Jira issue for this problem is 
here:  https://jira.apache.org/jira/browse/BEAM-7870.  I'd love to see some 
movement on this.  Happy to help where I can!
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304550)
Time Spent: 2.5h  (was: 2h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296816
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 17/Aug/19 18:11
Start Date: 17/Aug/19 18:11
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522258539
 
 
   Ok, I now know how the `PubsubMessageWithAttributesCoder` is getting swapped 
with a `BytesCoder`, but it turns out it's intentional and I don't know why.
   
   In `FlinkStreamingPortablePipelineTranslator.translateRead` when the coder 
is initiated it is passed through 
`LengthPrefixUnknownCoders.addLengthPrefixedCoder` which silently replaces all 
non-model coders with `LengthPrefixCoder(ByteArrayCoder)`.  This _seems_ like 
the sort of thing that should print a warning, since I assume that a broken 
pipeline is a likely outcome.
   
   I'm unclear why this coder swap is necessary.  The java part of this 
pipeline is a `Read -> ParDo -> ParDo`, shouldn't this segment be able to 
utilize java-only coders (i.e. non-model coder)?
   
   What's the proper solution here?  The last java ParDo is the one that's 
ensuring we have a byte array for sending to python, but evidently this needs 
to be happening in the Read itself?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296816)
Time Spent: 2h 20m  (was: 2h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296813
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 17/Aug/19 17:58
Start Date: 17/Aug/19 17:58
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522258539
 
 
   Ok, I now know how the `PubsubMessageWithAttributesCoder` is getting swapped 
with a `BytesCoder`, but it turns out it's intentional and I don't know why.
   
   In `FlinkStreamingPortablePipelineTranslator.translateRead` when the coder 
is initiated it is passed through 
`LengthPrefixUnknownCoders.addLengthPrefixedCoder` which silently replaces all 
non-model coders with `LengthPrefixCoder(BytesCoder)`.  This _seems_ like the 
sort of thing that should print a warning, since I assume that a broken 
pipeline is a likely outcome.
   
   I'm unclear why this coder swap is necessary.  The java part of this 
pipeline is a `Read -> ParDo -> ParDo`, shouldn't this segment be able to 
utilize java-only coders (i.e. non-model coder)?
   
   What's the proper solution here?  The last java ParDo is the one that's 
ensuring we have a byte array for sending to python, but evidently this needs 
to be happening in the Read itself?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296813)
Time Spent: 2h 10m  (was: 2h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296664
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Aug/19 23:43
Start Date: 16/Aug/19 23:43
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522181883
 
 
   Here's some info on where I am with this.  I could really use some help to 
push this over the finish line.
   
   The expansion service runs correctly, sends the expanded transforms back to 
python, but the job fails inside Java on Flink because it's trying to use the 
incorrect serializer.  There's a good chance that I'm overlooking something 
very obvious.
   
   Here's the stack trace:
   
   ```
   2019-08-16 16:04:58,297 INFO  org.apache.flink.runtime.taskmanager.Task  
   - Source: 
PubSubInflow/ExternalTransform(beam:external:java:pubsub:read:v1)/PubsubUnboundedSource/Read(PubsubSource)
 (1/1) (93e578d8ae737c7502685cd978413a98) switched from RUNNING to FAILED.
   java.lang.RuntimeException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage 
cannot be cast to [B
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:305)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:394)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.emitElement(UnboundedSourceWrapper.java:342)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:268)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
   Caused by: java.lang.ClassCastException: 
org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
at 
org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
at 
org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:578)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:569)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
at 
org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:87)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:175)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:46)
at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.serializeRecord(SpanningRecordSerializer.java:78)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:160)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
... 15 more
   ```
   
   Here's the serializer that 

[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296665
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Aug/19 23:43
Start Date: 16/Aug/19 23:43
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522181883
 
 
   Here's some info on where I am with this.  I could really use some help to 
push this over the finish line.
   
   The expansion service runs correctly, sends the expanded transforms back to 
python, but the job fails inside Java on Flink because it's trying to use the 
incorrect serializer.  There's a good chance that I'm overlooking something 
very obvious.
   
   Here's the stack trace:
   
   ```
   2019-08-16 16:04:58,297 INFO  org.apache.flink.runtime.taskmanager.Task  
   - Source: 
PubSubInflow/ExternalTransform(beam:external:java:pubsub:read:v1)/PubsubUnboundedSource/Read(PubsubSource)
 (1/1) (93e578d8ae737c7502685cd978413a98) switched from RUNNING to FAILED.
   java.lang.RuntimeException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage 
cannot be cast to [B
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:305)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:394)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.emitElement(UnboundedSourceWrapper.java:342)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:268)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
   Caused by: java.lang.ClassCastException: 
org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
at 
org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
at 
org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:578)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:569)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
at 
org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:87)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:175)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:46)
at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.serializeRecord(SpanningRecordSerializer.java:78)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:160)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
... 15 more
   ```
   
   Here's the serializer that 

[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296663
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Aug/19 23:40
Start Date: 16/Aug/19 23:40
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522181883
 
 
   Here's some info on where I am with this.  I could really use some help to 
push this over the finish line.
   
   The expansion service runs correctly, sends the expanded transforms back to 
python, but the job fails inside Java on Flink because it's trying to use the 
incorrect serializer.  There's a good chance that I'm overlooking something 
very obvious.
   
   Here's the stack trace:
   
   ```
   2019-08-16 16:04:58,297 INFO  org.apache.flink.runtime.taskmanager.Task  
   - Source: 
PubSubInflow/ExternalTransform(beam:external:java:pubsub:read:v1)/PubsubUnboundedSource/Read(PubsubSource)
 (1/1) (93e578d8ae737c7502685cd978413a98) switched from RUNNING to FAILED.
   java.lang.RuntimeException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage 
cannot be cast to [B
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:305)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:394)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.emitElement(UnboundedSourceWrapper.java:342)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:268)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
   Caused by: java.lang.ClassCastException: 
org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
at 
org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
at 
org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:578)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:569)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
at 
org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:87)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:175)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:46)
at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.serializeRecord(SpanningRecordSerializer.java:78)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:160)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
... 15 more
   ```
   
   Here's the serializer that 

[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296658
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 16/Aug/19 23:39
Start Date: 16/Aug/19 23:39
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522181883
 
 
   Here's some info on where I am with this.  I could really use some help to 
push this over the finish line.
   
   The expansion service runs correctly, sends the expanded transforms back to 
python, but the job fails inside Java because it's trying to use the incorrect 
serializer.  There's a good chance that I'm overlooking something very obvious.
   
   Here's the stack trace:
   
   ```
   2019-08-16 16:04:58,297 INFO  org.apache.flink.runtime.taskmanager.Task  
   - Source: 
PubSubInflow/ExternalTransform(beam:external:java:pubsub:read:v1)/PubsubUnboundedSource/Read(PubsubSource)
 (1/1) (93e578d8ae737c7502685cd978413a98) switched from RUNNING to FAILED.
   java.lang.RuntimeException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage 
cannot be cast to [B
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:305)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:394)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.emitElement(UnboundedSourceWrapper.java:342)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:268)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:745)
   Caused by: java.lang.ClassCastException: 
org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
at 
org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
at 
org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
at 
org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:578)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:569)
at 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
at 
org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:87)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:175)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:46)
at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.serializeRecord(SpanningRecordSerializer.java:78)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:160)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
... 15 more
   ```
   
   Here's the serializer that `CoderTypeSerializer.serialize` 

[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=294042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294042
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 13/Aug/19 17:25
Start Date: 13/Aug/19 17:25
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-520928591
 
 
   Thanks @mxm.  This is actually not yet working yet, and I'm getting some 
mysterious errors deep within flink wrt serialization.  I'm working on writing 
tests and gathering as much info as I can so that I can report back here.  So 
far I'm pretty perplexed, but I'm new to Beam, Flink, and Java, so not a big 
surprise!  Once I have more info, it would be great to have your input.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294042)
Time Spent: 1h 20m  (was: 1h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=294013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294013
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 13/Aug/19 16:40
Start Date: 13/Aug/19 16:40
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9268: [BEAM-7738] Add external 
transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-520912285
 
 
   @chadrik This is awesome. Thanks a lot for your work! Let's rebase this once 
#9098 is in. I'll have a look at this very soon.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294013)
Time Spent: 1h 10m  (was: 1h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=290210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-290210
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 07/Aug/19 04:29
Start Date: 07/Aug/19 04:29
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r311364462
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
 ##
 @@ -674,6 +680,85 @@ public String toString() {
   abstract Builder setClock(@Nullable Clock clock);
 
   abstract Read build();
+
+  @Override
+  public PTransform> 
buildExternal(External.Configuration config) {
+if (config.topic != null) {
+  StaticValueProvider topic = 
StaticValueProvider.of(utf8String(config.topic));
+  setTopicProvider(NestedValueProvider.of(topic, new 
TopicTranslator()));
+}
+if (config.subscription != null) {
+  StaticValueProvider subscription =
+  StaticValueProvider.of(utf8String(config.subscription));
+  setSubscriptionProvider(
+  NestedValueProvider.of(subscription, new 
SubscriptionTranslator()));
+}
+if (config.idAttribute != null) {
+  String idAttribute = utf8String(config.idAttribute);
+  setIdAttribute(idAttribute);
+}
+if (config.timestampAttribute != null) {
+  String timestampAttribute = utf8String(config.timestampAttribute);
+  setTimestampAttribute(timestampAttribute);
+}
+setNeedsAttributes(config.needsAttributes);
+setPubsubClientFactory(FACTORY);
+if (config.needsAttributes) {
+  SimpleFunction parseFn =
+  (SimpleFunction) new IdentityMessageFn();
+  setParseFn(parseFn);
+  // FIXME: call setCoder(). need to use PubsubMessage proto to be 
compatible with python
 
 Review comment:
   I serialized the `PubsubMessage` using protobufs.  Since there's no 
cross-language coder for `PubsubMessage`, and I assumed it would be overreach 
to add one, I used the bytes coder and then handled converting to and from 
protobufs in code that lives close to the transforms. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 290210)
Time Spent: 1h  (was: 50m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=290208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-290208
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 07/Aug/19 04:23
Start Date: 07/Aug/19 04:23
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r311363606
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,131 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+from apache_beam import ExternalTransform
+from apache_beam import pvalue
+from apache_beam.coders import BytesCoder
+from apache_beam.coders import FastPrimitivesCoder
+from apache_beam.coders.coders import LengthPrefixCoder
+from apache_beam.portability.api.external_transforms_pb2 import ConfigValue
+from apache_beam.portability.api.external_transforms_pb2 import 
ExternalConfigurationPayload
+from apache_beam.transforms import ptransform
+
+
+class ReadFromPubSub(ptransform.PTransform):
+  """An external ``PTransform`` for reading from Cloud Pub/Sub."""
+
+  _urn = 'beam:external:java:pubsub:read:v1'
+
+  def __init__(self, topic=None, subscription=None, id_label=None,
+   with_attributes=False, timestamp_attribute=None,
+   expansion_service='localhost:8097'):
+super(ReadFromPubSub, self).__init__()
+self.topic = topic
+self.subscription = subscription
+self.id_label = id_label
+self.with_attributes = with_attributes
+self.timestamp_attribute = timestamp_attribute
+self.expansion_service = expansion_service
+
+  def expand(self, pbegin):
+if not isinstance(pbegin, pvalue.PBegin):
+  raise Exception("ReadFromPubSub must be a root transform")
+
+args = {}
+
+if self.topic is not None:
+  args['topic'] = _encode_str(self.topic)
+
+if self.subscription is not None:
+  args['subscription'] = _encode_str(self.subscription)
+
+if self.id_label is not None:
+  args['id_label'] = _encode_str(self.id_label)
+
+# FIXME: how do we encode a bool so that Java can decode it?
+# args['with_attributes'] = _encode_bool(self.with_attributes)
 
 Review comment:
   I encoded as int and handled the cast in java
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 290208)
Time Spent: 50m  (was: 40m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=289233=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289233
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Aug/19 22:24
Start Date: 05/Aug/19 22:24
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-518424158
 
 
   R: @robertwb
   R: @mxm
   R: @lukecwik
   R: @chamikaramj
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289233)
Time Spent: 40m  (was: 0.5h)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=289232=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289232
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Aug/19 22:22
Start Date: 05/Aug/19 22:22
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r310815552
 
 

 ##
 File path: sdks/python/apache_beam/io/external/gcp/pubsub.py
 ##
 @@ -0,0 +1,131 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+from apache_beam import ExternalTransform
+from apache_beam import pvalue
+from apache_beam.coders import BytesCoder
+from apache_beam.coders import FastPrimitivesCoder
+from apache_beam.coders.coders import LengthPrefixCoder
+from apache_beam.portability.api.external_transforms_pb2 import ConfigValue
+from apache_beam.portability.api.external_transforms_pb2 import 
ExternalConfigurationPayload
+from apache_beam.transforms import ptransform
+
+
+class ReadFromPubSub(ptransform.PTransform):
+  """An external ``PTransform`` for reading from Cloud Pub/Sub."""
+
+  _urn = 'beam:external:java:pubsub:read:v1'
+
+  def __init__(self, topic=None, subscription=None, id_label=None,
+   with_attributes=False, timestamp_attribute=None,
+   expansion_service='localhost:8097'):
+super(ReadFromPubSub, self).__init__()
+self.topic = topic
+self.subscription = subscription
+self.id_label = id_label
+self.with_attributes = with_attributes
+self.timestamp_attribute = timestamp_attribute
+self.expansion_service = expansion_service
+
+  def expand(self, pbegin):
+if not isinstance(pbegin, pvalue.PBegin):
+  raise Exception("ReadFromPubSub must be a root transform")
+
+args = {}
+
+if self.topic is not None:
+  args['topic'] = _encode_str(self.topic)
+
+if self.subscription is not None:
+  args['subscription'] = _encode_str(self.subscription)
+
+if self.id_label is not None:
+  args['id_label'] = _encode_str(self.id_label)
+
+# FIXME: how do we encode a bool so that Java can decode it?
+# args['with_attributes'] = _encode_bool(self.with_attributes)
 
 Review comment:
   What's the right way to encode a bool so that Java can decode it correctly?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289232)
Time Spent: 0.5h  (was: 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=289231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289231
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Aug/19 22:21
Start Date: 05/Aug/19 22:21
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268#discussion_r310815296
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
 ##
 @@ -674,6 +680,85 @@ public String toString() {
   abstract Builder setClock(@Nullable Clock clock);
 
   abstract Read build();
+
+  @Override
+  public PTransform> 
buildExternal(External.Configuration config) {
+if (config.topic != null) {
+  StaticValueProvider topic = 
StaticValueProvider.of(utf8String(config.topic));
+  setTopicProvider(NestedValueProvider.of(topic, new 
TopicTranslator()));
+}
+if (config.subscription != null) {
+  StaticValueProvider subscription =
+  StaticValueProvider.of(utf8String(config.subscription));
+  setSubscriptionProvider(
+  NestedValueProvider.of(subscription, new 
SubscriptionTranslator()));
+}
+if (config.idAttribute != null) {
+  String idAttribute = utf8String(config.idAttribute);
+  setIdAttribute(idAttribute);
+}
+if (config.timestampAttribute != null) {
+  String timestampAttribute = utf8String(config.timestampAttribute);
+  setTimestampAttribute(timestampAttribute);
+}
+setNeedsAttributes(config.needsAttributes);
+setPubsubClientFactory(FACTORY);
+if (config.needsAttributes) {
+  SimpleFunction parseFn =
+  (SimpleFunction) new IdentityMessageFn();
+  setParseFn(parseFn);
+  // FIXME: call setCoder(). need to use PubsubMessage proto to be 
compatible with python
 
 Review comment:
   In the case that an external sdk has requested `needsAttributes`, this 
transform needs to produce `PubsubMessage` instances rather than byte arrays.  
There's a wrinkle here:  python deserializes `PubsubMessage` using protobufs, 
whereas Java encodes with `PubsubMessageWithAttributesCoder`. 
   
   I believe we want to use protobufs from Java.  Can I get confirmation of 
that, please?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289231)
Time Spent: 20m  (was: 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=289230=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289230
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 05/Aug/19 22:17
Start Date: 05/Aug/19 22:17
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9268: [BEAM-7738] 
Add external transform support to PubSubIO
URL: https://github.com/apache/beam/pull/9268
 
 
   Add support for using PubSubIO as an external transform, so that it works 
from python on portable runners like Flink.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build