subject:"\[jira\] \[Work logged\] \(BEAM\-7246\) Create a Spanner IO for Python"

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=389169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389169
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 18/Feb/20 23:20
Start Date: 18/Feb/20 23:20
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #10712: 
[BEAM-7246] Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389169)
Time Spent: 22.5h  (was: 22h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 22.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=389124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389124
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:57
Start Date: 18/Feb/20 21:57
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-587923073
 
 
   LGTM. Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389124)
Time Spent: 22h 20m  (was: 22h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 22h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=389122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389122
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:53
Start Date: 18/Feb/20 21:53
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-587916297
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389122)
Time Spent: 22h 10m  (was: 22h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 22h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=389121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389121
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:52
Start Date: 18/Feb/20 21:52
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-587914900
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389121)
Time Spent: 22h  (was: 21h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=389016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389016
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 18/Feb/20 19:18
Start Date: 18/Feb/20 19:18
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-587691085
 
 
   Thanks @nielm .. @chamikaramj is there anything you would like to add or we 
are ready for merge?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389016)
Time Spent: 21h 50m  (was: 21h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=387971=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387971
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 15/Feb/20 18:37
Start Date: 15/Feb/20 18:37
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-586629057
 
 
   @chamikaramj @nielm could you please verify the changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 387971)
Time Spent: 21h 40m  (was: 21.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386741
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 13/Feb/20 16:49
Start Date: 13/Feb/20 16:49
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585856253
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386741)
Time Spent: 21.5h  (was: 21h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-13 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386738
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 13/Feb/20 16:44
Start Date: 13/Feb/20 16:44
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585853706
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386738)
Time Spent: 21h 20m  (was: 21h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386419
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 13/Feb/20 07:23
Start Date: 13/Feb/20 07:23
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585588148
 
 
   @aaltay I've rebased my branch, could you please trigger the tests!
   - Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386419)
Time Spent: 21h 10m  (was: 21h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386233=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386233
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 12/Feb/20 21:04
Start Date: 12/Feb/20 21:04
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585416739
 
 
   I believe this is fixed with https://github.com/apache/beam/pull/10844, you 
may need to rebase.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386233)
Time Spent: 21h  (was: 20h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 21h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386155=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386155
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 12/Feb/20 19:18
Start Date: 12/Feb/20 19:18
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585371923
 
 
   @aaltay All three tests are failed due to `ImportError: No module named 
'pycodestyle'` in `avro-python3` package. 
   
   Portable_Python PreCommit
   
https://scans.gradle.com/s/3qf5sqnettmmq/console-log?task=:sdks:python:test-suites:portable:py35:installGcpTest#L299
   
   PreCommit
   
https://scans.gradle.com/s/c5ncivj7k2pko/console-log?task=:sdks:python:test-suites:dataflow:py37:installGcpTest#L658
   
   PythonFormatter PreCommit
   
https://scans.gradle.com/s/i6nvgyym5tfqk/console-log?task=:sdks:python:test-suites:tox:py37:formatter#L267
   
   Could you please rerun these test, possibly it'll fix the issue!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386155)
Time Spent: 20h 50m  (was: 20h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=386105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386105
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 12/Feb/20 18:02
Start Date: 12/Feb/20 18:02
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585337286
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386105)
Time Spent: 20h 40m  (was: 20.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385953
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 12/Feb/20 14:47
Start Date: 12/Feb/20 14:47
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-585239918
 
 
   @aaltay @markflyhigh Seems like jobs are triggered and completed 
successfully but showing no activity on github!
   
   https://builds.apache.org/job/beam_PreCommit_Python_Commit/11080/
   https://builds.apache.org/job/beam_PreCommit_PythonFormatter_Commit/67/
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385953)
Time Spent: 20.5h  (was: 20h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385333
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 18:16
Start Date: 11/Feb/20 18:16
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584776785
 
 
   /cc @markflyhigh - Tests seems to be not triggering?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385333)
Time Spent: 20h 20m  (was: 20h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385325
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 18:11
Start Date: 11/Feb/20 18:11
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584774835
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385325)
Time Spent: 20h 10m  (was: 20h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385064
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 10:24
Start Date: 11/Feb/20 10:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584566111
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385064)
Time Spent: 19.5h  (was: 19h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385065
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 10:24
Start Date: 11/Feb/20 10:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584566278
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385065)
Time Spent: 19h 40m  (was: 19.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385066
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 10:24
Start Date: 11/Feb/20 10:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584566071
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385066)
Time Spent: 19h 50m  (was: 19h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385067
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 10:24
Start Date: 11/Feb/20 10:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584566111
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385067)
Time Spent: 20h  (was: 19h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 20h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=385063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385063
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 11/Feb/20 10:23
Start Date: 11/Feb/20 10:23
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584566071
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 385063)
Time Spent: 19h 20m  (was: 19h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384649
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:47
Start Date: 10/Feb/20 18:47
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584290424
 
 
   Trigger tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384649)
Time Spent: 19h 10m  (was: 19h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384640
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:40
Start Date: 10/Feb/20 18:40
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584287258
 
 
   ping for test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384640)
Time Spent: 19h  (was: 18h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 19h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384637
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:32
Start Date: 10/Feb/20 18:32
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584283428
 
 
   > Could you also add a new feature note in 
https://github.com/apache/beam/blob/master/CHANGES.md
   
   Done. :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384637)
Time Spent: 18h 50m  (was: 18h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 18h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384623
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:14
Start Date: 10/Feb/20 18:14
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584258200
 
 
   Could you also add a new feature note in 
https://github.com/apache/beam/blob/master/CHANGES.md
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384623)
Time Spent: 18h 40m  (was: 18.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 18h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384621
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:14
Start Date: 10/Feb/20 18:14
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584257913
 
 
   Trigger tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384621)
Time Spent: 18.5h  (was: 18h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 18.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=384620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384620
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 10/Feb/20 18:13
Start Date: 10/Feb/20 18:13
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-584257624
 
 
   ping for test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384620)
Time Spent: 18h 20m  (was: 18h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 18h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382976
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:36
Start Date: 06/Feb/20 16:36
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375945920
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382975
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:35
Start Date: 06/Feb/20 16:35
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375945920
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382974
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:35
Start Date: 06/Feb/20 16:35
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375945920
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382973
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:35
Start Date: 06/Feb/20 16:35
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375945639
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   Thanks. I'll update the code!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382973)
Time Spent: 17h 40m  (was: 17.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 17h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382962
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:10
Start Date: 06/Feb/20 16:10
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375930092
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   Also, given that in this transform the batches are not pre-sorted, I would 
make the defaults a lot smaller than the Java equivalent: say max 500 cells per 
batch, and max 50 rows. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382962)
Time Spent: 17.5h  (was: 17h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382958=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382958
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:06
Start Date: 06/Feb/20 16:06
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375926873
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   Yes, you are correct. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382958)
Time Spent: 17h 20m  (was: 17h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 17h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382957=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382957
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:05
Start Date: 06/Feb/20 16:05
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375926873
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   No, you are correct. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382957)
Time Spent: 17h 10m  (was: 17h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 17h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382956=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382956
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 16:04
Start Date: 06/Feb/20 16:04
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375926449
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382939
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 15:14
Start Date: 06/Feb/20 15:14
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-582952187
 
 
   > Just to let you know that we've just introduced Python autoformatter. Your 
merge conflict might be a result of this.
   > Here you can find an instruction on how to run autoformatter: 
https://cwiki.apache.org/confluence/display/BEAM/Python+Tips, section 
Formatting.
   > Sorry for inconvenience.
   
   No worries @kamilwu , i'll resolve the conflicts on my next commit. Thanks 
for the heads up :) 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382939)
Time Spent: 16h 50m  (was: 16h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382938
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 15:13
Start Date: 06/Feb/20 15:13
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375892684
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   For my understanding, `maximum_number_cells` would the (total number of 
columns * total number of rows)
   
   For Example:
   ```
   WriteMutation.insert("roles", ("key", "rolename"), [('abc1', "test-1"), 
('abc2', "test-2"), ('abc3', "test-3")])
   ```
   in this case... the max_number_cells would be `2 * 3 = 6`
   
   
   And for the max_rows_number will be 2 in the below case.
   
   ```
   MutationGroup([
               WriteMutation.insert("roles", ("key", "rolename"),
                                    [('abc1', "test1")]),
               WriteMutation.insert("roles", ("key", "rolename"),
                                    [('abc2', "test2")])
           ])
   ```
   Please corrent me if I am mistaken!
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382938)
Time Spent: 16h 40m  (was: 16.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382937
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 15:13
Start Date: 06/Feb/20 15:13
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375892462
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382868
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Feb/20 12:34
Start Date: 06/Feb/20 12:34
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-582885733
 
 
   Just to let you know that we've just introduced Python autoformatter. Your 
merge conflict might be a result of this.
   Here you can find an instruction on how to run autoformatter: 
https://cwiki.apache.org/confluence/display/BEAM/Python+Tips, section 
Formatting.
   Sorry for inconvenience.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 382868)
Time Spent: 16h 20m  (was: 16h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=381717=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381717
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 04/Feb/20 17:06
Start Date: 04/Feb/20 17:06
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r374802530
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=381716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381716
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 04/Feb/20 17:02
Start Date: 04/Feb/20 17:02
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r374800163
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
+transactions sent to spanner by grouping the mutation into batches. Setting
+this either to smaller value or zero to disable batching.
+
 
 Review comment:
   Please add a note to the following
   
   > Unlike the Java connector, this connector  _does not_ create batches of 
transactions sorted by table and primary key. 
   
   This can be a feature which is added later, I would not let it block this 
PR. 
   See 
https://medium.com/google-cloud/cloud-spanner-maximizing-data-load-throughput-23a0fc064b6d
 for more info. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 381716)
Time Spent: 16h  (was: 15h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 16h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=381710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381710
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 04/Feb/20 16:54
Start Date: 04/Feb/20 16:54
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r374795643
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+WriteMutation.insert(table='user', columns=('name', 'email'),
+values=[('sara'. 's...@dev.com')])
+  ]
+  _ = (p
+   | beam.Create(mutations)
+   | WriteToSpanner(
+  project_id=SPANNER_PROJECT_ID,
+  instance_id=SPANNER_INSTANCE_ID,
+  database_id=SPANNER_DATABASE_NAME)
+)
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+  WriteMutation(insert='users', columns=('name', 'email'),
+values=[('sara", 's...@example.com')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   There is one other batching parameter which is important -- the maximum 
number of cells being mutated. Spanner has a hard 20K limit here, so a batch 
must have less than 20K mutated cells, including cells being mutated in 
indexes. 
   
   Java version sets this to 5K by default. 
   
   A third parameter max_number_rows was also added recently to java, limiting 
the total number of rows in a batch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 381710)
Time Spent: 15h 50m  (was: 15h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-02-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=381709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-381709
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 04/Feb/20 16:46
Start Date: 04/Feb/20 16:46
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-582003286
 
 
   cc: @nithinsujir and @nielm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 381709)
Time Spent: 15h 40m  (was: 15.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379494
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 17:19
Start Date: 30/Jan/20 17:19
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-580360833
 
 
   ping for test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379494)
Time Spent: 15.5h  (was: 15h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379375
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 13:43
Start Date: 30/Jan/20 13:43
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-580259196
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379375)
Time Spent: 15h 20m  (was: 15h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379319
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:57
Start Date: 30/Jan/20 10:57
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r372845556
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379316
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:57
Start Date: 30/Jan/20 10:57
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r372842745
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379318
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:57
Start Date: 30/Jan/20 10:57
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r372831190
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379315
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:57
Start Date: 30/Jan/20 10:57
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r372828128
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -581,3 +644,369 @@ def display_data(self):
label='transaction')
 
 return res
+
+
+@experimental(extra_message="No backwards-compatibility guarantees.")
+class WriteToSpanner(PTransform):
+
+  def __init__(self, project_id, instance_id, database_id, pool=None,
+   credentials=None, max_batch_size_bytes=1048576):
+"""
+A PTransform to write onto Google Cloud Spanner.
+
+Args:
+  project_id: Cloud spanner project id. Be sure to use the Project ID,
+not the Project Number.
+  instance_id: Cloud spanner instance id.
+  database_id: Cloud spanner database id.
+  max_batch_size_bytes: (optional) Split the mutation into batches to
+reduce the number of transaction sent to Spanner. By default it is
+set to 1 MB (1048576 Bytes).
+"""
+self._configuration = _BeamSpannerConfiguration(
+project=project_id, instance=instance_id, database=database_id,
+credentials=credentials, pool=pool, snapshot_read_timestamp=None,
+snapshot_exact_staleness=None
+)
+self._max_batch_size_bytes = max_batch_size_bytes
+self._database_id = database_id
+self._project_id = project_id
+self._instance_id = instance_id
+self._pool = pool
+
+  def display_data(self):
+res = {
+'project_id': DisplayDataItem(self._project_id, label='Project Id'),
+'instance_id': DisplayDataItem(self._instance_id, label='Instance Id'),
+'pool': DisplayDataItem(str(self._pool), label='Pool'),
+'database': DisplayDataItem(self._database_id, label='Database'),
+'batch_size': DisplayDataItem(self._max_batch_size_bytes,
+  label="Batch Size"),
+}
+return res
+
+  def expand(self, pcoll):
+return (pcoll
+| "make batches" >>
+_WriteGroup(max_batch_size_bytes=self._max_batch_size_bytes)
+| 'Writing to spanner' >> ParDo(
+_WriteToSpannerDoFn(self._configuration)))
+
+
+class _Mutator(namedtuple('_Mutator', ["mutation", "operation", "kwargs"])):
+  __slots__ = ()
+
+  @property
+  def byte_size(self):
+return self.mutation.ByteSize()
+
+
+class MutationGroup(deque):
+  """
+  A Bundle of Spanner Mutations (_Mutator).
+  """
+
+  @property
+  def byte_size(self):
+s = 0
+for m in self.__iter__():
+  s += m.byte_size
+return s
+
+  def primary(self):
+return next(self.__iter__())
+
+
+class WriteMutation(object):
+
+  _OPERATION_DELETE = "delete"
+  _OPERATION_INSERT = "insert"
+  _OPERATION_INSERT_OR_UPDATE = "insert_or_update"
+  _OPERATION_REPLACE = "replace"
+  _OPERATION_UPDATE = "update"
+
+  def __init__(self,
+   insert=None,
+   update=None,
+   insert_or_update=None,
+   replace=None,
+   delete=None,
+   columns=None,
+   values=None,
+   keyset=None):
+"""
+A convenient class to create Spanner Mutations for Write. User can provide
+the operation via constructor or via static methods.
+
+Note: If a user passing the operation via construction, make sure that it
+will only accept one operation at a time. For example, if a user passing
+a table name in the `insert` parameter, and he also passes the `update`
+parameter value, this will cause an error.
+
+Args:
+  insert: (Optional) Name of the table in which rows will be inserted.
+  update: (Optional) Name of the table in which existing rows will be
+updated.
+  insert_or_update: (Optional) Table name in which rows will be written.
+Like insert, except that if the row already exists, then its column
+values are overwritten with the ones provided. Any column values not
+explicitly written are preserved.
+  replace: (Optional) Table name in which rows will be replaced. Like
+insert, except that if the row already exists, it is deleted, and the
+column values provided are inserted instead. Unlike `insert_or_update`,
+this means any values not explicitly written become `NULL`.
+  delete: (Optional) Table name from which rows will be deleted. Succeeds
+whether or not the named rows were present.
+  columns: The

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379317=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379317
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:57
Start Date: 30/Jan/20 10:57
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r372833189
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -131,13 +187,18 @@
 try:
   from google.cloud.spanner import Client
   from google.cloud.spanner import KeySet
+  from google.cloud.spanner_v1 import batch
   from google.cloud.spanner_v1.database import BatchSnapshot
+  from google.cloud.spanner_v1.proto.mutation_pb2 import Mutation
 
 Review comment:
   Since spanner package wont expose the `Mutation` and `batch` objects, so 
this is the only way to import it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379317)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379286
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:22
Start Date: 30/Jan/20 10:22
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-580185708
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379286)
Time Spent: 14h 20m  (was: 14h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379288
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 10:22
Start Date: 30/Jan/20 10:22
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-580185862
 
 
   Run Python Precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379288)
Time Spent: 14.5h  (was: 14h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=379218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379218
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 30/Jan/20 07:37
Start Date: 30/Jan/20 07:37
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-580122678
 
 
   R: @chamikaramj 
   R: @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 379218)
Time Spent: 14h 10m  (was: 14h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378965
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 29/Jan/20 18:28
Start Date: 29/Jan/20 18:28
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10712: [BEAM-7246] Added 
Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#issuecomment-579895160
 
 
   ping for tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378965)
Time Spent: 14h  (was: 13h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378956=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378956
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 29/Jan/20 18:08
Start Date: 29/Jan/20 18:08
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378561
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 23:26
Start Date: 28/Jan/20 23:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #10706: 
[BEAM-7246] Fix Spanner auth endpoints
URL: https://github.com/apache/beam/pull/10706
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378561)
Time Spent: 13h 40m  (was: 13.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378524
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 22:19
Start Date: 28/Jan/20 22:19
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10706: [BEAM-7246] Fix 
Spanner auth endpoints
URL: https://github.com/apache/beam/pull/10706#issuecomment-579487896
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378524)
Time Spent: 13h 20m  (was: 13h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378525
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 22:19
Start Date: 28/Jan/20 22:19
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10706: [BEAM-7246] Fix 
Spanner auth endpoints
URL: https://github.com/apache/beam/pull/10706#issuecomment-579487946
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378525)
Time Spent: 13.5h  (was: 13h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378523
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 22:18
Start Date: 28/Jan/20 22:18
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #10706: [BEAM-7246] Fix Spanner 
auth endpoints
URL: https://github.com/apache/beam/pull/10706#issuecomment-579487332
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378523)
Time Spent: 13h 10m  (was: 13h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378494
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 21:49
Start Date: 28/Jan/20 21:49
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10706: [BEAM-7246] Fix 
Spanner auth endpoints
URL: https://github.com/apache/beam/pull/10706#issuecomment-579473549
 
 
   cc: @mszb
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378494)
Time Spent: 13h  (was: 12h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=378459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378459
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 28/Jan/20 20:59
Start Date: 28/Jan/20 20:59
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #10706: 
[BEAM-7246] Fix Spanner auth endpoints
URL: https://github.com/apache/beam/pull/10706
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373868
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 20:55
Start Date: 17/Jan/20 20:55
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9606: 
[BEAM-7246] Add Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373868)
Time Spent: 12.5h  (was: 12h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373869
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 20:55
Start Date: 17/Jan/20 20:55
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575791689
 
 
   Thank you. 
   Let's get integration tests in so that we can move this out of experimental 
:)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373869)
Time Spent: 12h 40m  (was: 12.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373866
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 20:36
Start Date: 17/Jan/20 20:36
Worklog Time Spent: 10m 
  Work Description: shehzaadn-vd commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575785491
 
 
   Thanks @chamikaramj for your support! @aaltay looks like the tests are 
passing. Would you be able to merge this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373866)
Time Spent: 12h 20m  (was: 12h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373821
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 19:38
Start Date: 17/Jan/20 19:38
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575766434
 
 
   LGTM. Thanks.
   
   We can get this in when tests pass.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373821)
Time Spent: 12h 10m  (was: 12h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373814
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 19:29
Start Date: 17/Jan/20 19:29
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575763166
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373814)
Time Spent: 11h 50m  (was: 11h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373815
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 19:29
Start Date: 17/Jan/20 19:29
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575763362
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373815)
Time Spent: 12h  (was: 11h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373602
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 13:41
Start Date: 17/Jan/20 13:41
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575630131
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373602)
Time Spent: 11h 20m  (was: 11h 10m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373603
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 13:41
Start Date: 17/Jan/20 13:41
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575630301
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373603)
Time Spent: 11.5h  (was: 11h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373604=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373604
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 13:41
Start Date: 17/Jan/20 13:41
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575630131
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373604)
Time Spent: 11h 40m  (was: 11.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373547
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 11:13
Start Date: 17/Jan/20 11:13
Worklog Time Spent: 10m 
  Work Description: mszb commented on issue #9606: [BEAM-7246] Add Google 
Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575584369
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373547)
Time Spent: 11h 10m  (was: 11h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=373384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373384
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 17/Jan/20 03:51
Start Date: 17/Jan/20 03:51
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-575452714
 
 
   Any updates ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373384)
Time Spent: 11h  (was: 10h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=367172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367172
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 07/Jan/20 02:19
Start Date: 07/Jan/20 02:19
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-571402431
 
 
   Thanks. Mostly looks good.
   
   Added few more comments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367172)
Time Spent: 10h 50m  (was: 10h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=367170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367170
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 07/Jan/20 02:19
Start Date: 07/Jan/20 02:19
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9606: 
[BEAM-7246] Add Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r363567042
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio_test.py
 ##
 @@ -0,0 +1,271 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import datetime
+import logging
+import random
+import string
+import unittest
+
+import mock
+
+import apache_beam as beam
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+# Protect against environments where spanner library is not available.
+# pylint: disable=wrong-import-order, wrong-import-position, ungrouped-imports
+try:
+  from google.cloud import spanner
+  from apache_beam.io.gcp.experimental.spannerio import (create_transaction,
+ ReadOperation,
+ ReadFromSpanner) # 
pylint: disable=unused-import
+  # disable=unused-import
+except ImportError:
+  spanner = None
+# pylint: enable=wrong-import-order, wrong-import-position, ungrouped-imports
+
+
+MAX_DB_NAME_LENGTH = 30
+TEST_PROJECT_ID = 'apache-beam-testing'
+TEST_INSTANCE_ID = 'beam-test'
+TEST_DATABASE_PREFIX = 'spanner-testdb-'
+# TEST_TABLE = 'users'
+# TEST_COLUMNS = ['Key', 'Value']
+FAKE_ROWS = [[1, 'Alice'], [2, 'Bob'], [3, 'Carl'], [4, 'Dan'], [5, 'Evan'],
+ [6, 'Floyd']]
+
+
+def _generate_database_name():
+  mask = string.ascii_lowercase + string.digits
+  length = MAX_DB_NAME_LENGTH - 1 - len(TEST_DATABASE_PREFIX)
+  return TEST_DATABASE_PREFIX + ''.join(random.choice(mask) for i in range(
+  length))
+
+
+def _generate_test_data():
+  mask = string.ascii_lowercase + string.digits
+  length = 100
+  return [('users', ['Key', 'Value'], [(x, ''.join(
+  random.choice(mask) for _ in range(length))) for x in range(1, 5)])]
+
+
+@unittest.skipIf(spanner is None, 'GCP dependencies are not installed.')
+@mock.patch('apache_beam.io.gcp.experimental.spannerio.Client')
+@mock.patch('apache_beam.io.gcp.experimental.spannerio.BatchSnapshot')
+class SpannerReadTest(unittest.TestCase):
+
+  def test_read_with_query_batch(self, mock_batch_snapshot_class,
 
 Review comment:
   How about runReadUsingIndex ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367170)
Time Spent: 10h 40m  (was: 10.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=367169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367169
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 07/Jan/20 02:18
Start Date: 07/Jan/20 02:18
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9606: 
[BEAM-7246] Add Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r363566806
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##
 @@ -0,0 +1,565 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+Experimental; no backwards-compatibility guarantees.
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+ReadFromSpanner relies on the ReadOperation objects which is exposed by the
+SpannerIO API. ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of ReadOperations
+to the ReadFromSpanner transform constructor. ReadOperation exposes two static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  ReadOperation.table(table='customers', columns=['name',
+  'email']),
+  ReadOperation.table(table='vendors', columns=['name',
+  'email']),
+]
+  all_users = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  ReadOperation.query(sql='Select name, email from
+  customers'),
+  ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner.param_types`
+  all_users = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class ReadOperation.
+
+User can also able to provide the ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([ReadOperation...])
+   | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns PTransform
+which is passed later to the `ReadFromSpanner` constructor. `ReadFromSpanner`
+pass this transaction PTransform as a singleton side input to the
+`_NaiveSpannerReadDoFn` containing 'session_id' and 'transaction_id'.
+For example:::
+
+  transaction = (pipeline |

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=367157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367157
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 07/Jan/20 01:58
Start Date: 07/Jan/20 01:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #9606: 
[BEAM-7246] Add Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r363563133
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=366843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-366843
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 06/Jan/20 18:44
Start Date: 06/Jan/20 18:44
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-571259347
 
 
   Still reviewing the latest round of updates. Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 366843)
Time Spent: 10h 10m  (was: 10h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=365803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365803
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 03/Jan/20 10:44
Start Date: 03/Jan/20 10:44
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361167119
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,559 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+ReadFromSpanner relies on the ReadOperation objects which is exposed by the
+SpannerIO API. ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
 
 Review comment:
   Yes, you are right. In Naive reads transform we do not use spanner 
partitioning query in the transform.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 365803)
Time Spent: 10h  (was: 9h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=365802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365802
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 03/Jan/20 10:43
Start Date: 03/Jan/20 10:43
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361167119
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,559 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+ReadFromSpanner relies on the ReadOperation objects which is exposed by the
+SpannerIO API. ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
 
 Review comment:
   Yes, you are right. In Navie reads transform we do not use spanner 
partitioning query in the transform.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 365802)
Time Spent: 9h 50m  (was: 9h 40m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2020-01-02 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=365549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365549
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 03/Jan/20 00:09
Start Date: 03/Jan/20 00:09
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9606: [BEAM-7246] Add Google 
Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#issuecomment-570409375
 
 
   @chamikaramj is this ready to be merged? Are all the open comments resolved?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 365549)
Time Spent: 9h 40m  (was: 9.5h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363651
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 16:30
Start Date: 26/Dec/19 16:30
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361489358
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
 
 Review comment:
   Yes, its doing the same thing... updated the docs with some more details.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363650
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 16:28
Start Date: 26/Dec/19 16:28
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361489188
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363636
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 16:07
Start Date: 26/Dec/19 16:07
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361485151
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 363636)
Time Spent: 9h 10m  (was: 9h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363634
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 16:04
Start Date: 26/Dec/19 16:04
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361484541
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
 
 Review comment:
   I've updated the code. Now it referenced to 
`google.cloud.spanner.param_types`.
   But there is one import (`google.cloud.spanner_v1.database.BatchSnapshot`) 
which we need in our pipeline. Unfortunately, spanner sdk does not have its 
alias set in the package so the only option we have is to import is via 
version-specific.
   
https://github.com/googleapis/google-cloud-python/blob/master/spanner/google/cloud/spanner.py
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 363634)
Time Spent: 9h  (was: 8h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363628
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:51
Start Date: 26/Dec/19 15:51
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361482053
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio_test.py
 ##
 @@ -0,0 +1,267 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import datetime
+import logging
+import random
+import string
+import unittest
+
+import mock
+
+import apache_beam as beam
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
 
 Review comment:
   Yes, i took the references from 
`org.apache.beam.sdk.io.gcp.spanner.SpannerIOReadTest` and implementing them in 
pythonic way.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 363628)
Time Spent: 8.5h  (was: 8h 20m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363629
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:51
Start Date: 26/Dec/19 15:51
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361482073
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363630
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:51
Start Date: 26/Dec/19 15:51
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361482117
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363627
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:50
Start Date: 26/Dec/19 15:50
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361481918
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363625
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:46
Start Date: 26/Dec/19 15:46
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361481198
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 363625)
Time Spent: 8h  (was: 7h 50m)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363626
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:46
Start Date: 26/Dec/19 15:46
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361481286
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 363626)
Time Spent: 8h 10m  (was: 8h)

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363624
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:45
Start Date: 26/Dec/19 15:45
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361481076
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363622
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:45
Start Date: 26/Dec/19 15:45
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361481011
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363621
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:45
Start Date: 26/Dec/19 15:45
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480936
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363620
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:44
Start Date: 26/Dec/19 15:44
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480838
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363618
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:43
Start Date: 26/Dec/19 15:43
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480704
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363615
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:42
Start Date: 26/Dec/19 15:42
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480373
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363616
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:42
Start Date: 26/Dec/19 15:42
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480373
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363614
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:41
Start Date: 26/Dec/19 15:41
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361480318
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363613
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:39
Start Date: 26/Dec/19 15:39
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361479748
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363612
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:37
Start Date: 26/Dec/19 15:37
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361479506
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,558 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply _ReadFromSpanner transformation. It will
+return a PCollection, where each element represents an individual row returned
+from the read operation. Both Query and Read APIs are supported.
+
+_ReadFromSpanner relies on the _ReadOperation objects which is exposed by the
+SpannerIO API. _ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on Cloud Spanner. This is done for more
+convenient programming.
+
+_ReadFromSpanner reads from Cloud Spanner by providing either an 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also perform multiple reads by providing a list of _ReadOperations
+to the _ReadFromSpanner transform constructor. _ReadOperation exposes two 
static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  _ReadOperation.table('customers', ['name', 'email']),
+  _ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  _ReadOperation.query('Select name, email from 
customers'),
+  _ReadOperation.query(
+sql='Select * from users where id <= @user_id',
+params={'user_id': 100},
+params_type={'user_id': param_types.INT64}
+  ),
+]
+  # `params_types` are instance of `google.cloud.spanner_v1.param_types`
+  all_users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class _ReadOperation.
+
+User can also able to provide the _ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([_ReadOperation...])
+   | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`_create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `_create_transaction`
+PTransform later passed to the constructor of _ReadFromSpanner. For example:::
+
+  transaction = (pipeline | _create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | _ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select *

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

2019-12-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=363610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-363610
 ]

ASF GitHub Bot logged work on BEAM-7246:


Author: ASF GitHub Bot
Created on: 26/Dec/19 15:36
Start Date: 26/Dec/19 15:36
Worklog Time Spent: 10m 
  Work Description: mszb commented on pull request #9606: [BEAM-7246] Add 
Google Spanner IO Read on Python SDK
URL: https://github.com/apache/beam/pull/9606#discussion_r361479242
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/spannerio.py
 ##
 @@ -0,0 +1,531 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Google Cloud Spanner IO
+
+This is an experimental module for reading and writing data from Google Cloud
+Spanner. Visit: https://cloud.google.com/spanner for more details.
+
+To read from Cloud Spanner apply ReadFromSpanner transformation. It will
+return a list, where each element represents an individual row returned from
+the read operation. Both Query and Read APIs are supported.
+
+ReadFromSpanner relies on the ReadOperation objects which is exposed by the
+SpannerIO API. ReadOperation holds the immutable data which is responsible to
+execute batch and naive reads on spanner cloud. This is done for more
+convenient programming.
+
+ReadFromSpanner read from cloud spanner by providing the either 'sql' param
+in the constructor or 'table' name with 'columns' as list. For example:::
+
+  records = (pipeline
+| ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users'))
+
+  records = (pipeline
+| ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+table='users', columns=['id', 'name', 'email']))
+
+You can also performs the multiple reads by provide the list of ReadOperation
+to the ReadFromSpanner transform constructor. ReadOperation exposes two static
+methods. Use 'query' to perform sql based reads, 'table' to perform read from
+table name. For example:::
+
+  read_operations = [
+  ReadOperation.table('customers', ['name', 'email']),
+  ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+  ...OR...
+
+  read_operations = [
+  ReadOperation.sql('Select name, email from customers'),
+  ReadOperation.table('vendors', ['name', 'email']),
+]
+  all_users = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+read_operations=read_operations)
+
+For more information, please review the docs on class ReadOperation.
+
+User can also able to provide the ReadOperation in form of PCollection via
+pipeline. For example:::
+
+  users = (pipeline
+   | beam.Create([ReadOperation...])
+   | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME))
+
+User may also create cloud spanner transaction from the transform called
+`create_transaction` which is available in the SpannerIO API.
+
+The transform is guaranteed to be executed on a consistent snapshot of data,
+utilizing the power of read only transactions. Staleness of data can be
+controlled by providing the `read_timestamp` or `exact_staleness` param values
+in the constructor.
+
+This transform requires root of the pipeline (PBegin) and returns the dict
+containing 'session_id' and 'transaction_id'. This `create_transaction`
+PTransform later passed to the constructor of ReadFromSpanner. For example:::
+
+  transaction = (pipeline | create_transaction(TEST_PROJECT_ID,
+  TEST_INSTANCE_ID,
+  DB_NAME))
+
+  users = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from users', transaction=transaction)
+
+  tweets = pipeline | ReadFromSpanner(PROJECT_ID, INSTANCE_ID, DB_NAME,
+sql='Select * from tweets', transaction=transaction)
+
+For further details of this transform, please review the docs on the
+`create_transaction` method available in the

1 2 >

1 - 100 of 145 matches

Mail list logo