[GitHub] beam pull request #3484: [BEAM-2545] Upgrade bigtable client to 1.0.0-pre1

2017-10-05 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3484


---


[GitHub] beam pull request #3604: [BEAM-2141] Update jenkins job for JDBCIOIT

2017-10-05 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3604


---


[GitHub] beam pull request #3604: [BEAM-2141] Update jenkins job for JDBCIOIT

2017-07-20 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3604

[BEAM-2141] Update jenkins job for JDBCIOIT

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
cc @jasonkuster 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jenkins-jdbc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3604


commit 3d83475a851dcdfd471ba1a68828058ee6a8ffbc
Author: Stephen Sisk 
Date:   2017-07-20T17:45:47Z

Update jenkins job for JDBCIOIT




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3588: [BEAM-1598] Add Maven support for invoking perfkit ...

2017-07-18 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3588

[BEAM-1598] Add Maven support for invoking perfkit benchmarker to run IO ITs

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-1234] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-1234` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

You can find documentation on how users will invoke this at 
https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-NprQ7vbf1jNVRgdqeEE8I/edit?usp=sharing,
 which will be moved to the online documentation.

Full design doc for this feature is up at 
https://docs.google.com/document/d/1fISxgeq4Cbr-YRJQDgpnHxfTiQiHv8zQgb47dSvvJ78/edit?usp=sharing

R: @davorbonaci 
cc @iemejia 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam mvn-perfkit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3588.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3588


commit ff9c09f465c23ef4128baa8fccf493b82fa3fb66
Author: Stephen Sisk 
Date:   2017-06-14T16:57:35Z

Add maven support for invoking perfkit benchmarker to run IO ITs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3484: [BEAM-2545] Upgrade bigtable client to 1.0.0-pre1

2017-06-30 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3484

[BEAM-2545] Upgrade bigtable client to 1.0.0-pre1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
cc @kennknowles @jbonofre we believe this will reduce the likelihood of see 
the "UNKNOWN: Stale requests. Error mutating row" errors described in BEAM -2545

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam bigtable_1_0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3484.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3484


commit 2a850a8fdfb8543eadebd3a9bdaf6fd9ebfabddc
Author: Stephen Sisk 
Date:   2017-06-30T22:35:48Z

Upgrade bigtable client to 1.0.0-pre1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3465: [BEAM-2537] GCP IO ITs now all use --project option

2017-06-28 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3465

[BEAM-2537] GCP IO ITs now all use --project option

Up until now, some IO ITs used --projectId and others used --project

This mixing meant that running all the tests in one test run was
impossible.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Today, the GPC IO ITs use a mix of --project and --projectId pipeline 
options. This change fixes it so all the GCP IO ITs use --project

cc @mairbek 
R: @jasonkuster @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam io_project_option

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3465.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3465


commit 6fe0e4fda2f688b118bf2a8c7d5e41720c9d0214
Author: Stephen Sisk 
Date:   2017-06-28T22:34:45Z

GCP IO ITs now all use --project option

Up until now, some IO ITs used --projectId and others used --project

This mixing meant that running all the tests in one test run was
impossible.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3464: [BEAM-2533] Upgrade beam bigtable client dependency...

2017-06-28 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3464

[BEAM-2533] Upgrade beam bigtable client dependency to 0.9.7.1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Switching to use the latest version of the bigtable client library.

R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam bt_client_ver

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3464.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3464


commit ed815be8f4999aad6b02ae16574d8dbe1edc1c36
Author: Stephen Sisk 
Date:   2017-06-28T22:30:26Z

Upgrade beam bigtable client dependency to 0.9.7.1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3382: [BEAM-2458] Move HashingFn from test -> main

2017-06-19 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3382


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3382: [BEAM-2458] Move HashingFn from test -> main

2017-06-16 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3382

[BEAM-2458] Move HashingFn from test -> main

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam move-hashingfn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3382.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3382


commit 321b328135eda37ab84cf2aeb0653a24d28ae999
Author: Stephen Sisk 
Date:   2017-06-16T18:03:31Z

Move HashingFn from test -> main




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3213: [BEAM-2141] Fix postgres address in jdbc jenkins jd...

2017-05-24 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3213


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3213: [BEAM-2141] Fix postgres address in jdbc jenkins jd...

2017-05-23 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3213

[BEAM-2141] Fix postgres address in jdbc jenkins jdbc, re-enable test

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

The postgres instance had moved servers. This fixes the IP address and 
re-enables the test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam pg-jenkins-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3213


commit 486b230369ce3a3ce2450e4a668d1429fd16f5c9
Author: Stephen Sisk 
Date:   2017-05-23T21:51:27Z

Fix postgres address in jdbc jenkins jdbc, re-enable test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3196: [BEAM-2342] Cleanup k8s scripts naming & don't crea...

2017-05-22 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3196

[BEAM-2342] Cleanup k8s scripts naming & don't create insecure svc by 
default

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

These scripts setup a internet-accessible service by default, which is
insecure since we rely on firewalls for securing the data stores.

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam secure-k8s-es

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3196.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3196






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2681: [BEAM-2033] Add HadoopResourceId

2017-04-26 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/2681


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2681: [BEAM-2033] Add HadoopResourceId (attempt 2)

2017-04-25 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2681

[BEAM-2033] Add HadoopResourceId (attempt 2)

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Based on https://github.com/apache/beam/pull/2671 - now with more licenses! 
Also a serializable HadoopResourceId.

We'd like this to go into a new feature branch. 

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam hfs-pre-req

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2681.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2681


commit 7379db04fbdbed3f8460d991b7789b08d8334745
Author: Stephen Sisk 
Date:   2017-04-22T00:23:55Z

Add HadoopResourceId

commit 4dd919a35e09430973a1aabe0a77cc9df4485d32
Author: Stephen Sisk 
Date:   2017-04-25T17:36:30Z

fixup! Add HadoopResourceId

commit c1e6cbcf2360c43b93a33acd65f45204e3c34869
Author: Stephen Sisk 
Date:   2017-04-25T20:49:05Z

fixup! fixup! Add HadoopResourceId




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2671: [BEAM-2033] Add HadoopResourceId

2017-04-24 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2671

[BEAM-2033] Add HadoopResourceId

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
merging person: I believe we'll want to do the implementation of this in a 
feature branch - this'd be a good time to create that feature branch. 

R: @tgroh 

It's worth noting that GcsResourceId spends a lot of time on worrying about 
slashes - that's not a thing we'll need for this since Hadoop's Path does a 
good job of cleaning it up, so all that's handled at a lower level.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam hfs-pre-req

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2671.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2671


commit 405cc1471c4fb0668c30fe6ffa232bd21c31c4ee
Author: Stephen Sisk 
Date:   2017-04-22T00:23:55Z

Add HadoopResourceId




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2663: [BEAM-2066] TextIO & AvroIO no longer validate sche...

2017-04-24 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2663

[BEAM-2066] TextIO & AvroIO no longer validate schemas against 
IOChannelFactory

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

IOChannelFactory is no longer relevant in the new Beam FileSystem world, so 
using it to validate is not useful. We also expect that validation at expand 
time will no longer have access to pipeline options, so any future use for a 
manually specified validation won't be useful.

R: @dhalperi 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam remove-validation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2663.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2663


commit cf96ac4e01ee1294550a3800d2f67f259e82f403
Author: Stephen Sisk 
Date:   2017-04-24T19:48:41Z

TextIO & AvroIO no longer validate schemas against IOChannelFactory




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2520: [BEAM-1166] Update docs for getDefaultOutputCoder

2017-04-20 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/2520


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2546: [BEAM-1972] Only compile HIFIO ITs when compiling w...

2017-04-14 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2546

[BEAM-1972] Only compile HIFIO ITs when compiling with java 8.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
The HIFIO jdk1.8-tests directory contains dependencies that require java8 
(notably cassandra.) 

The enforcer rules in jdk1.8-test mean that enforcer won't treat that as a 
problem, however the module itself was still compiling in that scenario.  This 
change makes it so that the module is skipped.

I've verified this fix works by doing mvn clean verify on a machine with 
java8 and a machine with only java7 (which previously failed mvn clean verify) 
- I also verified that the jdk1.8-tests module actually compiled on the java8 
machine and that we can still run the integration tests.

mvn is a tricky one, so let me know if there's something I haven't thought 
of/etc...

Presumably, we will need java8-only modules in the future, so I'm guessing 
this will be a pattern that may be followed elsewhere (eg. in the upcoming 
cassandra module - cc @jbonofre )

R @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam enforcer-j7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2546.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2546


commit 06c8cf865e4030312b4563e690156c8b44636191
Author: Stephen Sisk 
Date:   2017-04-15T00:27:22Z

Only compile HIFIO ITs when compiling with java 8.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2525: Clarify BQ instructions in user_score.py

2017-04-13 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2525

Clarify BQ instructions in user_score.py

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam patch-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2525


commit 810646f2d9deea2fa8bf188cfa11a445d4644fa1
Author: Stephen Sisk 
Date:   2017-04-13T18:23:03Z

Update user_score.py




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2520: [BEAM-1166] Update docs for getDefaultOutputCoder

2017-04-12 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2520

[BEAM-1166] Update docs for getDefaultOutputCoder

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

I may not have the docs entirely correct - let me know if there's a better 
way to phrase this.

R @jkff 
cc @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam source-docs-fixup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2520


commit 2241474191117910e4d285d027291ef6b5a2759b
Author: Stephen Sisk 
Date:   2017-04-13T00:26:48Z

Update docs for getDefaultOutputCoder




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2507: [BEAM-1799] JdbcIOIT now uses writeThenRead style

2017-04-11 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2507

[BEAM-1799] JdbcIOIT now uses writeThenRead style

Removes JdbcTestDataSet's main, since it is no longer necessary for
loading data.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
I've now switched JdbcIOIT to do a write then a read. This code:
* creates a table during test class setup (this runs on the client)
* during the write test, generates data and populates data into that table
* reads the data from the table and verifies it via HashingFn + count

I really like the data generation - feels very clean, and I love the fact 
that using CountingSource means this part can be parallelized for larger data 
sets. I suspect that we may choose to refactor the "generate data" portion out 
into a helper method alongside HashingFn in io-common.

R @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jdbc-it-writeThenRead

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2507.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2507


commit 26fb99021b1edd48f6660b95a106ecd9116c6f16
Author: Stephen Sisk 
Date:   2017-04-04T22:46:19Z

JdbcIOIT now uses writeThenRead style

Removes JdbcTestDataSet's main, since it is no longer necessary for
loading data.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2499: HIFIO Cassandra tests were failing if run twice in ...

2017-04-11 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2499

HIFIO Cassandra tests were failing if run twice in a row without a clean

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
The embedded cassandra tests work correctly the first time they are run 
after a mvn clean - on any test phase execution of the jdk1.8-tests directory 
after that (ie, any time you run this unit tests again), they will fail. Thus, 
this passes our "mvn clean verify" runs and is not breaking CI, but will likely 
break any devs running tests in this directory locally without clean-ing. (aka, 
me)

R @dhalperi 
R @tgroh 
...whoever gets to this first

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam fix-cassandra-uts

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2499


commit e72b13577d9d836aa7f0bc494725b1607aa8eb83
Author: Stephen Sisk 
Date:   2017-04-11T21:36:05Z

HIFIO Cassandra tests were failing if run twice in a row without a clean




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2492: Fix build breaks caused by overlaps between b615013...

2017-04-11 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2492

Fix build breaks caused by overlaps between b615013 and c08b7b1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


There were conflicting changes in the cassandra unit tests & hashingfn 
changes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam fix-cassandra-hash

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2492.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2492


commit ccae1267442a9d8b176a531aa886437d5ad5c9ca
Author: Stephen Sisk 
Date:   2017-04-11T18:07:48Z

Fix build breaks caused by overlaps between b615013 and c08b7b1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2466: [BEAM-1644] Move travis & jenkins into shared test-...

2017-04-07 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2466

[BEAM-1644] Move travis & jenkins into shared test-infra dir, and move k8s 
scripts there

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Reasoning is discussed in the bug, but generally, we want to have a 
language neutral place for these kubernetes scripts to live, and also don't 
want a million top level directories for test-infra

cc @jasonkuster 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam create-k8s-home

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2466.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2466


commit 617297662e2649f5e38f5c4ae9047f7e9bcecbe8
Author: Stephen Sisk 
Date:   2017-04-07T22:57:33Z

Move travis/jenkins folders in a test-infra folder

commit 4cd666c5026ef91121a5cf7c6786c21c8d39d904
Author: Stephen Sisk 
Date:   2017-04-07T23:06:15Z

Move jdbc's postgres k8s scripts into shared k8s dir

commit 3468cec566bd24164922c0c063067d90e1c43849
Author: Stephen Sisk 
Date:   2017-04-07T23:11:19Z

Move HIFIO k8s scripts into shared dir




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2463: [BEAM-1799] Move HashingFn to io/common, switch to ...

2017-04-07 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2463

[BEAM-1799] Move HashingFn to io/common, switch to better hash

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
HadoopInputFormatIO has a hashing function that can be used to easily 
verify reads - this moves it to a common place so other IOs can use it. 

It also switches from SHA1 -> murmur since that's a good, fast hash (which 
is all we need). That meant I had to update the hashes in the unit tests.

cc @dhalperi please take a look
cc @diptikul this will change some of the code you're working on, wanted to 
make sure you're aware. Should be an easy merge, but wanted to give you a heads 
up. Also, I had thought that the Guava 19 dependency in HIFIO was a hard 
requirement, but I seem to be able to mvn verify & run the HIFIO ITs without 
it. Is there something I should be doing to cause the problem? (perhaps it was 
in the removed cassandra-unit dependency?)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam common-hash

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2463.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2463


commit 11de90abec49cb13da0147a27fc1cfcdd969a786
Author: Stephen Sisk 
Date:   2017-04-07T19:59:28Z

Move HashingFn to io/common, switch to better hash




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #205: Add to in-progress IO list and alphabetize

2017-04-07 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/205

Add to in-progress IO list and alphabetize



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site io-in-progress

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #205


commit 732378da1dbc66d3ae8b72ffb6893230350390fe
Author: Stephen Sisk 
Date:   2017-04-07T20:11:51Z

Add to in-progress IO list and alphabetize




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #203: [BEAM-1852] Fix ICLA link

2017-04-07 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam-site/pull/203


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #203: [BEAM-1852] Fix ICLA link

2017-04-06 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/203

[BEAM-1852] Fix ICLA link

The link at https://www.apache.org/licenses/ under "CONTRIBUTOR LICENSE 
AGREEMENTS" now links to the pdf version, and this commit changed the link in 
core beam:

https://github.com/apache/beam/commit/2a95b8fcb142c392fa3e48c4e80c7abc3b96a500

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site fix-icla-link

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/203.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #203


commit d7d4fc93bc58b434ccec9360f315a8ce5273a4fc
Author: Stephen Sisk 
Date:   2017-04-06T15:20:31Z

Fix ICLA link




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #202: [BEAM-1896] Add "In-Progress" info to the Built...

2017-04-06 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/202

[BEAM-1896] Add "In-Progress" info to the Built-in I/O Transforms page

Also added the Apache Hadoop InputFormat to the existing transforms list.

cc @melap 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site io-in-progress

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #202


commit d96138bf920bfc4c9cc6226173453593581a68c9
Author: Stephen Sisk 
Date:   2017-04-06T15:07:11Z

Add "In-Progress" info to the Built-in I/O Transforms page




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2431: [BEAM-1882] Update postgres k8 scripts & add script...

2017-04-04 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2431

[BEAM-1882] Update postgres k8 scripts & add scripts for running local dev 
test

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

Currently, the postgres instance created is a pod - that means that if 
crashes/anything happens to it, it won't be re-created. We should should switch 
to ReplicaController, which will ensure that it will be automatically created 
again.

This change:
* consolidates the normal pod creation + service down to one file
* switches from pods -> replica controllers
* adds a k8s script that exposes a public IP - this can be used when doing 
local development.

cc @jbonofre @jasonkuster 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jdbc-it-k8-scripts

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2431.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2431


commit 01e00ccb8b4c7ab9c80fc2a4f2c0205b1af0d5bc
Author: Stephen Sisk 
Date:   2017-03-17T22:09:00Z

Update postgres k8 scripts & add scripts for running local dev test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #196: [BEAM-1666] Read Transform content for Authorin...

2017-03-28 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/196

[BEAM-1666] Read Transform content for Authoring I/O Transforms - Overview

This adds the Read Transform content for the I/O Authoring overview page. 

It is more on the barebones side of things, but I'd prefer to get *some* 
content up that everyone can agree on. The write content is still mostly TODO - 
I realized that I don't have a lot of useful info there, so I'll leave that to 
folks who have experience with that side of things.

I moved some content that I initially thought was going to be in the 
overview page into the authoring-java page. Note that the authoring-java page 
is still not visible to users, so they won't care that the snippet is there, 
but I can remove that if it bothers us.

cc @melap for initial review
optional @jkff @dhalperi - this is based on an outline Eugene and I 
discussed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site authoring-overview

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/196.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #196


commit 95dd38e7715f4445fb49ca96a98f8fc74743ace4
Author: Stephen Sisk 
Date:   2017-03-28T22:09:28Z

Add Read Transform content for Authoring I/O Transforms - Overview




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2253: [BEAM-1644] Move PipelineOptions for IO ITs into sh...

2017-03-15 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2253

[BEAM-1644] Move PipelineOptions for IO ITs into shared location.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
* creates a common directory for IO 
* moves the current pipeline options (which are spread out into separate 
directories) into that common directory.

This is is useful/necessary because:
1. In order to run all the IO ITs in one run, they have to have a shared 
set of pipeline options - options on the command line but not present in the 
PipelineOptions being used to read them will cause an error
2. Data stores may be accessed in different IO modules, but should share 
common command line options - having them in a common directory makes that 
easier.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam io-test-pipeline-options

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2253.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2253


commit c5568417876ed6c8ea2e64c6caa22098f4494d41
Author: Stephen Sisk 
Date:   2017-03-15T21:31:49Z

Move PipelineOptions for IO ITs into shared location.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #173: [BEAM-1665] Add Pipeline I/O section to website...

2017-03-08 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/173

[BEAM-1665] Add Pipeline I/O section to website - outline + move some 
existing txt

This PR has a fair number of TODOs in the website content, but I have PRs 
queued up with more content from the IO authoring guide and none of the TODOs 
are for content already on the website. 

It has very little actual new content - mostly, it created structure and 
moves existing content around.

* I did not to go with a single page for all this content b/c both java and 
python have enough unique content that they deserve their own separate sections 
(ie, just tabs on the code isn't enough), and the "click to the next page" 
model currently implemented allows the user to pick java vs python, but then 
after reading those pages, the next page for both points at the same place - 
the users mostly follow the same path, but for java vs python specific content, 
they will diverge then converge again.
* I moved the "list of built-in I/O" content over to it's own separate page 
since it'd be nice to have more content there - e.g. capabilities matrix, and 
it felt special enough to pull out of the programming guide.
* We decided not to put all of this content in the contribute section of 
the site since the expectation is we don't think all users will contribute 
their IO transforms, so we want most of the docs to just be about writing an IO 
transforms, and they lay out the expectations in the contribute part of the IO 
section.

R @melap 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site io-guide

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/173.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #173


commit b0e0862042dd189dfcf8d1994d482fca99853a3b
Author: Stephen Sisk 
Date:   2017-03-09T01:49:37Z

Add Pipeline I/O section to website - outline + move some existing content

* I did not to go with a single page for all this content b/c both java and 
python have enough unique content that they deserve their own separate sections 
(ie, just tabs on the code isn't enough), and the "click to the next page" 
model currently implemented allows the user to pick java vs python, but then 
after reading those pages, the next page for both points at the same place - 
the users mostly follow the same path, but for java vs python specific content, 
they will diverge then converge again.
* I moved the "list of built-in I/O" content over to it's own separate page 
since it'd be nice to have more content there - e.g. capabilities matrix, and 
it felt special enough to pull out of the programming guide.
* We decided not to put all of this content in the contribute section of 
the site since the expectation is we don't think all users will contribute 
their IO transforms, so we want most of the docs to just be about writing an IO 
transforms, and they lay out the expectations in the contribute part of the IO 
section.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2174: [BEAM-1310] updates to JdbcIO k8 scripts & data loa...

2017-03-06 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2174

[BEAM-1310] updates to JdbcIO k8 scripts & data loading

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

We have started setting up the k8 cluster & postgres instances, and I had a 
few small updates as I got things working. 

R @jbonofre 
cc @jasonkuster 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jdbc-it-profiles

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2174.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2174


commit a59aebad8481044714e0ed8532812de263793843
Author: Stephen Sisk 
Date:   2017-03-06T23:59:31Z

Jdbc k8 & data loading: add teardown and update names/docs

commit 156018bbdc7769de6095425504d38703a8e81762
Author: Stephen Sisk 
Date:   2017-03-07T00:01:26Z

Jdbc k8 script: postgres data store only accessible inside test project




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


commits@beam.apache.org

2017-02-24 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam-site/pull/166

Add contrib guide comments about using squash & force push.

I encountered some issues while starting to use 

The contrib guide contains pretty detailed git commands, but:
1. For pushing to your branch, the git commands don't actually work when 
you try to use them in real life (you need to force push)
2. For squashing, we tell contributors that they *can* squash, but we don't 
actually say that we expect them to.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam-site squash-info

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/166.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #166


commit 841c2344640a6e482ea8affc8e5159a2e78f0802
Author: Stephen Sisk 
Date:   2017-02-24T22:44:09Z

Add contrib guide comments about using squash & force push.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2090: [BEAM-1310] Jdbc IO module now can run ITs against ...

2017-02-23 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/2090

[BEAM-1310] Jdbc IO module now can run ITs against spark & dataflow runners

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

I have added some changes that will allow us to start running the JdbcIO 
ITs with spark & dataflow.

These 2 commits are deliberately separate since they're doing different 
things. I don't think it's worth separating out into different commits since 
both are moving towards the same goal of enabling JdbcIOIT to actually run in 
jenkins, and can be logically reviewed together.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jdbc-it-profiles

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2090






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1841: BEAM-1310 Add integration tests for JdbcIO

2017-01-24 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/1841

BEAM-1310 Add integration tests for JdbcIO

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Hi @tgroh can you please take a look? (2nd time!)

This adds an integration test & data loading script for Jdbc IO

You'll note there are TODOs for verifying the contents of the rows in the 
tests - I will get to those, but wanted to get a first, useful test in so that 
we can continue working on the end to end integration test infrastructure.

This IT demonstrates:
* Having a separate load script for read tests
* Using TestPipelineOptions that are not test based, but rather data source 
based.
* Writing kubernetes scripts for instantiating an instance of the data 
source.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jdbc-it

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1841


commit e6007397e7d5508a4dfe39cf95ad4e1ceaaa2d6e
Author: Stephen Sisk 
Date:   2017-01-25T01:56:35Z

Add pg IT, load script and k8 script for JdbcIO




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1840: BEAM-1310 Add integration tests for JdbcIO

2017-01-24 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/1840


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1840: BEAM-1310 Add integration tests for JdbcIO

2017-01-24 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/1840

BEAM-1310 Add integration tests for JdbcIO

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Hi @tgroh  can you please take a look? 

This adds an integration test & data loading script for Jdbc IO

You'll note there are TODOs for verifying the contents of the rows in the 
tests - I will get to those, but wanted to get a first, useful test in so that 
we can continue working on the end to end integration test infrastructure.

This IT demonstrates:
* Having a separate load script for read tests
* Using TestPipelineOptions that are not test based, but rather data source 
based.
* Writing kubernetes scripts for instantiating an instance of the data 
source.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam io-testing

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1840


commit 21561ce64db15e2ba6b13186cc0092ba419b253e
Author: Stephen Sisk 
Date:   2016-12-23T01:35:40Z

Add an example IT that relies on an external data store.

commit 83fcafcec6ababa664bbb2098fb440d86873ba20
Author: Stephen Sisk 
Date:   2017-01-11T19:21:47Z

Update JdbcIOIT example so it uses real pipeline options & demonstrates a 
bug in JdbcIO

commit ba57d38e92fc6b79a192738f9f40925d2f3ae6f9
Author: Stephen Sisk 
Date:   2017-01-12T19:48:37Z

JdbcIOIT now runs successfully. Fixed errors in Serialization & tableName
It is intentional that we use static inner classes rather than anonymous 
classes -
anonymous classes pull in their containing class, and thus the containing 
class
must be serializable.

There was also a small error in using an outdated table name in 
verification.

Also updated a few comments.

commit 82b35cf42387f39380145c77b9f15725fe9fff06
Author: Stephen Sisk 
Date:   2016-12-23T01:35:40Z

Add an example IT that relies on an external data store.

commit b3456996ad2363de6d27181b03cdd08634a32aad
Author: Stephen Sisk 
Date:   2017-01-11T19:21:47Z

Update JdbcIOIT example so it uses real pipeline options & demonstrates a 
bug in JdbcIO

commit 12963c6d657821b50491138b577f9ae0f7286eb6
Author: Stephen Sisk 
Date:   2017-01-12T19:48:37Z

JdbcIOIT now runs successfully. Fixed errors in Serialization & tableName
It is intentional that we use static inner classes rather than anonymous 
classes -
anonymous classes pull in their containing class, and thus the containing 
class
must be serializable.

There was also a small error in using an outdated table name in 
verification.

Also updated a few comments.

commit e59949ae41ea60698490bf4c836382bcd1e05bcc
Author: Stephen Sisk 
Date:   2017-01-13T01:27:17Z

Merge branch 'io-testing' of github.com:ssisk/beam into io-testing

commit 1e95f9253ad9cd6295f1377850f064edcd317032
Author: Stephen Sisk 
Date:   2016-12-23T01:35:40Z

Add an example IT that relies on an external data store.

commit 760413fe328b0f7feed637e39bfe2990168b9cbf
Author: Stephen Sisk 
Date:   2017-01-11T19:21:47Z

Update JdbcIOIT example so it uses real pipeline options & demonstrates a 
bug in JdbcIO

commit 8720379930b6fdc772d7e32facb55c9ded5b55ae
Author: Stephen Sisk 
Date:   2017-01-12T19:48:37Z

JdbcIOIT now runs successfully. Fixed errors in Serialization & tableName
It is intentional that we use static inner classes rather than anonymous 
classes -
anonymous classes pull in their containing class, and thus the containing 
class
must be serializable.

There was also a small error in using an outdated table name in 
verification.

Also updated a few comments.

commit 8c364ff6cfbc1d34f6f2b283be6ee7beff61c773
Author: Stephen Sisk 
Date:   2017-01-13T02:16:40Z

Switch JdbcTestOptions -> PostgresTestOptions & add comments explaining use.

commit 35ee94279b37e9d0d9c21dd6799200b07ca8817f
Author: Stephen Sisk 
Date:   2017-01-18T00:16:16Z

Add example kubernetes & mesos scripts for setting up postgres instance.

commit 170881085f0b5f0098dd9087a04462