[jira] [Created] (BEAM-1164) Allow a DoFn to opt in to mutating it's input

2016-12-15 Thread Frances Perry (JIRA)
Frances Perry created BEAM-1164:
---

 Summary: Allow a DoFn to opt in to mutating it's input
 Key: BEAM-1164
 URL: https://issues.apache.org/jira/browse/BEAM-1164
 Project: Beam
  Issue Type: Bug
  Components: beam-model
Reporter: Frances Perry
Priority: Minor


Runners generally can't tell if a DoFn is mutating inputs, but assuming so by 
default leads to significant performance implications from unnecessary copying 
(around sibling fusion, etc). So instead the model prevents mutating inputs, 
and the Direct Runner validates this behavior. (See: 
http://beam.incubator.apache.org/contribute/design-principles/#make-efficient-things-easy-rather-than-make-easy-things-efficient)
 

However, if users are processing a small number of large records by making 
incremental changes (for example, genomics use cases), the cost of immutability 
requirement can be very large. As a workaround, users sometimes do suboptimal 
things (fusing ParDos by hand) or undefined things when they expect the 
immutability requirement is unnecessarily strict (adding no-op coders in places 
they hope the runner won't be materializing things, mutating things anyway when 
they don't expect sibling fusion to happen, etc).

We should consider adding a signal (MutatingDoFn?) that users explicitly opt in 
to to say their code may mutate inputs. The runner can then use this assumption 
to either prevent optimizations that would break in the face of this or insert 
additional copies as needed to allow optimizations to preserve semantics.

See this related user@ discussion:
https://lists.apache.org/thread.html/f39689f54147117f3fc54c498eff1a20fa73f1be5b5cad5b6f816fd3@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-1070) Service Account Based Authentication Broken

2016-12-01 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-1070:

Assignee: Ahmet Altay  (was: Frances Perry)

> Service Account Based Authentication Broken
> ---
>
> Key: BEAM-1070
> URL: https://issues.apache.org/jira/browse/BEAM-1070
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
> Environment: CentOS Linux release 7.1.1503 (Core) 
> Python 2.7.5
>Reporter: Stephen Reichling
>Assignee: Ahmet Altay
>Priority: Critical
>
> {{sdks/python/apache_beam/internal/auth.py}} calls into the 
> {{oauth2client.service_account.ServiceAccountCredentials.from_p12_keyfile}} 
> method with invalid and incorrectly-ordered parameters. Compare the [function 
> signature of 
> ServiceAccountCredentials.from_p12_keyfile|https://github.com/google/oauth2client/blob/ae73312942d3cf0e98f097dfbb40f136c2a7c463/oauth2client/service_account.py#L300-L303]
>  with [how it is 
> invoked|https://github.com/apache/incubator-beam/blob/9ded359daefc6040d61a1f33c77563474fcb09b6/sdks/python/apache_beam/internal/auth.py#L150-L154].
>  This causes a runtime error when one attempts to use a service account to 
> authenticate with the Google Dataflow APIs.
> The specific problems are:
>  - the {{client_scopes}} variable (a list) is passed as a positional 
> parameter where the function signature expects the {{private_key_password}} 
> parameter (a string).
>  - a keyed parameter, {{user_agent}}, is passed but no such parameter is 
> defined in the function signature.
>  - no value is provided for {{private_key_password}}. All p12 key files for 
> service accounts issued by Google Cloud have the password {{notasecret}} as 
> documented 
> [here|https://support.google.com/cloud/answer/6158849?hl=en#serviceaccounts], 
> so it's currently not possible to use a Google-issued p12 key file with this 
> implementation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-1068) Service Account Credentials File Specified via Pipeline Option Ignored

2016-12-01 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-1068:

Assignee: Ahmet Altay  (was: Frances Perry)

> Service Account Credentials File Specified via Pipeline Option Ignored
> --
>
> Key: BEAM-1068
> URL: https://issues.apache.org/jira/browse/BEAM-1068
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
> Environment: CentOS Linux release 7.1.1503 (Core)
> Python 2.7.5
>Reporter: Stephen Reichling
>Assignee: Ahmet Altay
>Priority: Minor
>
> When writing a pipeline that authenticates with Google Dataflow APIs using a 
> service account, specifying the path to that service account's credentials 
> file in the {{PipelineOptions}} object passed in to the pipeline does not 
> work, it only works when passed as a command-line flag.
> For example, if I write code like so:
> {code}
> pipelineOptions = options.PipelineOptions()
> gcOptions = pipelineOptions.view_as(options.GoogleCloudOptions)
> gcOptions.service_account_name = 'My Service Account Name'
> gcOptions.service_account_key_file = '/some/path/keyfile.p12'
> pipeline = beam.Pipeline(options=pipelineOptions)
> # ... add stages to the pipeline
> p.run()
> {code}
> and execute it like so:
> {{python ./my_pipeline.py}}
> ...the service account I specify will not be used.
> Only if I were to execute the code like so:
> {{python ./my_pipeline.py --service_account_name 'My Service Account Name' 
> --service_account_key_file /some/path/keyfile.p12}}
> ...does it actually use the service account.
> The problem appears to be rooted in `auth.py` which reconstructs the 
> {{PipelineOptions}} object directly from {{sys.argv}} rather than using the 
> instance passed in to the pipeline: 
> https://github.com/apache/incubator-beam/blob/9ded359daefc6040d61a1f33c77563474fcb09b6/sdks/python/apache_beam/internal/auth.py#L129-L130



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-666) Add accurate "How to Run" instructions for each of the WC examples

2016-11-10 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-666:
--

Assignee: Frances Perry  (was: Hadar Hod)

> Add accurate "How to Run" instructions for each of the WC examples
> --
>
> Key: BEAM-666
> URL: https://issues.apache.org/jira/browse/BEAM-666
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Frances Perry
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-667) Include code snippets from real examples

2016-11-10 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-667:
--

Assignee: Frances Perry  (was: Hadar Hod)

> Include code snippets from real examples
> 
>
> Key: BEAM-667
> URL: https://issues.apache.org/jira/browse/BEAM-667
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Frances Perry
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-667) Include code snippets from real examples

2016-11-10 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654554#comment-15654554
 ] 

Frances Perry commented on BEAM-667:


These need to be redone given https://github.com/apache/incubator-beam/pull/1315

> Include code snippets from real examples
> 
>
> Key: BEAM-667
> URL: https://issues.apache.org/jira/browse/BEAM-667
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Hadar Hod
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-900) Spark quickstart instructions

2016-11-05 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15641199#comment-15641199
 ] 

Frances Perry commented on BEAM-900:


Amit, could you help find this an owner? Thanks!

> Spark quickstart instructions
> -
>
> Key: BEAM-900
> URL: https://issues.apache.org/jira/browse/BEAM-900
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Amit Sela
>
> After initial quickstart structure is pushed, add commandlines for Spark 
> execution to quickstart.md and detailed Spark setup instructions to 
> learn/runners/spark.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-900) Spark quickstart instructions

2016-11-05 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-900:
---
Assignee: Amit Sela  (was: James Malone)

> Spark quickstart instructions
> -
>
> Key: BEAM-900
> URL: https://issues.apache.org/jira/browse/BEAM-900
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Amit Sela
>
> After initial quickstart structure is pushed, add commandlines for Spark 
> execution to quickstart.md and detailed Spark setup instructions to 
> learn/runners/spark.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (BEAM-899) Flink quickstart instructions

2016-11-05 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15641198#comment-15641198
 ] 

Frances Perry edited comment on BEAM-899 at 11/6/16 5:34 AM:
-

Aljoscha, could you help make sure this finds an owner? Thanks!


was (Author: frances):
Aljosha, could you help make sure this finds an owner? Thanks!

> Flink quickstart instructions
> -
>
> Key: BEAM-899
> URL: https://issues.apache.org/jira/browse/BEAM-899
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Aljoscha Krettek
>
> After initial quickstart structure is pushed, add commandlines for Flink 
> execution to quickstart.md and detailed Flink setup instructions to 
> learn/runners/flink.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-899) Flink quickstart instructions

2016-11-05 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-899:
---
Assignee: Aljoscha Krettek  (was: James Malone)

> Flink quickstart instructions
> -
>
> Key: BEAM-899
> URL: https://issues.apache.org/jira/browse/BEAM-899
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Aljoscha Krettek
>
> After initial quickstart structure is pushed, add commandlines for Flink 
> execution to quickstart.md and detailed Flink setup instructions to 
> learn/runners/flink.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-899) Flink quickstart instructions

2016-11-05 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15641198#comment-15641198
 ] 

Frances Perry commented on BEAM-899:


Aljosha, could you help make sure this finds an owner? Thanks!

> Flink quickstart instructions
> -
>
> Key: BEAM-899
> URL: https://issues.apache.org/jira/browse/BEAM-899
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Aljoscha Krettek
>
> After initial quickstart structure is pushed, add commandlines for Flink 
> execution to quickstart.md and detailed Flink setup instructions to 
> learn/runners/flink.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-919) Remove remaining old use/learn links from website src

2016-11-05 Thread Frances Perry (JIRA)
Frances Perry created BEAM-919:
--

 Summary: Remove remaining old use/learn links from website src
 Key: BEAM-919
 URL: https://issues.apache.org/jira/browse/BEAM-919
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone
Priority: Minor


We still have old links lingering after the website refactoring.

For example, the release guide 
(https://github.com/apache/incubator-beam-site/blob/asf-site/src/contribute/release-guide.md)
 still links to "/use/..." in a bunch of places. 

impact: links still work because of redirects, but it's tech debt we should fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-905) Archetype pom needs to generalize dependencies

2016-11-03 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-905:
---
Affects Version/s: 0.4.0-incubating

> Archetype pom needs to generalize dependencies
> --
>
> Key: BEAM-905
> URL: https://issues.apache.org/jira/browse/BEAM-905
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.4.0-incubating
> Environment: Currently the archetype pom includes the direct runner 
> and the dataflow one, but not the others. It should do the same magic as the 
> main examples.
>Reporter: Frances Perry
>Assignee: Pei He
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-909) Starter archetype's pom doesn't include the right dependencies

2016-11-03 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635305#comment-15635305
 ] 

Frances Perry commented on BEAM-909:


Whoops, type. Meant BEAM-905, which is about the example archetype.

> Starter archetype's pom doesn't include the right dependencies
> --
>
> Key: BEAM-909
> URL: https://issues.apache.org/jira/browse/BEAM-909
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.4.0-incubating
>Reporter: Frances Perry
>
> Repro:
> $ mvn archetype:generate -DarchetypeGroupId=org.apache.beam 
> -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-starter 
> -DarchetypeVersion=LATEST -DgroupId=org.example 
> -DartifactId=beam-starter -Dversion="0.1" -DinteractiveMode=false
> The resulting pom doesn't seem to have dependencies on any runners or a 
> profile for enabling them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-909) Starter archetype's pom doesn't include the right dependencies

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-909:
--

 Summary: Starter archetype's pom doesn't include the right 
dependencies
 Key: BEAM-909
 URL: https://issues.apache.org/jira/browse/BEAM-909
 Project: Beam
  Issue Type: Bug
Affects Versions: 0.4.0-incubating
Reporter: Frances Perry


Repro:

$ mvn archetype:generate -DarchetypeGroupId=org.apache.beam 
-DarchetypeArtifactId=beam-sdks-java-maven-archetypes-starter 
-DarchetypeVersion=LATEST -DgroupId=org.example 
-DartifactId=beam-starter -Dversion="0.1" -DinteractiveMode=false


The resulting pom doesn't seem to have dependencies on any runners or a profile 
for enabling them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-909) Starter archetype's pom doesn't include the right dependencies

2016-11-03 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635282#comment-15635282
 ] 

Frances Perry commented on BEAM-909:


(Related to BEAM-904, although that one includes on Direct and Dataflow.)

> Starter archetype's pom doesn't include the right dependencies
> --
>
> Key: BEAM-909
> URL: https://issues.apache.org/jira/browse/BEAM-909
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.4.0-incubating
>Reporter: Frances Perry
>
> Repro:
> $ mvn archetype:generate -DarchetypeGroupId=org.apache.beam 
> -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-starter 
> -DarchetypeVersion=LATEST -DgroupId=org.example 
> -DartifactId=beam-starter -Dversion="0.1" -DinteractiveMode=false
> The resulting pom doesn't seem to have dependencies on any runners or a 
> profile for enabling them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-899) Flink quickstart instructions

2016-11-03 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634565#comment-15634565
 ] 

Frances Perry commented on BEAM-899:


The quickstart uses archetypes, so unfortunately the instructions will need a 
step to hand edit the pom until BEAM-905 is fixed and released.

> Flink quickstart instructions
> -
>
> Key: BEAM-899
> URL: https://issues.apache.org/jira/browse/BEAM-899
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>
> After initial quickstart structure is pushed, add commandlines for Flink 
> execution to quickstart.md and detailed Flink setup instructions to 
> learn/runners/flink.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-900) Spark quickstart instructions

2016-11-03 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634564#comment-15634564
 ] 

Frances Perry commented on BEAM-900:


The quickstart uses archetypes, so unfortunately the instructions will need a 
step to hand edit the pom until BEAM-905 is fixed and released.

> Spark quickstart instructions
> -
>
> Key: BEAM-900
> URL: https://issues.apache.org/jira/browse/BEAM-900
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>
> After initial quickstart structure is pushed, add commandlines for Spark 
> execution to quickstart.md and detailed Spark setup instructions to 
> learn/runners/spark.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-904) Dataflow setup instructions

2016-11-03 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634567#comment-15634567
 ] 

Frances Perry commented on BEAM-904:


If BEAM-899 and BEAM-900 are fixed before this is released, they will likely 
include a hack that will need to be removed.

> Dataflow setup instructions
> ---
>
> Key: BEAM-904
> URL: https://issues.apache.org/jira/browse/BEAM-904
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Frances Perry
>Assignee: Melissa Pashniak
>
> As you are working on the Dataflow Runner page, please include the getting 
> started instructions, as I'm linking there from the quickstart. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-905) Archetype pom needs to generalize dependencies

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-905:
--

 Summary: Archetype pom needs to generalize dependencies
 Key: BEAM-905
 URL: https://issues.apache.org/jira/browse/BEAM-905
 Project: Beam
  Issue Type: Bug
 Environment: Currently the archetype pom includes the direct runner 
and the dataflow one, but not the others. It should do the same magic as the 
main examples.
Reporter: Frances Perry
Assignee: Pei He






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-904) Dataflow setup instructions

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-904:
--

 Summary: Dataflow setup instructions
 Key: BEAM-904
 URL: https://issues.apache.org/jira/browse/BEAM-904
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Frances Perry
Assignee: Melissa Pashniak


As you are working on the Dataflow Runner page, please include the getting 
started instructions, as I'm linking there from the quickstart. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-902) Add runner toggles

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-902:
--

 Summary: Add runner toggles
 Key: BEAM-902
 URL: https://issues.apache.org/jira/browse/BEAM-902
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Frances Perry
Assignee: Abdullah Bashir


As discussed on pull/752, extend the language toggle support to be able to 
toggle commandlines between different runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-900) Spark quickstart instructions

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-900:
--

 Summary: Spark quickstart instructions
 Key: BEAM-900
 URL: https://issues.apache.org/jira/browse/BEAM-900
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Frances Perry
Assignee: James Malone


After initial quickstart structure is pushed, add commandlines for Spark 
execution to quickstart.md and detailed Spark setup instructions to 
learn/runners/spark.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-899) Flink quickstart instructions

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-899:
--

 Summary: Flink quickstart instructions
 Key: BEAM-899
 URL: https://issues.apache.org/jira/browse/BEAM-899
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Frances Perry
Assignee: James Malone


After initial quickstart structure is pushed, add commandlines for Flink 
execution to quickstart.md and detailed Flink setup instructions to 
learn/runners/flink.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-895) Transport.newStorageClient requires credentials

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-895:
--

 Summary: Transport.newStorageClient requires credentials
 Key: BEAM-895
 URL: https://issues.apache.org/jira/browse/BEAM-895
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Frances Perry
Assignee: Davor Bonaci
 Fix For: 0.4.0-incubating


Transport.newStorageClient requires credentials, even if those aren't needed.

Impact: Examples use publicly accessible files on Google Cloud Storage, however 
reading those is still requiring the user to authenticate with Google Cloud 
Storage.

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Unable to get application default 
credentials. Please see 
https://developers.google.com/accounts/docs/application-default-credentials for 
details on how to specify credentials. This version of the SDK is dependent on 
the gcloud core component version 2015.02.05 or newer to be able to get 
credentials from the currently authorized user via gcloud auth.
at 
org.apache.beam.sdk.util.Credentials.getCredential(Credentials.java:123)
at 
org.apache.beam.sdk.util.GcpCredentialFactory.getCredential(GcpCredentialFactory.java:43)
at 
org.apache.beam.sdk.options.GcpOptions$GcpUserCredentialsFactory.create(GcpOptions.java:264)
at 
org.apache.beam.sdk.options.GcpOptions$GcpUserCredentialsFactory.create(GcpOptions.java:254)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper(ProxyInvocationHandler.java:549)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault(ProxyInvocationHandler.java:490)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:152)
at com.sun.proxy.$Proxy52.getGcpCredential(Unknown Source)
at 
org.apache.beam.sdk.util.Transport.newStorageClient(Transport.java:148)
at 
org.apache.beam.sdk.util.GcsUtil$GcsUtilFactory.create(GcsUtil.java:96)
at 
org.apache.beam.sdk.util.GcsUtil$GcsUtilFactory.create(GcsUtil.java:84)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper(ProxyInvocationHandler.java:549)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault(ProxyInvocationHandler.java:490)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:152)
at com.sun.proxy.$Proxy52.getGcsUtil(Unknown Source)
at 
org.apache.beam.sdk.util.GcsIOChannelFactory.match(GcsIOChannelFactory.java:43)
at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:283)
at org.apache.beam.sdk.io.TextIO$Read$Bound.apply(TextIO.java:195)
at 
org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
at 
org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:226)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:400)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:323)
at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:58)
at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:173)
at org.apache.beam.examples.WordCount.main(WordCount.java:195)
... 6 more
Caused by: java.io.IOException: The Application Default Credentials are not 
available. They are available if running on Google App Engine, Google Compute 
Engine, or Google Cloud Shell. Otherwise, the environment variable 
GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the 
credentials. See 
https://developers.google.com/accounts/docs/application-default-credentials for 
more information.
at 
com.google.api.client.googleapis.auth.oauth2.DefaultCredentialProvider.getDefaultCredential(DefaultCredentialProvider.java:98)
at 
com.google.api.client.googleapis.auth.oauth2.GoogleCredential.getApplicationDefault(GoogleCredential.java:213)
at 
com.google.api.client.googleapis.auth.oauth2.GoogleCredential.getApplicationDefault(GoogleCredential.java:191)
at 
org.apache.beam.sdk.util.Credentials.getCredential(Credentials.java:121)
... 30 more




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-892) revamp quickstart

2016-11-03 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-892:
--

Assignee: Frances Perry  (was: James Malone)

> revamp quickstart
> -
>
> Key: BEAM-892
> URL: https://issues.apache.org/jira/browse/BEAM-892
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> We need to make this quickstart actually a quickstart!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-892) revamp quickstart

2016-11-03 Thread Frances Perry (JIRA)
Frances Perry created BEAM-892:
--

 Summary: revamp quickstart
 Key: BEAM-892
 URL: https://issues.apache.org/jira/browse/BEAM-892
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


We need to make this quickstart actually a quickstart!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-752) infrastructure for toggling code snippets in documentation

2016-11-01 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-752:
---
Assignee: Abdullah Bashir  (was: James Malone)

> infrastructure for toggling code snippets in documentation
> --
>
> Key: BEAM-752
> URL: https://issues.apache.org/jira/browse/BEAM-752
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Abdullah Bashir
>  Labels: starter
>
> Once the python sdk gets merged to the master branch, a lot of our 
> documentation (programming guide, walkthroughs, etc) will need to support 
> multiple languages.
> The hope is that the vast bulk of the prose can be written about Beam 
> concepts in a language independent way. But for code snippets it would be 
> great to be able to toggle languages.
> Goals:
> * Support tabbed language toggles for both code and small sections of text.
> * Support easily changing the default per-user-visit so that the entire file 
> (or even better entire site) defaults to showing a specific language



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-842) dependency.py: package not found when running on Windows

2016-10-30 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-842:
---
Assignee: Ahmet Altay  (was: Frances Perry)

> dependency.py: package not found when running on Windows
> 
>
> Key: BEAM-842
> URL: https://issues.apache.org/jira/browse/BEAM-842
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 0.4.0-incubating
> Environment: Windows 10, Python 2.7.11
>Reporter: Matthias Baetens
>Assignee: Ahmet Altay
>Priority: Minor
>  Labels: newbie
>
> When having splitting your pipeline into multiple files and configuring your 
> project according to the Juliaset example 
> (https://cloud.google.com/dataflow/pipelines/dependencies-python#multiple-file-dependencies),
>  the Pipeline still crashes when using Windows.
> This is caused by setuptools defaulting to a .zip on Windows, and the current 
> Beam code looks for a .tar.gz (dependency.py, line 400). When changing this 
> line to: output_files = glob.glob(os.path.join(temp_dir, '*.zip')), it works. 
> Suggestion: checking the OS would probably solve this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-835) add Intellij instructions to the contribution guide

2016-10-25 Thread Frances Perry (JIRA)
Frances Perry created BEAM-835:
--

 Summary: add Intellij instructions to the contribution guide
 Key: BEAM-835
 URL: https://issues.apache.org/jira/browse/BEAM-835
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Priority: Minor


Add Intellij-specific instructions to the contribution guide, to go alongside 
the Eclipse instructions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-193) Port existing Dataflow SDK documentation to Beam Programming Guide

2016-10-25 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-193:
---
Assignee: Melissa Pashniak  (was: Devin Donnelly)

> Port existing Dataflow SDK documentation to Beam Programming Guide
> --
>
> Key: BEAM-193
> URL: https://issues.apache.org/jira/browse/BEAM-193
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Devin Donnelly
>Assignee: Melissa Pashniak
>
> There is an extensive amount of documentation on the Dataflow SDK programming 
> model and classes. Port this documentation over as a new Beam Programming 
> Guide covering the following major topics:
> - Programming model overview
> - Pipeline structure
> - PCollections
> - Transforms
> - I/O



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-505) Fill in the learn/runners/direct portion of the website

2016-10-25 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-505:
---
Assignee: Melissa Pashniak  (was: James Malone)

> Fill in the learn/runners/direct portion of the website
> ---
>
> Key: BEAM-505
> URL: https://issues.apache.org/jira/browse/BEAM-505
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Melissa Pashniak
>
> As per 
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
> Should be a landing page for the Direct runner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-508) Fill in the learn/runners/dataflow portion of the website

2016-10-25 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-508:
---
Assignee: Melissa Pashniak  (was: James Malone)

> Fill in the learn/runners/dataflow portion of the website
> -
>
> Key: BEAM-508
> URL: https://issues.apache.org/jira/browse/BEAM-508
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Melissa Pashniak
>
> As per 
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
> Should be a landing page for Dataflow-runner-specific content



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-749) Syntax highlight on website

2016-10-20 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593317#comment-15593317
 ] 

Frances Perry commented on BEAM-749:


Confirmed James isn't currently working on this. Reassigning to myself.

> Syntax highlight on website
> ---
>
> Key: BEAM-749
> URL: https://issues.apache.org/jira/browse/BEAM-749
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>
> We should able to enable rouge on the website in order to get syntax 
> highlighting in the programming guide, walkthroughs, etc.
> https://jekyllrb.com/docs/templates/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-749) Syntax highlight on website

2016-10-20 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-749:
--

Assignee: Frances Perry  (was: James Malone)

> Syntax highlight on website
> ---
>
> Key: BEAM-749
> URL: https://issues.apache.org/jira/browse/BEAM-749
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> We should able to enable rouge on the website in order to get syntax 
> highlighting in the programming guide, walkthroughs, etc.
> https://jekyllrb.com/docs/templates/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-602) make feature branches more discoverable

2016-10-20 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry resolved BEAM-602.

   Resolution: Fixed
Fix Version/s: Not applicable

> make feature branches more discoverable
> ---
>
> Key: BEAM-602
> URL: https://issues.apache.org/jira/browse/BEAM-602
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
> Fix For: Not applicable
>
>
> We have great things happening on feature branches, but they are a bit hidden.
> - update the contribution guide to add instructions for working on branches
> - add a page under contribute/ that lists the feature branches, links to 
> their JIRAs, etc.
> - add a quick link from pages in use/ and learn/ to help make this 
> discoverable for adventurous users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-721) Travis CI fails to run Python tox tests on Mac

2016-10-16 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry resolved BEAM-721.

   Resolution: Fixed
Fix Version/s: Not applicable

> Travis CI fails to run Python tox tests on Mac
> --
>
> Key: BEAM-721
> URL: https://issues.apache.org/jira/browse/BEAM-721
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
> Environment: Mac
>Reporter: Pablo Estrada
>Assignee: Frances Perry
> Fix For: Not applicable
>
>
> Some Travis CI runs on Mac are failing because the test script can not find 
> tox.
> See: https://travis-ci.org/apache/incubator-beam/jobs/165306424#L86
> The travis.yml file does attempt to install tox (See: 
> https://github.com/apache/incubator-beam/blob/python-sdk/.travis.yml#L66)
> Looking at the logs, it seems that tox is available in a different directory 
> (/usr/local), and TOX_HOME is set to $HOME/Library/Python/2.7/bin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (BEAM-753) Travis failure (cannot import name locked_file)

2016-10-16 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reopened BEAM-753:


(doh, wrong tab ;-) )

> Travis failure (cannot import name locked_file)
> ---
>
> Key: BEAM-753
> URL: https://issues.apache.org/jira/browse/BEAM-753
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: Not applicable
>
>
> ERROR: Failure: ImportError (cannot import name locked_file)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/loader.py",
>  line 418, in loadTestsFromName
> addr.filename, addr.module)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/importer.py",
>  line 47, in importFromPath
> return self.importFromDir(dir_path, fqname)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/importer.py",
>  line 94, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/__init__.py",
>  line 78, in 
> from apache_beam import io
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/__init__.py",
>  line 21, in 
> from apache_beam.io.avroio import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/avroio.py",
>  line 29, in 
> from apache_beam.io import filebasedsource
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/filebasedsource.py",
>  line 31, in 
> from apache_beam.io import concat_source
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/concat_source.py",
>  line 24, in 
> from apache_beam.io import iobase
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/iobase.py",
>  line 818, in 
> from apache_beam.runners.dataflow.native_io.iobase import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/runners/__init__.py",
>  line 23, in 
> from apache_beam.runners.dataflow_runner import DataflowPipelineRunner
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/runners/dataflow_runner.py",
>  line 43, in 
> from apache_beam.internal.clients import dataflow as dataflow_api
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/internal/clients/dataflow/__init__.py",
>  line 23, in 
> from apitools.base.py import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/.tox/py27/local/lib/python2.7/site-packages/apitools/base/py/__init__.py",
>  line 22, in 
> from apitools.base.py.credentials_lib import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/.tox/py27/local/lib/python2.7/site-packages/apitools/base/py/credentials_lib.py",
>  line 50, in 
> from oauth2client import locked_file
> ImportError: cannot import name locked_file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-753) Travis failure (cannot import name locked_file)

2016-10-16 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry resolved BEAM-753.

   Resolution: Fixed
Fix Version/s: Not applicable

> Travis failure (cannot import name locked_file)
> ---
>
> Key: BEAM-753
> URL: https://issues.apache.org/jira/browse/BEAM-753
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: Not applicable
>
>
> ERROR: Failure: ImportError (cannot import name locked_file)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/loader.py",
>  line 418, in loadTestsFromName
> addr.filename, addr.module)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/importer.py",
>  line 47, in importFromPath
> return self.importFromDir(dir_path, fqname)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/nose-1.3.7-py2.7.egg/nose/importer.py",
>  line 94, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/__init__.py",
>  line 78, in 
> from apache_beam import io
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/__init__.py",
>  line 21, in 
> from apache_beam.io.avroio import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/avroio.py",
>  line 29, in 
> from apache_beam.io import filebasedsource
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/filebasedsource.py",
>  line 31, in 
> from apache_beam.io import concat_source
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/concat_source.py",
>  line 24, in 
> from apache_beam.io import iobase
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/io/iobase.py",
>  line 818, in 
> from apache_beam.runners.dataflow.native_io.iobase import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/runners/__init__.py",
>  line 23, in 
> from apache_beam.runners.dataflow_runner import DataflowPipelineRunner
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/runners/dataflow_runner.py",
>  line 43, in 
> from apache_beam.internal.clients import dataflow as dataflow_api
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/apache_beam/internal/clients/dataflow/__init__.py",
>  line 23, in 
> from apitools.base.py import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/.tox/py27/local/lib/python2.7/site-packages/apitools/base/py/__init__.py",
>  line 22, in 
> from apitools.base.py.credentials_lib import *
>   File 
> "/usr/local/google/home/altay/Desktop/beam/test/incubator-beam/sdks/python/.tox/py27/local/lib/python2.7/site-packages/apitools/base/py/credentials_lib.py",
>  line 50, in 
> from oauth2client import locked_file
> ImportError: cannot import name locked_file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-751) infrastructure for extracting code snippets into documentation

2016-10-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-751:
---
Issue Type: Improvement  (was: Bug)

> infrastructure for extracting code snippets into documentation
> --
>
> Key: BEAM-751
> URL: https://issues.apache.org/jira/browse/BEAM-751
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>  Labels: starter
>
> As we fill in more and more documentation, the number of code snippets is 
> going to drastically increase, and we should ensure the quality of those 
> snippets by automatically extracting them from code that is regularly 
> compiled and tested.  
> Goals:
> * automatically extract code snippets from incubator-beam for use in the beam 
> website documentation
> * use stable references so folks editing the code can clearly tell what 
> documentation changes this will result in (good: specially formatted comment, 
> bad: line number)
> * freshness (is live possible? or at least during the general 'jekyll build' 
> phase?)
> The best we've found so far is using jekyll-gist with gist-it, but that would 
> rely on fragile line numbers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-749) Syntax highlight on website

2016-10-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-749:
---
Issue Type: Improvement  (was: Bug)

> Syntax highlight on website
> ---
>
> Key: BEAM-749
> URL: https://issues.apache.org/jira/browse/BEAM-749
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>
> We should able to enable rouge on the website in order to get syntax 
> highlighting in the programming guide, walkthroughs, etc.
> https://jekyllrb.com/docs/templates/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-752) infrastructure for toggling code snippets in documentation

2016-10-14 Thread Frances Perry (JIRA)
Frances Perry created BEAM-752:
--

 Summary: infrastructure for toggling code snippets in documentation
 Key: BEAM-752
 URL: https://issues.apache.org/jira/browse/BEAM-752
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Frances Perry
Assignee: James Malone


Once the python sdk gets merged to the master branch, a lot of our 
documentation (programming guide, walkthroughs, etc) will need to support 
multiple languages.

The hope is that the vast bulk of the prose can be written about Beam concepts 
in a language independent way. But for code snippets it would be great to be 
able to toggle languages.

Goals:
* Support tabbed language toggles for both code and small sections of text.
* Support easily changing the default per-user-visit so that the entire file 
(or even better entire site) defaults to showing a specific language



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-751) infrastructure for extracting code snippets into documentation

2016-10-14 Thread Frances Perry (JIRA)
Frances Perry created BEAM-751:
--

 Summary: infrastructure for extracting code snippets into 
documentation
 Key: BEAM-751
 URL: https://issues.apache.org/jira/browse/BEAM-751
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As we fill in more and more documentation, the number of code snippets is going 
to drastically increase, and we should ensure the quality of those snippets by 
automatically extracting them from code that is regularly compiled and tested.  

Goals:
* automatically extract code snippets from incubator-beam for use in the beam 
website documentation
* use stable references so folks editing the code can clearly tell what 
documentation changes this will result in (good: specially formatted comment, 
bad: line number)
* freshness (is live possible? or at least during the general 'jekyll build' 
phase?)

The best we've found so far is using jekyll-gist with gist-it, but that would 
rely on fragile line numbers.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-749) Syntax highlight on website

2016-10-13 Thread Frances Perry (JIRA)
Frances Perry created BEAM-749:
--

 Summary: Syntax highlight on website
 Key: BEAM-749
 URL: https://issues.apache.org/jira/browse/BEAM-749
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


We should able to enable rouge on the website in order to get syntax 
highlighting in the programming guide, walkthroughs, etc.

https://jekyllrb.com/docs/templates/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-728) Javadoc should clearly separate facts from runner requirements

2016-10-07 Thread Frances Perry (JIRA)
Frances Perry created BEAM-728:
--

 Summary: Javadoc should clearly separate facts from runner 
requirements
 Key: BEAM-728
 URL: https://issues.apache.org/jira/browse/BEAM-728
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Frances Perry
Assignee: Davor Bonaci


The javadoc for View.asMap() says the map needs to fit in memory. That's not 
true in all runners. (For example, Dataflow has distributed map support.) 

https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java

This is likely just one specific case of a more general issue -- different 
runners will have common constraints on the scalability of portions of the 
model. Currently these are documented in the capability matrix on the website, 
but for usability we should consider surfacing these constraints on 
particularly relevant methods. But keeping things in sync in multiple locations 
is hard...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-570) Update AvroSource to support more compression types

2016-10-06 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553995#comment-15553995
 ] 

Frances Perry commented on BEAM-570:


Assigning to Konstantinos to follow up after #1053  is in.

> Update AvroSource to support more compression types
> ---
>
> Key: BEAM-570
> URL: https://issues.apache.org/jira/browse/BEAM-570
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Konstantinos Katsiapis
>
> Python AvroSource [1] currently only support 'deflate' compression. We should 
> update it to support other compression types supported by the Avro library 
> (e.g.: snappy, bzip2).
> [1] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/avroio.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-570) Update AvroSource to support more compression types

2016-10-06 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-570:
---
Assignee: Konstantinos Katsiapis

> Update AvroSource to support more compression types
> ---
>
> Key: BEAM-570
> URL: https://issues.apache.org/jira/browse/BEAM-570
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Konstantinos Katsiapis
>
> Python AvroSource [1] currently only support 'deflate' compression. We should 
> update it to support other compression types supported by the Avro library 
> (e.g.: snappy, bzip2).
> [1] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/avroio.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-602) make feature branches more discoverable

2016-08-29 Thread Frances Perry (JIRA)
Frances Perry created BEAM-602:
--

 Summary: make feature branches more discoverable
 Key: BEAM-602
 URL: https://issues.apache.org/jira/browse/BEAM-602
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: Frances Perry


We have great things happening on feature branches, but they are a bit hidden.

- update the contribution guide to add instructions for working on branches
- add a page under contribute/ that lists the feature branches, links to their 
JIRAs, etc.
- add a quick link from pages in use/ and learn/ to help make this discoverable 
for adventurous users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-590) Port examples web docs from Dataflow to Beam website.

2016-08-25 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437677#comment-15437677
 ] 

Frances Perry commented on BEAM-590:


Is this a dup of BEAM-194?

> Port examples web docs from Dataflow to Beam website.
> -
>
> Key: BEAM-590
> URL: https://issues.apache.org/jira/browse/BEAM-590
> Project: Beam
>  Issue Type: New Feature
>  Components: examples-java
>Reporter: Pei He
>Priority: Minor
>
> I am removing references to dataflow website in examples, such as:
> https://cloud.google.com/dataflow/java-sdk/wordcount-example
> Creating this issue to track web docs that we might want to port to Beam.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-276) Add PCollections Section

2016-08-24 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry closed BEAM-276.
--
   Resolution: Fixed
Fix Version/s: Not applicable

Done by Devin: 
http://beam.incubator.apache.org/learn/programming-guide/#pcollection

> Add PCollections Section
> 
>
> Key: BEAM-276
> URL: https://issues.apache.org/jira/browse/BEAM-276
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Devin Donnelly
> Fix For: Not applicable
>
>
> Add section with overview and usage of PCollection class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-277) Add Transforms Section

2016-08-24 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435328#comment-15435328
 ] 

Frances Perry commented on BEAM-277:


Partially completed: 
http://beam.incubator.apache.org/learn/programming-guide/#transforms 

> Add Transforms Section
> --
>
> Key: BEAM-277
> URL: https://issues.apache.org/jira/browse/BEAM-277
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Devin Donnelly
>
> Document general transforms usage and ParDo usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-275) Add Pipelines Section

2016-08-24 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry closed BEAM-275.
--
   Resolution: Fixed
Fix Version/s: Not applicable

Completed by Devin: 
http://beam.incubator.apache.org/learn/programming-guide/#pipeline

> Add Pipelines Section
> -
>
> Key: BEAM-275
> URL: https://issues.apache.org/jira/browse/BEAM-275
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Devin Donnelly
> Fix For: Not applicable
>
>
> Document overview and usage of Pipeline object, including creation and 
> options assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-274) Add Programming Guide Skeleton

2016-08-24 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry closed BEAM-274.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Add Programming Guide Skeleton
> --
>
> Key: BEAM-274
> URL: https://issues.apache.org/jira/browse/BEAM-274
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Devin Donnelly
> Fix For: Not applicable
>
>
> Creating headings, front matter, and TOC for table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-274) Add Programming Guide Skeleton

2016-08-24 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435322#comment-15435322
 ] 

Frances Perry commented on BEAM-274:


Looks like this was already completed: 
http://beam.incubator.apache.org/learn/programming-guide/

Sorry for the miscommunication. I'll do a pass over Devin's issues and close 
the ones he finished.

> Add Programming Guide Skeleton
> --
>
> Key: BEAM-274
> URL: https://issues.apache.org/jira/browse/BEAM-274
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Devin Donnelly
> Fix For: Not applicable
>
>
> Creating headings, front matter, and TOC for table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-566) Implement proposal process

2016-08-18 Thread Frances Perry (JIRA)
Frances Perry created BEAM-566:
--

 Summary: Implement proposal process
 Key: BEAM-566
 URL: https://issues.apache.org/jira/browse/BEAM-566
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: Frances Perry


As discussed on the dev list...

- Update contribution guide to explain what the design doc / proposal should 
include (like is done in 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals)
- Clearly track the open proposals (potentially in JIRA with a known label and 
incrementing proposal IDs).
- Set expectations around the timelines for proposals -- both to ensure enough 
feedback is gathered and perhaps inactive proposals are archived.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-555) Documentation in BiqQueryIO.java has awkward cut-and-paste error.

2016-08-16 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-555:
---
Assignee: Frank Yellin  (was: Davor Bonaci)

> Documentation in BiqQueryIO.java has awkward cut-and-paste error.
> -
>
> Key: BEAM-555
> URL: https://issues.apache.org/jira/browse/BEAM-555
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Frank Yellin
>Assignee: Frank Yellin
>Priority: Trivial
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> Twice in the documentation, the sample code reads from 
> samples.weather_stations and called the resulting TableRow "shakespeare".
> I suspect that these lines of code were copied from a different example, and 
> then only partially modified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-556) typo in documentation

2016-08-16 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-556:
---
Assignee: Frank Yellin  (was: Frances Perry)

> typo in documentation
> -
>
> Key: BEAM-556
> URL: https://issues.apache.org/jira/browse/BEAM-556
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Frank Yellin
>Assignee: Frank Yellin
>Priority: Trivial
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> transform.py:
> ergument -> argument  
> in documentation for parse_label_and_args



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-541) Add more documentation on Java DoFn Annotations

2016-08-09 Thread Frances Perry (JIRA)
Frances Perry created BEAM-541:
--

 Summary: Add more documentation on Java DoFn Annotations
 Key: BEAM-541
 URL: https://issues.apache.org/jira/browse/BEAM-541
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone
Priority: Minor


https://github.com/apache/incubator-beam-site/pull/36 made the basic 
documentation changes that correspond to BEAM-498, but we should add more 
details on how to use the advance configurations for window access, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-515) Add feature logo and incubator logo

2016-08-02 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405326#comment-15405326
 ] 

Frances Perry commented on BEAM-515:


Submitted https://github.com/apache/incubator-beam-site/pull/33

> Add feature logo and incubator logo
> ---
>
> Key: BEAM-515
> URL: https://issues.apache.org/jira/browse/BEAM-515
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Frances Perry
>Priority: Critical
>
> Except from: 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201608.mbox/%3C7E0226B1-0386-499C-8473-61A8E51A691B%40classsoftware.com%3E
>  A feather ASF logo would be a nice addition as well. [4]
> http://www.apache.org/foundation/press/kit/#links
> While we're in there, I believe we still need to add the Apache Incubator egg 
> logo. http://incubator.apache.org/images/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-514) Add all mandatory links

2016-08-02 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-514:
--

Assignee: Frances Perry  (was: James Malone)

> Add all mandatory links
> ---
>
> Key: BEAM-514
> URL: https://issues.apache.org/jira/browse/BEAM-514
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Frances Perry
>
> Except from: 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201608.mbox/%3C7E0226B1-0386-499C-8473-61A8E51A691B%40classsoftware.com%3E
> > Branding wise I think you are missing a few of the
> required links [3] including a link back to the Apache homepage.
> http://www.apache.org/foundation/marks/pmcs.html#navigation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-515) Add feature logo and incubator logo

2016-08-02 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-515:
--

Assignee: Frances Perry  (was: James Malone)

> Add feature logo and incubator logo
> ---
>
> Key: BEAM-515
> URL: https://issues.apache.org/jira/browse/BEAM-515
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Frances Perry
>Priority: Critical
>
> Except from: 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201608.mbox/%3C7E0226B1-0386-499C-8473-61A8E51A691B%40classsoftware.com%3E
>  A feather ASF logo would be a nice addition as well. [4]
> http://www.apache.org/foundation/press/kit/#links
> While we're in there, I believe we still need to add the Apache Incubator egg 
> logo. http://incubator.apache.org/images/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-516) Update navigation for Javadoc

2016-08-02 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404687#comment-15404687
 ] 

Frances Perry commented on BEAM-516:


Added a basic link. Didn't do anything fancy yet with a latest link -- so 
leaving this bug to track that.

> Update navigation for Javadoc 
> --
>
> Key: BEAM-516
> URL: https://issues.apache.org/jira/browse/BEAM-516
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ismaël Mejía
>Assignee: Frances Perry
>Priority: Minor
> Attachments: screenshot.png
>
>
> The link to the latest version of the java documentation dissapeared with the 
> recent changes to the website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-516) Update navigation for Javadoc

2016-08-02 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-516:
---
Assignee: James Malone  (was: Frances Perry)

> Update navigation for Javadoc 
> --
>
> Key: BEAM-516
> URL: https://issues.apache.org/jira/browse/BEAM-516
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ismaël Mejía
>Assignee: James Malone
>Priority: Minor
> Attachments: screenshot.png
>
>
> The link to the latest version of the java documentation dissapeared with the 
> recent changes to the website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-500) Update website layout

2016-08-02 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry closed BEAM-500.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Update website layout
> -
>
> Key: BEAM-500
> URL: https://issues.apache.org/jira/browse/BEAM-500
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
> Fix For: Not applicable
>
>
> As discussed on dev@, update the website layout to use this:
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-500) Update website layout

2016-08-02 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404682#comment-15404682
 ] 

Frances Perry commented on BEAM-500:


Filed bugs to fill in remaining missing content. Closing this root issue.

> Update website layout
> -
>
> Key: BEAM-500
> URL: https://issues.apache.org/jira/browse/BEAM-500
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> As discussed on dev@, update the website layout to use this:
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-516) The Javadoc link dissapeared in the website refactoring

2016-08-02 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404075#comment-15404075
 ] 

Frances Perry commented on BEAM-516:


[~jbonofre] Wow, that looks awesome ;-)

[~iemejia] Thanks for the report! I'll assign this to myself, since I just 
created a java SDK subdirectory as part of BEAM-500. Int he meantime, the 
workaround is to go to the url directly: 
http://beam.incubator.apache.org/javadoc/0.1.0-incubating/

> The Javadoc link dissapeared in the website refactoring
> ---
>
> Key: BEAM-516
> URL: https://issues.apache.org/jira/browse/BEAM-516
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ismaël Mejía
>Assignee: James Malone
>Priority: Minor
> Attachments: screenshot.png
>
>
> The link to the latest version of the java documentation dissapeared with the 
> recent changes to the website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-512) Fill in the contribute/testing section of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-512:
--

 Summary: Fill in the contribute/testing section of the website
 Key: BEAM-512
 URL: https://issues.apache.org/jira/browse/BEAM-512
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-511) Fill in the contribute/technical-vision section of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-511:
--

 Summary: Fill in the contribute/technical-vision section of the 
website
 Key: BEAM-511
 URL: https://issues.apache.org/jira/browse/BEAM-511
 Project: Beam
  Issue Type: Bug
Reporter: Frances Perry


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-509) Fill in the learn/resources portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-509:
--

 Summary: Fill in the learn/resources portion of the website
 Key: BEAM-509
 URL: https://issues.apache.org/jira/browse/BEAM-509
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit

Do a nicer curation of great Beam articles, videos, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-508) Fill in the learn/runners/dataflow portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-508:
--

 Summary: Fill in the learn/runners/dataflow portion of the website
 Key: BEAM-508
 URL: https://issues.apache.org/jira/browse/BEAM-508
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
Should be a landing page for Dataflow-runner-specific content



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-507) Fill in the learn/runners/spark portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-507:
--

 Summary: Fill in the learn/runners/spark portion of the website
 Key: BEAM-507
 URL: https://issues.apache.org/jira/browse/BEAM-507
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
Should be a landing page for Spark-specific information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-506) Fill in the learn/runners/flink portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-506:
--

 Summary: Fill in the learn/runners/flink portion of the website
 Key: BEAM-506
 URL: https://issues.apache.org/jira/browse/BEAM-506
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
Should be a landing page for Flink-specific details



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-505) Fill in the learn/runners/direct portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-505:
--

 Summary: Fill in the learn/runners/direct portion of the website
 Key: BEAM-505
 URL: https://issues.apache.org/jira/browse/BEAM-505
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.

Should be a landing page for the Direct runner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-504) Fill in the learn/sdks/java portion of the website

2016-08-01 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-504:
---
Summary: Fill in the learn/sdks/java portion of the website  (was: Fill in 
use/sdks/java portion of the website)

> Fill in the learn/sdks/java portion of the website
> --
>
> Key: BEAM-504
> URL: https://issues.apache.org/jira/browse/BEAM-504
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: James Malone
>
> As per 
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
> Should be a landing page for Java-SDK-specific content like existing IO 
> connectors, javadoc, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-504) Fill in use/sdks/java portion of the website

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-504:
--

 Summary: Fill in use/sdks/java portion of the website
 Key: BEAM-504
 URL: https://issues.apache.org/jira/browse/BEAM-504
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Frances Perry
Assignee: James Malone


As per 
https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.

Should be a landing page for Java-SDK-specific content like existing IO 
connectors, javadoc, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-500) Update website layout

2016-08-01 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-500:
---
Summary: Update website layout  (was: Update website layou)

> Update website layout
> -
>
> Key: BEAM-500
> URL: https://issues.apache.org/jira/browse/BEAM-500
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> As discussed on dev@, update the website layout to use this:
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-500) Update website layout

2016-08-01 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402601#comment-15402601
 ] 

Frances Perry commented on BEAM-500:


(To be clear, this is just the page/navigation structure. The skin / main page 
is covered in BEAM-501.)

> Update website layout
> -
>
> Key: BEAM-500
> URL: https://issues.apache.org/jira/browse/BEAM-500
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> As discussed on dev@, update the website layout to use this:
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-500) Update website layout

2016-08-01 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402598#comment-15402598
 ] 

Frances Perry commented on BEAM-500:


Devin started this process in 
https://github.com/apache/incubator-beam-site/pull/25

I'll do the next round.

> Update website layout
> -
>
> Key: BEAM-500
> URL: https://issues.apache.org/jira/browse/BEAM-500
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Frances Perry
>
> As discussed on dev@, update the website layout to use this:
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-501) Update website skin

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-501:
--

 Summary: Update website skin
 Key: BEAM-501
 URL: https://issues.apache.org/jira/browse/BEAM-501
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Frances Perry
Assignee: Jean-Baptiste Onofré


Update the main landing page and website skin as discussed here

https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-500) Update website layou

2016-08-01 Thread Frances Perry (JIRA)
Frances Perry created BEAM-500:
--

 Summary: Update website layou
 Key: BEAM-500
 URL: https://issues.apache.org/jira/browse/BEAM-500
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Frances Perry
Assignee: Frances Perry


As discussed on dev@, update the website layout to use this:

https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-434) When examples write output to file it creates many output files instead of one

2016-07-12 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373374#comment-15373374
 ] 

Frances Perry commented on BEAM-434:


Not overly constraining the sharding to allow the runner to choose bundling 
that allows good performance is pretty key to the model. So I think it's pretty 
important to introduce users to this idea in the examples.

The direct runner should be careful to create a small (but variable) number of 
files to show that the default is *not* one or a fixed number. I'd prefer we 
fix this in a way that is *not* specific to TextIO.Write -- the same thing will 
happen in many other places.

Can we wait for Thomas to return from vacation tomorrow and get his opinion?

> When examples write output to file it creates many output files instead of one
> --
>
> Key: BEAM-434
> URL: https://issues.apache.org/jira/browse/BEAM-434
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
>
> When using `TextIO.Write.to("/path/to/output")` without any restrictions on 
> the number of shards, it might generate many output files (depending on your 
> input), for WordCount for example, you'll get as many output files as unique 
> words in your input.
> Since I think examples are expected to execute in a friendly manner to "see" 
> what it does and not optimize for performance in some way, I suggest to use 
> `withoutSharding()` when writing the example output to an output file.
> Examples I could find that behave this way:
> org.apache.beam.examples.WordCount
> org.apache.beam.examples.complete.TfIdf
> org.apache.beam.examples.cookbook.DeDupExample



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-320) Provide Beam keyturn binary distributions embedding runners and execution runtime

2016-06-01 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311053#comment-15311053
 ] 

Frances Perry commented on BEAM-320:


Ready to use distributions for common usage patterns sounds like a good idea -- 
it will make things much easier for users. 

For the Dataflow Runner, I think Google may prefer to provide a Google-built 
binary distribution based on Beam instead of providing this convenience as part 
of Beam, because for Google Cloud Platform customers, we may want to package in 
a few additional libraries for interacting with other Google Cloud Platform 
services. It doesn't sound right to complicate Beam with those dependencies. 

But I can definitely see there are some that would make sense as part of Beam. 
And in any case, we should make all of these distributions easy to find via 
documentation on the Beam site.

(Also keep in mind, that there will be multiple SDKs, so we likely want to name 
things to include both the runner and the sdk -- beam-java-spark, etc.)

> Provide Beam keyturn binary distributions embedding runners and execution 
> runtime
> -
>
> Key: BEAM-320
> URL: https://issues.apache.org/jira/browse/BEAM-320
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Now, the only distribution Beam provides is the source distribution.
> For new users, it could be interesting to have ready-to-use binary 
> distribution embedding the SDK, a specific runner with the backend execution 
> runtime.
> For instance, we could provide:
> - beam-spark-xxx.tar.gz containing SDK, Spark runner, Spark
> - beam-flink-xxx.tar.gz containing SDK, Flink runner, Flink
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-262) Native Runners | Direct Compiler

2016-05-09 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276724#comment-15276724
 ] 

Frances Perry commented on BEAM-262:


I'm not sure why you think using Flink or Spark for execution is overkill for 
what Beam does? Creating a backend that can handle all Beam pipelines at scale 
is a huge undertaking! I agree with Davor that building backends is generally 
beyond the scope of Beam currently. 

Right now we're looking at creating the best programming model for writing data 
processing pipelines that generalizes functionality of a number of current 
distributed processing backends. Each backend has its own strengths in terms of 
what use cases it handles well, and users can choose which one fits their 
needs. Having a single backend that automatically does the best thing would be 
great, but I don't think it's feasible yet.

> Native Runners | Direct Compiler 
> -
>
> Key: BEAM-262
> URL: https://issues.apache.org/jira/browse/BEAM-262
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-ideas
>Reporter: Suminda Dharmasena
>Assignee: Davor Bonaci
>
> Having to depend on other frameworks to do the heavy lifting means that the 
> quakes, limitation and overhead of the other platform limits what can be 
> achieved. Hence is it possible to have Beam directly generate code for LLVM, 
> JVM and .Net platforms without dependence on any other platform.
> Also perhaps there can be code generation than directly native code in high 
> level languages like C/C++, Java, C#, F#, Rust, Julia, D, Nim, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-193) Port existing Dataflow SDK documentation to Beam Programming Guide

2016-04-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-193:
---
Assignee: Devin Donnelly  (was: James Malone)

> Port existing Dataflow SDK documentation to Beam Programming Guide
> --
>
> Key: BEAM-193
> URL: https://issues.apache.org/jira/browse/BEAM-193
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Devin Donnelly
>Assignee: Devin Donnelly
>
> There is an extensive amount of documentation on the Dataflow SDK programming 
> model and classes. Port this documentation over as a new Beam Programming 
> Guide covering the following major topics:
> - Programming model overview
> - Pipeline structure
> - PCollections
> - Transforms
> - I/O



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-192) Create new landing page for Apache Beam Documentation

2016-04-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-192:
---
Assignee: Devin Donnelly  (was: James Malone)

> Create new landing page for Apache Beam Documentation
> -
>
> Key: BEAM-192
> URL: https://issues.apache.org/jira/browse/BEAM-192
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Devin Donnelly
>Assignee: Devin Donnelly
>
> Revise the current stopgap Apache Beam landing page.
> - Explain the benefits of the Beam programming model
> - Disclose the status of the various Beam SDKs and runners
> - Provide an easy place to access release notes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-194) Create a walkthrough of Beam examples in mobile gaming domain

2016-04-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-194:
---
Assignee: Devin Donnelly  (was: James Malone)

> Create a walkthrough of Beam examples in mobile gaming domain
> -
>
> Key: BEAM-194
> URL: https://issues.apache.org/jira/browse/BEAM-194
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Devin Donnelly
>Assignee: Devin Donnelly
>
> The Beam SDKs provide a series of example pipelines in the mobile gaming 
> domain. The Dataflow documentation contains an detailed walkthrough of these 
> examples, explaining the use case, pipeline design, and some of the code.
> Port these examples to the Beam website for Beam users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-138) Extend TextIO to new protocols (and maybe rename to FileIO)

2016-03-21 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15204593#comment-15204593
 ] 

Frances Perry commented on BEAM-138:


TextIO is really about a specific file format -- it requires 
newline-deliminated records. It'd be great to increase the number of things it 
can read those from though. [~dhalp...@google.com] You probably know the status 
of generalizing the file system?

> Extend TextIO to new protocols (and maybe rename to FileIO)
> ---
>
> Key: BEAM-138
> URL: https://issues.apache.org/jira/browse/BEAM-138
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> The current TextIO supports:
> - local file when using directly path like /path/to...
> - Google Service file using path like gs:...
> On the other hand, we have a contribution (from Tom) to support HDFS.
> For an user perspective, it would be easier to use an unique IO supporting 
> different protocol:
> - file:
> - gs:
> - hdfs:
> - mvn:
> - ...
> It would also be convenient to be able to combine protocols and eventually 
> use a different coder (for instance xml:hdfs:).
> In that case, maybe I would make sense to rename TextIO as generic FileIO.
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-137) Add implicit conf/pipeline-default.conf options file

2016-03-21 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15204607#comment-15204607
 ] 

Frances Perry commented on BEAM-137:


[~lcwik] Do you have plans for generalizing PipelineOptions in a multi-runner 
world? How would that affect this?

> Add implicit conf/pipeline-default.conf options file
> 
>
> Key: BEAM-137
> URL: https://issues.apache.org/jira/browse/BEAM-137
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Davor Bonaci
>
> Right now, most of users provide the pipeline options via the main arguments.
> For instance, it's the classic way to provide pipeline input, etc.
> For convenience, it would be great that the pipeline looks for options in 
> conf/[pipeline_name]-default.conf by default, and override the options using 
> the main arguments.
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-91) Retractions

2016-03-02 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177206#comment-15177206
 ] 

Frances Perry commented on BEAM-91:
---

Did you mean "backsies"? ;-)

> Retractions
> ---
>
> Key: BEAM-91
> URL: https://issues.apache.org/jira/browse/BEAM-91
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Tyler Akidau
>Assignee: Frances Perry
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We still haven't added retractions to Beam, even though they're a core part 
> of the model. We should document all the necessary aspects (uncombine, 
> reverting DoFn output with DoOvers, sink integration, source-level 
> retractions, etc), and then implement them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-79) Gearpump runner

2016-02-29 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173229#comment-15173229
 ] 

Frances Perry commented on BEAM-79:
---

Happy to assign it to you, as you will clearly be the expert on Gearpump ;-)

But please note that Beam is still very much under construction and there are a 
number of breaking API changes likely in the near future. So please reach out 
before getting beyond the early design phase / determining how well the models 
align. If you haven't yet, I'd start with these resources: 
http://beam.incubator.apache.org/getting_started/


> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-79) Gearpump runner

2016-02-29 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-79:
--
Assignee: Manu Zhang  (was: James Malone)

> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-79) Gearpump runner

2016-02-29 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-79:
--
Assignee: James Malone

> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: James Malone
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-77) Reorganize Directory structure

2016-02-26 Thread Frances Perry (JIRA)
Frances Perry created BEAM-77:
-

 Summary: Reorganize Directory structure
 Key: BEAM-77
 URL: https://issues.apache.org/jira/browse/BEAM-77
 Project: Beam
  Issue Type: Task
  Components: project-management
Reporter: Frances Perry
Assignee: Frances Perry


Now that we've done the initial Dataflow code drop, we will restructure 
directories to provide space for additional SDKs and Runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-14) Add data integration DSL

2016-02-15 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148117#comment-15148117
 ] 

Frances Perry commented on BEAM-14:
---

I think there's a few rough concepts in here that may need model extensions, 
but general this seems to be about supporting a different DSL on top of the 
existing model.

> Add data integration DSL
> 
>
> Key: BEAM-14
> URL: https://issues.apache.org/jira/browse/BEAM-14
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Even if users would still be able to use directly the API, it would be great 
> to provide a DSL on top of the API covering batch and streaming data 
> processing but also data integration.
> Instead of designing a pipeline as a chain of apply() wrapping function 
> (DoFn), we can provide a fluent DSL allowing users to directly leverage 
> keyturn functions.
> For instance, an user would be able to design a pipeline like:
> {code}
> .from(“kafka:localhost:9092?topic=foo”).reduce(...).split(...).wiretap(...).map(...).to(“jms:queue:foo….”);
> {code}
> The DSL will allow to use existing pipelines, for instance:
> {code}
> .from("cxf:...").reduce().pipeline("other").map().to("kafka:localhost:9092?topic=foo=all")
> {code}
> So it means that we will have to create a IO Sink that can trigger the 
> execution of a target pipeline: (from("trigger:other") triggering the 
> pipeline execution when another pipeline design starts with 
> pipeline("other")). We can also imagine to mix the runners: the pipeline() 
> can be on one runner, the from("trigger:other") can be on another runner). 
> It's not trivial, but it will give strong flexibility and key value for Beam.
> In a second step, we can provide DSLs in different languages (the first one 
> would be Java, but why not providing XML, akka, scala DSLs).
> We can note in previous examples that the DSL would also provide data 
> integration support to bean in addition of data processing. Data Integration 
> is an extension of Beam API to support some Enterprise Integration Patterns 
> (EIPs). As we would need metadata for data integration (even if metadata can 
> also be interesting in stream/batch data processing pipeline), we can provide 
> a DataxMessage built on top of PCollection. A DataxMessage would contain:
> structured headers
> binary payload
> For instance, the headers can contains an Avro schema to describe the payload.
> The headers can also contains useful information coming from the IO Source 
> (for instance the partition/path where the data comes from, …).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-12) Apply GroupByKey transforms on PCollection of normal type other than KV

2016-02-14 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-12:
--
   Assignee: Frances Perry
   Priority: Trivial  (was: Major)
Component/s: sdk-java-core

If you need to do something to the elements to extract the key before grouping, 
you can use a ParDo (or a derivative like MapElements). So something like:
 
input.apply(ParDo.of(new ExtractFn()))
.apply(GroupByKey.create());

I'm not sure what you meant by automatically extracting keys from data -- that 
sounds like something that would application or domain specific.

As always, if you find yourself using a pattern often in your applications, you 
can create your own composite PTransform do it more compactly.


> Apply GroupByKey transforms on PCollection of normal type other than KV
> ---
>
> Key: BEAM-12
> URL: https://issues.apache.org/jira/browse/BEAM-12
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: bakeypan
>Assignee: Frances Perry
>Priority: Trivial
>
> Now the GroupByKey transforms can only apply on PCollection>.So I 
> have to transform PCollection to PCollection> before I want to 
> apply GroupByKey.
> I think we can do better by apply GroupByKey on normal type of PCollection 
> other than KV.And user can offer one custome extract key function or we can 
> offer default extract key function.Just like this:
> PCollection input = ...
> PCollection> result = input.apply(GroupByKey. V>create(new ExtractFn()));



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)