[3/3] beam-site git commit: This closes #146

2017-02-08 Thread frances
This closes #146


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/221f388d
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/221f388d
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/221f388d

Branch: refs/heads/asf-site
Commit: 221f388d3fd75af1e2a4ae2e59815bb96d292fbb
Parents: e1dd10a 6ebcb08
Author: Frances Perry <f...@google.com>
Authored: Wed Feb 8 10:54:44 2017 -0800
Committer: Frances Perry <f...@google.com>
Committed: Wed Feb 8 10:54:44 2017 -0800

--
 .../documentation/programming-guide/index.html  | 327 +++---
 src/documentation/programming-guide.md  | 341 +++
 2 files changed, 541 insertions(+), 127 deletions(-)
--




[2/3] beam-site git commit: Regenerate website

2017-02-08 Thread frances
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/6ebcb08c
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/6ebcb08c
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/6ebcb08c

Branch: refs/heads/asf-site
Commit: 6ebcb08cb503a3e58101fc73be11649116111c65
Parents: f277339
Author: Frances Perry <f...@google.com>
Authored: Wed Feb 8 10:53:36 2017 -0800
Committer: Frances Perry <f...@google.com>
Committed: Wed Feb 8 10:53:36 2017 -0800

--
 .../documentation/programming-guide/index.html  | 327 ---
 1 file changed, 274 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/6ebcb08c/content/documentation/programming-guide/index.html
--
diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index dee4869..9830735 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -208,7 +208,7 @@
 PCollection: A PCollection represents a distributed data set 
that your Beam pipeline operates on. The data set can be bounded, 
meaning it comes from a fixed source like a file, or unbounded, 
meaning it comes from a continuously updating source via a subscription or 
other mechanism. Your pipeline typically creates an initial PCollection by reading data from an external 
data source, but you can also create a PCollection from in-memory data within your 
driver program. From there, PCollections 
are the inputs and outputs for each step in your pipeline.
   
   
-Transform: A Transform represents a data processing 
operation, or a step, in your pipeline. Every Transform takes one or more PCollection objects as input, performs a 
processing function that you provide on the elements of that PCollection, and produces one or more output 
PCollection objects.
+Transform: A Transform represents a data processing 
operation, or a step, in your pipeline. Every Transform takes one or more PCollection objects as input, perfroms a 
processing function that you provide on the elements of that PCollection, and produces one or more output 
PCollection objects.
   
   
 I/O Source and Sink: Beam provides Source and Sink APIs to represent reading and writing 
data, respectively. Source encapsulates 
the code necessary to read data into your Beam pipeline from some external 
source, such as cloud file storage or a subscription to a streaming data 
source. Sink likewise encapsulates the 
code necessary to write the elements of a PCollection to an external data sink.
@@ -248,11 +248,13 @@
 
 
 
-from apache_beam.utils.pipeline_options import PipelineOptions
-
-# Will parse the arguments passed into the application and 
construct a PipelineOptions
+# Will parse the arguments passed into the application and construct 
a PipelineOptions object.
 # Note that --help will print registered options.
+
+from apache_beam.utils.pipeline_options import PipelineOptions
+
 p = beam.Pipeline(options=PipelineOptions())
+
 
 
 
@@ -286,13 +288,8 @@
 
 
 
-import apache_beam as beam
-
-# Create the pipeline.
-p = beam.Pipeline()
+lines = p | 'ReadMyFile'  beam.io.ReadFromText('gs://some/inputData.txt')
 
-# Read the text file into a PCollection.
-lines = p 
| 'ReadMyFile'  beam.io.Read(beam.io.TextFileSource("protocol://path/to/some/inputData.txt"))
 
 
 
@@ -327,20 +324,18 @@
 
 
 
-import apache_beam as beam
+p = beam.Pipeline(options=pipeline_options)
 
-# python list
-lines = [
-  "To be, or not to be: that is the question: ",
-  "Whether 'tis nobler in the mind to suffer ",
-  "The slings and arrows of outrageous fortune, ",
-  "Or to take arms against a sea of troubles, "
-]
+(p
+ | beam.Create([
+ 'To be, or not to be: that is the question: ',
+ 'Whether \'tis nobler in the mind to suffer ',
+ 'The slings and arrows of outrageous fortune, 
',
+ 'Or to take arms against a sea of troubles, '])
+ | beam.io.WriteToText(my_options.output))
 
-# Create the pipeline.
-p = beam.Pipeline()
+result = p.run()
 
-collection = p | 'ReadMyLines'  beam.Create(lines)
 
 
 
@@ -401,8 +396,8 @@
 How you apply your pipeline’s transforms determines the structure of your 
pipeline. The best way to think of your pipeline is as a directed acyclic 
graph, where the nodes are PCollections 
and the edges are transforms. For example, you can chain transforms to create a 
sequential pipeline, like this one:
 
 [Final 
Output PCollection] = [Initial Input PCollection].apply([First 
Transform])
-   .apply([Second Transform])
-

[jira] [Created] (BEAM-1556) Spark executors need to register IO factories

2017-02-25 Thread Frances Perry (JIRA)
Frances Perry created BEAM-1556:
---

 Summary: Spark executors need to register IO factories
 Key: BEAM-1556
 URL: https://issues.apache.org/jira/browse/BEAM-1556
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Reporter: Frances Perry
Assignee: Amit Sela


The Spark executors need to call IOChannelUtils.registerIOFactories(options) in 
order to support GCS file and make the default WordCount example work.

Context in this thread: 
https://lists.apache.org/thread.html/469a139c9eb07e64e514cdea42ab8000678ab743794a090c365205d7@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-370) Remove the .named() methods from PTransforms and sub-classes

2016-12-22 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15771289#comment-15771289
 ] 

Frances Perry commented on BEAM-370:


Can this issue be closed?

> Remove the .named() methods from PTransforms and sub-classes
> 
>
> Key: BEAM-370
> URL: https://issues.apache.org/jira/browse/BEAM-370
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>Priority: Minor
>  Labels: backward-incompatible
>
> 1. Update examples/tests/etc. to use named application instead of `.named()`
> 2. Remove the `.named()` methods from composite PTransforms
> 3. Where appropriate, use the the PTransform constructor which takes a string 
> to use as the default name.
> See further discussion in the related thread 
> (http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201606.mbox/%3ccan-7fgzuz1f_szzd2orfyd2pk2_prymhgwjepjpefp01h7s...@mail.gmail.com%3E).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-501) Update website skin

2017-04-05 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958315#comment-15958315
 ] 

Frances Perry commented on BEAM-501:


Checked with JB that he's not actively working on this right now. Reassigning 
to Jeremy, who has some great thoughts ;-)

> Update website skin
> ---
>
> Key: BEAM-501
> URL: https://issues.apache.org/jira/browse/BEAM-501
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>    Reporter: Frances Perry
>Assignee: Jeremy Weinstein
>
> Update the main landing page and website skin as discussed here
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-501) Update website skin

2017-04-05 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-501:
--

Assignee: Jeremy Weinstein  (was: Jean-Baptiste Onofré)

> Update website skin
> ---
>
> Key: BEAM-501
> URL: https://issues.apache.org/jira/browse/BEAM-501
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>    Reporter: Frances Perry
>Assignee: Jeremy Weinstein
>
> Update the main landing page and website skin as discussed here
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1247) Session state should not be lost when discardingFiredPanes

2017-04-20 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977091#comment-15977091
 ] 

Frances Perry commented on BEAM-1247:
-

Any updates on this one?

> Session state should not be lost when discardingFiredPanes
> --
>
> Key: BEAM-1247
> URL: https://issues.apache.org/jira/browse/BEAM-1247
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> Today when {{discardingFiredPanes}} the entirety of state is cleared, 
> including the state of evolving sessions. This means that with multiple 
> triggerings a single session shows up as multiple. This also stymies 
> downstream stateful computations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-170) Session windows should not be identified by their bounds

2017-04-20 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977095#comment-15977095
 ] 

Frances Perry commented on BEAM-170:


Ilya, should we find a new owner for this?

> Session windows should not be identified by their bounds
> 
>
> Key: BEAM-170
> URL: https://issues.apache.org/jira/browse/BEAM-170
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Ilya Ganelin
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> Today, if two session windows for the same key have the same bounds, they are 
> considered the same window. This is an accident. It is not intended that any 
> session windows are considered equal except via the operation of merging them 
> into the same session.
> A risk associated with this behavior is that two windows that happen to 
> coincide will share per-window-and-key state rather than evolving separately 
> and having their separate state reconciled by state merging logic. These code 
> paths are not required to be coherent, and in practice they are not.
> In particular, if the trigger for a session window ever finishes, then 
> subsequent data in a window with the same bounds will be dropped, whereas if 
> it had differed by a millisecond it would have created a new session, 
> ignoring the previously closed session.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-622) Add checkpointing tests for DoFnOperator and WindowDoFnOperator

2017-04-20 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977100#comment-15977100
 ] 

Frances Perry commented on BEAM-622:


Is this a blocker for the first stable release? Is there a good owner if so?

> Add checkpointing tests for DoFnOperator and WindowDoFnOperator 
> 
>
> Key: BEAM-622
> URL: https://issues.apache.org/jira/browse/BEAM-622
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Affects Versions: 0.3.0-incubating
>Reporter: Maximilian Michels
> Fix For: First stable release
>
>
> Tests which test the correct snapshotting of these two operators are missing. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1624) Unable to deserialize Coder in DataflowRunner

2017-03-04 Thread Frances Perry (JIRA)
Frances Perry created BEAM-1624:
---

 Summary: Unable to deserialize Coder in DataflowRunner
 Key: BEAM-1624
 URL: https://issues.apache.org/jira/browse/BEAM-1624
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Frances Perry
Assignee: Davor Bonaci
Priority: Blocker
 Fix For: 0.6.0


To repro, sync to head and run the LeaderBoard example with the Dataflow runner
Does not repro in 0.5.

Caused by: java.lang.RuntimeException: Unable to deserialize Coder: 
WindowedValue$FullWindowedValueCoder(KvCoder(BigQueryIO$ShardedKeyCoder(StringUtf8Coder),BigQueryIO$TableRowInfoCoder),IntervalWindow$IntervalWindowCoder).
 Check that a suitable constructor is defined.  See Coder for details.
at 
org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:115)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$StepTranslator.addOutput(DataflowPipelineTranslator.java:655)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$StepTranslator.addOutput(DataflowPipelineTranslator.java:602)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translateOutputs(DataflowPipelineTranslator.java:945)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.access$1200(DataflowPipelineTranslator.java:111)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translateMultiHelper(DataflowPipelineTranslator.java:836)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translate(DataflowPipelineTranslator.java:826)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translate(DataflowPipelineTranslator.java:823)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.visitPrimitiveTransform(DataflowPipelineTranslator.java:413)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:486)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
at 
org.apache.beam.sdk.runners.TransformHierarchy$Node.access$400(TransformHierarchy.java:231)
at 
org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:206)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:321)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.translate(DataflowPipelineTranslator.java:363)
at 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translate(DataflowPipelineTranslator.java:153)
at 
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:505)
at 
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:150)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:210)
at 
org.apache.beam.examples.complete.game.GameStats.main(GameStats.java:340)
... 6 more
Caused by: java.lang.RuntimeException: Unable to deserialize class interface 
org.apache.beam.sdk.coders.Coder
at org.apache.beam.sdk.util.Serializer.deserialize(Serializer.java:102)
at 
org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:112)
... 29 more



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1624) Unable to deserialize Coder in DataflowRunner

2017-03-04 Thread Frances Perry (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896112#comment-15896112
 ] 

Frances Perry commented on BEAM-1624:
-

[~altay] FYI -- considering this 0.6 release blocking until triaged.


> Unable to deserialize Coder in DataflowRunner
> -
>
> Key: BEAM-1624
> URL: https://issues.apache.org/jira/browse/BEAM-1624
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>    Reporter: Frances Perry
>Assignee: Davor Bonaci
>Priority: Blocker
> Fix For: 0.6.0
>
>
> To repro, sync to head and run the LeaderBoard example with the Dataflow 
> runner
> Does not repro in 0.5.
> Caused by: java.lang.RuntimeException: Unable to deserialize Coder: 
> WindowedValue$FullWindowedValueCoder(KvCoder(BigQueryIO$ShardedKeyCoder(StringUtf8Coder),BigQueryIO$TableRowInfoCoder),IntervalWindow$IntervalWindowCoder).
>  Check that a suitable constructor is defined.  See Coder for details.
>   at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:115)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$StepTranslator.addOutput(DataflowPipelineTranslator.java:655)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$StepTranslator.addOutput(DataflowPipelineTranslator.java:602)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translateOutputs(DataflowPipelineTranslator.java:945)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator.access$1200(DataflowPipelineTranslator.java:111)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translateMultiHelper(DataflowPipelineTranslator.java:836)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translate(DataflowPipelineTranslator.java:826)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$6.translate(DataflowPipelineTranslator.java:823)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.visitPrimitiveTransform(DataflowPipelineTranslator.java:413)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:486)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:481)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$400(TransformHierarchy.java:231)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:206)
>   at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:321)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.translate(DataflowPipelineTranslator.java:363)
>   at 
> org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translate(DataflowPipelineTranslator.java:153)
>   at 
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:505)
>   at 
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:150)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:210)
>   at 
> org.apache.beam.examples.complete.game.GameStats.main(GameStats.java:340)
>   ... 6 more
> Caused by: java.lang.RuntimeException: Unable to deserialize class interface 
> org.apache.beam.sdk.coders.Coder
>   at org.apache.beam.sdk.util.Serializer.deserialize(Serializer.java:102)
>   at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:112)
>   ... 29 more



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1627) Composite/DisplayData structure changed

2017-03-05 Thread Frances Perry (JIRA)
Frances Perry created BEAM-1627:
---

 Summary: Composite/DisplayData structure changed
 Key: BEAM-1627
 URL: https://issues.apache.org/jira/browse/BEAM-1627
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Frances Perry
Assignee: Thomas Groh
Priority: Blocker
 Fix For: 0.6.0


When running at head, pipeline composite structure has changed. My guess is 
this is related to pull/2145. 

(1) Steps that used to be leaf notes are now expandable composites with a 
ParMultiDo inside them.

(2) For some (but not all) display data appears to be lost

This can be seen pretty clearly in the Dataflow monitoring UI. Attached 
screenshots showing
-- ParseGameEvent transform leaks an extra level of composite.
-- FixedWindows transform leaks an extra composite and loses display data.

[~tgroh] can you triage?
[~altay] FYI potential 0.6 release blocker



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1627) Composite/DisplayData structure changed

2017-03-05 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry updated BEAM-1627:

Attachment: ParseGame-0.5.png
ParseGame-snapshot-extraComposite.png
FixedWindows-0.5.png
FixedWindows-snapshot-extraComposite-noDisplayData.png

> Composite/DisplayData structure changed
> ---
>
> Key: BEAM-1627
> URL: https://issues.apache.org/jira/browse/BEAM-1627
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>    Reporter: Frances Perry
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 0.6.0
>
> Attachments: FixedWindows-0.5.png, 
> FixedWindows-snapshot-extraComposite-noDisplayData.png, ParseGame-0.5.png, 
> ParseGame-snapshot-extraComposite.png
>
>
> When running at head, pipeline composite structure has changed. My guess is 
> this is related to pull/2145. 
> (1) Steps that used to be leaf notes are now expandable composites with a 
> ParMultiDo inside them.
> (2) For some (but not all) display data appears to be lost
> This can be seen pretty clearly in the Dataflow monitoring UI. Attached 
> screenshots showing
> -- ParseGameEvent transform leaks an extra level of composite.
> -- FixedWindows transform leaks an extra composite and loses display data.
> [~tgroh] can you triage?
> [~altay] FYI potential 0.6 release blocker



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1069) Add CountingInput Transform to python sdk

2017-03-31 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-1069:
---

Assignee: Tibor Kiss  (was: Frances Perry)

> Add CountingInput Transform to python sdk
> -
>
> Key: BEAM-1069
> URL: https://issues.apache.org/jira/browse/BEAM-1069
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Vikas Kedigehalli
>Assignee: Tibor Kiss
>Priority: Minor
>  Labels: starter
>
> Similar to java sdk,  
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1976) Allow only one runner profile active at once in examples and archetypes

2017-08-11 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-1976:
---

Assignee: (was: Frances Perry)

> Allow only one runner profile active at once in examples and archetypes
> ---
>
> Key: BEAM-1976
> URL: https://issues.apache.org/jira/browse/BEAM-1976
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-java
>Reporter: Aviem Zur
>
> Since only one SLF4J logger binding is allowed in the classpath, we shouldn't 
> allow more than one runner profile to be active at once in our 
> examples/archetype modules since different runners use different bindings.
> Also, remove slf4j-jdk14 dependency from root and place it instead in 
> direct-runner and dataflow-runner profiles, for the same reason.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-91) Retractions

2017-08-11 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-91?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-91:
-

Assignee: (was: Frances Perry)

> Retractions
> ---
>
> Key: BEAM-91
> URL: https://issues.apache.org/jira/browse/BEAM-91
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Tyler Akidau
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We still haven't added retractions to Beam, even though they're a core part 
> of the model. We should document all the necessary aspects (uncombine, 
> reverting DoFn output with DoOvers, sink integration, source-level 
> retractions, etc), and then implement them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2450) Transform names and named applications should not be null or empty

2017-08-11 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry reassigned BEAM-2450:
---

Assignee: (was: Frances Perry)

> Transform names and named applications should not be null or empty
> --
>
> Key: BEAM-2450
> URL: https://issues.apache.org/jira/browse/BEAM-2450
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, sdk-java-core, sdk-py
>Reporter: Scott Wegner
>Priority: Minor
>
> Beam SDK allows setting the name of a transform [1] and also naming the 
> transform application [2]. If no name is specified on application, the name 
> of the transform is used. If no name is specified for the transform, the 
> class name is used.
> The application name serves as metadata for the applied PTransforms in the 
> constructed graph. The are effectively extra display data (historically, 
> PTransform names predate display data). The names are used by runners for UI 
> and monitoring applications, such as the displayed pipeline graph in the 
> Dataflow Monitoring UI [3].
> Currently there is no explicit validation on the specified application name. 
> The current behavior seems to be:
> * null application names cause a NullPointerException at construction time.
> * Specifying the empty string compiles and succeeds in the DirectRunner, but 
> causes strange behavior in Dataflow when rendering the graph in the UI. I 
> have not tested the behavior of other runners.
> We should add explicit validation in the model on the specified transform 
> name and application name. I propose that we disallow null and empty names.
> This is technically a breaking change as the SDK currently allows the empty 
> string, but only because it is under-specified. The upgrade path for any 
> pipelines broken by this change is simple: specify a non-empty name or 
> fallback to the default class name.
> [1] 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PTransform.java#L236
> [2] 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java#L295
> [3] 
> https://cloud.google.com/dataflow/pipelines/dataflow-monitoring-intf#viewing-a-pipeline



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-193) Port existing Dataflow SDK documentation to Beam Programming Guide

2017-04-28 Thread Frances Perry (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frances Perry closed BEAM-193.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Port existing Dataflow SDK documentation to Beam Programming Guide
> --
>
> Key: BEAM-193
> URL: https://issues.apache.org/jira/browse/BEAM-193
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Devin Donnelly
>Assignee: Melissa Pashniak
> Fix For: Not applicable
>
>
> There is an extensive amount of documentation on the Dataflow SDK programming 
> model and classes. Port this documentation over as a new Beam Programming 
> Guide covering the following major topics:
> - Programming model overview
> - Pipeline structure
> - PCollections
> - Transforms
> - I/O



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)