[jira] [Commented] (BEAM-5654) Race condition in python sdk_worker causing KeyError

2018-10-04 Thread Alex Amato (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638766#comment-16638766
 ] 

Alex Amato commented on BEAM-5654:
--

Note: I believe the code is able to recovery on the next iteration, I don't 
believe it causes pipelines to fail/get stuck. But does cause noisy errors to 
be logged.

> Race condition in python sdk_worker causing KeyError
> 
>
> Key: BEAM-5654
> URL: https://issues.apache.org/jira/browse/BEAM-5654
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Alex Amato
>Assignee: Ahmet Altay
>Priority: Major
>
> We save Dataflow pipelines hitting a key error when hitting this line. This 
> is likely due to a race condition between the key existence check and 
> accessing it.
> [https://github.com/apache/beam/blob/v2.6.0/sdks/python/apache_beam/runners/worker/sdk_worker.py#L189]
> One fix would be to do a single lookup for the value instead of the double 
> lookup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5654) Race condition in python sdk_worker causing KeyError

2018-10-04 Thread Alex Amato (JIRA)
Alex Amato created BEAM-5654:


 Summary: Race condition in python sdk_worker causing KeyError
 Key: BEAM-5654
 URL: https://issues.apache.org/jira/browse/BEAM-5654
 Project: Beam
  Issue Type: New Feature
  Components: sdk-py-core
Reporter: Alex Amato
Assignee: Ahmet Altay


We save Dataflow pipelines hitting a key error when hitting this line. This is 
likely due to a race condition between the key existence check and accessing it.

[https://github.com/apache/beam/blob/v2.6.0/sdks/python/apache_beam/runners/worker/sdk_worker.py#L189]

One fix would be to do a single lookup for the value instead of the double 
lookup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5635) FR: Enable Transactional writes with DatastoreIO

2018-10-03 Thread Alex Amato (JIRA)
Alex Amato created BEAM-5635:


 Summary: FR: Enable Transactional writes with DatastoreIO
 Key: BEAM-5635
 URL: https://issues.apache.org/jira/browse/BEAM-5635
 Project: Beam
  Issue Type: New Feature
  Components: sdk-java-core
Reporter: Alex Amato
Assignee: Kenneth Knowles


I have seen a user who would like to use Datastore Transactions to rollback a 
set of records if one of them fails to write. Let's consider this use case for 
DatastoreIO



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4830) Determine why go vet failures invoked by ./gradlew check were not caught be jenkins build build

2018-07-20 Thread Alex Amato (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550933#comment-16550933
 ] 

Alex Amato commented on BEAM-4830:
--

See related https://issues.apache.org/jira/projects/BEAM/issues/BEAM-4831

> Determine why go vet failures invoked by ./gradlew check  were not caught be 
> jenkins build build
> 
>
> Key: BEAM-4830
> URL: https://issues.apache.org/jira/browse/BEAM-4830
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alex Amato
>Assignee: Luke Cwik
>Priority: Major
>
> The purpose of this is to catch errors developers see when they first start 
> contributing to beam. Let's ensure we run the same commands in the 
> [contributing guide|https://beam.apache.org/contribute/].
>  
> Note: check runs more than build, so we are not catching these problems in 
> the continuous Jenkins testing.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4830) Update Jenkins build to run ./gradlew check instead of just ./gradlew build

2018-07-20 Thread Alex Amato (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550931#comment-16550931
 ] 

Alex Amato commented on BEAM-4830:
--

Well for some reason the go vet issues were not caught by jenkins. We thought 
it was due to check not running. But if we are wrong then it occured for some 
other reason

> Update Jenkins build to run ./gradlew check instead of just ./gradlew build
> ---
>
> Key: BEAM-4830
> URL: https://issues.apache.org/jira/browse/BEAM-4830
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alex Amato
>Assignee: Luke Cwik
>Priority: Major
>
> The purpose of this is to catch errors developers see when they first start 
> contributing to beam. Let's ensure we run the same commands in the 
> [contributing guide|https://beam.apache.org/contribute/].
>  
> Note: check runs more than build, so we are not catching these problems in 
> the continuous Jenkins testing.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4830) Determine why go vet failures invoked by ./gradlew check were not caught be jenkins build build

2018-07-20 Thread Alex Amato (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Amato updated BEAM-4830:
-
Summary: Determine why go vet failures invoked by ./gradlew check  were not 
caught be jenkins build build  (was: Update Jenkins build to run ./gradlew 
check instead of just ./gradlew build)

> Determine why go vet failures invoked by ./gradlew check  were not caught be 
> jenkins build build
> 
>
> Key: BEAM-4830
> URL: https://issues.apache.org/jira/browse/BEAM-4830
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alex Amato
>Assignee: Luke Cwik
>Priority: Major
>
> The purpose of this is to catch errors developers see when they first start 
> contributing to beam. Let's ensure we run the same commands in the 
> [contributing guide|https://beam.apache.org/contribute/].
>  
> Note: check runs more than build, so we are not catching these problems in 
> the continuous Jenkins testing.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4831) Fix broken go vet task in the gradle build

2018-07-19 Thread Alex Amato (JIRA)
Alex Amato created BEAM-4831:


 Summary: Fix broken go vet task in the gradle build
 Key: BEAM-4831
 URL: https://issues.apache.org/jira/browse/BEAM-4831
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Alex Amato
Assignee: Luke Cwik


Reproduce by running

{{./gradlew check}}

{{Today this is failing.}}

{{go vet seems to be trying to validate files in a vendor folder.}}

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4830) Update Jenkins build to run ./gradlew check instead of just ./gradlew build

2018-07-19 Thread Alex Amato (JIRA)
Alex Amato created BEAM-4830:


 Summary: Update Jenkins build to run ./gradlew check instead of 
just ./gradlew build
 Key: BEAM-4830
 URL: https://issues.apache.org/jira/browse/BEAM-4830
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Alex Amato
Assignee: Luke Cwik


The purpose of this is to catch errors developers see when they first start 
contributing to beam. Let's ensure we run the same commands in the 
[contributing guide|https://beam.apache.org/contribute/].

 

Note: check runs more than build, so we are not catching these problems in the 
continuous Jenkins testing.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2018-05-21 Thread Alex Amato (JIRA)
Alex Amato created BEAM-4374:


 Summary: Update existing metrics in the FN API to use new Metric 
Schema
 Key: BEAM-4374
 URL: https://issues.apache.org/jira/browse/BEAM-4374
 Project: Beam
  Issue Type: New Feature
  Components: beam-model
Reporter: Alex Amato
Assignee: Kenneth Knowles


Update existing metrics to use the new proto and cataloging schema defined in:

[_https://s.apache.org/beam-fn-api-metrics_]
 * Check in new protos
 * Define catalog file for metrics
 * Port existing metrics to use this new format, based on catalog names+metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3926) Support MetricsPusher in Dataflow Runner

2018-05-16 Thread Alex Amato (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478254#comment-16478254
 ] 

Alex Amato commented on BEAM-3926:
--

Hi Etienne, I saw your PR for the metrics pusher 
([https://github.com/apache/beam/pull/4548/files])

Its true that the dataflow engine today handles pushing metrics to different 
places inside of its service.

Although, it might be appropriate to have metrics pusher push metrics to the 
dataflow service. It seems like an appropriate use of the layer there. However, 
perhaps your design assumes metrics are already aggregated before pushing. 
Dataflow expects workers to push metrics (local value for the worker) to the 
service, which aggregates them together.

Metrics pusher relies on a metrics container to exist on a cloud hosted engine 
to collected these already aggregated metrics? Then it pushes to where ever 
appropriate correct? If this is the case, then you're right that metrics pusher 
would need to be implemented in the Dataflow service, ideally accounting for 
the options/sinks you have specified.

Though, perhaps a design is possible to send the pre aggregated metrics back to 
a worker (by querying them from the service) and then use the same 
MetricsPusher.

> Support MetricsPusher in Dataflow Runner
> 
>
> Key: BEAM-3926
> URL: https://issues.apache.org/jira/browse/BEAM-3926
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Scott Wegner
>Assignee: Pablo Estrada
>Priority: Major
>
> See [relevant email 
> thread|https://lists.apache.org/thread.html/2e87f0adcdf8d42317765f298e3e6fdba72917a72d4a12e71e67e4b5@%3Cdev.beam.apache.org%3E].
>  From [~echauchot]:
>   
> _AFAIK Dataflow being a cloud hosted engine, the related runner is very 
> different from the others. It just submits a job to the cloud hosted engine. 
> So, no access to metrics container etc... from the runner. So I think that 
> the MetricsPusher (component responsible for merging metrics and pushing them 
> to a sink backend) must not be instanciated in DataflowRunner otherwise it 
> would be more a client (driver) piece of code and we will lose all the 
> interest of being close to the execution engine (among other things 
> instrumentation of the execution of the pipelines).  I think that the 
> MetricsPusher needs to be instanciated in the actual Dataflow engine._
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread Alex Amato (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425894#comment-16425894
 ] 

Alex Amato commented on BEAM-3250:
--

FWIW, here is what I learned yesterday. 

While looking to migrate:

job_beam_PostCommit_Java_MavenInstall.groovy

to a PostCommit file, i.e.

job_beam_PostCommit_Java_GradleBuild.groovy

I discovered that the gradle precommit test was actually running most of the 
integration tests

beam_PreCommit_Java_GradleBuild

It was not running the apexRunnerTests, due to a bug listed in the 
examples/build.gradle file (which has a few functions to invoke all the 
integration test variants).

Maven tagged tests as integration-test but gradle does noto, so we need to 
define functions/tasks in gradle to pick out the integration tests and just run 
those in postcommit.

 

Though I think for this one, we can just make a postCommit file which does 
basically the same thing as 

beam_PreCommit_Java_GradleBuild, which appears to be running integration tests 
which Maven was doing before. 

I believe Dan sent a PR for one of the PostCommit tests out yesterday.

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-03 Thread Alex Amato (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Amato reassigned BEAM-3250:


Assignee: Alex Amato  (was: Ben Sidhom)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Alex Amato
>Priority: Major
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3139) Update dataflow.version in beam root pom.xml

2017-11-03 Thread Alex Amato (JIRA)
Alex Amato created BEAM-3139:


 Summary: Update dataflow.version in beam root pom.xml
 Key: BEAM-3139
 URL: https://issues.apache.org/jira/browse/BEAM-3139
 Project: Beam
  Issue Type: Bug
  Components: runner-core, runner-dataflow
Reporter: Alex Amato
Assignee: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)