[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139946 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 31/Aug/18 04:52 Start Date: 31/Aug/18 04:52 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #6309: [BEAM-5274][SQL] Check if iterator.hasNext URL: https://github.com/apache/beam/pull/6309#issuecomment-417551085 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139946) Time Spent: 1.5h (was: 1h 20m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5264) Reference DirectRunner implementation of Python user state and timers API
[ https://issues.apache.org/jira/browse/BEAM-5264?focusedWorklogId=139933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139933 ] ASF GitHub Bot logged work on BEAM-5264: Author: ASF GitHub Bot Created on: 31/Aug/18 01:42 Start Date: 31/Aug/18 01:42 Worklog Time Spent: 10m Work Description: charlesccychen commented on issue #6304: [BEAM-5264] Reference DirectRunner implementation of Python User State and Timers API URL: https://github.com/apache/beam/pull/6304#issuecomment-417522850 run python postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139933) Time Spent: 20m (was: 10m) > Reference DirectRunner implementation of Python user state and timers API > - > > Key: BEAM-5264 > URL: https://issues.apache.org/jira/browse/BEAM-5264 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.6.0 >Reporter: Charles Chen >Assignee: Charles Chen >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > This issue tracks the reference DirectRunner implementation of the Beam > Python User State and Timer API, described here: > [https://s.apache.org/beam-python-user-state-and-timers]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139928 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 31/Aug/18 00:38 Start Date: 31/Aug/18 00:38 Worklog Time Spent: 10m Work Description: udim commented on issue #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#issuecomment-417512291 @pabloem I'll try to "run seed job" and verify that the precommit runs on Jenkins. I'll try that tomorrow since builds.apache.org is still down. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139928) Time Spent: 7.5h (was: 7h 20m) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139929 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 31/Aug/18 00:38 Start Date: 31/Aug/18 00:38 Worklog Time Spent: 10m Work Description: udim edited a comment on issue #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#issuecomment-417512291 @pabloem I'll try to "run seed job" and verify that the precommit runs on Jenkins. I'll try that tomorrow since builds.apache.org is still down. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139929) Time Spent: 7h 40m (was: 7.5h) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] 01/01: Merge pull request #6312: Add min_cpu_platform pipeline option
This is an automated email from the ASF dual-hosted git repository. chamikara pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 8030204647ff2e13c8465ecca0f96ca82ab7c761 Merge: 568c96a c75dd4a Author: Chamikara Jayalath AuthorDate: Thu Aug 30 17:03:22 2018 -0700 Merge pull request #6312: Add min_cpu_platform pipeline option sdks/python/apache_beam/options/pipeline_options.py | 6 ++ sdks/python/apache_beam/runners/dataflow/dataflow_runner.py | 11 +++ 2 files changed, 17 insertions(+)
[beam] branch master updated (568c96a -> 8030204)
This is an automated email from the ASF dual-hosted git repository. chamikara pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 568c96a Config Gradle javadoc with UTF-8 encoding add c75dd4a Add min_cpu_platform pipeline option new 8030204 Merge pull request #6312: Add min_cpu_platform pipeline option The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: sdks/python/apache_beam/options/pipeline_options.py | 6 ++ sdks/python/apache_beam/runners/dataflow/dataflow_runner.py | 11 +++ 2 files changed, 17 insertions(+)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139924 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 23:58 Start Date: 30/Aug/18 23:58 Worklog Time Spent: 10m Work Description: akedin commented on issue #6309: [BEAM-5274][SQL] Check if iterator.hasNext URL: https://github.com/apache/beam/pull/6309#issuecomment-417505481 @apilloud I assume you've looked at the updated PR. > LGTM, but there is an implied // this should never happen here. I don't think that's an implication with `.hasNext()`. "`.hasNext()+.next()`" is the contract of the iterator interface and it's a good practice to guard `.next()` with `.hasNext()`. In this case it's similar to if you needed to check `if (counters.size() > 0)` before `counters.get(0)`, which you probably want to always check unless there's clear guarantee of non-emptiness. In this case there's no clear guarantee of the non-emptiness so I think that it's correct to handle the emptiness. I don't know whether lack of metrics always means that `count = 0`, but to me it seems reasonable to assume this. I agree that it's unclear from just reading this line of code what the behavior of metrics is expected to be for an empty pipeline, or if there are other cases when metrics can be empty. My opinion though is that ambiguity in this case is caused by the fact that iterator is picked as an interface to extract metrics and not by how we handle it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139924) Time Spent: 1h 20m (was: 1h 10m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] branch release-2.8.0-lyft deleted (was ff645a2)
This is an automated email from the ASF dual-hosted git repository. thw pushed a change to branch release-2.8.0-lyft in repository https://gitbox.apache.org/repos/asf/beam.git. was ff645a2 [LYFT] Build support for 2.8 branch. This change permanently discards the following revisions: discard ff645a2 [LYFT] Build support for 2.8 branch.
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139921 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 23:20 Start Date: 30/Aug/18 23:20 Worklog Time Spent: 10m Work Description: apilloud commented on issue #6309: [BEAM-5274][SQL] Check if iterator.hasNext URL: https://github.com/apache/beam/pull/6309#issuecomment-417498570 LGTM, but there is an implied `// this should never happen` here. You found this in the direct runner, I wonder what other runners do in regards to metrics when when you run a pipeline with no elements. (Could this be a bug in the direct runner?) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139921) Time Spent: 1h 10m (was: 1h) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5258) Investigate if we can disable Row type flattening in Calcite
[ https://issues.apache.org/jira/browse/BEAM-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang reassigned BEAM-5258: -- Assignee: (was: Rui Wang) > Investigate if we can disable Row type flattening in Calcite > > > Key: BEAM-5258 > URL: https://issues.apache.org/jira/browse/BEAM-5258 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Rui Wang >Priority: Major > > Either disable the flattening in PlannerImpl or Flattener could be a good > start. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] 01/01: Config Gradle javadoc with UTF-8 encoding
This is an automated email from the ASF dual-hosted git repository. lcwik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 568c96ac189571f838e152c8044a3429a4db9493 Merge: 64ec7bd 778d498 Author: Lukasz Cwik AuthorDate: Thu Aug 30 16:00:43 2018 -0700 Config Gradle javadoc with UTF-8 encoding .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy| 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
[beam] branch master updated (64ec7bd -> 568c96a)
This is an automated email from the ASF dual-hosted git repository. lcwik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 64ec7bd Merge pull request #6247 from cclauss/patch-1 add 778d498 Config Gradle javadoc new 568c96a Config Gradle javadoc with UTF-8 encoding The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy| 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()
[ https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598020#comment-16598020 ] Reuven Lax commented on BEAM-5036: -- Great. We should also fix GCS to use rewrite instead of copy/rename (I think GCS rewrite didn't exist back when this code was originally written), though that should probably be in a separate PR. On Thu, Aug 30, 2018 at 3:13 PM Tim Robertson (JIRA) > Optimize FileBasedSink's WriteOperation.moveToOutput() > -- > > Key: BEAM-5036 > URL: https://issues.apache.org/jira/browse/BEAM-5036 > Project: Beam > Issue Type: Improvement > Components: io-java-files >Affects Versions: 2.5.0 >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > moveToOutput() methods in FileBasedSink.WriteOperation implements move by > copy+delete. It would be better to use a rename() which can be much more > effective for some filesystems. > Filesystem must support cross-directory rename. BEAM-4861 is related to this > for the case of HDFS filesystem. > Feature was discussed here: > http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139915 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:41 Start Date: 30/Aug/18 22:41 Worklog Time Spent: 10m Work Description: amaliujia commented on a change in pull request #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309#discussion_r214202380 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -304,7 +305,11 @@ private static Object fieldToAvatica(Schema.FieldType type, Object beamValue) { MetricsFilter.builder() .addNameFilter(MetricNameFilter.named(BeamEnumerableConverter.class, "rows")) .build()); - count = metrics.getCounters().iterator().next().getAttempted(); + try { +count = metrics.getCounters().iterator().next().getAttempted(); + } catch (NoSuchElementException e) { Review comment: Updated. Thought about it. Your approach is better from the perspective of avoiding catching other NoSuchElementExceptions that are not caused by this issue. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139915) Time Spent: 1h (was: 50m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] branch master updated (865b147 -> 64ec7bd)
This is an automated email from the ASF dual-hosted git repository. ccy pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 865b147 Merge pull request #6291 from pabloem/for-test-ss add 9506b7a tox.ini: Upgrade to current versions of PyLint add 60ad2d4 PyLint 1.9.3 --> 1.9.2 add 4450af9 Add failing tests to .pylintrc disable section add b8cfbd1 Add disable directive logging-not-lazy to .pylintrc add c74b1d9 Drop back to PyLint v1.8 when the --py3k flag is used add db50be4 Drop back to PyLint v1.8 when the --py3k flag is used add 0a2de30 Drop back to PyLint v1.7 when the --py3k flag is used add 12e87dd flake8 --exclude={toxinidir}/build/gradleenv add 085a974 Put a space between --exclude= and --select= add 1123909 Remove the unnecessary disable directives from .pylintrc add 6bd3277 Remove flake8 experimentation new 64ec7bd Merge pull request #6247 from cclauss/patch-1 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: sdks/python/.pylintrc | 7 +-- sdks/python/tox.ini | 12 ++-- 2 files changed, 11 insertions(+), 8 deletions(-)
[beam] 01/01: Merge pull request #6247 from cclauss/patch-1
This is an automated email from the ASF dual-hosted git repository. ccy pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 64ec7bd9612353555fc53e9a88f4c747e042fdaf Merge: 865b147 6bd3277 Author: Charles Chen AuthorDate: Thu Aug 30 15:41:18 2018 -0700 Merge pull request #6247 from cclauss/patch-1 tox.ini: Upgrade to current versions of PyLint sdks/python/.pylintrc | 7 +-- sdks/python/tox.ini | 12 ++-- 2 files changed, 11 insertions(+), 8 deletions(-)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139913 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:32 Start Date: 30/Aug/18 22:32 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309#discussion_r214200886 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -304,7 +305,11 @@ private static Object fieldToAvatica(Schema.FieldType type, Object beamValue) { MetricsFilter.builder() .addNameFilter(MetricNameFilter.named(BeamEnumerableConverter.class, "rows")) .build()); - count = metrics.getCounters().iterator().next().getAttempted(); + try { +count = metrics.getCounters().iterator().next().getAttempted(); + } catch (NoSuchElementException e) { Review comment: i think it's better to guard `.next()` with `.hasNext()` instead of catching the exception This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139913) Time Spent: 50m (was: 40m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139909 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:20 Start Date: 30/Aug/18 22:20 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309#issuecomment-417486000 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139909) Time Spent: 0.5h (was: 20m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139910 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:20 Start Date: 30/Aug/18 22:20 Worklog Time Spent: 10m Work Description: amaliujia removed a comment on issue #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309#issuecomment-417486000 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139910) Time Spent: 40m (was: 0.5h) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (BEAM-5268) Ardagan's test ticket
[ https://issues.apache.org/jira/browse/BEAM-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin closed BEAM-5268. --- Resolution: Invalid Fix Version/s: Not applicable > Ardagan's test ticket > -- > > Key: BEAM-5268 > URL: https://issues.apache.org/jira/browse/BEAM-5268 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139907 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:19 Start Date: 30/Aug/18 22:19 Worklog Time Spent: 10m Work Description: amaliujia opened a new pull request #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309 I found when `INSERT INTO empty_table FROM empty_table", SQL Shell throws NoSuchElementException, which it shouldn't throw. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | --- | --- | --- | --- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139907) Time Spent: 10m Remaining Estimate: 0h > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang
[jira] [Work logged] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
[ https://issues.apache.org/jira/browse/BEAM-5274?focusedWorklogId=139908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139908 ] ASF GitHub Bot logged work on BEAM-5274: Author: ASF GitHub Bot Created on: 30/Aug/18 22:19 Start Date: 30/Aug/18 22:19 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #6309: [BEAM-5274][SQL] Catch NoSuchElementException URL: https://github.com/apache/beam/pull/6309#issuecomment-417485877 R: @apilloud CC: @akedin This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139908) Time Spent: 20m (was: 10m) > Handle NoSuchElementException When select from an empty table and insert into > another table > --- > > Key: BEAM-5274 > URL: https://issues.apache.org/jira/browse/BEAM-5274 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5275) Beam SQL RAND(seed) function isn't compatible with dofn
Andrew Pilloud created BEAM-5275: Summary: Beam SQL RAND(seed) function isn't compatible with dofn Key: BEAM-5275 URL: https://issues.apache.org/jira/browse/BEAM-5275 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Andrew Pilloud Assignee: Xu Mingmin We currently wrap the RAND(seed) operator with a normal dofn, but that isn't actually valid because we don't process the rows in a single instance of that dofn. We should revisit if this is something we even want to support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()
[ https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597976#comment-16597976 ] Tim Robertson commented on BEAM-5036: - Thanks [~reuvenlax] - that was in response to my concern about the files already existing right? It doesn't affect whether we use copy/delete or rename approach, or am I missing something? I have added {{FileAlreadyExistsException}} in the [PR for changing HDFSFileSystem.rename()|https://github.com/apache/beam/pull/6285]. With that we can handle the case of failure when the destination already exists, delete it and retry thus forcing the overwrite. Together with the IGNORE_MISSING_FILES that should be as idempotent as we can achieve I think. Sound reasonable? > Optimize FileBasedSink's WriteOperation.moveToOutput() > -- > > Key: BEAM-5036 > URL: https://issues.apache.org/jira/browse/BEAM-5036 > Project: Beam > Issue Type: Improvement > Components: io-java-files >Affects Versions: 2.5.0 >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > moveToOutput() methods in FileBasedSink.WriteOperation implements move by > copy+delete. It would be better to use a rename() which can be much more > effective for some filesystems. > Filesystem must support cross-directory rename. BEAM-4861 is related to this > for the case of HDFS filesystem. > Feature was discussed here: > http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5274) Handle NoSuchElementException When select from an empty table and insert into another table
Rui Wang created BEAM-5274: -- Summary: Handle NoSuchElementException When select from an empty table and insert into another table Key: BEAM-5274 URL: https://issues.apache.org/jira/browse/BEAM-5274 Project: Beam Issue Type: Improvement Components: dsl-sql Reporter: Rui Wang Assignee: Rui Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4704) String operations yield incorrect results when executed through SQL shell
[ https://issues.apache.org/jira/browse/BEAM-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597973#comment-16597973 ] Andrew Pilloud commented on BEAM-4704: -- We can't override calcite's implementation if calcite is generating our code. Fixing is in calcite blocks BEAM-5112 > String operations yield incorrect results when executed through SQL shell > - > > Key: BEAM-4704 > URL: https://issues.apache.org/jira/browse/BEAM-4704 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kenneth Knowles >Priority: Major > > {{TRIM}} is defined to trim _all_ the characters in the first string from the > string-to-be-trimmed. Calcite has an incorrect implementation of this. We use > our own fixed implementation. But when executed through the SQL shell, the > results do not match what we get from the PTransform path. Here two test > cases that pass on {{master}} but are incorrect in the shell: > {code:sql} > BeamSQL> select TRIM(LEADING 'eh' FROM 'hehe__hehe'); > ++ > | EXPR$0 | > ++ > | hehe__hehe | > ++ > {code} > {code:sql} > BeamSQL> select TRIM(TRAILING 'eh' FROM 'hehe__hehe'); > ++ > | EXPR$0 | > ++ > | hehe__heh | > ++ > {code} > {code:sql} > BeamSQL> select TRIM(BOTH 'eh' FROM 'hehe__hehe'); > ++ > | EXPR$0 | > ++ > | hehe__heh | > ++ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4861) Hadoop Filesystem silently fails
[ https://issues.apache.org/jira/browse/BEAM-4861?focusedWorklogId=139899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139899 ] ASF GitHub Bot logged work on BEAM-4861: Author: ASF GitHub Bot Created on: 30/Aug/18 21:55 Start Date: 30/Aug/18 21:55 Worklog Time Spent: 10m Work Description: timrobertson100 commented on issue #6285: [BEAM-4861] Autocreate directories when doing an HDFS rename URL: https://github.com/apache/beam/pull/6285#issuecomment-417479658 @reuvenlax when builds.apache.org is back, can you PTAL? This is added to enable us to proceed with BEAM-5036, but I believe this is a good addition regardless of what we do on that ticket. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139899) Time Spent: 0.5h (was: 20m) > Hadoop Filesystem silently fails > > > Key: BEAM-4861 > URL: https://issues.apache.org/jira/browse/BEAM-4861 > Project: Beam > Issue Type: Bug > Components: io-java-hadoop >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Hi, > beam Filesystem operations copy, rename and delete are void in SDK. Hadoop > native filesystem operations are not and returns void. Current implementation > in Beam ignores the result and pass as long as exception is not thrown. > I got burned by this when using 'rename' to do a 'move' operation on HDFS. If > target directory does not exists, operations returns false and do not touch > the file. > [https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java#L148] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4861) Hadoop Filesystem silently fails
[ https://issues.apache.org/jira/browse/BEAM-4861?focusedWorklogId=139898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139898 ] ASF GitHub Bot logged work on BEAM-4861: Author: ASF GitHub Bot Created on: 30/Aug/18 21:50 Start Date: 30/Aug/18 21:50 Worklog Time Spent: 10m Work Description: timrobertson100 commented on issue #6285: [BEAM-4861] Autocreate directories when doing an HDFS rename URL: https://github.com/apache/beam/pull/6285#issuecomment-417478467 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139898) Time Spent: 20m (was: 10m) > Hadoop Filesystem silently fails > > > Key: BEAM-4861 > URL: https://issues.apache.org/jira/browse/BEAM-4861 > Project: Beam > Issue Type: Bug > Components: io-java-hadoop >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Hi, > beam Filesystem operations copy, rename and delete are void in SDK. Hadoop > native filesystem operations are not and returns void. Current implementation > in Beam ignores the result and pass as long as exception is not thrown. > I got burned by this when using 'rename' to do a 'move' operation on HDFS. If > target directory does not exists, operations returns false and do not touch > the file. > [https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java#L148] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5273) Local file system does not work as expected on Portability Framework with Docker
Ankur Goenka created BEAM-5273: -- Summary: Local file system does not work as expected on Portability Framework with Docker Key: BEAM-5273 URL: https://issues.apache.org/jira/browse/BEAM-5273 Project: Beam Issue Type: Bug Components: sdk-go, sdk-java-harness, sdk-py-harness Reporter: Ankur Goenka Assignee: Ankur Goenka With portability framework, the local file system reads and write to the docker container file system. This makes usage of local files impossible with portability framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5272) Randomize the reduced splits in BigtableIO so that multiple workers may not hit the same tablet server
[ https://issues.apache.org/jira/browse/BEAM-5272?focusedWorklogId=139887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139887 ] ASF GitHub Bot logged work on BEAM-5272: Author: ASF GitHub Bot Created on: 30/Aug/18 21:14 Start Date: 30/Aug/18 21:14 Worklog Time Spent: 10m Work Description: kevinsi4508 opened a new pull request #6308: [BEAM-5272] Randomize the reduced splits in BigtableIO so that multiple workers may not hit the same tablet server URL: https://github.com/apache/beam/pull/6308 Randomize the reduced splits so that multiple workers may not hit the same tablet server Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | --- | --- | --- | --- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139887) Time Spent: 10m Remaining Estimate: 0h > Randomize the reduced splits in BigtableIO so that multiple workers may not > hit the same tablet server > -- > > Key: BEAM-5272 > URL: https://issues.apache.org/jira/browse/BEAM-5272 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Kevin Si >Assignee: Chamikara Jayalath >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Randomize the reduced splits in BigtableIO so that multiple workers may not > hit the same tablet server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5272) Randomize the reduced splits in BigtableIO so that multiple workers may not hit the same tablet server
Kevin Si created BEAM-5272: -- Summary: Randomize the reduced splits in BigtableIO so that multiple workers may not hit the same tablet server Key: BEAM-5272 URL: https://issues.apache.org/jira/browse/BEAM-5272 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Kevin Si Assignee: Chamikara Jayalath Randomize the reduced splits in BigtableIO so that multiple workers may not hit the same tablet server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] branch master updated (5720c1d -> 865b147)
This is an automated email from the ASF dual-hosted git repository. ccy pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 5720c1d [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution (#6287) add d21d328 Adding for_test utility function for state sampler add 7ead7b8 Addressing comments new 865b147 Merge pull request #6291 from pabloem/for-test-ss The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: sdks/python/apache_beam/runners/worker/statesampler.py | 6 ++ 1 file changed, 6 insertions(+)
[beam] 01/01: Merge pull request #6291 from pabloem/for-test-ss
This is an automated email from the ASF dual-hosted git repository. ccy pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 865b14781ed28bab80be8db49843f9d2d3d15527 Merge: 5720c1d 7ead7b8 Author: Charles Chen AuthorDate: Thu Aug 30 14:04:34 2018 -0700 Merge pull request #6291 from pabloem/for-test-ss Adding for_test utility function for state sampler sdks/python/apache_beam/runners/worker/statesampler.py | 6 ++ 1 file changed, 6 insertions(+)
[jira] [Created] (BEAM-5271) Support INSERT OVERWRITE Statement
Rui Wang created BEAM-5271: -- Summary: Support INSERT OVERWRITE Statement Key: BEAM-5271 URL: https://issues.apache.org/jira/browse/BEAM-5271 Project: Beam Issue Type: Improvement Components: dsl-sql Reporter: Rui Wang We can support INSERT OVERWRITE to insert to a table but overwrite existing table contents. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139882 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 20:32 Start Date: 30/Aug/18 20:32 Worklog Time Spent: 10m Work Description: udim commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214170949 ## File path: website/Dockerfile ## @@ -0,0 +1,33 @@ +### +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +### + +# This image contains Ruby and dependencies required to build and test the Beam +# website. It is used by tasks in build.gradle. + +FROM ruby:2.5 + +WORKDIR /ruby +RUN gem install bundler +# Update buildDockerImage's inputs.files if you change this list. +ADD Gemfile Gemfile.lock /ruby/ +RUN bundle install --deployment --path $GEM_HOME + +# Required for website testing using HTMLProofer. +ENV LC_ALL C.UTF-8 + +CMD sleep 3600 Review comment: Used by `bundle install` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139882) Time Spent: 7h 20m (was: 7h 10m) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139863 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 19:56 Start Date: 30/Aug/18 19:56 Worklog Time Spent: 10m Work Description: udim commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214160845 ## File path: website/Rakefile ## @@ -3,16 +3,18 @@ require 'html-proofer' require 'etc' task :test do - FileUtils.rm_rf('./.testcontent') - sh "bundle exec jekyll build --config _config.yml,_config_test.yml" - HTMLProofer.check_directory("./.testcontent", { + HTMLProofer.check_directory("./content", { :typhoeus => { :timeout => 60, :connecttimeout => 40 }, :allow_hash_href => true, :check_html => true, :file_ignore => [/javadoc/, /v2/, /pydoc/], :url_ignore => [ +# Javadocs and Pydocs are only available on asf-site branch +/documentation\/sdks\/javadoc/, +/documentation\/sdks\/pydoc/, Review comment: I'm not sure yet where the generated docs go, it's going to either be apache/beam-site or directly push to gitbox. TBD when I get responses from INFRA. This bug tracks generation of java and pydocs: https://issues.apache.org/jira/browse/BEAM-4498 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139863) Time Spent: 7h 10m (was: 7h) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139862 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 19:56 Start Date: 30/Aug/18 19:56 Worklog Time Spent: 10m Work Description: udim commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214159508 ## File path: website/Dockerfile ## @@ -0,0 +1,33 @@ +### +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +### + +# This image contains Ruby and dependencies required to build and test the Beam +# website. It is used by tasks in build.gradle. + +FROM ruby:2.5 + +WORKDIR /ruby +RUN gem install bundler +# Update buildDockerImage's inputs.files if you change this list. +ADD Gemfile Gemfile.lock /ruby/ +RUN bundle install --deployment --path $GEM_HOME + +# Required for website testing using HTMLProofer. +ENV LC_ALL C.UTF-8 + +CMD sleep 3600 Review comment: Yes, that's specified in https://github.com/apache/beam/blob/master/website/Gemfile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139862) Time Spent: 7h 10m (was: 7h) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4498) Migrate release Javadocs / Pydocs to [asf-site] branch and update release guide
[ https://issues.apache.org/jira/browse/BEAM-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597857#comment-16597857 ] Udi Meiri commented on BEAM-4498: - Release script: https://github.com/apache/beam/blob/5720c1d22771a65ad5d7be6a06ad8aa0754fa64b/release/src/main/scripts/build_release_candidate.sh#L224 > Migrate release Javadocs / Pydocs to [asf-site] branch and update release > guide > --- > > Key: BEAM-4498 > URL: https://issues.apache.org/jira/browse/BEAM-4498 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Scott Wegner >Assignee: Scott Wegner >Priority: Major > Labels: beam-site-automation-reliability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5270) Finish Python 3 porting for coders subpackage
Robbe created BEAM-5270: --- Summary: Finish Python 3 porting for coders subpackage Key: BEAM-5270 URL: https://issues.apache.org/jira/browse/BEAM-5270 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Robbe Assignee: Robbe -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4511) Create a tox environment that uses Py3 interpreter for pre/post commit test suites, once codebase supports Py3.
[ https://issues.apache.org/jira/browse/BEAM-4511?focusedWorklogId=139839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139839 ] ASF GitHub Bot logged work on BEAM-4511: Author: ASF GitHub Bot Created on: 30/Aug/18 19:01 Start Date: 30/Aug/18 19:01 Worklog Time Spent: 10m Work Description: RobbeSneyders commented on issue #6266: [BEAM-4511] added py3 tox env for first test URL: https://github.com/apache/beam/pull/6266#issuecomment-417431322 I've added a change which also regenerates the proto files when gen_protos.py has been updated, which is needed to apply the new changes automatically. I can also confirm that having "py3" in the testenv name uses the default python 3 version on my machine. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139839) Time Spent: 40m (was: 0.5h) > Create a tox environment that uses Py3 interpreter for pre/post commit test > suites, once codebase supports Py3. > > > Key: BEAM-4511 > URL: https://issues.apache.org/jira/browse/BEAM-4511 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Matthias Feys >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5269) Create integration tests for BigQueryIORead pipeline
yifan zou created BEAM-5269: --- Summary: Create integration tests for BigQueryIORead pipeline Key: BEAM-5269 URL: https://issues.apache.org/jira/browse/BEAM-5269 Project: Beam Issue Type: Bug Components: testing Reporter: yifan zou Assignee: yifan zou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] branch master updated: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution (#6287)
This is an automated email from the ASF dual-hosted git repository. thw pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 5720c1d [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution (#6287) 5720c1d is described below commit 5720c1d22771a65ad5d7be6a06ad8aa0754fa64b Author: Maximilian Michels AuthorDate: Thu Aug 30 20:02:45 2018 +0200 [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution (#6287) --- .../control/DockerJobBundleFactory.java| 266 ++--- ...undleFactory.java => JobBundleFactoryBase.java} | 137 +++ .../control/ProcessJobBundleFactory.java | 84 +++ .../environment/ProcessEnvironment.java| 77 ++ .../environment/ProcessEnvironmentFactory.java | 157 .../fnexecution/environment/ProcessManager.java| 225 + .../control/ProcessJobBundleFactoryTest.java | 195 +++ .../environment/ProcessEnvironmentFactoryTest.java | 127 ++ .../environment/ProcessEnvironmentTest.java| 44 .../environment/ProcessManagerTest.java| 103 10 files changed, 1056 insertions(+), 359 deletions(-) diff --git a/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DockerJobBundleFactory.java b/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DockerJobBundleFactory.java index 1e7f48b..3178a2e 100644 --- a/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DockerJobBundleFactory.java +++ b/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DockerJobBundleFactory.java @@ -18,46 +18,19 @@ package org.apache.beam.runners.fnexecution.control; import com.google.common.annotations.VisibleForTesting; -import com.google.common.cache.CacheBuilder; -import com.google.common.cache.CacheLoader; -import com.google.common.cache.LoadingCache; -import com.google.common.cache.RemovalNotification; -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.Iterables; import com.google.common.net.HostAndPort; -import java.io.IOException; -import java.util.Map; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicInteger; import java.util.concurrent.atomic.AtomicReference; import javax.annotation.concurrent.ThreadSafe; -import org.apache.beam.model.fnexecution.v1.BeamFnApi.Target; -import org.apache.beam.model.pipeline.v1.RunnerApi.Environment; -import org.apache.beam.runners.core.construction.graph.ExecutableStage; -import org.apache.beam.runners.fnexecution.GrpcContextHeaderAccessorProvider; import org.apache.beam.runners.fnexecution.GrpcFnServer; import org.apache.beam.runners.fnexecution.ServerFactory; import org.apache.beam.runners.fnexecution.artifact.ArtifactRetrievalService; -import org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService; -import org.apache.beam.runners.fnexecution.control.ProcessBundleDescriptors.ExecutableProcessBundleDescriptor; -import org.apache.beam.runners.fnexecution.control.SdkHarnessClient.BundleProcessor; -import org.apache.beam.runners.fnexecution.data.GrpcDataService; import org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory; import org.apache.beam.runners.fnexecution.environment.EnvironmentFactory; -import org.apache.beam.runners.fnexecution.environment.RemoteEnvironment; import org.apache.beam.runners.fnexecution.logging.GrpcLoggingService; -import org.apache.beam.runners.fnexecution.logging.Slf4jLogWriter; import org.apache.beam.runners.fnexecution.provisioning.JobInfo; import org.apache.beam.runners.fnexecution.provisioning.StaticGrpcProvisionService; -import org.apache.beam.runners.fnexecution.state.GrpcStateService; -import org.apache.beam.runners.fnexecution.state.StateRequestHandler; -import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.fn.IdGenerator; -import org.apache.beam.sdk.fn.IdGenerators; -import org.apache.beam.sdk.fn.data.FnDataReceiver; -import org.apache.beam.sdk.fn.stream.OutboundObserverFactory; -import org.apache.beam.sdk.util.WindowedValue; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -67,7 +40,7 @@ import org.slf4j.LoggerFactory; * thread-safe. Instead, a new stage factory should be created for each client. */ @ThreadSafe -public class DockerJobBundleFactory implements JobBundleFactory { +public class DockerJobBundleFactory extends JobBundleFactoryBase { private static final Logger LOG = LoggerFactory.getLogger(DockerJobBundleFactory.class); // Port offset for MacOS since we don't have host networking and need to use published ports @@ -77,7 +50,7 @@ public class DockerJobBundleFactory implements
[jira] [Assigned] (BEAM-5250) Python Wordcount fails with Flink portable streaming
[ https://issues.apache.org/jira/browse/BEAM-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels reassigned BEAM-5250: Assignee: Maximilian Michels > Python Wordcount fails with Flink portable streaming > > > Key: BEAM-5250 > URL: https://issues.apache.org/jira/browse/BEAM-5250 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Maximilian Michels >Priority: Major > Labels: portability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139800 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 17:50 Start Date: 30/Aug/18 17:50 Worklog Time Spent: 10m Work Description: mxm commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417408812 @tweise Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139800) Time Spent: 6h 40m (was: 6.5h) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6h 40m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5268) Ardagan's test ticket
[ https://issues.apache.org/jira/browse/BEAM-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin reassigned BEAM-5268: --- Assignee: Mikhail Gryzykhin > Ardagan's test ticket > -- > > Key: BEAM-5268 > URL: https://issues.apache.org/jira/browse/BEAM-5268 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5268) Ardagan's test ticket
Mikhail Gryzykhin created BEAM-5268: --- Summary: Ardagan's test ticket Key: BEAM-5268 URL: https://issues.apache.org/jira/browse/BEAM-5268 Project: Beam Issue Type: Bug Components: test-failures Reporter: Mikhail Gryzykhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5214) Update Java quickstart to use maven
[ https://issues.apache.org/jira/browse/BEAM-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597670#comment-16597670 ] Bruce Arctor commented on BEAM-5214: Should this be 'Update Java quickstart to use gradle'??? > Update Java quickstart to use maven > --- > > Key: BEAM-5214 > URL: https://issues.apache.org/jira/browse/BEAM-5214 > Project: Beam > Issue Type: Bug > Components: examples-java, website >Reporter: Robert Bradshaw >Assignee: Reuven Lax >Priority: Major > > The existing quickstart still uses mvn commands. > https://beam.apache.org/get-started/quickstart-java/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139788 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 16:52 Start Date: 30/Aug/18 16:52 Worklog Time Spent: 10m Work Description: tweise commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417389707 Please run: `./gradlew :beam-runners-java-fn-execution:check` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139788) Time Spent: 6.5h (was: 6h 20m) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6.5h > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139785 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 16:43 Start Date: 30/Aug/18 16:43 Worklog Time Spent: 10m Work Description: pabloem commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214101911 ## File path: website/Rakefile ## @@ -3,16 +3,18 @@ require 'html-proofer' require 'etc' task :test do - FileUtils.rm_rf('./.testcontent') - sh "bundle exec jekyll build --config _config.yml,_config_test.yml" - HTMLProofer.check_directory("./.testcontent", { + HTMLProofer.check_directory("./content", { Review comment: Thank you This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139785) Time Spent: 7h (was: 6h 50m) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 7h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139782 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 16:43 Start Date: 30/Aug/18 16:43 Worklog Time Spent: 10m Work Description: pabloem commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214101672 ## File path: website/Dockerfile ## @@ -0,0 +1,33 @@ +### +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +### + +# This image contains Ruby and dependencies required to build and test the Beam +# website. It is used by tasks in build.gradle. + +FROM ruby:2.5 + +WORKDIR /ruby +RUN gem install bundler +# Update buildDockerImage's inputs.files if you change this list. +ADD Gemfile Gemfile.lock /ruby/ +RUN bundle install --deployment --path $GEM_HOME + +# Required for website testing using HTMLProofer. +ENV LC_ALL C.UTF-8 + +CMD sleep 3600 Review comment: the other way around, does the ruby container have jekyll? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139782) Time Spent: 6.5h (was: 6h 20m) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 6.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139783 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 16:43 Start Date: 30/Aug/18 16:43 Worklog Time Spent: 10m Work Description: pabloem commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214101740 ## File path: website/build.gradle ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// Define common lifecycle tasks and artifact types +apply plugin: "base" + +def dockerImageTag = 'beam-website' +def dockerWorkDir = "/repo" +def buildDir = "$project.rootDir/build/website" + +task buildDockerImage(type: Exec) { + inputs.files 'Gemfile', 'Gemfile.lock' + commandLine 'docker', 'build', '-t', dockerImageTag, '.' +} + +task createDockerContainer(type: Exec) { + dependsOn buildDockerImage + standardOutput = new ByteArrayOutputStream() + ext.containerId = { +return standardOutput.toString().trim() + } + commandLine '/bin/bash', '-c', +"docker create -v $project.rootDir:$dockerWorkDir -u \$(id -u):\$(id -g) $dockerImageTag" +} + +task startDockerContainer(type: Exec) { + dependsOn createDockerContainer + ext.containerId = { +return createDockerContainer.containerId() + } + commandLine 'docker', 'start', +"${->createDockerContainer.containerId()}" // Lazily evaluate containerId. +} + +task stopAndRemoveDockerContainer(type: Exec) { + commandLine 'docker', 'rm', '-f', "${->createDockerContainer.containerId()}" +} + +task setupBuildDir(type: Copy) { + from('.') { +include 'Gemfile*' +include 'Rakefile' + } + into buildDir +} + +task cleanWebsite(type: Delete) { + delete buildDir +} +clean.dependsOn cleanWebsite + +task buildWebsite(type: Exec) { + dependsOn startDockerContainer, setupBuildDir + finalizedBy stopAndRemoveDockerContainer + inputs.files 'Gemfile.lock', '_config.yml' + inputs.dir 'src' + outputs.dir "$buildDir/.sass-cache" + outputs.dir "$buildDir/content" + commandLine 'docker', 'exec', '-w', "$dockerWorkDir/build/website", +"${->startDockerContainer.containerId()}", '/bin/bash', '-c', +"""bundle exec jekyll build \ + --config $dockerWorkDir/website/_config.yml \ + --incremental \ + --source $dockerWorkDir/website/src + """ +} +build.dependsOn buildWebsite + +task testWebsite(type: Exec) { + dependsOn startDockerContainer, buildWebsite Review comment: Fair enough : ) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139783) Time Spent: 6h 40m (was: 6.5h) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 6h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4495) Create website pre-commits for apache/beam repository
[ https://issues.apache.org/jira/browse/BEAM-4495?focusedWorklogId=139784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139784 ] ASF GitHub Bot logged work on BEAM-4495: Author: ASF GitHub Bot Created on: 30/Aug/18 16:43 Start Date: 30/Aug/18 16:43 Worklog Time Spent: 10m Work Description: pabloem commented on a change in pull request #6282: [BEAM-4495] Website pre-commit job URL: https://github.com/apache/beam/pull/6282#discussion_r214101823 ## File path: website/Rakefile ## @@ -3,16 +3,18 @@ require 'html-proofer' require 'etc' Review comment: fair enough. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139784) Time Spent: 6h 50m (was: 6h 40m) > Create website pre-commits for apache/beam repository > - > > Key: BEAM-4495 > URL: https://issues.apache.org/jira/browse/BEAM-4495 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Scott Wegner >Assignee: Udi Meiri >Priority: Major > Labels: beam-site-automation-reliability > Time Spent: 6h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[beam] 01/01: Merge pull request #6302 from boyuanzz/add_new_dependency
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit f469471cd850eba77fe7b8e4d3e385a6fb75416c Merge: 6497b0b 71491db Author: Pablo AuthorDate: Thu Aug 30 09:41:49 2018 -0700 Merge pull request #6302 from boyuanzz/add_new_dependency Add boyuanzz as a owner of java powermock dependency ownership/JAVA_DEPENDENCY_OWNERS.yaml | 5 + 1 file changed, 5 insertions(+)
[beam] branch master updated (6497b0b -> f469471)
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 6497b0b Merge pull request #6300 from apilloud/index add 71491db Add boyuanzz as a owner of powermock deps new f469471 Merge pull request #6302 from boyuanzz/add_new_dependency The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: ownership/JAVA_DEPENDENCY_OWNERS.yaml | 5 + 1 file changed, 5 insertions(+)
[jira] [Assigned] (BEAM-4819) Make portable Flink runner JobBundleFactory configurable
[ https://issues.apache.org/jira/browse/BEAM-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise reassigned BEAM-4819: -- Assignee: Maximilian Michels (was: Thomas Weise) > Make portable Flink runner JobBundleFactory configurable > > > Key: BEAM-4819 > URL: https://issues.apache.org/jira/browse/BEAM-4819 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Maximilian Michels >Priority: Major > Labels: portability > > BEAM-4791 introduces factory override for testing, expand that to allow users > to configure a different factory via service loader to adopt alternative > execution environments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5124) Write Euphoria in Beam documentation
[ https://issues.apache.org/jira/browse/BEAM-5124?focusedWorklogId=139770=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139770 ] ASF GitHub Bot logged work on BEAM-5124: Author: ASF GitHub Bot Created on: 30/Aug/18 16:27 Start Date: 30/Aug/18 16:27 Worklog Time Spent: 10m Work Description: pabloem commented on issue #540: [BEAM-5124] DSL Euphoria documentation update URL: https://github.com/apache/beam-site/pull/540#issuecomment-417381391 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139770) Time Spent: 1h 40m (was: 1.5h) > Write Euphoria in Beam documentation > > > Key: BEAM-5124 > URL: https://issues.apache.org/jira/browse/BEAM-5124 > Project: Beam > Issue Type: Sub-task > Components: dsl-euphoria >Reporter: Vaclav Plajt >Assignee: Vaclav Plajt >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3193) CoGroupByKey doesn't work in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-3193?focusedWorklogId=139748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139748 ] ASF GitHub Bot logged work on BEAM-3193: Author: ASF GitHub Bot Created on: 30/Aug/18 15:38 Start Date: 30/Aug/18 15:38 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on a change in pull request #5945: [BEAM-3193] Add SparkCoGroupByKeyStreaming validates runner to test CoGroupByKay bahavior in streaming mode on spark runner URL: https://github.com/apache/beam/pull/5945#discussion_r214079472 ## File path: runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/SparkCoGroupByKeyStreamingTest.java ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.spark.translation.streaming; + +import static org.hamcrest.Matchers.containsInAnyOrder; +import static org.junit.Assert.assertThat; + +import org.apache.beam.runners.spark.ReuseSparkContextRule; +import org.apache.beam.runners.spark.SparkPipelineOptions; +import org.apache.beam.runners.spark.StreamingTest; +import org.apache.beam.runners.spark.io.CreateStream; +import org.apache.beam.sdk.coders.KvCoder; +import org.apache.beam.sdk.coders.VarIntCoder; +import org.apache.beam.sdk.testing.PAssert; +import org.apache.beam.sdk.testing.TestPipeline; +import org.apache.beam.sdk.transforms.SerializableFunction; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.transforms.windowing.FixedWindows; +import org.apache.beam.sdk.transforms.windowing.Window; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.TimestampedValue; +import org.apache.beam.sdk.values.TupleTag; +import org.joda.time.Duration; +import org.joda.time.Instant; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A test that verifies that CoGroupByKey works in streaming mode in spark runner. */ +public class SparkCoGroupByKeyStreamingTest { + + private static final Logger LOG = LoggerFactory.getLogger(SparkCoGroupByKeyStreamingTest.class); Review comment: I think `LOG` is never used after, should be removed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139748) Time Spent: 1h (was: 50m) > CoGroupByKey doesn't work in streaming mode > --- > > Key: BEAM-3193 > URL: https://issues.apache.org/jira/browse/BEAM-3193 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Jean-Baptiste Onofré >Assignee: Etienne Chauchot >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > The CoGroupByKey PTransform doesn't throw an exception but doesn't actually > perform the grouping when used in streaming mode. I will attach a test > pipeline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3193) CoGroupByKey doesn't work in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-3193?focusedWorklogId=139749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139749 ] ASF GitHub Bot logged work on BEAM-3193: Author: ASF GitHub Bot Created on: 30/Aug/18 15:38 Start Date: 30/Aug/18 15:38 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on a change in pull request #5945: [BEAM-3193] Add SparkCoGroupByKeyStreaming validates runner to test CoGroupByKay bahavior in streaming mode on spark runner URL: https://github.com/apache/beam/pull/5945#discussion_r214079123 ## File path: runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/SparkCoGroupByKeyStreamingTest.java ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.spark.translation.streaming; + +import static org.hamcrest.Matchers.containsInAnyOrder; +import static org.junit.Assert.assertThat; + +import org.apache.beam.runners.spark.ReuseSparkContextRule; +import org.apache.beam.runners.spark.SparkPipelineOptions; +import org.apache.beam.runners.spark.StreamingTest; +import org.apache.beam.runners.spark.io.CreateStream; +import org.apache.beam.sdk.coders.KvCoder; +import org.apache.beam.sdk.coders.VarIntCoder; +import org.apache.beam.sdk.testing.PAssert; +import org.apache.beam.sdk.testing.TestPipeline; +import org.apache.beam.sdk.transforms.SerializableFunction; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.transforms.windowing.FixedWindows; +import org.apache.beam.sdk.transforms.windowing.Window; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.TimestampedValue; +import org.apache.beam.sdk.values.TupleTag; +import org.joda.time.Duration; +import org.joda.time.Instant; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A test that verifies that CoGroupByKey works in streaming mode in spark runner. */ +public class SparkCoGroupByKeyStreamingTest { + + private static final Logger LOG = LoggerFactory.getLogger(SparkCoGroupByKeyStreamingTest.class); + private static final TupleTag INPUT1_TAG = new TupleTag<>("input1"); + private static final TupleTag INPUT2_TAG = new TupleTag<>("input2"); + + @Rule public final transient ReuseSparkContextRule noContextResue = ReuseSparkContextRule.no(); + + @Rule public final TestPipeline pipeline = TestPipeline.create(); + + private Duration batchDuration() { +return Duration.millis( + (pipeline.getOptions().as(SparkPipelineOptions.class)).getBatchIntervalMillis()); + } + + @Category(StreamingTest.class) + @Test + public void testInStreamingMode() throws Exception { +Instant instant = new Instant(0); +CreateStream> source1 = +CreateStream.of(KvCoder.of(VarIntCoder.of(), VarIntCoder.of()), batchDuration()) +.emptyBatch() +.advanceWatermarkForNextBatch(instant) +.nextBatch( +TimestampedValue.of(KV.of(1, 1), instant), +TimestampedValue.of(KV.of(1, 2), instant), +TimestampedValue.of(KV.of(1, 3), instant)) + .advanceWatermarkForNextBatch(instant.plus(Duration.standardSeconds(1L))) +.nextBatch( +TimestampedValue.of(KV.of(2, 4), instant.plus(Duration.standardSeconds(1L))), +TimestampedValue.of(KV.of(2, 5), instant.plus(Duration.standardSeconds(1L))), +TimestampedValue.of(KV.of(2, 6), instant.plus(Duration.standardSeconds(1L +.advanceNextBatchWatermarkToInfinity(); + +CreateStream> source2 = +CreateStream.of(KvCoder.of(VarIntCoder.of(), VarIntCoder.of()), batchDuration()) +.emptyBatch() +.advanceWatermarkForNextBatch(instant) +.nextBatch( +
[jira] [Work logged] (BEAM-5062) Add ability to configure S3ClientOptions
[ https://issues.apache.org/jira/browse/BEAM-5062?focusedWorklogId=139738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139738 ] ASF GitHub Bot logged work on BEAM-5062: Author: ASF GitHub Bot Created on: 30/Aug/18 15:23 Start Date: 30/Aug/18 15:23 Worklog Time Spent: 10m Work Description: iemejia commented on issue #6122: [BEAM-5062] Add ability to provide custom S3ClientOptions URL: https://github.com/apache/beam/pull/6122#issuecomment-417359457 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139738) Time Spent: 2h 40m (was: 2.5h) > Add ability to configure S3ClientOptions > > > Key: BEAM-5062 > URL: https://issues.apache.org/jira/browse/BEAM-5062 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > It would be very useful to have an ability to configure > [S3ClientOptions|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/S3ClientOptions.html] > for Apache Beam jobs. > For example, there are some implementations of S3, that does not support > virtual-hosted-style URLs for buckets, only path-style. Currently it's > impossible to enable path style access for amazon s3 client, which is used by > an apache-beam job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()
[ https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597563#comment-16597563 ] Reuven Lax commented on BEAM-5036: -- Once you get to the rename step, the set of files to rename should be deterministic. This isn't currently true for the Flink runner (it is for Dataflow) because support for @RequiresStableInput is fully implemented, however without stable input to the rename step many things can go wrong. The Flink implementation of stable input will block the rename step from executing until the snapshot is finalized, which means that a rollback will only rollback that far and not regenerate new output files. This does work in the current Spark runner (I believe) by forcing an RDD checkpoint. Of course if the user manually rerurns a pipeline this can happen. On Thu, Aug 30, 2018 at 7:11 AM Tim Robertson (JIRA) > Optimize FileBasedSink's WriteOperation.moveToOutput() > -- > > Key: BEAM-5036 > URL: https://issues.apache.org/jira/browse/BEAM-5036 > Project: Beam > Issue Type: Improvement > Components: io-java-files >Affects Versions: 2.5.0 >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > moveToOutput() methods in FileBasedSink.WriteOperation implements move by > copy+delete. It would be better to use a rename() which can be much more > effective for some filesystems. > Filesystem must support cross-directory rename. BEAM-4861 is related to this > for the case of HDFS filesystem. > Feature was discussed here: > http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139736 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 15:13 Start Date: 30/Aug/18 15:13 Worklog Time Spent: 10m Work Description: mxm commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417355811 @tweise I made the ShutdownHook gracefully stop the processes, followed by a kill if there are still running processes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139736) Time Spent: 6h 20m (was: 6h 10m) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6h 20m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139713=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139713 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 14:45 Start Date: 30/Aug/18 14:45 Worklog Time Spent: 10m Work Description: tweise commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417345794 thanks! One more observation: The shutdown hook kills the top level process, but not its child process(es). For bash -> python, only bash will be killed. Maybe we can add a (very small) graceful termination period before taking out the hammer? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139713) Time Spent: 6h 10m (was: 6h) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139697 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 14:25 Start Date: 30/Aug/18 14:25 Worklog Time Spent: 10m Work Description: mxm commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417338622 @tweise sure, done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139697) Time Spent: 6h (was: 5h 50m) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 6h > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3912) Add support of HadoopOutputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-3912?focusedWorklogId=139693=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139693 ] ASF GitHub Bot logged work on BEAM-3912: Author: ASF GitHub Bot Created on: 30/Aug/18 14:21 Start Date: 30/Aug/18 14:21 Worklog Time Spent: 10m Work Description: timrobertson100 commented on a change in pull request #6306: [BEAM-3912] Add HadoopOutputFormatIO support URL: https://github.com/apache/beam/pull/6306#discussion_r214049997 ## File path: sdks/java/io/hadoop-output-format/build.gradle ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +apply plugin: org.apache.beam.gradle.BeamModulePlugin +applyJavaNature() +provideIntegrationTestingDependencies() +enableJavaPerformanceTesting() + +description = "Apache Beam :: SDKs :: Java :: IO :: Hadoop Output Format" +ext.summary = "IO to write data to sinks that implement Hadoop Output Format." + +def log4j_version = "2.6.2" +def elastic_search_version = "5.0.0" +// Migrate to using a version of the driver compatible with Guava 20 +def cassandra_driver = "3.2.0" + +// Ban dependencies from the test runtime classpath +configurations.testRuntimeClasspath { + // Ban hive-exec and mesos since they bundle protobuf without repackaging + exclude group: "org.apache.hive", module: "hive-exec" + exclude group: "org.apache.mesos", module: "mesos" + // Prevent a StackOverflow because of wiring LOG4J -> SLF4J -> LOG4J + exclude group: "org.slf4j", module: "log4j-over-slf4j" +} + +dependencies { + shadow project(path: ":beam-sdks-java-core", configuration: "shadow") + compile library.java.guava + shadow library.java.slf4j_api + shadow project(path: ":beam-sdks-java-io-hadoop-common", configuration: "shadow") + provided library.java.hadoop_common + provided library.java.hadoop_mapreduce_client_core + testCompile project(path: ":beam-runners-direct-java", configuration: "shadow") + testCompile project(path: ":beam-sdks-java-core", configuration: "shadowTest") + testCompile project(path: ":beam-sdks-java-io-common", configuration: "shadow") + testCompile project(path: ":beam-sdks-java-io-common", configuration: "shadowTest") + testCompile "io.netty:netty-transport-native-epoll:4.1.0.CR3" Review comment: Nit: inlined version This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139693) Time Spent: 50m (was: 40m) > Add support of HadoopOutputFormatIO > --- > > Key: BEAM-3912 > URL: https://issues.apache.org/jira/browse/BEAM-3912 > Project: Beam > Issue Type: Improvement > Components: io-java-hadoop >Reporter: Alexey Romanenko >Assignee: Alexey Romanenko >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > For the moment, there is only HadoopInputFormatIO in Beam. To provide a > support of different writing IOs, that are not yet natively supported in Beam > (for example, Apache Orc or HBase bulk load), it would make sense to add > HadoopOutputFormatIO as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139690 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 14:16 Start Date: 30/Aug/18 14:16 Worklog Time Spent: 10m Work Description: tweise commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417335281 @mxm rebased my branch https://github.com/tweise/beam/commit/c64add1a4eb2e1a4ae818e7891516a5c57ef1fe1 Can you add the changes to ProcessManager and DockerJobBundleFactory to your PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139690) Time Spent: 5h 50m (was: 5h 40m) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 5h 50m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()
[ https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597502#comment-16597502 ] Tim Robertson commented on BEAM-5036: - Thanks to everyone for contributing to this. [~JozoVilcek] I've come to a similar conclusion overnight and think we need to do one of: # surface {{FileAlreadyExistsException}} as well as {{FileNotFoundException}} from {{FileSystem.rename()}} and let the caller decide (here I presume we would opt to overwrite by deleting the target only if the source still exists and then retry) # document and implement that {{FileSystem.rename()}} will always replace existing files for all filesystems # expose a {{forceOverwrite}} flag / option and use it here I propose we should open a separate issue to explore optimising rename for Gcs. I had simply overlooked the rewrite option (sorry, I am not all that familiar with Gcs). I still have some concern about rewriting output files that already exist though. Isn't it the case that if "run 1" produced 45 avro file parts but for some reason "run 2" split differently and produced 43 file parts, anything using a glob on the directory would get incorrect data (i.e. the addition of 2 parts from run 1)? This would be relevant for bounded, but possibly even a restart / recover of a streaming scenario? > Optimize FileBasedSink's WriteOperation.moveToOutput() > -- > > Key: BEAM-5036 > URL: https://issues.apache.org/jira/browse/BEAM-5036 > Project: Beam > Issue Type: Improvement > Components: io-java-files >Affects Versions: 2.5.0 >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > moveToOutput() methods in FileBasedSink.WriteOperation implements move by > copy+delete. It would be better to use a rename() which can be much more > effective for some filesystems. > Filesystem must support cross-directory rename. BEAM-4861 is related to this > for the case of HDFS filesystem. > Feature was discussed here: > http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Priority: Major (was: Minor) > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.8.0 > > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Fix Version/s: 2.8.0 > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Fix For: 2.8.0 > > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Affects Version/s: (was: 2.8.0) > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5267) Update Flink Runner to Flink 1.6.x
Maximilian Michels created BEAM-5267: Summary: Update Flink Runner to Flink 1.6.x Key: BEAM-5267 URL: https://issues.apache.org/jira/browse/BEAM-5267 Project: Beam Issue Type: Improvement Components: runner-flink Affects Versions: 2.8.0 Reporter: Maximilian Michels Assignee: Maximilian Michels For the next release, the Flink version should be bumped. As changes for 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139675 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 13:09 Start Date: 30/Aug/18 13:09 Worklog Time Spent: 10m Work Description: mxm commented on a change in pull request #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#discussion_r214023347 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java ## @@ -171,4 +176,12 @@ public static StreamExecutionEnvironment createStreamExecutionEnvironment( return flinkStreamEnv; } + + private static void applyLatencyTrackingInterval( + ExecutionConfig config, FlinkPipelineOptions options) { +long latencyTrackingInterval = options.getLatencyTrackingInterval(); +if (latencyTrackingInterval >= 0) { Review comment: The default is now 0, so it gets disabled by default now. I agree with Aljoscha that the check could be removed entirely. The problem is, if you pass in negative numbers, they will simply be ignored and latency tracking will still be enabled... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139675) Time Spent: 2h 20m (was: 2h 10m) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139674=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139674 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 13:09 Start Date: 30/Aug/18 13:09 Worklog Time Spent: 10m Work Description: mxm commented on a change in pull request #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#discussion_r214021455 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java ## @@ -167,4 +167,12 @@ Boolean isShutdownSourcesOnFinalWatermark(); void setShutdownSourcesOnFinalWatermark(Boolean shutdownOnFinalWatermark); + + @Description( + "Interval in milliseconds for sending latency tracking marks from the sources to the sinks. " + + "Interval value = 0 disablesthe feature.") Review comment: space missing `disablesthe` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139674) Time Spent: 2h 10m (was: 2h) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3912) Add support of HadoopOutputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-3912?focusedWorklogId=139673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139673 ] ASF GitHub Bot logged work on BEAM-3912: Author: ASF GitHub Bot Created on: 30/Aug/18 12:57 Start Date: 30/Aug/18 12:57 Worklog Time Spent: 10m Work Description: timrobertson100 commented on issue #6306: [BEAM-3912] Add HadoopOutputFormatIO support URL: https://github.com/apache/beam/pull/6306#issuecomment-417309907 @aromanenko-dev thank you for including me. This is just to say that I am a little busy and probably can't do a thorough review within the next 2 weeks. I will comment as best I can and am very interested in this though. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139673) Time Spent: 40m (was: 0.5h) > Add support of HadoopOutputFormatIO > --- > > Key: BEAM-3912 > URL: https://issues.apache.org/jira/browse/BEAM-3912 > Project: Beam > Issue Type: Improvement > Components: io-java-hadoop >Reporter: Alexey Romanenko >Assignee: Alexey Romanenko >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > For the moment, there is only HadoopInputFormatIO in Beam. To provide a > support of different writing IOs, that are not yet natively supported in Beam > (for example, Apache Orc or HBase bulk load), it would make sense to add > HadoopOutputFormatIO as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5266) TextIO doen't support http schema anymore
Jean-Baptiste Onofré created BEAM-5266: -- Summary: TextIO doen't support http schema anymore Key: BEAM-5266 URL: https://issues.apache.org/jira/browse/BEAM-5266 Project: Beam Issue Type: Bug Components: io-java-files Affects Versions: 2.6.0 Reporter: Jean-Baptiste Onofré Assignee: Jean-Baptiste Onofré Up to Beam 2.4.0 (at least), it was possible to directly use {{http}} schema with {{TextIO}}. However, now, when trying something like: {code} TextIO.read().from("http://;) {code} throws: {code} Caused by: java.lang.IllegalArgumentException: No filesystem found for scheme http {code} That's due to the "new" file system support. Both {{file}} and {{http}} schema should be handled for URL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139647 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 11:42 Start Date: 30/Aug/18 11:42 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#issuecomment-417289738 Ah, Jenkins is down: https://status.apache.org This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139647) Time Spent: 2h (was: 1h 50m) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5247) Remove slf4j-simple binding from dependencies
[ https://issues.apache.org/jira/browse/BEAM-5247?focusedWorklogId=139648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139648 ] ASF GitHub Bot logged work on BEAM-5247: Author: ASF GitHub Bot Created on: 30/Aug/18 11:42 Start Date: 30/Aug/18 11:42 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6284: [BEAM-5247] Remove slf4j-simple binding from dependencies URL: https://github.com/apache/beam/pull/6284#issuecomment-417289783 @lukecwik Nevermind, Jenkins is/was down: https://status.apache.org This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139648) Time Spent: 1h (was: 50m) > Remove slf4j-simple binding from dependencies > - > > Key: BEAM-5247 > URL: https://issues.apache.org/jira/browse/BEAM-5247 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Flink runner declares a slf4j-simple binding in dependencies. This can break > logging of application if they have their own binding and does not exclude > this one from beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139646=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139646 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 11:40 Start Date: 30/Aug/18 11:40 Worklog Time Spent: 10m Work Description: aljoscha commented on a change in pull request #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#discussion_r213997293 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java ## @@ -171,4 +176,12 @@ public static StreamExecutionEnvironment createStreamExecutionEnvironment( return flinkStreamEnv; } + + private static void applyLatencyTrackingInterval( + ExecutionConfig config, FlinkPipelineOptions options) { +long latencyTrackingInterval = options.getLatencyTrackingInterval(); +if (latencyTrackingInterval >= 0) { Review comment: I think we should disable by default. This is also what will happen for the next Flink version, I think. And it might save people some headaches if they don't have to debug it and find out the hard way, as you did. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139646) Time Spent: 1h 50m (was: 1h 40m) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3912) Add support of HadoopOutputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-3912?focusedWorklogId=139642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139642 ] ASF GitHub Bot logged work on BEAM-3912: Author: ASF GitHub Bot Created on: 30/Aug/18 11:19 Start Date: 30/Aug/18 11:19 Worklog Time Spent: 10m Work Description: aromanenko-dev removed a comment on issue #6306: [BEAM-3912] Add HadoopOutputFormatIO support URL: https://github.com/apache/beam/pull/6306#issuecomment-417263723 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139642) Time Spent: 0.5h (was: 20m) > Add support of HadoopOutputFormatIO > --- > > Key: BEAM-3912 > URL: https://issues.apache.org/jira/browse/BEAM-3912 > Project: Beam > Issue Type: Improvement > Components: io-java-hadoop >Reporter: Alexey Romanenko >Assignee: Alexey Romanenko >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > For the moment, there is only HadoopInputFormatIO in Beam. To provide a > support of different writing IOs, that are not yet natively supported in Beam > (for example, Apache Orc or HBase bulk load), it would make sense to add > HadoopOutputFormatIO as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139639 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 10:41 Start Date: 30/Aug/18 10:41 Worklog Time Spent: 10m Work Description: JozoVilcek commented on a change in pull request #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#discussion_r213983463 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java ## @@ -171,4 +176,12 @@ public static StreamExecutionEnvironment createStreamExecutionEnvironment( return flinkStreamEnv; } + + private static void applyLatencyTrackingInterval( + ExecutionConfig config, FlinkPipelineOptions options) { +long latencyTrackingInterval = options.getLatencyTrackingInterval(); +if (latencyTrackingInterval >= 0) { Review comment: Idea was to stick to default flink configuration, unless user intend to overwrite it. With this PR I have a choice to disable it by passing to runner `--latencyTrackingInterval=0`. Did not wanted to disable it for all by default. Should I? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139639) Time Spent: 1h 40m (was: 1.5h) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5124) Write Euphoria in Beam documentation
[ https://issues.apache.org/jira/browse/BEAM-5124?focusedWorklogId=139636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139636 ] ASF GitHub Bot logged work on BEAM-5124: Author: ASF GitHub Bot Created on: 30/Aug/18 10:21 Start Date: 30/Aug/18 10:21 Worklog Time Spent: 10m Work Description: VaclavPlajt commented on issue #540: [BEAM-5124] DSL Euphoria documentation update URL: https://github.com/apache/beam-site/pull/540#issuecomment-417269596 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139636) Time Spent: 1.5h (was: 1h 20m) > Write Euphoria in Beam documentation > > > Key: BEAM-5124 > URL: https://issues.apache.org/jira/browse/BEAM-5124 > Project: Beam > Issue Type: Sub-task > Components: dsl-euphoria >Reporter: Vaclav Plajt >Assignee: Vaclav Plajt >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5247) Remove slf4j-simple binding from dependencies
[ https://issues.apache.org/jira/browse/BEAM-5247?focusedWorklogId=139635=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139635 ] ASF GitHub Bot logged work on BEAM-5247: Author: ASF GitHub Bot Created on: 30/Aug/18 10:19 Start Date: 30/Aug/18 10:19 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6284: [BEAM-5247] Remove slf4j-simple binding from dependencies URL: https://github.com/apache/beam/pull/6284#issuecomment-417269203 @lukecwik Is "Run Flink ValidatesRunner" not the correct incantation anymore? I think there might be some other issue. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139635) Time Spent: 50m (was: 40m) > Remove slf4j-simple binding from dependencies > - > > Key: BEAM-5247 > URL: https://issues.apache.org/jira/browse/BEAM-5247 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Flink runner declares a slf4j-simple binding in dependencies. This can break > logging of application if they have their own binding and does not exclude > this one from beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5247) Remove slf4j-simple binding from dependencies
[ https://issues.apache.org/jira/browse/BEAM-5247?focusedWorklogId=139634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139634 ] ASF GitHub Bot logged work on BEAM-5247: Author: ASF GitHub Bot Created on: 30/Aug/18 10:18 Start Date: 30/Aug/18 10:18 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6284: [BEAM-5247] Remove slf4j-simple binding from dependencies URL: https://github.com/apache/beam/pull/6284#issuecomment-417269034 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139634) Time Spent: 40m (was: 0.5h) > Remove slf4j-simple binding from dependencies > - > > Key: BEAM-5247 > URL: https://issues.apache.org/jira/browse/BEAM-5247 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Flink runner declares a slf4j-simple binding in dependencies. This can break > logging of application if they have their own binding and does not exclude > this one from beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139633=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139633 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 10:18 Start Date: 30/Aug/18 10:18 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#issuecomment-417268836 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139633) Time Spent: 1.5h (was: 1h 20m) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5239) Allow configure latencyTrackingInterval
[ https://issues.apache.org/jira/browse/BEAM-5239?focusedWorklogId=139632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139632 ] ASF GitHub Bot logged work on BEAM-5239: Author: ASF GitHub Bot Created on: 30/Aug/18 10:16 Start Date: 30/Aug/18 10:16 Worklog Time Spent: 10m Work Description: aljoscha commented on a change in pull request #6278: [BEAM-5239] Enable to configure latencyTrackingInterval URL: https://github.com/apache/beam/pull/6278#discussion_r213977137 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java ## @@ -171,4 +176,12 @@ public static StreamExecutionEnvironment createStreamExecutionEnvironment( return flinkStreamEnv; } + + private static void applyLatencyTrackingInterval( + ExecutionConfig config, FlinkPipelineOptions options) { +long latencyTrackingInterval = options.getLatencyTrackingInterval(); +if (latencyTrackingInterval >= 0) { Review comment: Will this work? The default is `-1`, so this condition will be `false`, i.e. we never disable latency tracking in Flink which is the original point of this PR. I think we can just remove the check. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139632) Time Spent: 1h 20m (was: 1h 10m) > Allow configure latencyTrackingInterval > --- > > Key: BEAM-5239 > URL: https://issues.apache.org/jira/browse/BEAM-5239 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Aljoscha Krettek >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Because of FLINK-10226, we need to be able to set > latencyTrackingConfiguration for flink via FlinkPipelineOptions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5247) Remove slf4j-simple binding from dependencies
[ https://issues.apache.org/jira/browse/BEAM-5247?focusedWorklogId=139631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139631 ] ASF GitHub Bot logged work on BEAM-5247: Author: ASF GitHub Bot Created on: 30/Aug/18 10:12 Start Date: 30/Aug/18 10:12 Worklog Time Spent: 10m Work Description: aljoscha commented on issue #6284: [BEAM-5247] Remove slf4j-simple binding from dependencies URL: https://github.com/apache/beam/pull/6284#issuecomment-417267284 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139631) Time Spent: 0.5h (was: 20m) > Remove slf4j-simple binding from dependencies > - > > Key: BEAM-5247 > URL: https://issues.apache.org/jira/browse/BEAM-5247 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Flink runner declares a slf4j-simple binding in dependencies. This can break > logging of application if they have their own binding and does not exclude > this one from beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3912) Add support of HadoopOutputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-3912?focusedWorklogId=139630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139630 ] ASF GitHub Bot logged work on BEAM-3912: Author: ASF GitHub Bot Created on: 30/Aug/18 09:59 Start Date: 30/Aug/18 09:59 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #6306: [BEAM-3912] Add HadoopOutputFormatIO support URL: https://github.com/apache/beam/pull/6306#issuecomment-417263723 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139630) Time Spent: 20m (was: 10m) > Add support of HadoopOutputFormatIO > --- > > Key: BEAM-3912 > URL: https://issues.apache.org/jira/browse/BEAM-3912 > Project: Beam > Issue Type: Improvement > Components: io-java-hadoop >Reporter: Alexey Romanenko >Assignee: Alexey Romanenko >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > For the moment, there is only HadoopInputFormatIO in Beam. To provide a > support of different writing IOs, that are not yet natively supported in Beam > (for example, Apache Orc or HBase bulk load), it would make sense to add > HadoopOutputFormatIO as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3912) Add support of HadoopOutputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-3912?focusedWorklogId=139629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139629 ] ASF GitHub Bot logged work on BEAM-3912: Author: ASF GitHub Bot Created on: 30/Aug/18 09:56 Start Date: 30/Aug/18 09:56 Worklog Time Spent: 10m Work Description: aromanenko-dev opened a new pull request #6306: [BEAM-3912] Add HadoopOutputFormatIO support URL: https://github.com/apache/beam/pull/6306 For the moment, there is only `HadoopInputFormatIO` in Beam. To provide a support of different writing IOs, that are not yet natively supported in Beam (for example, Apache Orc or HBase bulk load), this PR adds new Java IO `HadoopOutputFormatIO` which allows to write data to any sink which implements Hadoop OutputFormat. It is developed as a separate IO module `hadoop-output-format` to avoid a confusion with a name of already existed module `hadoop-input-format`. Perhaps, in the next versions of Beam we should merge them into one common `HadoopFormatIO` module. It was tested by unit tests and integration test that were incorporated into this PR as well. Follow this checklist to help us incorporate your contribution quickly and easily: - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | --- | --- | --- | --- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139628 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 09:42 Start Date: 30/Aug/18 09:42 Worklog Time Spent: 10m Work Description: mxm commented on issue #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#issuecomment-417258241 > btw I noticed that after job server shutdown, launched processes still stick around and don't exit Please see the latest version of the PR from yesterday. I revised the shutdown logic to eventually kill processes if they don't stop gracefully. In your code you're using the old version. Also, I've added a ShutdownHook to kill running processes in case the JVM shuts down prematurely. @tweise @angoenka I've removed the singleton ProcessManager and instantiate it per `ProcessEnvironmentFactory` which should get rid of the duplicate worker ids you were seeing @tweise. @tweise For the IO, I've enabled inheriting IO when the log level is set to DEBUG. That should help with debugging process startup problems. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139628) Time Spent: 5h 40m (was: 5.5h) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 5h 40m > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3193) CoGroupByKey doesn't work in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-3193?focusedWorklogId=139621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139621 ] ASF GitHub Bot logged work on BEAM-3193: Author: ASF GitHub Bot Created on: 30/Aug/18 09:35 Start Date: 30/Aug/18 09:35 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #5945: [BEAM-3193] Add SparkCoGroupByKeyStreaming validates runner to test CoGroupByKay bahavior in streaming mode on spark runner URL: https://github.com/apache/beam/pull/5945#issuecomment-417256115 @echauchot Sure, I'll take a look today This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139621) Time Spent: 50m (was: 40m) > CoGroupByKey doesn't work in streaming mode > --- > > Key: BEAM-3193 > URL: https://issues.apache.org/jira/browse/BEAM-3193 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Jean-Baptiste Onofré >Assignee: Etienne Chauchot >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > The CoGroupByKey PTransform doesn't throw an exception but doesn't actually > perform the grouping when used in streaming mode. I will attach a test > pipeline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3193) CoGroupByKey doesn't work in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-3193?focusedWorklogId=139618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139618 ] ASF GitHub Bot logged work on BEAM-3193: Author: ASF GitHub Bot Created on: 30/Aug/18 09:27 Start Date: 30/Aug/18 09:27 Worklog Time Spent: 10m Work Description: echauchot commented on issue #5945: [BEAM-3193] Add SparkCoGroupByKeyStreaming validates runner to test CoGroupByKay bahavior in streaming mode on spark runner URL: https://github.com/apache/beam/pull/5945#issuecomment-417253207 @aromanenko-dev can you please take a look? JB and Ismael seem busy. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139618) Time Spent: 40m (was: 0.5h) > CoGroupByKey doesn't work in streaming mode > --- > > Key: BEAM-3193 > URL: https://issues.apache.org/jira/browse/BEAM-3193 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Jean-Baptiste Onofré >Assignee: Etienne Chauchot >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > The CoGroupByKey PTransform doesn't throw an exception but doesn't actually > perform the grouping when used in streaming mode. I will attach a test > pipeline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5265) Can not test Timer with processing time domain
[ https://issues.apache.org/jira/browse/BEAM-5265?focusedWorklogId=139614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139614 ] ASF GitHub Bot logged work on BEAM-5265: Author: ASF GitHub Bot Created on: 30/Aug/18 08:44 Start Date: 30/Aug/18 08:44 Worklog Time Spent: 10m Work Description: JozoVilcek commented on issue #6305: [BEAM-5265] Use currentProcessingTime() for onTime with processing time domain URL: https://github.com/apache/beam/pull/6305#issuecomment-417239792 Simple patch to see if it breaks any existing tests This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139614) Time Spent: 20m (was: 10m) > Can not test Timer with processing time domain > -- > > Key: BEAM-5265 > URL: https://issues.apache.org/jira/browse/BEAM-5265 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-direct >Reporter: Jozef Vilcek >Assignee: Kenneth Knowles >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > I have a stateful DoFn which has a timer on PROCESSING_TIME domain. While > writing tests, I noticed that it does not react to `advanceProcessingTime()` > on tests stream. Problem seems to be here: > [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java#L260] > I can only tell that patching this place works for direct runner tests. Not > sure about broader impact on other runners since it is in `runner-core` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5265) Can not test Timer with processing time domain
[ https://issues.apache.org/jira/browse/BEAM-5265?focusedWorklogId=139613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139613 ] ASF GitHub Bot logged work on BEAM-5265: Author: ASF GitHub Bot Created on: 30/Aug/18 08:44 Start Date: 30/Aug/18 08:44 Worklog Time Spent: 10m Work Description: JozoVilcek opened a new pull request #6305: [BEAM-5265] Use currentProcessingTime() for onTime with processing time domain URL: https://github.com/apache/beam/pull/6305 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | --- | --- | --- | --- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139613) Time Spent: 10m Remaining Estimate: 0h > Can not test Timer with processing time domain > -- > > Key: BEAM-5265 > URL: https://issues.apache.org/jira/browse/BEAM-5265 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-direct >Reporter: Jozef Vilcek >Assignee: Kenneth Knowles >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I have a stateful DoFn which has a timer on PROCESSING_TIME domain. While > writing tests, I noticed that it does not react to `advanceProcessingTime()` > on tests stream. Problem seems to be here: > [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java#L260] > I can only tell that patching this place works for direct runner tests. Not > sure about broader impact on other runners since it is in `runner-core` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5264) Reference DirectRunner implementation of Python user state and timers API
[ https://issues.apache.org/jira/browse/BEAM-5264?focusedWorklogId=139612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139612 ] ASF GitHub Bot logged work on BEAM-5264: Author: ASF GitHub Bot Created on: 30/Aug/18 08:42 Start Date: 30/Aug/18 08:42 Worklog Time Spent: 10m Work Description: charlesccychen opened a new pull request #6304: [BEAM-5264] Reference DirectRunner implementation of Python User State and Timers API URL: https://github.com/apache/beam/pull/6304 This change adds the reference DirectRunner implementation of the Python User State and Timers API. With this change, a user can execute DoFns with state and timers on the DirectRunner. More details on the API design is available at https://s.apache.org/beam-python-user-state-and-timers. R: @robertwb CC: @lukecwik @tweise @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139612) Time Spent: 10m Remaining Estimate: 0h > Reference DirectRunner implementation of Python user state and timers API > - > > Key: BEAM-5264 > URL: https://issues.apache.org/jira/browse/BEAM-5264 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.6.0 >Reporter: Charles Chen >Assignee: Charles Chen >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This issue tracks the reference DirectRunner implementation of the Beam > Python User State and Timer API, described here: > [https://s.apache.org/beam-python-user-state-and-timers]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5265) Can not test Timer with processing time domain
Jozef Vilcek created BEAM-5265: -- Summary: Can not test Timer with processing time domain Key: BEAM-5265 URL: https://issues.apache.org/jira/browse/BEAM-5265 Project: Beam Issue Type: Bug Components: runner-core, runner-direct Reporter: Jozef Vilcek Assignee: Kenneth Knowles I have a stateful DoFn which has a timer on PROCESSING_TIME domain. While writing tests, I noticed that it does not react to `advanceProcessingTime()` on tests stream. Problem seems to be here: [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java#L260] I can only tell that patching this place works for direct runner tests. Not sure about broader impact on other runners since it is in `runner-core` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5264) Reference DirectRunner implementation of Python user state and timers API
Charles Chen created BEAM-5264: -- Summary: Reference DirectRunner implementation of Python user state and timers API Key: BEAM-5264 URL: https://issues.apache.org/jira/browse/BEAM-5264 Project: Beam Issue Type: Improvement Components: sdk-py-core Affects Versions: 2.6.0 Reporter: Charles Chen Assignee: Charles Chen This issue tracks the reference DirectRunner implementation of the Beam Python User State and Timer API, described here: [https://s.apache.org/beam-python-user-state-and-timers]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5263) Add suppurt for accumulatrors to `SingleValueCollector`
Vaclav Plajt created BEAM-5263: -- Summary: Add suppurt for accumulatrors to `SingleValueCollector` Key: BEAM-5263 URL: https://issues.apache.org/jira/browse/BEAM-5263 Project: Beam Issue Type: Sub-task Components: dsl-euphoria Reporter: Vaclav Plajt Assignee: Vaclav Plajt -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness
[ https://issues.apache.org/jira/browse/BEAM-5187?focusedWorklogId=139599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139599 ] ASF GitHub Bot logged work on BEAM-5187: Author: ASF GitHub Bot Created on: 30/Aug/18 08:27 Start Date: 30/Aug/18 08:27 Worklog Time Spent: 10m Work Description: mxm commented on a change in pull request #6287: [BEAM-5187] Add a ProcessJobBundleFactory for process-based execution URL: https://github.com/apache/beam/pull/6287#discussion_r213943954 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/ProcessManager.java ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.fnexecution.environment; + +import static com.google.common.base.Preconditions.checkNotNull; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import java.io.File; +import java.io.IOException; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import javax.annotation.concurrent.ThreadSafe; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A simple process manager which forks processes and kills them if necessary. */ +@ThreadSafe +class ProcessManager { + private static final Logger LOG = LoggerFactory.getLogger(ProcessManager.class); + + private static final ProcessManager INSTANCE = new ProcessManager(); + + private final Map processes; + + public static ProcessManager getInstance() { +return INSTANCE; Review comment: Absolutely, the `ProcessManager` needs to be instantiated per `ProcessEnvironmentFactory`. Getting rid of the static instance also makes testing easier. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139599) Time Spent: 5.5h (was: 5h 20m) > Create a ProcessJobBundleFactory for non-dockerized SDK harness > --- > > Key: BEAM-5187 > URL: https://issues.apache.org/jira/browse/BEAM-5187 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 5.5h > Remaining Estimate: 0h > > As discussed on the mailing list [1], we want to giver users an option to > execute portable pipelines without Docker. Analog to the > {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to > directly fork SDK harness processes. > Artifacts will be provided by an artifact directory or could be setup similar > to the existing bootstrapping code ("boot.go") which we use for containers. > The process-based execution can optionally be configured via the pipeline > options. > [1] > [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-690) Backoff in the DirectRunner Monitor if no work is Available
[ https://issues.apache.org/jira/browse/BEAM-690?focusedWorklogId=139594=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139594 ] ASF GitHub Bot logged work on BEAM-690: --- Author: ASF GitHub Bot Created on: 30/Aug/18 08:24 Start Date: 30/Aug/18 08:24 Worklog Time Spent: 10m Work Description: janotav opened a new pull request #6303: [BEAM-690] Backoff in the DirectRunner if no work is available URL: https://github.com/apache/beam/pull/6303 Implementing backoff as described in the JIRA ticker for [BEAM-690]. Basically this PR: 1. adds new DriverState (CONTINUE_THROTTLE) that signals that no work was available 2. performs capped exponential backoff in the ExecutorServiceParallelExecutor when CONTINUE_THROTTLE is encountered @tgroh , you are probably the most suitable reviewer Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | --- | --- | --- | --- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139594) Time Spent: 10m Remaining Estimate: 0h > Backoff in the DirectRunner Monitor if no work is Available > --- > > Key: BEAM-690 > URL:
[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas
[ https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=139586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139586 ] ASF GitHub Bot logged work on BEAM-4461: Author: ASF GitHub Bot Created on: 30/Aug/18 07:55 Start Date: 30/Aug/18 07:55 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #6298: [BEAM-4461] Introduce Group transform. URL: https://github.com/apache/beam/pull/6298#issuecomment-417225567 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139586) Time Spent: 3h 10m (was: 3h) > Create a library of useful transforms that use schemas > -- > > Key: BEAM-4461 > URL: https://issues.apache.org/jira/browse/BEAM-4461 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > e.g. JoinBy(fields). Project, Filter, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5124) Write Euphoria in Beam documentation
[ https://issues.apache.org/jira/browse/BEAM-5124?focusedWorklogId=139585=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139585 ] ASF GitHub Bot logged work on BEAM-5124: Author: ASF GitHub Bot Created on: 30/Aug/18 07:40 Start Date: 30/Aug/18 07:40 Worklog Time Spent: 10m Work Description: VaclavPlajt commented on issue #540: [BEAM-5124] DSL Euphoria documentation update URL: https://github.com/apache/beam-site/pull/540#issuecomment-417221301 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 139585) Time Spent: 1h 20m (was: 1h 10m) > Write Euphoria in Beam documentation > > > Key: BEAM-5124 > URL: https://issues.apache.org/jira/browse/BEAM-5124 > Project: Beam > Issue Type: Sub-task > Components: dsl-euphoria >Reporter: Vaclav Plajt >Assignee: Vaclav Plajt >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)