[jira] [Commented] (BEAM-4848) GC overhead limit exceeded in Java Pre and PostCommit tests

2018-07-23 Thread Jan Peuker (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553705#comment-16553705
 ] 

Jan Peuker commented on BEAM-4848:
--

In case it helps: For me this even happens building locally regardless of 
building tests or not, on MacOS 10.13.6, JDK 1.8.0_151 consuming 3.24GB of RAM 
on a 16GB machine
Regular build fails with:
{{Execution failed for task ':rat'.}}
{{ > java.lang.OutOfMemoryError: Java heap space}}

Gradle check fails with:
{{Execution failed for task ':rat'.}}
{{> java.lang.OutOfMemoryError: GC overhead limit exceeded}}

Looking at the failures of rat, I wonder if it has something to do with parsing 
dependencies:
{{> Task :rat
{{skipping symbolic link   -- too many levels of symbolic links.}}
{{skipping symbolic link ...-- too many levels of symbolic links.}}
{{Expiring Daemon because JVM Tenured space is exhausted}}

More similar errors are in Python:
{{> Task :beam-sdks-python:lintPy27}}
{{Running flake8 for module apache_beam   gen_protos.py   setup.py
test_config.py:}}
{{Expiring Daemon because JVM Tenured space is exhausted}}

> GC overhead limit exceeded in Java Pre and PostCommit tests
> ---
>
> Key: BEAM-4848
> URL: https://issues.apache.org/jira/browse/BEAM-4848
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Rui Wang
>Priority: Major
>
> Right now, Beam Java PreCommit and PostCommit tests are failing due GC 
> issues. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4571) RedisIO support for write using SET operation

2018-07-22 Thread Jan Peuker (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551914#comment-16551914
 ] 

Jan Peuker commented on BEAM-4571:
--

Apologies for the tardy reply [~jbonofre], was traveling too much. I believe 
[~hsuryawirawan] has a fix ready as well that we're testing on a local branch 
as well. Will quickly check with him.

> RedisIO support for write using SET operation
> -
>
> Key: BEAM-4571
> URL: https://issues.apache.org/jira/browse/BEAM-4571
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-redis
>Affects Versions: 2.4.0
>Reporter: Jan Peuker
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
> Fix For: 2.7.0
>
>
> At the moment, RedisIO only supports "append" operation when writing to 
> Redis. This improvement is to add support for "set" operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4096) BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency

2018-07-22 Thread Jan Peuker (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551913#comment-16551913
 ] 

Jan Peuker commented on BEAM-4096:
--

My apologies [~pabloem] for the tardy reply, was traveling too much. I used to 
have the PR ready but postponed it due to the Maven -> Gradle change (as most 
of the fixes actually involved upgrading to the latest BigQuery SDK). Now that 
this is done I intend do send a PR for 2.6.0 but happy to keep the fix version 
empty for now.

> BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency
> ---
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Assignee: Jan Peuker
>Priority: Minor
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.
> withMethod(..) cannot be changed at runtime currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4571) RedisIO support for write using SET operation

2018-06-18 Thread Jan Peuker (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515523#comment-16515523
 ] 

Jan Peuker commented on BEAM-4571:
--

Apologies for raising this with a spelling mistake first. My colleague Henry 
would like to work on this, waiting for him to sign up.

> RedisIO support for write using SET operation
> -
>
> Key: BEAM-4571
> URL: https://issues.apache.org/jira/browse/BEAM-4571
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-redis
>Affects Versions: 2.4.0
>Reporter: Jan Peuker
>Assignee: Jan Peuker
>Priority: Minor
> Fix For: 2.6.0
>
>
> At the moment, RedisIO only supports "append" operation when writing to 
> Redis. This improvement is to add support for "set" operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4571) RedisIO support for write using SET operation

2018-06-18 Thread Jan Peuker (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Peuker reassigned BEAM-4571:


Assignee: Jan Peuker  (was: Jean-Baptiste Onofré)

> RedisIO support for write using SET operation
> -
>
> Key: BEAM-4571
> URL: https://issues.apache.org/jira/browse/BEAM-4571
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-redis
>Affects Versions: 2.4.0
>Reporter: Jan Peuker
>Assignee: Jan Peuker
>Priority: Minor
> Fix For: 2.6.0
>
>
> At the moment, RedisIO only supports "append" operation when writing to 
> Redis. This improvement is to add support for "set" operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4571) RedisIO support for write using SET operation

2018-06-18 Thread Jan Peuker (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Peuker updated BEAM-4571:
-
Description: At the moment, RedisIO only supports "append" operation when 
writing to Redis. This improvement is to add support for "set" operation.
Summary: RedisIO support for write using SET operation  (was: RedisIO 
support for writeet operation)

> RedisIO support for write using SET operation
> -
>
> Key: BEAM-4571
> URL: https://issues.apache.org/jira/browse/BEAM-4571
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-redis
>Affects Versions: 2.4.0
>Reporter: Jan Peuker
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
> Fix For: 2.6.0
>
>
> At the moment, RedisIO only supports "append" operation when writing to 
> Redis. This improvement is to add support for "set" operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4571) RedisIO support for writeet operation

2018-06-18 Thread Jan Peuker (JIRA)
Jan Peuker created BEAM-4571:


 Summary: RedisIO support for writeet operation
 Key: BEAM-4571
 URL: https://issues.apache.org/jira/browse/BEAM-4571
 Project: Beam
  Issue Type: Improvement
  Components: io-java-redis
Affects Versions: 2.4.0
Reporter: Jan Peuker
Assignee: Jean-Baptiste Onofré
 Fix For: 2.6.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4096) BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency

2018-04-24 Thread Jan Peuker (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Peuker updated BEAM-4096:
-
Summary: BigQueryIO ValueProvider support for NumFileShards and Triggering 
Frequency  (was: BigQueryIO ValueProvider support for Method and Triggering 
Frequency)

> BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency
> ---
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Assignee: Jan Peuker
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.
> withMethod(..) cannot be changed at runtime currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4096) BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency

2018-04-24 Thread Jan Peuker (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451598#comment-16451598
 ] 

Jan Peuker commented on BEAM-4096:
--

Removed Method and will focus on Shards (correcting the Javadoc as well) and 
Triggering Frequency

> BigQueryIO ValueProvider support for NumFileShards and Triggering Frequency
> ---
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Assignee: Jan Peuker
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.
> withMethod(..) cannot be changed at runtime currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-24 Thread Jan Peuker (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Peuker updated BEAM-4096:
-
Description: 
Enhance BigQueryIO to accept ValueProviders for:
 * withTriggeringFrequency(..)
 * withNumFileShards(..)

It would allow Dataflow templates to accept these parameters at runtime instead 
of being hardcoded. This opens up the ability to create Dataflow templates 
which allow users to flip back-and-forth between batch and streaming inserts.

withMethod(..) cannot be changed at runtime currently.

  was:
Enhance BigQueryIO to accept ValueProviders for:
 * withMethod(..)
 * withTriggeringFrequency(..)
 * withNumFileShards(..)

It would allow Dataflow templates to accept these parameters at runtime instead 
of being hardcoded. This opens up the ability to create Dataflow templates 
which allow users to flip back-and-forth between batch and streaming inserts.


> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Assignee: Jan Peuker
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.
> withMethod(..) cannot be changed at runtime currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-23 Thread Jan Peuker (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447936#comment-16447936
 ] 

Jan Peuker commented on BEAM-4096:
--

Thanks Eugene, very helpful, I suppose you mean the fact that in 
BigQueryIO.Write.expand different PTransforms are created per method. In my 
tests using a manually constructed graph (i.e. StaticValueProvider) it worked, 
but that's probably because I haven't done enough dynamic testing yet.

> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withMethod(..)
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-16 Thread Jan Peuker (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440364#comment-16440364
 ] 

Jan Peuker edited comment on BEAM-4096 at 4/17/18 5:40 AM:
---

Hi this is Jan, all set up with Jira now.

Small addition here: We also need to change withNumFileShards to a 
ValueProvider which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, naive, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 10 and reaches common chunk sizes earlier).


was (Author: janpeuker):
Hi this is Jan, all set up with Jira now.

Small addition here: We also need to change withNumFileShards to a 
ValueProvider which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, native, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 2 and reaches common chunk sizes earlier).

> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withMethod(..)
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-16 Thread Jan Peuker (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440364#comment-16440364
 ] 

Jan Peuker edited comment on BEAM-4096 at 4/17/18 4:19 AM:
---

Hi this is Jan, all set up with Jira now.

Small addition here: We also need to change withNumFileShards to a 
ValueProvider which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, native, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 2 and reaches common chunk sizes earlier).


was (Author: janpeuker):
Hi this is Jan, all set up with Jira now.

Small addition here: We also need to be change withNumFileShards to a 
ValueProviders which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, native, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 2 and reaches common chunk sizes earlier).

> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withMethod(..)
>  * withTriggeringFrequency(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-16 Thread Jan Peuker (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440364#comment-16440364
 ] 

Jan Peuker commented on BEAM-4096:
--

Hi this is Jan, all set up with Jira now.

Small addition here: We also need to be change withNumFileShards to a 
ValueProviders which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, native, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 2 and reaches common chunk sizes earlier).

> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withMethod(..)
>  * withTriggeringFrequency(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)