[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2020-05-28 Thread Pablo Estrada (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118985#comment-17118985
 ] 

Pablo Estrada commented on BEAM-1438:
-

Ah it looks like it's just a matter of removing the check. 
[https://github.com/apache/beam/pull/11850] is out to fix this.

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: P2
> Fix For: 2.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2020-05-28 Thread Pablo Estrada (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118980#comment-17118980
 ] 

Pablo Estrada commented on BEAM-1438:
-

[~reuvenlax] are you able to take a look at this?

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: P2
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2020-05-28 Thread Pablo Estrada (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118979#comment-17118979
 ] 

Pablo Estrada commented on BEAM-1438:
-

Reopening this issue, as this will not work on Dataflow, as appropriately 
pointed out by others.

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: P2
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2019-11-06 Thread Amit Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968788#comment-16968788
 ] 

Amit Kumar commented on BEAM-1438:
--

I have also recently seen failure withNumShards(0) for an unbounded source.

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2019-10-15 Thread Alexey Romanenko (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951897#comment-16951897
 ] 

Alexey Romanenko commented on BEAM-1438:


Does it actually work with Dataflow for file-based IOs? As I see, every IO that 
uses {{WriteFiles.wuthNumShards(0)}} will fail for unbounded source because of 
the check hat Robert mentioned above. Am I mistaken?

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2019-10-15 Thread Alexey Romanenko (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951867#comment-16951867
 ] 

Alexey Romanenko commented on BEAM-1438:


+1 to Robert's question. Though, I guess it was only fixed for Dataflow and 
Flink runners.

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner

2019-10-14 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951374#comment-16951374
 ] 

Robert Bradshaw commented on BEAM-1438:
---

Does this mean that the error at 
https://github.com/apache/beam/blob/release-2.16.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L315
 can be removed? 

> The default behavior for the Write transform doesn't work well with the 
> Dataflow streaming runner
> -
>
> Key: BEAM-1438
> URL: https://issues.apache.org/jira/browse/BEAM-1438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.5.0
>
>
> If a Write specifies 0 output shards, that implies the runner should pick an 
> appropriate sharding. The default behavior is to write one shard per input 
> bundle. This works well with the Dataflow batch runner, but not with the 
> streaming runner which produces large numbers of small bundles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)