[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2018-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323442#comment-16323442
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user coveralls commented on the issue:

https://github.com/apache/flink/pull/1563
  

[![Coverage 
Status](https://coveralls.io/builds/15015307/badge)](https://coveralls.io/builds/15015307)

Changes Unknown when pulling **df49d5bb8ba778cdd17b94318c4bf48c6d1747ad on 
rmetzger:flink3296** into ** on apache:master**.



> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
> Fix For: 1.0.0
>
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148707#comment-15148707
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1563


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
> Fix For: 1.0.0
>
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148652#comment-15148652
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-184700077
  
I'll merge the PR.


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148479#comment-15148479
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-184642511
  
I renamed the method to `writeUsingOutputFormat` and rebased to current 
master.


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144642#comment-15144642
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-183356403
  
Thank you for the review. I've addressed the comments and rebased the 
change.

Once travis has passed, I'll merge it!


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144868#comment-15144868
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-183414977
  
I think the `writeOutputFormat` method name could be misleading. To me it 
implies writing the `OutputFormat` not writing something by using the 
`OutputFormat`.


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143176#comment-15143176
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-182983683
  
Changes look good to me


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126388#comment-15126388
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-178020653
  
Code looks good. I would like to update the docs to include a bit more info 
(like in the inline comment) and at least refer to the `addSink(...)` method, 
for fault tolerant sinks.


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-02-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126351#comment-15126351
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1563#discussion_r51428065
  
--- Diff: docs/apis/streaming/index.md ---
@@ -1768,6 +1768,11 @@ greater than 1, the output will also be prepended 
with the identifier of the tas
 
 
 
+Note that the `write*()` methods on `DataStream` are mainly intended for 
debugging purposes.
+They are not participating in Flink's checkpointing (no fault tolerance 
guarantees). The 
--- End diff --

May be worth adding that this means usually "at-least-once" , but may also 
mean data loss in cases where the output formats buffer data and do not 
immediately persist it.


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-01-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124777#comment-15124777
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/1563#issuecomment-177088061
  
With the `RollingHDFSSink` this functionality is not needed any more and as 
you suggested apparently was a misleading implmentation anyway. I like your not 
to the docs. :+1: 
Could you add `[api-breaking]` to the commit msg though please, when 
merging?


> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

2016-01-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124194#comment-15124194
 ] 

ASF GitHub Bot commented on FLINK-3296:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/1563

[FLINK-3296] Remove 'flushing' behavior of the OutputFormat in DataStream 
API

I removed the `FileSinkFunctionByMillis` and removed all the `millis` 
arguments on the writing functions.

The whole "buffering" and "flushing" functionality was broken: Elements 
were kept in an ArrayList and send to the OutputFormat on "flush()". However, 
the flush was not really called periodically. It was only checked when new 
records arrived. So when a stream is not having elements for a certain time, 
the last few elements would just stay in the list until new elements arrive 
again.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink3296

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1563.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1563


commit 3275adaaf27f6e1ec74ffc2a48169239da0e1f5b
Author: Robert Metzger 
Date:   2016-01-28T13:56:29Z

[FLINK-3296] Remove 'flushing' behavior of the OutputFormat support of the 
DataStream API




> DataStream.write*() methods are not flushing properly
> -
>
> Key: FLINK-3296
> URL: https://issues.apache.org/jira/browse/FLINK-3296
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} 
> class, which has a logic for flushing records, even though the underlying 
> stream is never flushed. This is misleading for users as files are not 
> written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types 
> were not set correctly. This PR opened the write() method to any output 
> format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)