Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Jingsong Li
+1 to remove the Bucketing Sink. Thanks for the effort on ORC and `HadoopPathBasedBulkFormatBuilder`, I think it's safe to get rid of the old Bucketing API with them. Best, Jingsong On Thu, Oct 29, 2020 at 3:06 AM Kostas Kloudas wrote: > Thanks for the discussion! > > From this thread I do

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Kostas Kloudas
Thanks for the discussion! >From this thread I do not see any objection with moving forward with removing the sink. Given this I will open a voting thread tomorrow. Cheers, Kostas On Wed, Oct 28, 2020 at 6:50 PM Stephan Ewen wrote: > > +1 to remove the Bucketing Sink. > > It has been very

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Stephan Ewen
+1 to remove the Bucketing Sink. It has been very common in the past to remove code that was deprecated for multiple releases in favor of reducing baggage. Also in cases that had no perfect drop-in replacement, but needed users to forward fit the code. I am not sure I understand why this case is

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
Then we can't remove it, because there is no way for us to ascertain whether anyone is still using it. Sure, the user ML is the best we got, but you can't argue that we don't want any users to be affected and then use an imperfect mean to find users. If you are fine with relying on the user

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Kostas Kloudas
No, I do not think that "we are fine with removing it at the cost of friction for some users". I believe that this can be another discussion that we should have as soon as we establish that someone is actually using it. The point I am trying to make is that if no user is using it, we should

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
The alternative could also be to use a different argument than "no one uses it", e.g., we are fine with removing it at the cost of friction for some users because there are better alternatives. On 10/28/2020 10:46 AM, Kostas Kloudas wrote: I think that the mailing lists is the best we can do

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Kostas Kloudas
I think that the mailing lists is the best we can do and I would say that they seem to be working pretty well (e.g. the recent Mesos discussion). Of course they are not perfect but the alternative would be to never remove anything user facing until the next major release, which I find pretty

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
If the conclusion is that we shouldn't remove it if _anyone_ is using it, then we cannot remove it because the user ML obviously does not reach all users. On 10/28/2020 9:28 AM, Kostas Kloudas wrote: Hi all, I am bringing the up again to see if there are any users actively using the

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Kostas Kloudas
Hi all, I am bringing the up again to see if there are any users actively using the BucketingSink. So far, if I am not mistaken (and really sorry if I forgot anything), it is only a discussion between devs about the potential problems of removing it. I totally understand Chesnay's concern about

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-16 Thread Chesnay Schepler
@Seth: Earlier in this discussion it was said that the BucketingSink would not be usable in 1.12 . On 10/16/2020 4:25 PM, Seth Wiesman wrote: +1 It has been deprecated for some time and the StreamingFileSink has stabalized with a large number of formats and features. Plus, the bucketing sink

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-16 Thread Seth Wiesman
+1 It has been deprecated for some time and the StreamingFileSink has stabalized with a large number of formats and features. Plus, the bucketing sink only implements a small number of stable interfaces[1]. I would expect users to continue to use the bucketing sink from the 1.11 release with

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-15 Thread Kostas Kloudas
@Arvid Heise I also do not remember exactly what were all the problems. The fact that we added some more bulk formats to the streaming file sink definitely reduced the non-supported features. In addition, the latest discussion I found on the topic was [1] and the conclusion of that discussion

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-14 Thread Arvid Heise
I remember this conversation popping up a few times already and I'm in general a big fan of removing BucketingSink. However, until now there were a few features lacking in StreamingFileSink that are present in BucketingSink and that are being actively used (I can't exactly remember them now, but

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Kostas Kloudas
@Chesnay Schepler Off the top of my head, I cannot find an easy way to migrate from the BucketingSink to the StreamingFileSink. It may be possible but it will require some effort because the logic would be "read the old state, commit it, and start fresh with the StreamingFileSink." On Tue, Oct

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Aljoscha Krettek
On 13.10.20 14:01, David Anderson wrote: I thought this was waiting on FLIP-46 -- Graceful Shutdown Handling -- and in fact, the StreamingFileSink is mentioned in that FLIP as a motivating use case. Ah yes, I see FLIP-147 as a more general replacement for FLIP-46. Thanks for the reminder, we

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread David Anderson
> The BucketingSink suffers from the same problem. It's caused by the fact > that we don't do a "final" checkpoint before shutting down a pipeline. > We're trying to resolve that with FLIP-147 [1]. I thought this was waiting on FLIP-46 -- Graceful Shutdown Handling -- and in fact, the

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Jingsong Li
Hi, I share a concern: Although we now support ORC Writer. It's not easy to support. We need to override something for ORC classes. Note that we are using a newer version of ORC, which is not forward compatible. Therefore, the data written by users using Flink Orc writer may not be readable by

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Chesnay Schepler
How easy is the migration to the StreamingFileSink? On 10/13/2020 1:01 PM, Aljoscha Krettek wrote: On 13.10.20 11:18, David Anderson wrote: I think the pertinent question is whether there are interesting cases where the BucketingSink is still a better choice. One case I'm not sure about is

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Aljoscha Krettek
On 13.10.20 11:18, David Anderson wrote: I think the pertinent question is whether there are interesting cases where the BucketingSink is still a better choice. One case I'm not sure about is the situation described in docs for the StreamingFileSink under Important Note 2 [1]: ... upon

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread David Anderson
I think the pertinent question is whether there are interesting cases where the BucketingSink is still a better choice. One case I'm not sure about is the situation described in docs for the StreamingFileSink under Important Note 2 [1]: ... upon normal termination of a job, the last

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Konstantin Knauf
Given that it has been deprecated for three releases now, I am +1 to dropping it. On Mon, Oct 12, 2020 at 9:38 PM Chesnay Schepler wrote: > Is there a way for us to change the module (in a reasonable way) that > would allow users to continue using it? > Is it an API problem, or one of

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-12 Thread Chesnay Schepler
Is there a way for us to change the module (in a reasonable way) that would allow users to continue using it? Is it an API problem, or one of semantics? On 10/12/2020 4:57 PM, Kostas Kloudas wrote: Hi Chesnay, Unfortunately not from what I can see in the code. This is the reason why I am

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-12 Thread Kostas Kloudas
Hi Chesnay, Unfortunately not from what I can see in the code. This is the reason why I am opening a discussion. I think that if we supported backwards compatibility, this would have been an easier process. Kostas On Mon, Oct 12, 2020 at 4:32 PM Chesnay Schepler wrote: > > Are older versions

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-12 Thread Chesnay Schepler
Are older versions of the module compatible with 1.12+? On 10/12/2020 4:30 PM, Kostas Kloudas wrote: Hi all, As the title suggests, this thread is to discuss the removal of the flink-connector-filesystem module which contains (only) the deprecated BucketingSink. The BucketingSin is deprecated

[DISCUSS] Remove flink-connector-filesystem module.

2020-10-12 Thread Kostas Kloudas
Hi all, As the title suggests, this thread is to discuss the removal of the flink-connector-filesystem module which contains (only) the deprecated BucketingSink. The BucketingSin is deprecated since FLINK 1.9 [1] in favor of the relatively recently introduced StreamingFileSink. For the sake of a