Just submitted the patch: https://github.com/apache/spark/pull/29077
On Tue, Jun 16, 2020 at 3:40 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > Bump this again. I filed SPARK-31985 [1] and plan to submit a PR in a > couple of days if there's no voice on the reason we should keep it. > > 1. https://issues.apache.org/jira/browse/SPARK-31985 > > On Thu, May 21, 2020 at 8:54 AM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> Let me share the effect on removing the incomplete and undocumented code >> path. I manually tried out removing the code path and here's the change. >> >> >> https://github.com/HeartSaVioR/spark/commit/aa53e9b1b33c0b8aec37704ad290b42ffb2962d8 >> >> 1,120 lines deleted without hurting any existing streaming tests, except >> a suite I removed as well since it tests such code path. No need to update >> documentation as it was never publicized. This also removes some rules >> which only apply for continuous mode, which gave exceptions on the fact >> that "continuous mode is only available for map-like operations". Removing >> them would be back to simplify the usage of continuous mode. >> >> Also worth noting that I had to manually remove the code path instead of >> revert, because the code path has been changed to reflect DSv2 change. What >> this means? We have to "update" the code path and concern >> about compatibility, etc. while it never be used in production. >> >> On Tue, May 19, 2020 at 1:14 PM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >>> Hi devs, >>> >>> during experiment on complete mode I realized we left some incomplete >>> code parts on supporting aggregation for continuous mode. (shuffle & >>> coalesce) >>> >>> The work had been occurred around first half of 2018 and stopped, and no >>> work has been done for around 2 years (so I don't expect anyone is working >>> on this). The functionality is undocumented (as the work was only done >>> partially) and continuous mode is experimental so I don't feel risks to get >>> rid of the part. >>> >>> What do you think? If it makes sense then I'll raise a PR to get rid of >>> the incomplete codes. >>> >>> Thanks, >>> Jungtaek Lim (HeartSaVioR) >>> >>> ps. I'm kind of feeling the same with continuous mode itself - it has >>> been still experimental over the 2 years, and while it lacks lots of >>> essential things (backpressure, epoch synchronization, query progress, >>> etc.) there's no additional major work over 1 year. We eventually have been >>> excluded continuous mode for the new streaming feature like observable >>> metric, streaming UI, etc. because it is on top of the feature >>> which continuous mode lacks. >>> >>> Unlike incomplete code path, I'm not strongly against this, as it >>> has been documented and someone might use the feature in production. I >>> still think it's ideal to retire the feature smoothly. >>> (Please chime in with the use case if someone has the production cases.) >>> >>> ps2. It feels like "feature branch" looks to be a thing to consider for >>> efforts on one big feature. >>> >>