FYI: patch submitted - https://github.com/apache/spark/pull/25996
On Wed, Oct 2, 2019 at 3:25 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > I need to do full manual test to make sure, but according to experiment > (small UT) "closeFrameOnFlush" seems to work. > > There was relevant change on master branch SPARK-26283 [1], and it changed > the way to read the zstd event log file to "continuous", which seems to > read open frame. With "closeFrameOnFlush" being false for ZstdOutputStream, > frame is never closed (even flushing output stream) unless output stream is > closed. > > I'll raise a patch once manual test is passed. Sorry for the false alarm. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 1. https://issues.apache.org/jira/browse/SPARK-26283 > > On Wed, Oct 2, 2019 at 2:33 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> The change log for zstd v1.4.3 feels me that the changes don't seem to be >> related. >> >> https://github.com/facebook/zstd/blob/dev/CHANGELOG#L1-L5 >> >> v1.4.3 >> bug: Fix Dictionary Compression Ratio Regression by @cyan4973 (#1709) >> bug: Fix Buffer Overflow in v0.3 Decompression by @felixhandte (#1722) >> build: Add support for IAR C/C++ Compiler for Arm by @joseph0918 (#1705) >> misc: Add NULL pointer check in util.c by @leeyoung624 (#1706) >> >> But it's only the matter of dependency update and rebuild, so I'll try it >> out. >> >> Before that, I just indicated ZstdOutputStream has a parameter >> "closeFrameOnFlush" which seems to deal with flush. We let the value as the >> default value which is "false". Let me pass the value to "true" and see it >> helps. Please let me know if someone knows why we pick the value as false >> (or let it by default). >> >> >> On Wed, Oct 2, 2019 at 1:48 PM Dongjoon Hyun <dongjoon.h...@gmail.com> >> wrote: >> >>> Thank you for reporting, Jungtaek. >>> >>> Can we try to upgrade it to the newer version first? >>> >>> Since we are at 1.4.2, the newer version is 1.4.3. >>> >>> Bests, >>> Dongjoon. >>> >>> >>> >>> On Tue, Oct 1, 2019 at 9:18 PM Mridul Muralidharan <mri...@gmail.com> >>> wrote: >>> >>>> Makes more sense to drop support for zstd assuming the fix is not >>>> something at spark end (configuration, etc). >>>> Does not make sense to try to detect deadlock in codec. >>>> >>>> Regards, >>>> Mridul >>>> >>>> On Tue, Oct 1, 2019 at 8:39 PM Jungtaek Lim >>>> <kabhwan.opensou...@gmail.com> wrote: >>>> > >>>> > Hi devs, >>>> > >>>> > I've discovered an issue with event logger, specifically reading >>>> incomplete event log file which is compressed with 'zstd' - the reader >>>> thread got stuck on reading that file. >>>> > >>>> > This is very easy to reproduce: setting configuration as below >>>> > >>>> > - spark.eventLog.enabled=true >>>> > - spark.eventLog.compress=true >>>> > - spark.eventLog.compression.codec=zstd >>>> > >>>> > and start Spark application. While the application is running, load >>>> the application in SHS webpage. It may succeed to replay the event log, but >>>> high likely it will be stuck and loading page will be also stuck. >>>> > >>>> > Please refer SPARK-29322 for more details. >>>> > >>>> > As the issue only occurs with 'zstd', the simplest approach is >>>> dropping support of 'zstd' for event log. More general approach would be >>>> introducing timeout on reading event log file, but it should be able to >>>> differentiate thread being stuck vs thread busy with reading huge event log >>>> file. >>>> > >>>> > Which approach would be preferred in Spark community, or would >>>> someone propose better ideas for handling this? >>>> > >>>> > Thanks, >>>> > Jungtaek Lim (HeartSaVioR) >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>>