FYI: patch submitted - https://github.com/apache/spark/pull/25996

On Wed, Oct 2, 2019 at 3:25 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> I need to do full manual test to make sure, but according to experiment
> (small UT) "closeFrameOnFlush" seems to work.
>
> There was relevant change on master branch SPARK-26283 [1], and it changed
> the way to read the zstd event log file to "continuous", which seems to
> read open frame. With "closeFrameOnFlush" being false for ZstdOutputStream,
> frame is never closed (even flushing output stream) unless output stream is
> closed.
>
> I'll raise a patch once manual test is passed. Sorry for the false alarm.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 1. https://issues.apache.org/jira/browse/SPARK-26283
>
> On Wed, Oct 2, 2019 at 2:33 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> The change log for zstd v1.4.3 feels me that the changes don't seem to be
>> related.
>>
>> https://github.com/facebook/zstd/blob/dev/CHANGELOG#L1-L5
>>
>> v1.4.3
>> bug: Fix Dictionary Compression Ratio Regression by @cyan4973 (#1709)
>> bug: Fix Buffer Overflow in v0.3 Decompression by @felixhandte (#1722)
>> build: Add support for IAR C/C++ Compiler for Arm by @joseph0918 (#1705)
>> misc: Add NULL pointer check in util.c by @leeyoung624 (#1706)
>>
>> But it's only the matter of dependency update and rebuild, so I'll try it
>> out.
>>
>> Before that, I just indicated ZstdOutputStream has a parameter
>> "closeFrameOnFlush" which seems to deal with flush. We let the value as the
>> default value which is "false". Let me pass the value to "true" and see it
>> helps. Please let me know if someone knows why we pick the value as false
>> (or let it by default).
>>
>>
>> On Wed, Oct 2, 2019 at 1:48 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
>> wrote:
>>
>>> Thank you for reporting, Jungtaek.
>>>
>>> Can we try to upgrade it to the newer version first?
>>>
>>> Since we are at 1.4.2, the newer version is 1.4.3.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>>
>>> On Tue, Oct 1, 2019 at 9:18 PM Mridul Muralidharan <mri...@gmail.com>
>>> wrote:
>>>
>>>> Makes more sense to drop support for zstd assuming the fix is not
>>>> something at spark end (configuration, etc).
>>>> Does not make sense to try to detect deadlock in codec.
>>>>
>>>> Regards,
>>>> Mridul
>>>>
>>>> On Tue, Oct 1, 2019 at 8:39 PM Jungtaek Lim
>>>> <kabhwan.opensou...@gmail.com> wrote:
>>>> >
>>>> > Hi devs,
>>>> >
>>>> > I've discovered an issue with event logger, specifically reading
>>>> incomplete event log file which is compressed with 'zstd' - the reader
>>>> thread got stuck on reading that file.
>>>> >
>>>> > This is very easy to reproduce: setting configuration as below
>>>> >
>>>> > - spark.eventLog.enabled=true
>>>> > - spark.eventLog.compress=true
>>>> > - spark.eventLog.compression.codec=zstd
>>>> >
>>>> > and start Spark application. While the application is running, load
>>>> the application in SHS webpage. It may succeed to replay the event log, but
>>>> high likely it will be stuck and loading page will be also stuck.
>>>> >
>>>> > Please refer SPARK-29322 for more details.
>>>> >
>>>> > As the issue only occurs with 'zstd', the simplest approach is
>>>> dropping support of 'zstd' for event log. More general approach would be
>>>> introducing timeout on reading event log file, but it should be able to
>>>> differentiate thread being stuck vs thread busy with reading huge event log
>>>> file.
>>>> >
>>>> > Which approach would be preferred in Spark community, or would
>>>> someone propose better ideas for handling this?
>>>> >
>>>> > Thanks,
>>>> > Jungtaek Lim (HeartSaVioR)
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>

Reply via email to