So, based on the comments in the PR, the underlying issue seems to be
'FileBasedSink.convertToFileResourceIfPossible(stringCoder.decode(inStream));'
not returning the correct result, right ?
If so I think the correct fix might be your proposal (2) - Try to fix the
underlying filesystem to do a better job of file/dir matching

This is a bug we probably have to fix anyways for the local filesystem
and/or HDFS and this will also give us a solution that does not break
update compatibility.

Thanks,
Cham

On Wed, Aug 5, 2020 at 3:41 PM Luke Cwik <[email protected]> wrote:

> Cham, that was one of the options I had mentioned on the PR. The
> difference here is that this is a bug fix and existing users could be
> broken unknowingly so it might be worthwhile to take that breaking change
> (and possibly provide users a way to perform an upgrade using the old
> implementation).
>
>
> On Wed, Aug 5, 2020 at 3:33 PM Chamikara Jayalath <[email protected]>
> wrote:
>
>> This might break the update compatibility for Dataflow streaming
>> pipelines. +Reuven Lax <[email protected]>  +Lukasz Cwik
>> <[email protected]>
>>
>> In other cases, to save update compatibility, we introduced a user option
>> that changes the coder only when the user explicitly asks for an updated
>> feature that requires the new coder. For example,
>> https://github.com/apache/beam/commit/304882caa89afe24150062b959ee915c79e72ab3
>>
>> Thanks,
>> Cham
>>
>>
>> On Mon, Aug 3, 2020 at 10:00 AM David Janíček <[email protected]>
>> wrote:
>>
>>> Hello everyone,
>>>
>>> I've reported an issue https://issues.apache.org/jira/browse/BEAM-10292
>>> which is about broken DefaultFilenamePolicy.ParamsCoder behavior.
>>> DefaultFilenamePolicy.ParamsCoder loses information whether
>>> DefaultFilenamePolicy.Params's baseFilename resource is file or directory
>>> on some filesystems, at least on local FS and HDFS.
>>>
>>> After discussion with @dmvk and @lukecwik, we have agreed that the best
>>> solution could be to take the breaking change and use ResourceIdCoder for
>>> encoding/decoding DefaultFilenamePolicy.Params's baseFilename, this way the
>>> file/directory information is preserved.
>>> The solution is implemented in pull request
>>> https://github.com/apache/beam/pull/12050.
>>>
>>> I'd like to ask if there is a consensus on this breaking change. Is
>>> everyone OK with this?
>>> Thanks in advance for answers.
>>>
>>> Best regards,
>>> David
>>>
>>

Reply via email to