It makes sense to have a more concrete URN including the version.

Good idea Robert.

Regards
JB

On 05/11/2018 16:52, Robert Bradshaw wrote:
> I think we'll want to allow upgrades across SDK versions. A runner
> should be able to recognize when a coder (or any other aspect of the
> pipeline) has changed and adapt/reject accordingly. (Until we remove
> coders from sources/sinks, there's also possibly the expectation that
> one should be able to read data from a source written with that same
> coder across versions as well.)
> 
> I think it really comes down to how coders are named. If we decide to
> let coders change arbitrarily between versions, probably the URN for
> SerializedJavaCoder should have the SDK version number in it. Coders
> that are stable across SDKs can have better, more stable URNs defined
> and registered.
> 
> I am more OK with changing the registry to infer different coders as
> the SDK evolves (which would be detected and manually overwritten with
> the old ones, on a case-by-case basis, if they still exist). This
> should still be done with caution as it will make upgrading harder.
> Highly composite, experimental coders should possibly be designed in
> an intrinsically extensible way.
> 
> On Mon, Nov 5, 2018 at 4:24 PM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
>>
>> That's really a pita. It's an important and impacting change.
>>
>> I would go to 1.
>>
>> For LTS, as already said, I would create a LTS branch and only cherry
>> pick some changes. Using master as LTS release branch won't work IMHO.
>>
>> Regards
>> JB
>>
>> On 05/11/2018 15:47, Ismaël Mejía wrote:
>>> For some extra context this change touches more than FileIO, in
>>> reality this will affect updates in any file-based pipelines because
>>> the metadata on each file will have now an extra field for the
>>> lastModifiedDate.
>>>
>>> The PR looks perfect, only issue is the backwards compatibility Coder
>>> question. Knowing that probably Dataflow is the only one affected, I
>>> would like to know what can we do?
>>>
>>> [1] Should we merge and the Coder updatability be tied to SDK versions
>>> (which makes sense and is probably more aligned with the LTS
>>> discussion)?
>>> [2] Should we have a MetadataCoderV2? (does this imply a repeated
>>> Matadata object) ? In this case where is the right place to identify
>>> and decide what coder to use?
>>>
>>> Other ideas... ?
>>>
>>> Last thing, the link that Luke shared does not seem to work (looks
>>> like a googley-friendly URL, here it is the full URL for those
>>> interested in the drain/update proposal:
>>>
>>> [2] 
>>> https://docs.google.com/document/d/1UWhnYPgui0gUYOsuGcCjLuoOUlGA4QaY91n8p3wz9MY/edit#
>>> On Fri, Nov 2, 2018 at 10:11 PM Lukasz Cwik <lc...@google.com> wrote:
>>>>
>>>> I think the idea is that you would use one coder for paths where you don't 
>>>> need this information and would have FileIO provide a separate path that 
>>>> uses your updated coder.
>>>> Existing users would not be impacted and users of the new FileIO that 
>>>> depend on this information would not be able to have updated their 
>>>> pipeline in the first place.
>>>>
>>>> If the feature in FileIO is experimental, we could choose to break it for 
>>>> existing users though since I don't know how feasible my suggestion above 
>>>> is.
>>>>
>>>>
>>>>
>>>> On Fri, Nov 2, 2018 at 12:56 PM Jeff Klukas <jklu...@mozilla.com> wrote:
>>>>>
>>>>> Lukasz - Thanks for those links. That's very helpful context.
>>>>>
>>>>> It sounds like there's no explicit user contract about evolving Coder 
>>>>> classes in the Java SDK and users might reasonably assume Coders to be 
>>>>> stable between SDK versions. Thus, users of the Dataflow or Flink runners 
>>>>> might reasonably expect that they can update the Java SDK version used in 
>>>>> their pipeline when performing an update.
>>>>>
>>>>> Based in that understanding, evolving a class like Metadata might not be 
>>>>> possible except in a major version bump where it's obvious to users to 
>>>>> expect breaking changes and not to expect an "update" operation to work.
>>>>>
>>>>> It's not clear to me what changing the "name" of a coder would look like 
>>>>> or whether that's a tenable solution here. Would that change be able to 
>>>>> happen within the SDK itself, or is it something users would need to 
>>>>> specify?
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to