On 2022-09-02 01:34, Martin Grigorov wrote:
> On Fri, Sep 2, 2022, 02:53 Brennan Vincent <[email protected]> wrote:
>
>> I don’t understand what you mean. I am talking about what to do with names
>> that have no namespace. Obviously, in such a case there are no namespace
>> attributes to remove.
>>
>
> It seems I misunderstood your previous message then.
>
My point was that currently, there is no fullname
corresponding to a name with no namespace. In the future, if
we allow ".Foo", there will be one. Thus, following the
description of PCF which mandates replacing all names by fullnames,
we would replace "Foo" in a non-namespaced context by ".Foo", which
differs from the current behavior of PCF.
>
>
>>> On Sep 1, 2022, at 16:34, Martin Grigorov <[email protected]> wrote:
>>>
>>> 
>>>
>>>
>>>> On Thu, Sep 1, 2022 at 11:08 PM Brennan Vincent <[email protected]>
>> wrote:
>>>>
>>>>
>>>> On 2022-08-31 17:18, Martin Grigorov wrote:
>>>>> On Wed, Aug 31, 2022 at 9:59 PM Brennan Vincent <
>> [email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 2022-08-31 13:38, Ryan Skraba wrote:
>>>>>>> Hello!  I've been trying out some POC code with Java to see what
>> would
>>>>>>> be the impact on that SDK -- in the past, a lot of the development
>> has
>>>>>>> been pretty Java-centric, but this is definitely not a requirement!
>>>>>>>
>>>>>>> Currently, the worst scenario I found is something like:
>>>>>>>
>>>>>>> { "type" : "record",
>>>>>>>   "name" : "A",
>>>>>>>   "fields" : [ { "name" : "a1",
>>>>>>>     "type" : {
>>>>>>>       "type" : "record",
>>>>>>>       "name" : "B",
>>>>>>>       "fields" : [ { "name" : "b1",  "type" : [ "null", "A" ],
>>>>>>> "default" : null  } ] } } ] }
>>>>>>>
>>>>>>> This is a recursive definition that would like like a linked list
>>>>>>> alternating A records containing B records containing A records,
>> etc.
>>>>>>>
>>>>>>> If you were to only change the name of B to test.B (A fully
>> qualified
>>>>>>> namespace), Java can still parse the schema but the generated code
>>>>>>> unsurprisingly no longer compiles.  It correctly finds the outer
>>>>>>> schema (and doesn't try to look for test.A) but it's impossible to
>>>>>>> import into the generated Java code.
>>>>>>>
>>>>>>> If you were to only change the name of A to test A, this is fine.
>>>>>>>
>>>>>>> I was playing around a bit with "auto-mangling" the packages to put
>> A
>>>>>>> in root$.A for this case, but I think it's a hopeless case for Java
>> --
>>>>>>> there's too many ways for the default package to "sneak" into the
>>>>>>> system from other previously compiled classes, or from IDL, etc.
>>>>>>>
>>>>>>> I think it's still possible to try and accept the .Foo syntax but
>> we'd
>>>>>>> have to note that (for Java) mixing namespaced schemas and
>>>>>>> null-namespaced schemas is either not supported, or we supply a
>>>>>>> mechanism in Java to put ALL unnamespaced generated classes in a
>>>>>>> folder like root$.
>>>>>>>
>>>>>>> Thanks for pointing out part 4, I'm also taking a look at the impact
>>>>>>> there!  Given that these mixed namespace schemas are likely to
>> already
>>>>>>> be broken, I don't know if it's too big of an impact!  Especially if
>>>>>>> we say that the dot is only added when strictly necessary to prevent
>>>>>>> namespace inheritance.
>>>>>>
>>>>>> There is still a question for non-mixed schemas.
>>>>>>
>>>>>> Consider the following schema:
>>>>>>
>>>>>> {
>>>>>>     "type": "fixed",
>>>>>>     "name": "Foo",
>>>>>>     "size": 10
>>>>>> }
>>>>>>
>>>>>> Now, if we clarify the spec to say that leading dots are valid in
>>>>>> default-namespace fullnames, then when this is normalized, the
>>>>>> current language of the description of PCF implies that its
>>>>>>
>>>>>
>>>>> Please copy/paste the text from the spec that implies that the name
>> should
>>>>> be ".Foo".
>>>>> Otherwise we will have to guess which sentence you mean exactly.
>>>>
>>>> [FULLNAMES] Replace short names with fullnames, using applicable
>> namespaces
>>>> to do so. Then eliminate namespace attributes, which are now redundant.
>>>
>>> I totally agree that using namespaces everywhere is a best practice!
>>> But eliminating the namespace attribute is not really an option due to
>> backward compatibility.
>>>
>>>
>>>>
>>>>>
>>>>> I don't see any pluses or minuses in using the leading dot in the PCF
>> for
>>>>> top-level names. IMO there is no difference with both representations.
>>>>> For inner names the leading dot should be preserved in the PCF.
>> Otherwise
>>>>> it will start using the enclosing namespace after parsing.
>>>>>
>>>>>
>>>>>> name should be rewritten to ".Foo". However, this is contrary to
>> current
>>>>>> behavior.
>>>>>>
>>>>>> So, if it's okay to change the behavior on existing valid schemas,
>> then
>>>>>> we should do so. If it's not okay, then we should clarify the spec to
>>>>>> say that names are normalized to fullnames for PCF, _except_
>>>>>> in the special case of the non-default namespace.
>>>>>>
>>>>>>>
>>>>>>> I'll keep digging on the Java side.  Anybody else from the other
>> SDKs
>>>>>>> want to weigh in?  What would happen with C# generated code?
>>>>>>>
>>>>>>> All my best, Ryan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Aug 26, 2022 at 4:10 PM Brennan Vincent <
>> [email protected]>
>>>>>> wrote:
>>>>>>>>
>>>>>>>> I’m in favor of allowing .Foo as a fullname for the following
>> reasons:
>>>>>>>>
>>>>>>>> 1. I believe the *intent* of the initial change to the spec was to
>> only
>>>>>> refer to namespaces;
>>>>>>>> 2. Even if it is not possible in Java to generate code that refers
>> to a
>>>>>> non-namespaced context from a namespaced one, it may be possible in
>> other
>>>>>> languages;
>>>>>>>> 3. We do not lose anything by supporting it.
>>>>>>>> 4. Other parts of the spec assume that all names can be converted
>> to a
>>>>>> fullname, specifically the parsing canonical form algorithm.
>>>>>>>>
>>>>>>>> Point 4. brings me to another issue. Currently, non-namespaced
>> names
>>>>>> are left as bare names in PCF, at least by the Python SDK - they are
>> not
>>>>>> converted to fullnames like .Foo (which makes sense, since that is
>> out of
>>>>>> spec). However, it contradicts the spec:
>>>>>>>>
>>>>>>>> [FULLNAMES] Replace short names with fullnames, using applicable
>>>>>> namespaces to do so.
>>>>>>>>
>>>>>>>> The spec doesn’t say “only if the non-empty namespace is used”. It
>> says
>>>>>> to always do this. So if we enable the ability to write fullnames
>> like
>>>>>> .Foo, we need to decide whether to change the PCF behavior (this will
>>>>>> change the fingerprints of existing schemas) to match the spec, or
>> change
>>>>>> the spec to match the current behavior.
>>>>>>>>
>>>>>>>>> On Aug 26, 2022, at 03:57, Ryan Skraba <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Hello!  We can just discuss the impact here in the mailing list
>> and
>>>>>>>>> make a decision by consensus.  Sometimes for major changes, we do
>> a
>>>>>>>>> more formal VOTE thread -- this might be one of those cases.
>>>>>>>>>
>>>>>>>>> What would happen if we were to say that ".MyRecord" was valid in
>> the
>>>>>>>>> next major version of Avro?
>>>>>>>>>
>>>>>>>>> Some SDKs used to accept this in the past and were made more
>> strict,
>>>>>>>>> causing working examples to break?  That is really unfortunate.
>>>>>>>>>
>>>>>>>>> On the other hand, if we generate Java code today and map
>> packages 1:1
>>>>>>>>> to namespaces... we still won't be able to mix namespaced (in a
>>>>>>>>> package) and unnamespaced (unpackaged) generated code.  Would we
>> just
>>>>>>>>> mangle the default namespace to "default$" or ... ?  A
>> configuration
>>>>>>>>> option for the SpecificCompiler in Java?
>>>>>>>>>
>>>>>>>>> Either way, it would be great if we didn't leave this point vague
>> in
>>>>>>>>> the spec!   There's always the possibility to allow language SDKs
>> to
>>>>>>>>> deviate from the spec -- if e.g. python or Java has a
>>>>>>>>> "setValidateUnqualifiedNamespace(boolean)" method, we can leave
>> it up
>>>>>>>>> to the user whether or not to follow the strict spec.  We already
>> do
>>>>>>>>> this with validating defaults in Java, for example.
>>>>>>>>>
>>>>>>>>> It might take a bit of thought, but if we can find some elegant
>> way to
>>>>>>>>> make this work I don't see why we wouldn't make specification
>> changes!
>>>>>>>>>
>>>>>>>>> Ryan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Thu, Aug 25, 2022 at 7:31 PM Brennan Vincent <
>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> That is a fair point also.
>>>>>>>>>>
>>>>>>>>>> Anyway, since I'm not an Apache project member, I'm not quite
>> sure
>>>>>> what
>>>>>>>>>> is the best way to move forward here. Is there a formal process
>> for
>>>>>> proposing
>>>>>>>>>> changes to the spec and reaching a consensus?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Brennan
>>>>>>>>>>
>>>>>>>>>>> On 2022-08-25 01:36, Oscar Westra van Holthe - Kind wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Allowing references to the null namespace from within another
>>>>>> namespace
>>>>>>>>>>> gives schema authors more options.
>>>>>>>>>>>
>>>>>>>>>>> But if you're using namespaces at all, there must be a reason
>> for
>>>>>> it. As a
>>>>>>>>>>> schema author, you've made the decision to group your schemata.
>>>>>>>>>>>
>>>>>>>>>>> To make this decision from schema authors more visible, I'd opt
>> to
>>>>>> choose
>>>>>>>>>>> the Java route and in that case force all schemata to belong to
>> a
>>>>>> group.
>>>>>>>>>>> I.e., explicitly disallow identifiers to start with a dot (and
>>>>>> disallow
>>>>>>>>>>> references to the null namespace from within another namespace).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kind regards,
>>>>>>>>>>> Oscar
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Oscar Westra van Holthe - Kind <[email protected]>
>>>>>>>>>>>
>>>>>>>>>>> Op wo 24 aug. 2022 14:42 schreef Ryan Skraba <[email protected]>:
>>>>>>>>>>>
>>>>>>>>>>>> Hello!  There is definitely an ambiguity here caused by
>> inheriting
>>>>>>>>>>>> namespaces.
>>>>>>>>>>>>
>>>>>>>>>>>> The obvious takeaway is to use a namespace with all of your
>> named
>>>>>>>>>>>> schemas.  As a best practice, that avoids the problem of mixing
>>>>>>>>>>>> schemas with and without namespaces, and it's probably this
>> techniq
>>>>>>>>>>>>
>>>>>>>>>>>> This same problem occurs in Java classes, where you can have a
>> class
>>>>>>>>>>>> in the default package (without a package name), but it's an
>> error
>>>>>> to
>>>>>>>>>>>> import it into other packages.
>>>>>>>>>>>>
>>>>>>>>>>>> The ".MyRecord" notation might be the right way to clarify
>> this, but
>>>>>>>>>>>> we can also go the Java route (i.e. you can't mix namespaced
>> schema
>>>>>>>>>>>> and non-namespaced schemas).  What do you think?
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards, Ryan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 22, 2022 at 10:49 PM Brennan Vincent <
>>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2022/08/22 20:05:22 Martin Grigorov wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I might be wrong but I think your sample schema should be
>> valid!
>>>>>> Does
>>>>>>>>>>>> it
>>>>>>>>>>>>>> fail with any of the SDKs ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes. It fails with the Python avro package.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This part of the spec talks about the namespace, not the
>> type.
>>>>>> I.e.
>>>>>>>>>>>>>> "namespace": ".ns" would be an error.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The linked thread (
>>>>>>>>>>>>
>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587 )
>>>>>>>>>>>>> is a bit vague -- it's not totally clear whether the
>> restriction is
>>>>>>>>>>>> meant to apply to
>>>>>>>>>>>>> namespaces only, or to fullnames also.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "The null namespace may not be used in a dot-separated
>> sequence of
>>>>>>>>>>>> names."
>>>>>>>>>>>>>
>>>>>>>>>>>>> certainly makes it sound like it applies to _any_ sequence of
>>>>>> names,
>>>>>>>>>>>> though,
>>>>>>>>>>>>> not just in a namespace field.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Aug 22, 2022 at 10:40 PM Brennan Vincent <
>>>>>> [email protected]
>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://github.com/apache/avro/pull/917 introduced the
>> following
>>>>>>>>>>>> language
>>>>>>>>>>>>>>> to the spec:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The null namespace may not be used in a dot-separated
>> sequence
>>>>>> of
>>>>>>>>>>>> names.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thus ruling out fullnames like ".foo".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, this seems to rule out referring to names in the
>> default
>>>>>>>>>>>>>>> namespace from another namespace.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For example, this schema was previously allowed by the spec:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>    "type": "record",
>>>>>>>>>>>>>>>    "name": "r",
>>>>>>>>>>>>>>>    "fields": [
>>>>>>>>>>>>>>>        {
>>>>>>>>>>>>>>>            "name": "f",
>>>>>>>>>>>>>>>            "type": {
>>>>>>>>>>>>>>>                "type": "record",
>>>>>>>>>>>>>>>                "name": "r2",
>>>>>>>>>>>>>>>                "namespace": "ns",
>>>>>>>>>>>>>>>                "fields": [
>>>>>>>>>>>>>>>                    {
>>>>>>>>>>>>>>>                        "name": "f2",
>>>>>>>>>>>>>>>                        "type": ["null", ".r"]
>>>>>>>>>>>>>>>                    }
>>>>>>>>>>>>>>>                ]
>>>>>>>>>>>>>>>            }
>>>>>>>>>>>>>>>        }
>>>>>>>>>>>>>>>    ]
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Note ".r" in the type of "f2". This can't be changed to "r",
>>>>>>>>>>>>>>> because that would be interpreted as "ns.r" due to "ns"
>> being the
>>>>>>>>>>>> nearest
>>>>>>>>>>>>>>> enclosing namespace.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thus it seems that the new spec has restricted the set of
>> valid
>>>>>>>>>>>> schemas
>>>>>>>>>>>>>>> and there is no longer
>>>>>>>>>>>>>>> any way to accomplish this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am I misinterpreting the spec? Does the empty namespace
>> being
>>>>>>>>>>>> disallowed
>>>>>>>>>>>>>>> in dotted sequences
>>>>>>>>>>>>>>> of names only apply to initial name definitions, but not to
>> later
>>>>>>>>>>>> name
>>>>>>>>>>>>>>> references? Or is there
>>>>>>>>>>>>>>> some other way to express this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here is the initial discussion of this change, where the
>> issue
>>>>>> I'm
>>>>>>>>>>>> raising
>>>>>>>>>>>>>>> here doesn't
>>>>>>>>>>>>>>> appear to have come up:
>>>>>>>>>>>>>>>
>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587
>>>>>>>>>>>>>>>
>>

Reply via email to