Re: Impossible to refer to names in the default namespace from another namespace?

Brennan Vincent Thu, 01 Sep 2022 16:52:59 -0700

I don’t understand what you mean. I am talking about what to do with names that 
have no namespace. Obviously, in such a case there are no namespace attributes 
to remove.


> On Sep 1, 2022, at 16:34, Martin Grigorov <[email protected]> wrote:
> 
> 
> 
> 
>> On Thu, Sep 1, 2022 at 11:08 PM Brennan Vincent <[email protected]> 
>> wrote:
>> 
>> 
>> On 2022-08-31 17:18, Martin Grigorov wrote:
>> > On Wed, Aug 31, 2022 at 9:59 PM Brennan Vincent <[email protected]>
>> > wrote:
>> >
>> >>
>> >>
>> >> On 2022-08-31 13:38, Ryan Skraba wrote:
>> >>> Hello!  I've been trying out some POC code with Java to see what would
>> >>> be the impact on that SDK -- in the past, a lot of the development has
>> >>> been pretty Java-centric, but this is definitely not a requirement!
>> >>>
>> >>> Currently, the worst scenario I found is something like:
>> >>>
>> >>> { "type" : "record",
>> >>>   "name" : "A",
>> >>>   "fields" : [ { "name" : "a1",
>> >>>     "type" : {
>> >>>       "type" : "record",
>> >>>       "name" : "B",
>> >>>       "fields" : [ { "name" : "b1",  "type" : [ "null", "A" ],
>> >>> "default" : null  } ] } } ] }
>> >>>
>> >>> This is a recursive definition that would like like a linked list
>> >>> alternating A records containing B records containing A records, etc.
>> >>>
>> >>> If you were to only change the name of B to test.B (A fully qualified
>> >>> namespace), Java can still parse the schema but the generated code
>> >>> unsurprisingly no longer compiles.  It correctly finds the outer
>> >>> schema (and doesn't try to look for test.A) but it's impossible to
>> >>> import into the generated Java code.
>> >>>
>> >>> If you were to only change the name of A to test A, this is fine.
>> >>>
>> >>> I was playing around a bit with "auto-mangling" the packages to put A
>> >>> in root$.A for this case, but I think it's a hopeless case for Java --
>> >>> there's too many ways for the default package to "sneak" into the
>> >>> system from other previously compiled classes, or from IDL, etc.
>> >>>
>> >>> I think it's still possible to try and accept the .Foo syntax but we'd
>> >>> have to note that (for Java) mixing namespaced schemas and
>> >>> null-namespaced schemas is either not supported, or we supply a
>> >>> mechanism in Java to put ALL unnamespaced generated classes in a
>> >>> folder like root$.
>> >>>
>> >>> Thanks for pointing out part 4, I'm also taking a look at the impact
>> >>> there!  Given that these mixed namespace schemas are likely to already
>> >>> be broken, I don't know if it's too big of an impact!  Especially if
>> >>> we say that the dot is only added when strictly necessary to prevent
>> >>> namespace inheritance.
>> >>
>> >> There is still a question for non-mixed schemas.
>> >>
>> >> Consider the following schema:
>> >>
>> >> {
>> >>     "type": "fixed",
>> >>     "name": "Foo",
>> >>     "size": 10
>> >> }
>> >>
>> >> Now, if we clarify the spec to say that leading dots are valid in
>> >> default-namespace fullnames, then when this is normalized, the
>> >> current language of the description of PCF implies that its
>> >>
>> >
>> > Please copy/paste the text from the spec that implies that the name should
>> > be ".Foo".
>> > Otherwise we will have to guess which sentence you mean exactly.
>> 
>> [FULLNAMES] Replace short names with fullnames, using applicable namespaces
>> to do so. Then eliminate namespace attributes, which are now redundant.
> 
> I totally agree that using namespaces everywhere is a best practice!
> But eliminating the namespace attribute is not really an option due to 
> backward compatibility.
> 
>  
>> 
>> >
>> > I don't see any pluses or minuses in using the leading dot in the PCF for
>> > top-level names. IMO there is no difference with both representations.
>> > For inner names the leading dot should be preserved in the PCF. Otherwise
>> > it will start using the enclosing namespace after parsing.
>> >
>> >
>> >> name should be rewritten to ".Foo". However, this is contrary to current
>> >> behavior.
>> >>
>> >> So, if it's okay to change the behavior on existing valid schemas, then
>> >> we should do so. If it's not okay, then we should clarify the spec to
>> >> say that names are normalized to fullnames for PCF, _except_
>> >> in the special case of the non-default namespace.
>> >>
>> >>>
>> >>> I'll keep digging on the Java side.  Anybody else from the other SDKs
>> >>> want to weigh in?  What would happen with C# generated code?
>> >>>
>> >>> All my best, Ryan
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Aug 26, 2022 at 4:10 PM Brennan Vincent <[email protected]>
>> >> wrote:
>> >>>>
>> >>>> I’m in favor of allowing .Foo as a fullname for the following reasons:
>> >>>>
>> >>>> 1. I believe the *intent* of the initial change to the spec was to only
>> >> refer to namespaces;
>> >>>> 2. Even if it is not possible in Java to generate code that refers to a
>> >> non-namespaced context from a namespaced one, it may be possible in other
>> >> languages;
>> >>>> 3. We do not lose anything by supporting it.
>> >>>> 4. Other parts of the spec assume that all names can be converted to a
>> >> fullname, specifically the parsing canonical form algorithm.
>> >>>>
>> >>>> Point 4. brings me to another issue. Currently, non-namespaced names
>> >> are left as bare names in PCF, at least by the Python SDK - they are not
>> >> converted to fullnames like .Foo (which makes sense, since that is out of
>> >> spec). However, it contradicts the spec:
>> >>>>
>> >>>> [FULLNAMES] Replace short names with fullnames, using applicable
>> >> namespaces to do so.
>> >>>>
>> >>>> The spec doesn’t say “only if the non-empty namespace is used”. It says
>> >> to always do this. So if we enable the ability to write fullnames like
>> >> .Foo, we need to decide whether to change the PCF behavior (this will
>> >> change the fingerprints of existing schemas) to match the spec, or change
>> >> the spec to match the current behavior.
>> >>>>
>> >>>>> On Aug 26, 2022, at 03:57, Ryan Skraba <[email protected]> wrote:
>> >>>>>
>> >>>>> Hello!  We can just discuss the impact here in the mailing list and
>> >>>>> make a decision by consensus.  Sometimes for major changes, we do a
>> >>>>> more formal VOTE thread -- this might be one of those cases.
>> >>>>>
>> >>>>> What would happen if we were to say that ".MyRecord" was valid in the
>> >>>>> next major version of Avro?
>> >>>>>
>> >>>>> Some SDKs used to accept this in the past and were made more strict,
>> >>>>> causing working examples to break?  That is really unfortunate.
>> >>>>>
>> >>>>> On the other hand, if we generate Java code today and map packages 1:1
>> >>>>> to namespaces... we still won't be able to mix namespaced (in a
>> >>>>> package) and unnamespaced (unpackaged) generated code.  Would we just
>> >>>>> mangle the default namespace to "default$" or ... ?  A configuration
>> >>>>> option for the SpecificCompiler in Java?
>> >>>>>
>> >>>>> Either way, it would be great if we didn't leave this point vague in
>> >>>>> the spec!   There's always the possibility to allow language SDKs to
>> >>>>> deviate from the spec -- if e.g. python or Java has a
>> >>>>> "setValidateUnqualifiedNamespace(boolean)" method, we can leave it up
>> >>>>> to the user whether or not to follow the strict spec.  We already do
>> >>>>> this with validating defaults in Java, for example.
>> >>>>>
>> >>>>> It might take a bit of thought, but if we can find some elegant way to
>> >>>>> make this work I don't see why we wouldn't make specification changes!
>> >>>>>
>> >>>>> Ryan
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>> On Thu, Aug 25, 2022 at 7:31 PM Brennan Vincent <
>> >> [email protected]> wrote:
>> >>>>>>
>> >>>>>> That is a fair point also.
>> >>>>>>
>> >>>>>> Anyway, since I'm not an Apache project member, I'm not quite sure
>> >> what
>> >>>>>> is the best way to move forward here. Is there a formal process for
>> >> proposing
>> >>>>>> changes to the spec and reaching a consensus?
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Brennan
>> >>>>>>
>> >>>>>>> On 2022-08-25 01:36, Oscar Westra van Holthe - Kind wrote:
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> Allowing references to the null namespace from within another
>> >> namespace
>> >>>>>>> gives schema authors more options.
>> >>>>>>>
>> >>>>>>> But if you're using namespaces at all, there must be a reason for
>> >> it. As a
>> >>>>>>> schema author, you've made the decision to group your schemata.
>> >>>>>>>
>> >>>>>>> To make this decision from schema authors more visible, I'd opt to
>> >> choose
>> >>>>>>> the Java route and in that case force all schemata to belong to a
>> >> group.
>> >>>>>>> I.e., explicitly disallow identifiers to start with a dot (and
>> >> disallow
>> >>>>>>> references to the null namespace from within another namespace).
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Kind regards,
>> >>>>>>> Oscar
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Oscar Westra van Holthe - Kind <[email protected]>
>> >>>>>>>
>> >>>>>>> Op wo 24 aug. 2022 14:42 schreef Ryan Skraba <[email protected]>:
>> >>>>>>>
>> >>>>>>>> Hello!  There is definitely an ambiguity here caused by inheriting
>> >>>>>>>> namespaces.
>> >>>>>>>>
>> >>>>>>>> The obvious takeaway is to use a namespace with all of your named
>> >>>>>>>> schemas.  As a best practice, that avoids the problem of mixing
>> >>>>>>>> schemas with and without namespaces, and it's probably this techniq
>> >>>>>>>>
>> >>>>>>>> This same problem occurs in Java classes, where you can have a class
>> >>>>>>>> in the default package (without a package name), but it's an error
>> >> to
>> >>>>>>>> import it into other packages.
>> >>>>>>>>
>> >>>>>>>> The ".MyRecord" notation might be the right way to clarify this, but
>> >>>>>>>> we can also go the Java route (i.e. you can't mix namespaced schema
>> >>>>>>>> and non-namespaced schemas).  What do you think?
>> >>>>>>>>
>> >>>>>>>> Best regards, Ryan
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 22, 2022 at 10:49 PM Brennan Vincent <
>> >> [email protected]>
>> >>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> On 2022/08/22 20:05:22 Martin Grigorov wrote:
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> I might be wrong but I think your sample schema should be valid!
>> >> Does
>> >>>>>>>> it
>> >>>>>>>>>> fail with any of the SDKs ?
>> >>>>>>>>>
>> >>>>>>>>> Yes. It fails with the Python avro package.
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> This part of the spec talks about the namespace, not the type.
>> >> I.e.
>> >>>>>>>>>> "namespace": ".ns" would be an error.
>> >>>>>>>>>
>> >>>>>>>>> The linked thread (
>> >>>>>>>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587 )
>> >>>>>>>>> is a bit vague -- it's not totally clear whether the restriction is
>> >>>>>>>> meant to apply to
>> >>>>>>>>> namespaces only, or to fullnames also.
>> >>>>>>>>>
>> >>>>>>>>> "The null namespace may not be used in a dot-separated sequence of
>> >>>>>>>> names."
>> >>>>>>>>>
>> >>>>>>>>> certainly makes it sound like it applies to _any_ sequence of
>> >> names,
>> >>>>>>>> though,
>> >>>>>>>>> not just in a namespace field.
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Aug 22, 2022 at 10:40 PM Brennan Vincent <
>> >> [email protected]
>> >>>>>>>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hello,
>> >>>>>>>>>>>
>> >>>>>>>>>>> https://github.com/apache/avro/pull/917 introduced the following
>> >>>>>>>> language
>> >>>>>>>>>>> to the spec:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> The null namespace may not be used in a dot-separated sequence
>> >> of
>> >>>>>>>> names.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thus ruling out fullnames like ".foo".
>> >>>>>>>>>>>
>> >>>>>>>>>>> However, this seems to rule out referring to names in the default
>> >>>>>>>>>>> namespace from another namespace.
>> >>>>>>>>>>>
>> >>>>>>>>>>> For example, this schema was previously allowed by the spec:
>> >>>>>>>>>>>
>> >>>>>>>>>>> {
>> >>>>>>>>>>>    "type": "record",
>> >>>>>>>>>>>    "name": "r",
>> >>>>>>>>>>>    "fields": [
>> >>>>>>>>>>>        {
>> >>>>>>>>>>>            "name": "f",
>> >>>>>>>>>>>            "type": {
>> >>>>>>>>>>>                "type": "record",
>> >>>>>>>>>>>                "name": "r2",
>> >>>>>>>>>>>                "namespace": "ns",
>> >>>>>>>>>>>                "fields": [
>> >>>>>>>>>>>                    {
>> >>>>>>>>>>>                        "name": "f2",
>> >>>>>>>>>>>                        "type": ["null", ".r"]
>> >>>>>>>>>>>                    }
>> >>>>>>>>>>>                ]
>> >>>>>>>>>>>            }
>> >>>>>>>>>>>        }
>> >>>>>>>>>>>    ]
>> >>>>>>>>>>> }
>> >>>>>>>>>>>
>> >>>>>>>>>>> Note ".r" in the type of "f2". This can't be changed to "r",
>> >>>>>>>>>>> because that would be interpreted as "ns.r" due to "ns" being the
>> >>>>>>>> nearest
>> >>>>>>>>>>> enclosing namespace.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thus it seems that the new spec has restricted the set of valid
>> >>>>>>>> schemas
>> >>>>>>>>>>> and there is no longer
>> >>>>>>>>>>> any way to accomplish this.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Am I misinterpreting the spec? Does the empty namespace being
>> >>>>>>>> disallowed
>> >>>>>>>>>>> in dotted sequences
>> >>>>>>>>>>> of names only apply to initial name definitions, but not to later
>> >>>>>>>> name
>> >>>>>>>>>>> references? Or is there
>> >>>>>>>>>>> some other way to express this?
>> >>>>>>>>>>>
>> >>>>>>>>>>> Here is the initial discussion of this change, where the issue
>> >> I'm
>> >>>>>>>> raising
>> >>>>>>>>>>> here doesn't
>> >>>>>>>>>>> appear to have come up:
>> >>>>>>>>>>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587
>> >>>>>>>>>>>

Re: Impossible to refer to names in the default namespace from another namespace?

Reply via email to