On Thu, Sep 1, 2022 at 11:08 PM Brennan Vincent <[email protected]>
wrote:

>
>
> On 2022-08-31 17:18, Martin Grigorov wrote:
> > On Wed, Aug 31, 2022 at 9:59 PM Brennan Vincent <[email protected]>
> > wrote:
> >
> >>
> >>
> >> On 2022-08-31 13:38, Ryan Skraba wrote:
> >>> Hello!  I've been trying out some POC code with Java to see what would
> >>> be the impact on that SDK -- in the past, a lot of the development has
> >>> been pretty Java-centric, but this is definitely not a requirement!
> >>>
> >>> Currently, the worst scenario I found is something like:
> >>>
> >>> { "type" : "record",
> >>>   "name" : "A",
> >>>   "fields" : [ { "name" : "a1",
> >>>     "type" : {
> >>>       "type" : "record",
> >>>       "name" : "B",
> >>>       "fields" : [ { "name" : "b1",  "type" : [ "null", "A" ],
> >>> "default" : null  } ] } } ] }
> >>>
> >>> This is a recursive definition that would like like a linked list
> >>> alternating A records containing B records containing A records, etc.
> >>>
> >>> If you were to only change the name of B to test.B (A fully qualified
> >>> namespace), Java can still parse the schema but the generated code
> >>> unsurprisingly no longer compiles.  It correctly finds the outer
> >>> schema (and doesn't try to look for test.A) but it's impossible to
> >>> import into the generated Java code.
> >>>
> >>> If you were to only change the name of A to test A, this is fine.
> >>>
> >>> I was playing around a bit with "auto-mangling" the packages to put A
> >>> in root$.A for this case, but I think it's a hopeless case for Java --
> >>> there's too many ways for the default package to "sneak" into the
> >>> system from other previously compiled classes, or from IDL, etc.
> >>>
> >>> I think it's still possible to try and accept the .Foo syntax but we'd
> >>> have to note that (for Java) mixing namespaced schemas and
> >>> null-namespaced schemas is either not supported, or we supply a
> >>> mechanism in Java to put ALL unnamespaced generated classes in a
> >>> folder like root$.
> >>>
> >>> Thanks for pointing out part 4, I'm also taking a look at the impact
> >>> there!  Given that these mixed namespace schemas are likely to already
> >>> be broken, I don't know if it's too big of an impact!  Especially if
> >>> we say that the dot is only added when strictly necessary to prevent
> >>> namespace inheritance.
> >>
> >> There is still a question for non-mixed schemas.
> >>
> >> Consider the following schema:
> >>
> >> {
> >>     "type": "fixed",
> >>     "name": "Foo",
> >>     "size": 10
> >> }
> >>
> >> Now, if we clarify the spec to say that leading dots are valid in
> >> default-namespace fullnames, then when this is normalized, the
> >> current language of the description of PCF implies that its
> >>
> >
> > Please copy/paste the text from the spec that implies that the name
> should
> > be ".Foo".
> > Otherwise we will have to guess which sentence you mean exactly.
>
> [FULLNAMES] Replace short names with fullnames, using applicable namespaces
> to do so. Then eliminate namespace attributes, which are now redundant.
>

I totally agree that using namespaces everywhere is a best practice!
But eliminating the namespace attribute is not really an option due to
backward compatibility.



>
> >
> > I don't see any pluses or minuses in using the leading dot in the PCF for
> > top-level names. IMO there is no difference with both representations.
> > For inner names the leading dot should be preserved in the PCF. Otherwise
> > it will start using the enclosing namespace after parsing.
> >
> >
> >> name should be rewritten to ".Foo". However, this is contrary to current
> >> behavior.
> >>
> >> So, if it's okay to change the behavior on existing valid schemas, then
> >> we should do so. If it's not okay, then we should clarify the spec to
> >> say that names are normalized to fullnames for PCF, _except_
> >> in the special case of the non-default namespace.
> >>
> >>>
> >>> I'll keep digging on the Java side.  Anybody else from the other SDKs
> >>> want to weigh in?  What would happen with C# generated code?
> >>>
> >>> All my best, Ryan
> >>>
> >>>
> >>>
> >>> On Fri, Aug 26, 2022 at 4:10 PM Brennan Vincent <
> [email protected]>
> >> wrote:
> >>>>
> >>>> I’m in favor of allowing .Foo as a fullname for the following reasons:
> >>>>
> >>>> 1. I believe the *intent* of the initial change to the spec was to
> only
> >> refer to namespaces;
> >>>> 2. Even if it is not possible in Java to generate code that refers to
> a
> >> non-namespaced context from a namespaced one, it may be possible in
> other
> >> languages;
> >>>> 3. We do not lose anything by supporting it.
> >>>> 4. Other parts of the spec assume that all names can be converted to a
> >> fullname, specifically the parsing canonical form algorithm.
> >>>>
> >>>> Point 4. brings me to another issue. Currently, non-namespaced names
> >> are left as bare names in PCF, at least by the Python SDK - they are not
> >> converted to fullnames like .Foo (which makes sense, since that is out
> of
> >> spec). However, it contradicts the spec:
> >>>>
> >>>> [FULLNAMES] Replace short names with fullnames, using applicable
> >> namespaces to do so.
> >>>>
> >>>> The spec doesn’t say “only if the non-empty namespace is used”. It
> says
> >> to always do this. So if we enable the ability to write fullnames like
> >> .Foo, we need to decide whether to change the PCF behavior (this will
> >> change the fingerprints of existing schemas) to match the spec, or
> change
> >> the spec to match the current behavior.
> >>>>
> >>>>> On Aug 26, 2022, at 03:57, Ryan Skraba <[email protected]> wrote:
> >>>>>
> >>>>> Hello!  We can just discuss the impact here in the mailing list and
> >>>>> make a decision by consensus.  Sometimes for major changes, we do a
> >>>>> more formal VOTE thread -- this might be one of those cases.
> >>>>>
> >>>>> What would happen if we were to say that ".MyRecord" was valid in the
> >>>>> next major version of Avro?
> >>>>>
> >>>>> Some SDKs used to accept this in the past and were made more strict,
> >>>>> causing working examples to break?  That is really unfortunate.
> >>>>>
> >>>>> On the other hand, if we generate Java code today and map packages
> 1:1
> >>>>> to namespaces... we still won't be able to mix namespaced (in a
> >>>>> package) and unnamespaced (unpackaged) generated code.  Would we just
> >>>>> mangle the default namespace to "default$" or ... ?  A configuration
> >>>>> option for the SpecificCompiler in Java?
> >>>>>
> >>>>> Either way, it would be great if we didn't leave this point vague in
> >>>>> the spec!   There's always the possibility to allow language SDKs to
> >>>>> deviate from the spec -- if e.g. python or Java has a
> >>>>> "setValidateUnqualifiedNamespace(boolean)" method, we can leave it up
> >>>>> to the user whether or not to follow the strict spec.  We already do
> >>>>> this with validating defaults in Java, for example.
> >>>>>
> >>>>> It might take a bit of thought, but if we can find some elegant way
> to
> >>>>> make this work I don't see why we wouldn't make specification
> changes!
> >>>>>
> >>>>> Ryan
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Thu, Aug 25, 2022 at 7:31 PM Brennan Vincent <
> >> [email protected]> wrote:
> >>>>>>
> >>>>>> That is a fair point also.
> >>>>>>
> >>>>>> Anyway, since I'm not an Apache project member, I'm not quite sure
> >> what
> >>>>>> is the best way to move forward here. Is there a formal process for
> >> proposing
> >>>>>> changes to the spec and reaching a consensus?
> >>>>>>
> >>>>>> Thanks
> >>>>>> Brennan
> >>>>>>
> >>>>>>> On 2022-08-25 01:36, Oscar Westra van Holthe - Kind wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> Allowing references to the null namespace from within another
> >> namespace
> >>>>>>> gives schema authors more options.
> >>>>>>>
> >>>>>>> But if you're using namespaces at all, there must be a reason for
> >> it. As a
> >>>>>>> schema author, you've made the decision to group your schemata.
> >>>>>>>
> >>>>>>> To make this decision from schema authors more visible, I'd opt to
> >> choose
> >>>>>>> the Java route and in that case force all schemata to belong to a
> >> group.
> >>>>>>> I.e., explicitly disallow identifiers to start with a dot (and
> >> disallow
> >>>>>>> references to the null namespace from within another namespace).
> >>>>>>>
> >>>>>>>
> >>>>>>> Kind regards,
> >>>>>>> Oscar
> >>>>>>>
> >>>>>>> --
> >>>>>>> Oscar Westra van Holthe - Kind <[email protected]>
> >>>>>>>
> >>>>>>> Op wo 24 aug. 2022 14:42 schreef Ryan Skraba <[email protected]>:
> >>>>>>>
> >>>>>>>> Hello!  There is definitely an ambiguity here caused by inheriting
> >>>>>>>> namespaces.
> >>>>>>>>
> >>>>>>>> The obvious takeaway is to use a namespace with all of your named
> >>>>>>>> schemas.  As a best practice, that avoids the problem of mixing
> >>>>>>>> schemas with and without namespaces, and it's probably this
> techniq
> >>>>>>>>
> >>>>>>>> This same problem occurs in Java classes, where you can have a
> class
> >>>>>>>> in the default package (without a package name), but it's an error
> >> to
> >>>>>>>> import it into other packages.
> >>>>>>>>
> >>>>>>>> The ".MyRecord" notation might be the right way to clarify this,
> but
> >>>>>>>> we can also go the Java route (i.e. you can't mix namespaced
> schema
> >>>>>>>> and non-namespaced schemas).  What do you think?
> >>>>>>>>
> >>>>>>>> Best regards, Ryan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Aug 22, 2022 at 10:49 PM Brennan Vincent <
> >> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> On 2022/08/22 20:05:22 Martin Grigorov wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I might be wrong but I think your sample schema should be valid!
> >> Does
> >>>>>>>> it
> >>>>>>>>>> fail with any of the SDKs ?
> >>>>>>>>>
> >>>>>>>>> Yes. It fails with the Python avro package.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> This part of the spec talks about the namespace, not the type.
> >> I.e.
> >>>>>>>>>> "namespace": ".ns" would be an error.
> >>>>>>>>>
> >>>>>>>>> The linked thread (
> >>>>>>>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587
> )
> >>>>>>>>> is a bit vague -- it's not totally clear whether the restriction
> is
> >>>>>>>> meant to apply to
> >>>>>>>>> namespaces only, or to fullnames also.
> >>>>>>>>>
> >>>>>>>>> "The null namespace may not be used in a dot-separated sequence
> of
> >>>>>>>> names."
> >>>>>>>>>
> >>>>>>>>> certainly makes it sound like it applies to _any_ sequence of
> >> names,
> >>>>>>>> though,
> >>>>>>>>> not just in a namespace field.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Aug 22, 2022 at 10:40 PM Brennan Vincent <
> >> [email protected]
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hello,
> >>>>>>>>>>>
> >>>>>>>>>>> https://github.com/apache/avro/pull/917 introduced the
> following
> >>>>>>>> language
> >>>>>>>>>>> to the spec:
> >>>>>>>>>>>
> >>>>>>>>>>>> The null namespace may not be used in a dot-separated sequence
> >> of
> >>>>>>>> names.
> >>>>>>>>>>>
> >>>>>>>>>>> Thus ruling out fullnames like ".foo".
> >>>>>>>>>>>
> >>>>>>>>>>> However, this seems to rule out referring to names in the
> default
> >>>>>>>>>>> namespace from another namespace.
> >>>>>>>>>>>
> >>>>>>>>>>> For example, this schema was previously allowed by the spec:
> >>>>>>>>>>>
> >>>>>>>>>>> {
> >>>>>>>>>>>    "type": "record",
> >>>>>>>>>>>    "name": "r",
> >>>>>>>>>>>    "fields": [
> >>>>>>>>>>>        {
> >>>>>>>>>>>            "name": "f",
> >>>>>>>>>>>            "type": {
> >>>>>>>>>>>                "type": "record",
> >>>>>>>>>>>                "name": "r2",
> >>>>>>>>>>>                "namespace": "ns",
> >>>>>>>>>>>                "fields": [
> >>>>>>>>>>>                    {
> >>>>>>>>>>>                        "name": "f2",
> >>>>>>>>>>>                        "type": ["null", ".r"]
> >>>>>>>>>>>                    }
> >>>>>>>>>>>                ]
> >>>>>>>>>>>            }
> >>>>>>>>>>>        }
> >>>>>>>>>>>    ]
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> Note ".r" in the type of "f2". This can't be changed to "r",
> >>>>>>>>>>> because that would be interpreted as "ns.r" due to "ns" being
> the
> >>>>>>>> nearest
> >>>>>>>>>>> enclosing namespace.
> >>>>>>>>>>>
> >>>>>>>>>>> Thus it seems that the new spec has restricted the set of valid
> >>>>>>>> schemas
> >>>>>>>>>>> and there is no longer
> >>>>>>>>>>> any way to accomplish this.
> >>>>>>>>>>>
> >>>>>>>>>>> Am I misinterpreting the spec? Does the empty namespace being
> >>>>>>>> disallowed
> >>>>>>>>>>> in dotted sequences
> >>>>>>>>>>> of names only apply to initial name definitions, but not to
> later
> >>>>>>>> name
> >>>>>>>>>>> references? Or is there
> >>>>>>>>>>> some other way to express this?
> >>>>>>>>>>>
> >>>>>>>>>>> Here is the initial discussion of this change, where the issue
> >> I'm
> >>>>>>>> raising
> >>>>>>>>>>> here doesn't
> >>>>>>>>>>> appear to have come up:
> >>>>>>>>>>>
> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587
> >>>>>>>>>>>
>

Reply via email to