On Wed, Aug 31, 2022 at 8:40 PM Ryan Skraba <[email protected]> wrote:

> Hello!  I've been trying out some POC code with Java to see what would
> be the impact on that SDK -- in the past, a lot of the development has
> been pretty Java-centric, but this is definitely not a requirement!
>
> Currently, the worst scenario I found is something like:
>
> { "type" : "record",
>   "name" : "A",
>   "fields" : [ { "name" : "a1",
>     "type" : {
>       "type" : "record",
>       "name" : "B",
>       "fields" : [ { "name" : "b1",  "type" : [ "null", "A" ],
> "default" : null  } ] } } ] }
>
> This is a recursive definition that would like like a linked list
> alternating A records containing B records containing A records, etc.
>
> If you were to only change the name of B to test.B (A fully qualified
> namespace), Java can still parse the schema but the generated code
> unsurprisingly no longer compiles.  It correctly finds the outer
> schema (and doesn't try to look for test.A) but it's impossible to
> import into the generated Java code.
>
> If you were to only change the name of A to test A, this is fine.
>
> I was playing around a bit with "auto-mangling" the packages to put A
> in root$.A for this case, but I think it's a hopeless case for Java --
> there's too many ways for the default package to "sneak" into the
> system from other previously compiled classes, or from IDL, etc.
>
> I think it's still possible to try and accept the .Foo syntax but we'd
> have to note that (for Java) mixing namespaced schemas and
> null-namespaced schemas is either not supported, or we supply a
> mechanism in Java to put ALL unnamespaced generated classes in a
> folder like root$.
>
> Thanks for pointing out part 4, I'm also taking a look at the impact
> there!  Given that these mixed namespace schemas are likely to already
> be broken, I don't know if it's too big of an impact!  Especially if
> we say that the dot is only added when strictly necessary to prevent
> namespace inheritance.
>
> I'll keep digging on the Java side.  Anybody else from the other SDKs
> want to weigh in?  What would happen with C# generated code?
>

The Rust code generator does not support namespaces yet -
https://github.com/lerouxrgd/rsgen-avro/issues/31.
But when implemented I don't foresee any problems with the "root" namespace.
In Rust one can import with "use Abc" (relative path, i.e. anywhere in the
module hierarchy) or "use crate::Abc" (absolute path, i.e. to import a top
level struct/enum/trait/type/union).


>
> All my best, Ryan
>
>
>
> On Fri, Aug 26, 2022 at 4:10 PM Brennan Vincent <[email protected]>
> wrote:
> >
> > I’m in favor of allowing .Foo as a fullname for the following reasons:
> >
> > 1. I believe the *intent* of the initial change to the spec was to only
> refer to namespaces;
> > 2. Even if it is not possible in Java to generate code that refers to a
> non-namespaced context from a namespaced one, it may be possible in other
> languages;
> > 3. We do not lose anything by supporting it.
> > 4. Other parts of the spec assume that all names can be converted to a
> fullname, specifically the parsing canonical form algorithm.
> >
> > Point 4. brings me to another issue. Currently, non-namespaced names are
> left as bare names in PCF, at least by the Python SDK - they are not
> converted to fullnames like .Foo (which makes sense, since that is out of
> spec). However, it contradicts the spec:
> >
> > [FULLNAMES] Replace short names with fullnames, using applicable
> namespaces to do so.
> >
> > The spec doesn’t say “only if the non-empty namespace is used”. It says
> to always do this. So if we enable the ability to write fullnames like
> .Foo, we need to decide whether to change the PCF behavior (this will
> change the fingerprints of existing schemas) to match the spec, or change
> the spec to match the current behavior.
> >
> > > On Aug 26, 2022, at 03:57, Ryan Skraba <[email protected]> wrote:
> > >
> > > Hello!  We can just discuss the impact here in the mailing list and
> > > make a decision by consensus.  Sometimes for major changes, we do a
> > > more formal VOTE thread -- this might be one of those cases.
> > >
> > > What would happen if we were to say that ".MyRecord" was valid in the
> > > next major version of Avro?
> > >
> > > Some SDKs used to accept this in the past and were made more strict,
> > > causing working examples to break?  That is really unfortunate.
> > >
> > > On the other hand, if we generate Java code today and map packages 1:1
> > > to namespaces... we still won't be able to mix namespaced (in a
> > > package) and unnamespaced (unpackaged) generated code.  Would we just
> > > mangle the default namespace to "default$" or ... ?  A configuration
> > > option for the SpecificCompiler in Java?
> > >
> > > Either way, it would be great if we didn't leave this point vague in
> > > the spec!   There's always the possibility to allow language SDKs to
> > > deviate from the spec -- if e.g. python or Java has a
> > > "setValidateUnqualifiedNamespace(boolean)" method, we can leave it up
> > > to the user whether or not to follow the strict spec.  We already do
> > > this with validating defaults in Java, for example.
> > >
> > > It might take a bit of thought, but if we can find some elegant way to
> > > make this work I don't see why we wouldn't make specification changes!
> > >
> > > Ryan
> > >
> > >
> > >
> > >
> > >> On Thu, Aug 25, 2022 at 7:31 PM Brennan Vincent <
> [email protected]> wrote:
> > >>
> > >> That is a fair point also.
> > >>
> > >> Anyway, since I'm not an Apache project member, I'm not quite sure
> what
> > >> is the best way to move forward here. Is there a formal process for
> proposing
> > >> changes to the spec and reaching a consensus?
> > >>
> > >> Thanks
> > >> Brennan
> > >>
> > >>> On 2022-08-25 01:36, Oscar Westra van Holthe - Kind wrote:
> > >>> Hi all,
> > >>>
> > >>> Allowing references to the null namespace from within another
> namespace
> > >>> gives schema authors more options.
> > >>>
> > >>> But if you're using namespaces at all, there must be a reason for
> it. As a
> > >>> schema author, you've made the decision to group your schemata.
> > >>>
> > >>> To make this decision from schema authors more visible, I'd opt to
> choose
> > >>> the Java route and in that case force all schemata to belong to a
> group.
> > >>> I.e., explicitly disallow identifiers to start with a dot (and
> disallow
> > >>> references to the null namespace from within another namespace).
> > >>>
> > >>>
> > >>> Kind regards,
> > >>> Oscar
> > >>>
> > >>> --
> > >>> Oscar Westra van Holthe - Kind <[email protected]>
> > >>>
> > >>> Op wo 24 aug. 2022 14:42 schreef Ryan Skraba <[email protected]>:
> > >>>
> > >>>> Hello!  There is definitely an ambiguity here caused by inheriting
> > >>>> namespaces.
> > >>>>
> > >>>> The obvious takeaway is to use a namespace with all of your named
> > >>>> schemas.  As a best practice, that avoids the problem of mixing
> > >>>> schemas with and without namespaces, and it's probably this techniq
> > >>>>
> > >>>> This same problem occurs in Java classes, where you can have a class
> > >>>> in the default package (without a package name), but it's an error
> to
> > >>>> import it into other packages.
> > >>>>
> > >>>> The ".MyRecord" notation might be the right way to clarify this, but
> > >>>> we can also go the Java route (i.e. you can't mix namespaced schema
> > >>>> and non-namespaced schemas).  What do you think?
> > >>>>
> > >>>> Best regards, Ryan
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Aug 22, 2022 at 10:49 PM Brennan Vincent <
> [email protected]>
> > >>>> wrote:
> > >>>>>
> > >>>>> On 2022/08/22 20:05:22 Martin Grigorov wrote:
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> I might be wrong but I think your sample schema should be valid!
> Does
> > >>>> it
> > >>>>>> fail with any of the SDKs ?
> > >>>>>
> > >>>>> Yes. It fails with the Python avro package.
> > >>>>>
> > >>>>>>
> > >>>>>> This part of the spec talks about the namespace, not the type.
> I.e.
> > >>>>>> "namespace": ".ns" would be an error.
> > >>>>>
> > >>>>> The linked thread (
> > >>>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587 )
> > >>>>> is a bit vague -- it's not totally clear whether the restriction is
> > >>>> meant to apply to
> > >>>>> namespaces only, or to fullnames also.
> > >>>>>
> > >>>>> "The null namespace may not be used in a dot-separated sequence of
> > >>>> names."
> > >>>>>
> > >>>>> certainly makes it sound like it applies to _any_ sequence of
> names,
> > >>>> though,
> > >>>>> not just in a namespace field.
> > >>>>>
> > >>>>>>
> > >>>>>> On Mon, Aug 22, 2022 at 10:40 PM Brennan Vincent <
> [email protected]
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hello,
> > >>>>>>>
> > >>>>>>> https://github.com/apache/avro/pull/917 introduced the following
> > >>>> language
> > >>>>>>> to the spec:
> > >>>>>>>
> > >>>>>>>> The null namespace may not be used in a dot-separated sequence
> of
> > >>>> names.
> > >>>>>>>
> > >>>>>>> Thus ruling out fullnames like ".foo".
> > >>>>>>>
> > >>>>>>> However, this seems to rule out referring to names in the default
> > >>>>>>> namespace from another namespace.
> > >>>>>>>
> > >>>>>>> For example, this schema was previously allowed by the spec:
> > >>>>>>>
> > >>>>>>> {
> > >>>>>>>    "type": "record",
> > >>>>>>>    "name": "r",
> > >>>>>>>    "fields": [
> > >>>>>>>        {
> > >>>>>>>            "name": "f",
> > >>>>>>>            "type": {
> > >>>>>>>                "type": "record",
> > >>>>>>>                "name": "r2",
> > >>>>>>>                "namespace": "ns",
> > >>>>>>>                "fields": [
> > >>>>>>>                    {
> > >>>>>>>                        "name": "f2",
> > >>>>>>>                        "type": ["null", ".r"]
> > >>>>>>>                    }
> > >>>>>>>                ]
> > >>>>>>>            }
> > >>>>>>>        }
> > >>>>>>>    ]
> > >>>>>>> }
> > >>>>>>>
> > >>>>>>> Note ".r" in the type of "f2". This can't be changed to "r",
> > >>>>>>> because that would be interpreted as "ns.r" due to "ns" being the
> > >>>> nearest
> > >>>>>>> enclosing namespace.
> > >>>>>>>
> > >>>>>>> Thus it seems that the new spec has restricted the set of valid
> > >>>> schemas
> > >>>>>>> and there is no longer
> > >>>>>>> any way to accomplish this.
> > >>>>>>>
> > >>>>>>> Am I misinterpreting the spec? Does the empty namespace being
> > >>>> disallowed
> > >>>>>>> in dotted sequences
> > >>>>>>> of names only apply to initial name definitions, but not to later
> > >>>> name
> > >>>>>>> references? Or is there
> > >>>>>>> some other way to express this?
> > >>>>>>>
> > >>>>>>> Here is the initial discussion of this change, where the issue
> I'm
> > >>>> raising
> > >>>>>>> here doesn't
> > >>>>>>> appear to have come up:
> > >>>>>>> https://lists.apache.org/thread/q0o58fxgvstvdlgpoyv2pcz53borp587
> > >>>>>>>
> > >>>>>>> Thanks,
> >
>

Reply via email to