Re: [julia-users] Re: Defining a function in different modules

Jeff Bezanson Sat, 25 Apr 2015 15:13:28 -0700

When two people write packages independently, I claim there are only
two options: (1) they implement a common interface, (2) they don't. To
pick option (1), there has to be some kind of centralization or
agreement. For option (2), which is effectively the default, each
package just gets its own totally separate function, and you have to
say which one you want. We're not saying "you'd better get together
with all other developers", because with option (2) you don't need to.


IIUC, you're proposing option (3), automatically merge everybody's
methods, assuming they don't conflict. But I don't see how this can
work. We could have:

module A
type AConnectionManager
end

function connect(cm::AConnectionManager, port, address)
end
end

module B
type BConnectionManager
end

function connect(cm::BConnectionManager, address, port)
end
end

Obviously, you cannot do `using A; using B`, and then freely use
`connect` and have everything work. The fact that the type of the
first argument distinguishes methods doesn't help. The rest of the
arguments don't match, and even if they did the behaviors might not
implement compatible semantics. The only options I see are my options
1 and 2: (1) move to a common interface, or (2) specify A.connect or
B.connect in client code, because the interfaces aren't compatible.


On Sat, Apr 25, 2015 at 5:55 PM, Scott Jones <[email protected]> wrote:
> The problem is, in practice, people *will* have names that collide, and will
> not mean the same thing.
> It seems that people here are trying to say, if you have a particular name
> you'd like to use,
> you'd better get together with all other developers past and future and
> hammer out who
> "owns" the name, and what concept it can be used for... (like mathematical
> sin and fun sin,
> or tree bark and dogs bark... it gets even worse when you consider other
> languages...
> [Say I'm in Spain, and I write a robotics package that has a function
> "coger"..., and somebody in Argentina
> writes a function "coger" that does something, well, XXX...])
>
> I just don't see this as working for any length of time (and I think it is
> already breaking down with Julia...
> to me, the fact that DataFrames picked using ~ as a binary operator, when
> that might have been
> something that somebody wanted to use in the core language, shows how
> fragile things
> are now...)
>
> Scott
>
> On Saturday, April 25, 2015 at 5:27:10 PM UTC-4, Mauro wrote:
>>
>> I don't think it is realistic to expect be able to willy-nilly be
>> 'using' any number of packages and it just works.  The way you propose
>> may work most of the time, however, there were some solid arguments made
>> in this thread on how that can lead to hard to catch failures.
>>
>> And maybe more importantly, from a programmer's sanity perspective, I
>> think it is imperative that one generic function does just one
>> conceptual thing.  Otherwise it gets really hard to figure out what a
>> piece of code does.
>>
>> On Sat, 2015-04-25 at 22:24, Scott Jones <[email protected]> wrote:
>> > On Saturday, April 25, 2015 at 3:58:16 PM UTC-4, Jeff Bezanson wrote:
>> >>
>> >> I think this is just a different mindset than the one we've adopted.
>> >> In the mindset you describe, there really *ought* to be only one
>> >> function with each name, in other words a single global namespace. As
>> >> long as all new definitions for a function have disjoint signatures,
>> >> there are no conflicts. To deal with conflicts, each module has its
>> >> own "view" of a function that resolves conflicts in favor of its
>> >> definitions.
>> >>
>> >
>> > As a practical point, *why* should I have to know about every other
>> > package
>> > or module that users of my package might possibly want to use at the
>> > same
>> > time?
>> > With the way it is now, it seems I have to force everybody to not use
>> > using, and use fully specified names, which seems utterly against the
>> > extensibility of Julia,
>> > because if I try to export a function, I must know the intentions of
>> > every
>> > user, which packages they might load, etc. that might possibly have the
>> > same name.
>> >
>> > I have a module that defines a packed database format, and I want to
>> > define
>> > a length, push!, and getindex methods...
>> > Then (for examples sake) I also want to define a foobar method that
>> > people
>> > can use, and be able to call it on objects from my module with just
>> > foobar(db,arg1,arg2) (where db is from my class).
>> > All is well and good, but then some user complains that they can't use
>> > my
>> > package and package newdb, because coincidentally they also defined a
>> > function
>> > called foobar, that does have a different signature.
>> >
>> > I believe they should be able to use both, as long as there aren't any
>> > real
>> > conflicts, *without* spurious warnings...
>> >
>> >
>> >> This approach has a lot in common with class-based OO. For example in
>> >> Python when you say `x.sin()`, the `sin` name belongs to a single
>> >> method namespace. Sure there are different namespaces for *top level*
>> >> definitions, but not for method names. If you want a different `sin`
>> >> method, you need to make a new class, so the `x` part is different.
>> >> This corresponds to the requirement you describe of methods
>> >> referencing some new type from the same julia module.
>> >>
>> >> Well, that's not how we do things. For us, if two functions have the
>> >> same name it's just a cosmetic coincidence, at least initially. In
>> >> julia two functions can have the same name but refer to totally
>> >> different concepts. For example you can have Base.sin, which computes
>> >> the sine of a number, and Transgressions.sin, which implements all
>> >> sorts of fun behavior. Say Base only defines sin(x::Float64), and
>> >> Transgressions only defines sin(x::String). They're disjoint. However,
>> >> if you say
>> >>
>> >> map(sin, [1.0, "sloth", 2pi, "gluttony"])
>> >>
>> >> you can't get both behaviors. You'll get a method error on either the
>> >> 1.0 or the string. You have to decide which notion of `sin` you mean.
>> >> We're not going to automatically merge the two functions.
>> >>
>> >>
>> >
>> > I'm not saying you should... on the other hand, if I have to functions
>> > from
>> > different packages, developed independently,
>> > that happen to have a name in common, (but with different signatures),
>> > the
>> > users should not have to somehow get the developers
>> > together (who may not even be around anymore), to somehow resolve the
>> > conflict (which would probably adversely affect other users
>> > of both packages if some names had to be changed)
>> >
>> > Then if we
>> >> see the same name appearing in multiple packages, we decide if there
>> >> is indeed a common interface, and if so move the packages to using it,
>> >> e.g. by creating something like StatsBase or maybe adding something to
>> >> Base. But we don't want Base to grow much more, if at all.
>> >
>> >
>> > I'm sorry, but that just seems like a recipe for disaster... you are
>> > saying
>> > that *after* users finally
>> > decide they want to use two packages together, that then somehow you
>> > will
>> > force the
>> > developers of the packages to agree on a common interface, or change the
>> > names of conflicting functions,
>> > or make everybody use names qualified with the module name(s)...
>> >
>> > As for your map, example...
>> > If instead, I have map(sin, [1.0, myslothdays, 2pi, mygluttonydays] ),
>> > where myslothdays and mygluttonydays both have the type MySinDiary, and
>> > there is a Transgressions.sin(x::Transgressions.MySinDiary) method...
>> > that should work, right?
>> >
>> > What is a good reason for it not to work?
>> >
>> > Scott
>> >
>> > On Sat, Apr 25, 2015 at 3:27 PM, Scott Jones <[email protected]
>> >> <javascript:>> wrote:
>> >> > My point is, if I have been careful, and export methods that always
>> >> > reference at least one of type defined locally in my module, so that
>> >> they
>> >> > should always be unambiguous, I should NOT have to know about any
>> >> > other
>> >> > module (or Base) that a user of my module might also be using having
>> >> > a
>> >> > function with the
>> >> > same name, and should NOT have to do an import.
>> >> >
>> >> > For methods where I *am* trying to extend some type defined in
>> >> > another
>> >> > module/package or base, then yes, I believe you should do something
>> >> > explicitly to indicate that.
>> >> >
>> >> > I don't think there is any real conflict here... right now it is too
>> >> > restrictive when the module's programmer has clearly signaled their
>> >> intent
>> >> > by always using their own, unambiguous
>> >> > signitures for their functions.
>> >> >
>> >> > Have I got something fundamentally wrong here?
>> >> >
>> >> > Thanks,
>> >> > Scott
>> >> >
>> >> > On Saturday, April 25, 2015 at 2:10:25 PM UTC-4, Jeff Bezanson wrote:
>> >> >>
>> >> >> Scott, the behavior you're trying to get sounds to me like "IF this
>> >> >> function exists in Base then I want to extend it, otherwise just
>> >> >> make
>> >> >> my own version of the function". That strikes me as a hack. What
>> >> >> we've
>> >> >> tended to do is let everybody define whatever they want. Then if we
>> >> >> see the same name appearing in multiple packages, we decide if there
>> >> >> is indeed a common interface, and if so move the packages to using
>> >> >> it,
>> >> >> e.g. by creating something like StatsBase or maybe adding something
>> >> >> to
>> >> >> Base. But we don't want Base to grow much more, if at all.
>> >> >>
>> >> >> Getting an error for using both Base and your package seems
>> >> >> annoying,
>> >> >> but alternatives that involve doing "something" silently surely must
>> >> >> be considered worse. If a colliding name gets added to Base, the
>> >> >> default behavior should not be to assume that you meant to interfere
>> >> >> with its behavior.
>> >> >>
>> >> >> On Sat, Apr 25, 2015 at 1:57 PM, Jeff Bezanson <[email protected]>
>> >> >> wrote:
>> >> >> > Michael, that's not a bad summary. I would make a couple edits.
>> >> >> > You
>> >> >> > don't really need to qualify *all* uses. If you want to use `foo`
>> >> from
>> >> >> > module `A`, you can put `import A.foo` at the top and then use
>> >> >> > `foo`
>> >> >> > in your code. That will have no surprises and no breakage.
>> >> >> >
>> >> >> > Also I think calling it "SuperSecretBase" makes it sound worse
>> >> >> > than
>> >> it
>> >> >> > is. You can have modules that describe a certain named interface,
>> >> >> > and
>> >> >> > then other modules extend it. Which reminds me that I need to
>> >> >> > implement #8283, so you can introduce functions without adding
>> >> methods
>> >> >> > yet.
>> >> >> >
>> >> >> > On Sat, Apr 25, 2015 at 12:31 PM, Stefan Karpinski
>> >> >> > <[email protected]> wrote:
>> >> >> >> Scott, I'm not really understanding your problem. Can you give an
>> >> >> >> example?
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sat, Apr 25, 2015 at 11:53 AM, Scott Jones
>> >> >> >> <[email protected]>
>> >>
>> >> >> >> wrote:
>> >> >> >>>
>> >> >> >>> A problem I'm running into is the following (maybe the best
>> >> practice
>> >> >> >>> for
>> >> >> >>> this is documented, and I just to stupid to find it!):
>> >> >> >>> I have created a set of functions, which use my own type, so
>> >> >> >>> they
>> >> >> >>> should
>> >> >> >>> never be ambiguous.
>> >> >> >>> I would like to export them all, but I have to import any names
>> >> that
>> >> >> >>> already exist...
>> >> >> >>> Then tomorrow, somebody adds that name to Base, and my code no
>> >> longer
>> >> >> >>> works...
>> >> >> >>> I dislike having to explicitly import names to extend something,
>> >> how
>> >> >> >>> am I
>> >> >> >>> supposed to know in advance all the other names that could be
>> >> >> >>> used?
>> >> >> >>>
>> >> >> >>> What am I doing wrong?
>> >> >> >>>
>> >> >> >>> On Saturday, April 25, 2015 at 11:20:14 AM UTC-4, Stefan
>> >> >> >>> Karpinski
>> >> >> >>> wrote:
>> >> >> >>>>
>> >> >> >>>> I think you're probably being overly optimistic about how
>> >> >> >>>> infrequently
>> >> >> >>>> there will be dispatch ambiguities between unrelated functions
>> >> that
>> >> >> >>>> happen
>> >> >> >>>> to have the same name. I would guess that if you try to merge
>> >> >> >>>> two
>> >> >> >>>> unrelated
>> >> >> >>>> generic functions, ambiguities will exist more often than not.
>> >> >> >>>> If
>> >> you
>> >> >> >>>> were
>> >> >> >>>> to automatically merge generic functions from different
>> >> >> >>>> modules,
>> >> >> >>>> there are
>> >> >> >>>> two sane ways you could handle ambiguities:
>> >> >> >>>>
>> >> >> >>>> warn about ambiguities when merging happens;
>> >> >> >>>> raise an error when ambiguous calls actually occur.
>> >> >> >>>>
>> >> >> >>>> Warning when the ambiguity is caused is how we currently deal
>> >> >> >>>> with
>> >> >> >>>> ambiguities in individual generic functions. This seems like a
>> >> good
>> >> >> >>>> idea,
>> >> >> >>>> but it turns out to be extremely annoying. In practice, there
>> >> >> >>>> are
>> >> >> >>>> fairly
>> >> >> >>>> legitimate cases where you can have ambiguous intersections
>> >> between
>> >> >> >>>> very
>> >> >> >>>> generic definitions and you just don't care because the
>> >> >> >>>> ambiguous
>> >> >> >>>> case makes
>> >> >> >>>> no sense. This is especially true when loosely related modules
>> >> extend
>> >> >> >>>> shared
>> >> >> >>>> generic functions. As a result, #6190 has gained a lot of
>> >> >> >>>> support.
>> >> >> >>>>
>> >> >> >>>> If warning about ambiguities in a single generic function is
>> >> >> >>>> annoying,
>> >> >> >>>> warning about ambiguities when merging different generic
>> >> >> >>>> functions
>> >> >> >>>> that
>> >> >> >>>> happen share a name would be a nightmare. Imagine popular
>> >> >> >>>> packages
>> >> A
>> >> >> >>>> and B
>> >> >> >>>> both export a function `foo`. Initially there are no
>> >> >> >>>> ambiguities,
>> >> so
>> >> >> >>>> things
>> >> >> >>>> are fine. Then B adds some methods to its `foo` that introduce
>> >> >> >>>> ambiguities
>> >> >> >>>> with A's `foo`. In isolation A and B are both fine – so neither
>> >> >> >>>> package
>> >> >> >>>> author sees any warnings or problems. But suddenly every
>> >> >> >>>> package
>> >> in
>> >> >> >>>> the
>> >> >> >>>> ecosystem that uses both A and B – which is a lot since they're
>> >> both
>> >> >> >>>> very
>> >> >> >>>> popular – is spewing warnings upon loading. Who is responsible?
>> >> >> >>>> Package A
>> >> >> >>>> didn't even change anything. Package B just added some methods
>> >> >> >>>> to
>> >> its
>> >> >> >>>> own
>> >> >> >>>> function and has no issues in isolation. How would someone
>> >> >> >>>> using
>> >> both
>> >> >> >>>> A and
>> >> >> >>>> B avoid getting these warnings? They would have to stop writing
>> >> >> >>>> `using A` or
>> >> >> >>>> `using B` and instead explicitly import all the names they need
>> >> from
>> >> >> >>>> either
>> >> >> >>>> A or B. To avoid inflicting this on their users, A and B would
>> >> have
>> >> >> >>>> to
>> >> >> >>>> carefully coordinate to avoid any ambiguities between all of
>> >> >> >>>> their
>> >> >> >>>> generic
>> >> >> >>>> functions. Except that it's not just A and B – it's all
>> >> >> >>>> packages.
>> >> At
>> >> >> >>>> that
>> >> >> >>>> point, why have namespaces with exports at all?
>> >> >> >>>>
>> >> >> >>>> What if we only raise an error when making calls to `foo` that
>> >> >> >>>> are
>> >> >> >>>> ambiguous between `A.foo` and `B.foo`? This eliminates the
>> >> >> >>>> warning
>> >> >> >>>> annoyance, which is nice. But it makes code that uses A and B
>> >> >> >>>> that
>> >> >> >>>> calls
>> >> >> >>>> `foo` brittle in dangerous ways. Suppose, for example, you call
>> >> >> >>>> `foo(x,y)`
>> >> >> >>>> somewhere and initially this can only mean `A.foo` so things
>> >> >> >>>> are
>> >> >> >>>> fine. But
>> >> >> >>>> then you upgrade B, which adds a method to `B.foo` that also
>> >> matches
>> >> >> >>>> the
>> >> >> >>>> call to `foo(x,y)`. Now your code that used to work will fail
>> >> >> >>>> at
>> >> run
>> >> >> >>>> time –
>> >> >> >>>> and only when invoked with ambiguous arguments. This case may
>> >> >> >>>> be
>> >> >> >>>> possible
>> >> >> >>>> but rare and not covered by your tests. It's a ticking time
>> >> >> >>>> bomb
>> >> >> >>>> introduced
>> >> >> >>>> into your code just by upgrading dependencies.
>> >> >> >>>>
>> >> >> >>>> The way this issue has actually been resolved, if you were
>> >> >> >>>> using A
>> >> >> >>>> and B
>> >> >> >>>> and call `foo`, initially only is exported by A, as soon as
>> >> package B
>> >> >> >>>> starts
>> >> >> >>>> exporting `foo`, you'll get an error and be forced to
>> >> >> >>>> explicitly
>> >> >> >>>> disambiguate `foo`. This is a bit annoying, but after you've
>> >> >> >>>> done
>> >> >> >>>> that, your
>> >> >> >>>> code will no longer be affected by any changes to `A.foo` or
>> >> `B.foo`
>> >> >> >>>> – it's
>> >> >> >>>> safe and permanently unambiguous. This still isn't 100%
>> >> bulletproof.
>> >> >> >>>> When
>> >> >> >>>> `B.foo` is initially introduced, your code that used `foo`,
>> >> expecting
>> >> >> >>>> to
>> >> >> >>>> call `A.foo`, will break when `foo` is called – but you may not
>> >> have
>> >> >> >>>> tests
>> >> >> >>>> to catch this, so it could happen at an inconvenient time. But
>> >> >> >>>> introducing
>> >> >> >>>> new exports is far less common than adding methods to existing
>> >> >> >>>> exports and
>> >> >> >>>> you are much more likely to have tests that use `foo` in some
>> >> >> >>>> way
>> >> >> >>>> than you
>> >> >> >>>> are to have tests that exercise a specific ambiguous case. In
>> >> >> >>>> particular, it
>> >> >> >>>> would be fairly straightforward to check if the tests use every
>> >> name
>> >> >> >>>> that is
>> >> >> >>>> referred to anywhere in some code – this would be a simple
>> >> coverage
>> >> >> >>>> measure.
>> >> >> >>>> It is completely intractable, on the other hand, to determine
>> >> whether
>> >> >> >>>> your
>> >> >> >>>> tests cover all possible ambiguities between functions with the
>> >> same
>> >> >> >>>> name in
>> >> >> >>>> all your dependencies.
>> >> >> >>>>
>> >> >> >>>> Anyway, I hope that's somewhat convincing. I think that the way
>> >> this
>> >> >> >>>> has
>> >> >> >>>> been resolved is a good balance between convenient usage and
>> >> >> >>>> "programming in
>> >> >> >>>> the large".
>> >> >> >>>>
>> >> >> >>>> On Fri, Apr 24, 2015 at 10:55 PM, Michael Francis
>> >> >> >>>> <[email protected]>
>> >> >> >>>> wrote:
>> >> >> >>>>>
>> >> >> >>>>> the resolution of that issue seems odd -  If I have two
>> >> completely
>> >> >> >>>>> unrelated libraries. Say DataFrames and one of my own. I
>> >> >> >>>>> export
>> >> >> >>>>> value(
>> >> >> >>>>> ::MyType) I'm happily using it. Some time later I
>> >> >> >>>>> Pkg.update(),
>> >> >> >>>>> unbeknownst
>> >> >> >>>>> to me the DataFrames dev team have added an export of value(
>> >> >> >>>>> ::DataFrame,
>> >> >> >>>>> ...) suddenly all my code which imports both breaks and I have
>> >> >> >>>>> to
>> >> go
>> >> >> >>>>> through
>> >> >> >>>>> the entire stack qualifying the calls, as do other users of my
>> >> >> >>>>> module? That
>> >> >> >>>>> doesn't seem right, there is no ambiguity I can see and the
>> >> multiple
>> >> >> >>>>> dispatch should continue to work correctly.
>> >> >> >>>>>
>> >> >> >>>>> Fundamentally I want the two value() functions to collapse and
>> >> not
>> >> >> >>>>> have
>> >> >> >>>>> to qualify them. If there is a dispatch ambiguity then game
>> >> >> >>>>> over,
>> >> >> >>>>> but if
>> >> >> >>>>> there isn't I don't see any advantage (and lots of negatives)
>> >> >> >>>>> to
>> >> >> >>>>> preventing
>> >> >> >>>>> the import.
>> >> >> >>>>>
>> >> >> >>>>> I'd argue the same is true with overloading methods in Base.
>> >> >> >>>>> Why
>> >> >> >>>>> would
>> >> >> >>>>> we locally mask get if there is no dispatch ambiguity even if
>> >> >> >>>>> I
>> >> >> >>>>> don't
>> >> >> >>>>> importall Base.
>> >> >> >>>>>
>> >> >> >>>>> Qualifying names seems like an anti pattern in a multiple
>> >> dispatch
>> >> >> >>>>> world. Except for those edge cases where there is an ambiguity
>> >> >> >>>>> of
>> >> >> >>>>> dispatch.
>> >> >> >>>>>
>> >> >> >>>>> Am I missing something? Perhaps I don't understand multiple
>> >> dispatch
>> >> >> >>>>> well enough?
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>
>> >>
>>
>

Re: [julia-users] Re: Defining a function in different modules

Reply via email to