Re: [julia-users] Re: Defining a function in different modules

Scott Jones Sat, 25 Apr 2015 14:48:06 -0700

I think lindahua had it right:

> Generally, conflicting extensions of methods are a natural consequence of 
> allowing packages to evolve independently (which we should anyway). It is 
> unavoidable as the eco-system grows (even if we address such the Images + 
> DataArrays problem by other means). If this coupling over packages cannot 
> be addressed in a scalable way, it would severely influence the future 
> prospect of Julia to become a mainstream language.



Right now, Julia already has a big mess of overloaded operators and 
function names that aren't really exactly the same interface... (* and ^ on 
strings, for example, or ~ in DataFrames :-) ).

I do have a lot of experience of how code changes over time ;-)
But no, I did not assume that signatures wouldn't change within a package, 
not at all.

My suggestion doesn't break as signatures change... as long as you've, as 
the package/module creator, have used one of "your" types, so that you know 
that things are unambiguous,
then things don't break no matter how the signatures of my functions 
change...

Your method means that users are forced to 1) not use using on packages 
with coincidentally conflicting names and specify everything with the 
package/module name, or
2) force one of the package developers (if they are even still around) to 
change their package to avoid conflicting with somebody else's package's 
names,
which will requires users to remember to always use package A before 
package B (B being the one that had to change their package)...

You also seem to think that the only dynamic use of the language will be in 
the REPL...
If I write a system that can dynamically load Julia code from a database 
and execute it, where are all these warning messages going to be going?

Here's another thought:
Developer A makes a nice database binding package, with 20 different 
functions.

Developer B totally independently makes a new database binding package, 
that, because people in the same area are likely to pick similar names,
has 3 names (with totally different signatures).

Developer C comes along (me), and needs to use both packages... say A for 
the data sources and B for the backing store...
Why should I not be able to use "using A using B" or "using B using A", 
without worrying that they coincidentally used a few names that conflict,
and not have to try to get A & B together to come up with some common 
interface...

Remember, I am talking about something that I think would be pretty easy to 
determine, which somebody else had suggested somewhere
here (i.e. using the fact that a type local to the module was in the 
signature),  to say that you don't need the import <name> if you are 
exporting a function.
I am not talking about never having to do the import <name>, nor ever 
having the warnings in the case where you are creating a method on types
outside of the module.

To me, the fact that you have had to go to this "SuperSecretBase" points 
out the problems with the current design...

Scott

On Saturday, April 25, 2015 at 5:06:04 PM UTC-4, Jeff Bezanson wrote:
>
> The reason for it not to work is that we have two different concepts 
> that happen to be spelled the same. 
>
> To me you're just describing inherent problems with name conflicts and 
> agreeing on interfaces. Having a single method namespace is hardly a 
> magic bullet for that. It seems to require just as much coordination. 
> In Python again, two people might develop socket libraries that 
> implement connect, but one uses foo.connect(address, port) and the 
> other uses bar.connect(port, address). At that point, you absolutely 
> have to get the two developers to agree on one interface so that 
> people can use both, and say x.connect() where x might be either foo 
> or bar. If the libraries can't be changed, you can write shims to make 
> them compatible. But you can do the same thing in julia. In julia the 
> disaster is no bigger than usual. 
>
> Quite rightly, you are focusing on how code changes over time, and 
> what problems that might cause. But your design focuses on adding 
> functions, and assumes signatures don't change as much and have a 
> particular structure (i.e. typically referring directly to a type 
> defined in the same module). If those assumptions hold, I agree that 
> it could work very well. But it "breaks" as signatures change, while 
> our design "breaks" as export lists change. I prefer our tradeoff 
> because method signatures are far more subtle. Comparing method 
> signatures is computationally difficult (exponential worst case!), 
> while looking for a symbol in a list is trivial. Warnings for name 
> conflicts may be annoying, but at least it's dead obvious what's 
> happening. If a subtle adjustment to a signature affects visibility 
> elsewhere, I'd think that would be much harder to track down. 
>
> On Sat, Apr 25, 2015 at 4:24 PM, Scott Jones <[email protected] 
> <javascript:>> wrote: 
> > 
> > 
> > On Saturday, April 25, 2015 at 3:58:16 PM UTC-4, Jeff Bezanson wrote: 
> >> 
> >> I think this is just a different mindset than the one we've adopted. 
> >> In the mindset you describe, there really *ought* to be only one 
> >> function with each name, in other words a single global namespace. As 
> >> long as all new definitions for a function have disjoint signatures, 
> >> there are no conflicts. To deal with conflicts, each module has its 
> >> own "view" of a function that resolves conflicts in favor of its 
> >> definitions. 
> > 
> > 
> > As a practical point, *why* should I have to know about every other 
> package 
> > or module that users of my package might possibly want to use at the 
> same 
> > time? 
> > With the way it is now, it seems I have to force everybody to not use 
> using, 
> > and use fully specified names, which seems utterly against the 
> extensibility 
> > of Julia, 
> > because if I try to export a function, I must know the intentions of 
> every 
> > user, which packages they might load, etc. that might possibly have the 
> same 
> > name. 
> > 
> > I have a module that defines a packed database format, and I want to 
> define 
> > a length, push!, and getindex methods... 
> > Then (for examples sake) I also want to define a foobar method that 
> people 
> > can use, and be able to call it on objects from my module with just 
> > foobar(db,arg1,arg2) (where db is from my class). 
> > All is well and good, but then some user complains that they can't use 
> my 
> > package and package newdb, because coincidentally they also defined a 
> > function 
> > called foobar, that does have a different signature. 
> > 
> > I believe they should be able to use both, as long as there aren't any 
> real 
> > conflicts, *without* spurious warnings... 
> > 
> >> 
> >> This approach has a lot in common with class-based OO. For example in 
> >> Python when you say `x.sin()`, the `sin` name belongs to a single 
> >> method namespace. Sure there are different namespaces for *top level* 
> >> definitions, but not for method names. If you want a different `sin` 
> >> method, you need to make a new class, so the `x` part is different. 
> >> This corresponds to the requirement you describe of methods 
> >> referencing some new type from the same julia module. 
> >> 
> >> Well, that's not how we do things. For us, if two functions have the 
> >> same name it's just a cosmetic coincidence, at least initially. In 
> >> julia two functions can have the same name but refer to totally 
> >> different concepts. For example you can have Base.sin, which computes 
> >> the sine of a number, and Transgressions.sin, which implements all 
> >> sorts of fun behavior. Say Base only defines sin(x::Float64), and 
> >> Transgressions only defines sin(x::String). They're disjoint. However, 
> >> if you say 
> >> 
> >> map(sin, [1.0, "sloth", 2pi, "gluttony"]) 
> >> 
> >> you can't get both behaviors. You'll get a method error on either the 
> >> 1.0 or the string. You have to decide which notion of `sin` you mean. 
> >> We're not going to automatically merge the two functions. 
> >> 
> > 
> > 
> > I'm not saying you should... on the other hand, if I have to functions 
> from 
> > different packages, developed independently, 
> > that happen to have a name in common, (but with different signatures), 
> the 
> > users should not have to somehow get the developers 
> > together (who may not even be around anymore), to somehow resolve the 
> > conflict (which would probably adversely affect other users 
> > of both packages if some names had to be changed) 
> > 
> >> Then if we 
> >> see the same name appearing in multiple packages, we decide if there 
> >> is indeed a common interface, and if so move the packages to using it, 
> >> e.g. by creating something like StatsBase or maybe adding something to 
> >> Base. But we don't want Base to grow much more, if at all. 
> > 
> > 
> > I'm sorry, but that just seems like a recipe for disaster... you are 
> saying 
> > that *after* users finally 
> > decide they want to use two packages together, that then somehow you 
> will 
> > force the 
> > developers of the packages to agree on a common interface, or change the 
> > names of conflicting functions, 
> > or make everybody use names qualified with the module name(s)... 
> > 
> > As for your map, example... 
> > If instead, I have map(sin, [1.0, myslothdays, 2pi, mygluttonydays] ), 
> > where myslothdays and mygluttonydays both have the type MySinDiary, and 
> > there is a Transgressions.sin(x::Transgressions.MySinDiary) method... 
> > that should work, right? 
> > 
> > What is a good reason for it not to work? 
> > 
> > Scott 
> > 
> >> On Sat, Apr 25, 2015 at 3:27 PM, Scott Jones <[email protected]> 
> >> wrote: 
> >> > My point is, if I have been careful, and export methods that always 
> >> > reference at least one of type defined locally in my module, so that 
> >> > they 
> >> > should always be unambiguous, I should NOT have to know about any 
> other 
> >> > module (or Base) that a user of my module might also be using having 
> a 
> >> > function with the 
> >> > same name, and should NOT have to do an import. 
> >> > 
> >> > For methods where I *am* trying to extend some type defined in 
> another 
> >> > module/package or base, then yes, I believe you should do something 
> >> > explicitly to indicate that. 
> >> > 
> >> > I don't think there is any real conflict here... right now it is too 
> >> > restrictive when the module's programmer has clearly signaled their 
> >> > intent 
> >> > by always using their own, unambiguous 
> >> > signitures for their functions. 
> >> > 
> >> > Have I got something fundamentally wrong here? 
> >> > 
> >> > Thanks, 
> >> > Scott 
> >> > 
> >> > On Saturday, April 25, 2015 at 2:10:25 PM UTC-4, Jeff Bezanson wrote: 
> >> >> 
> >> >> Scott, the behavior you're trying to get sounds to me like "IF this 
> >> >> function exists in Base then I want to extend it, otherwise just 
> make 
> >> >> my own version of the function". That strikes me as a hack. What 
> we've 
> >> >> tended to do is let everybody define whatever they want. Then if we 
> >> >> see the same name appearing in multiple packages, we decide if there 
> >> >> is indeed a common interface, and if so move the packages to using 
> it, 
> >> >> e.g. by creating something like StatsBase or maybe adding something 
> to 
> >> >> Base. But we don't want Base to grow much more, if at all. 
> >> >> 
> >> >> Getting an error for using both Base and your package seems 
> annoying, 
> >> >> but alternatives that involve doing "something" silently surely must 
> >> >> be considered worse. If a colliding name gets added to Base, the 
> >> >> default behavior should not be to assume that you meant to interfere 
> >> >> with its behavior. 
> >> >> 
> >> >> On Sat, Apr 25, 2015 at 1:57 PM, Jeff Bezanson <[email protected]> 
>
> >> >> wrote: 
> >> >> > Michael, that's not a bad summary. I would make a couple edits. 
> You 
> >> >> > don't really need to qualify *all* uses. If you want to use `foo` 
> >> >> > from 
> >> >> > module `A`, you can put `import A.foo` at the top and then use 
> `foo` 
> >> >> > in your code. That will have no surprises and no breakage. 
> >> >> > 
> >> >> > Also I think calling it "SuperSecretBase" makes it sound worse 
> than 
> >> >> > it 
> >> >> > is. You can have modules that describe a certain named interface, 
> and 
> >> >> > then other modules extend it. Which reminds me that I need to 
> >> >> > implement #8283, so you can introduce functions without adding 
> >> >> > methods 
> >> >> > yet. 
> >> >> > 
> >> >> > On Sat, Apr 25, 2015 at 12:31 PM, Stefan Karpinski 
> >> >> > <[email protected]> wrote: 
> >> >> >> Scott, I'm not really understanding your problem. Can you give an 
> >> >> >> example? 
> >> >> >> 
> >> >> >> 
> >> >> >> On Sat, Apr 25, 2015 at 11:53 AM, Scott Jones 
> >> >> >> <[email protected]> 
> >> >> >> wrote: 
> >> >> >>> 
> >> >> >>> A problem I'm running into is the following (maybe the best 
> >> >> >>> practice 
> >> >> >>> for 
> >> >> >>> this is documented, and I just to stupid to find it!): 
> >> >> >>> I have created a set of functions, which use my own type, so 
> they 
> >> >> >>> should 
> >> >> >>> never be ambiguous. 
> >> >> >>> I would like to export them all, but I have to import any names 
> >> >> >>> that 
> >> >> >>> already exist... 
> >> >> >>> Then tomorrow, somebody adds that name to Base, and my code no 
> >> >> >>> longer 
> >> >> >>> works... 
> >> >> >>> I dislike having to explicitly import names to extend something, 
> >> >> >>> how 
> >> >> >>> am I 
> >> >> >>> supposed to know in advance all the other names that could be 
> used? 
> >> >> >>> 
> >> >> >>> What am I doing wrong? 
> >> >> >>> 
> >> >> >>> On Saturday, April 25, 2015 at 11:20:14 AM UTC-4, Stefan 
> Karpinski 
> >> >> >>> wrote: 
> >> >> >>>> 
> >> >> >>>> I think you're probably being overly optimistic about how 
> >> >> >>>> infrequently 
> >> >> >>>> there will be dispatch ambiguities between unrelated functions 
> >> >> >>>> that 
> >> >> >>>> happen 
> >> >> >>>> to have the same name. I would guess that if you try to merge 
> two 
> >> >> >>>> unrelated 
> >> >> >>>> generic functions, ambiguities will exist more often than not. 
> If 
> >> >> >>>> you 
> >> >> >>>> were 
> >> >> >>>> to automatically merge generic functions from different 
> modules, 
> >> >> >>>> there are 
> >> >> >>>> two sane ways you could handle ambiguities: 
> >> >> >>>> 
> >> >> >>>> warn about ambiguities when merging happens; 
> >> >> >>>> raise an error when ambiguous calls actually occur. 
> >> >> >>>> 
> >> >> >>>> Warning when the ambiguity is caused is how we currently deal 
> with 
> >> >> >>>> ambiguities in individual generic functions. This seems like a 
> >> >> >>>> good 
> >> >> >>>> idea, 
> >> >> >>>> but it turns out to be extremely annoying. In practice, there 
> are 
> >> >> >>>> fairly 
> >> >> >>>> legitimate cases where you can have ambiguous intersections 
> >> >> >>>> between 
> >> >> >>>> very 
> >> >> >>>> generic definitions and you just don't care because the 
> ambiguous 
> >> >> >>>> case makes 
> >> >> >>>> no sense. This is especially true when loosely related modules 
> >> >> >>>> extend 
> >> >> >>>> shared 
> >> >> >>>> generic functions. As a result, #6190 has gained a lot of 
> support. 
> >> >> >>>> 
> >> >> >>>> If warning about ambiguities in a single generic function is 
> >> >> >>>> annoying, 
> >> >> >>>> warning about ambiguities when merging different generic 
> functions 
> >> >> >>>> that 
> >> >> >>>> happen share a name would be a nightmare. Imagine popular 
> packages 
> >> >> >>>> A 
> >> >> >>>> and B 
> >> >> >>>> both export a function `foo`. Initially there are no 
> ambiguities, 
> >> >> >>>> so 
> >> >> >>>> things 
> >> >> >>>> are fine. Then B adds some methods to its `foo` that introduce 
> >> >> >>>> ambiguities 
> >> >> >>>> with A's `foo`. In isolation A and B are both fine – so neither 
> >> >> >>>> package 
> >> >> >>>> author sees any warnings or problems. But suddenly every 
> package 
> >> >> >>>> in 
> >> >> >>>> the 
> >> >> >>>> ecosystem that uses both A and B – which is a lot since they're 
> >> >> >>>> both 
> >> >> >>>> very 
> >> >> >>>> popular – is spewing warnings upon loading. Who is responsible? 
> >> >> >>>> Package A 
> >> >> >>>> didn't even change anything. Package B just added some methods 
> to 
> >> >> >>>> its 
> >> >> >>>> own 
> >> >> >>>> function and has no issues in isolation. How would someone 
> using 
> >> >> >>>> both 
> >> >> >>>> A and 
> >> >> >>>> B avoid getting these warnings? They would have to stop writing 
> >> >> >>>> `using A` or 
> >> >> >>>> `using B` and instead explicitly import all the names they need 
> >> >> >>>> from 
> >> >> >>>> either 
> >> >> >>>> A or B. To avoid inflicting this on their users, A and B would 
> >> >> >>>> have 
> >> >> >>>> to 
> >> >> >>>> carefully coordinate to avoid any ambiguities between all of 
> their 
> >> >> >>>> generic 
> >> >> >>>> functions. Except that it's not just A and B – it's all 
> packages. 
> >> >> >>>> At 
> >> >> >>>> that 
> >> >> >>>> point, why have namespaces with exports at all? 
> >> >> >>>> 
> >> >> >>>> What if we only raise an error when making calls to `foo` that 
> are 
> >> >> >>>> ambiguous between `A.foo` and `B.foo`? This eliminates the 
> warning 
> >> >> >>>> annoyance, which is nice. But it makes code that uses A and B 
> that 
> >> >> >>>> calls 
> >> >> >>>> `foo` brittle in dangerous ways. Suppose, for example, you call 
> >> >> >>>> `foo(x,y)` 
> >> >> >>>> somewhere and initially this can only mean `A.foo` so things 
> are 
> >> >> >>>> fine. But 
> >> >> >>>> then you upgrade B, which adds a method to `B.foo` that also 
> >> >> >>>> matches 
> >> >> >>>> the 
> >> >> >>>> call to `foo(x,y)`. Now your code that used to work will fail 
> at 
> >> >> >>>> run 
> >> >> >>>> time – 
> >> >> >>>> and only when invoked with ambiguous arguments. This case may 
> be 
> >> >> >>>> possible 
> >> >> >>>> but rare and not covered by your tests. It's a ticking time 
> bomb 
> >> >> >>>> introduced 
> >> >> >>>> into your code just by upgrading dependencies. 
> >> >> >>>> 
> >> >> >>>> The way this issue has actually been resolved, if you were 
> using A 
> >> >> >>>> and B 
> >> >> >>>> and call `foo`, initially only is exported by A, as soon as 
> >> >> >>>> package B 
> >> >> >>>> starts 
> >> >> >>>> exporting `foo`, you'll get an error and be forced to 
> explicitly 
> >> >> >>>> disambiguate `foo`. This is a bit annoying, but after you've 
> done 
> >> >> >>>> that, your 
> >> >> >>>> code will no longer be affected by any changes to `A.foo` or 
> >> >> >>>> `B.foo` 
> >> >> >>>> – it's 
> >> >> >>>> safe and permanently unambiguous. This still isn't 100% 
> >> >> >>>> bulletproof. 
> >> >> >>>> When 
> >> >> >>>> `B.foo` is initially introduced, your code that used `foo`, 
> >> >> >>>> expecting 
> >> >> >>>> to 
> >> >> >>>> call `A.foo`, will break when `foo` is called – but you may not 
> >> >> >>>> have 
> >> >> >>>> tests 
> >> >> >>>> to catch this, so it could happen at an inconvenient time. But 
> >> >> >>>> introducing 
> >> >> >>>> new exports is far less common than adding methods to existing 
> >> >> >>>> exports and 
> >> >> >>>> you are much more likely to have tests that use `foo` in some 
> way 
> >> >> >>>> than you 
> >> >> >>>> are to have tests that exercise a specific ambiguous case. In 
> >> >> >>>> particular, it 
> >> >> >>>> would be fairly straightforward to check if the tests use every 
> >> >> >>>> name 
> >> >> >>>> that is 
> >> >> >>>> referred to anywhere in some code – this would be a simple 
> >> >> >>>> coverage 
> >> >> >>>> measure. 
> >> >> >>>> It is completely intractable, on the other hand, to determine 
> >> >> >>>> whether 
> >> >> >>>> your 
> >> >> >>>> tests cover all possible ambiguities between functions with the 
> >> >> >>>> same 
> >> >> >>>> name in 
> >> >> >>>> all your dependencies. 
> >> >> >>>> 
> >> >> >>>> Anyway, I hope that's somewhat convincing. I think that the way 
> >> >> >>>> this 
> >> >> >>>> has 
> >> >> >>>> been resolved is a good balance between convenient usage and 
> >> >> >>>> "programming in 
> >> >> >>>> the large". 
> >> >> >>>> 
> >> >> >>>> On Fri, Apr 24, 2015 at 10:55 PM, Michael Francis 
> >> >> >>>> <[email protected]> 
> >> >> >>>> wrote: 
> >> >> >>>>> 
> >> >> >>>>> the resolution of that issue seems odd -  If I have two 
> >> >> >>>>> completely 
> >> >> >>>>> unrelated libraries. Say DataFrames and one of my own. I 
> export 
> >> >> >>>>> value( 
> >> >> >>>>> ::MyType) I'm happily using it. Some time later I 
> Pkg.update(), 
> >> >> >>>>> unbeknownst 
> >> >> >>>>> to me the DataFrames dev team have added an export of value( 
> >> >> >>>>> ::DataFrame, 
> >> >> >>>>> ...) suddenly all my code which imports both breaks and I have 
> to 
> >> >> >>>>> go 
> >> >> >>>>> through 
> >> >> >>>>> the entire stack qualifying the calls, as do other users of my 
> >> >> >>>>> module? That 
> >> >> >>>>> doesn't seem right, there is no ambiguity I can see and the 
> >> >> >>>>> multiple 
> >> >> >>>>> dispatch should continue to work correctly. 
> >> >> >>>>> 
> >> >> >>>>> Fundamentally I want the two value() functions to collapse and 
> >> >> >>>>> not 
> >> >> >>>>> have 
> >> >> >>>>> to qualify them. If there is a dispatch ambiguity then game 
> over, 
> >> >> >>>>> but if 
> >> >> >>>>> there isn't I don't see any advantage (and lots of negatives) 
> to 
> >> >> >>>>> preventing 
> >> >> >>>>> the import. 
> >> >> >>>>> 
> >> >> >>>>> I'd argue the same is true with overloading methods in Base. 
> Why 
> >> >> >>>>> would 
> >> >> >>>>> we locally mask get if there is no dispatch ambiguity even if 
> I 
> >> >> >>>>> don't 
> >> >> >>>>> importall Base. 
> >> >> >>>>> 
> >> >> >>>>> Qualifying names seems like an anti pattern in a multiple 
> >> >> >>>>> dispatch 
> >> >> >>>>> world. Except for those edge cases where there is an ambiguity 
> of 
> >> >> >>>>> dispatch. 
> >> >> >>>>> 
> >> >> >>>>> Am I missing something? Perhaps I don't understand multiple 
> >> >> >>>>> dispatch 
> >> >> >>>>> well enough? 
> >> >> >>>> 
> >> >> >>>> 
> >> >> >> 
>

Re: [julia-users] Re: Defining a function in different modules

Reply via email to