Re: [julia-users] Re: Defining a function in different modules

Mauro Sat, 25 Apr 2015 14:27:48 -0700

I don't think it is realistic to expect be able to willy-nilly be
'using' any number of packages and it just works.  The way you propose
may work most of the time, however, there were some solid arguments made
in this thread on how that can lead to hard to catch failures.


And maybe more importantly, from a programmer's sanity perspective, I
think it is imperative that one generic function does just one
conceptual thing.  Otherwise it gets really hard to figure out what a
piece of code does.

On Sat, 2015-04-25 at 22:24, Scott Jones <[email protected]> wrote:
> On Saturday, April 25, 2015 at 3:58:16 PM UTC-4, Jeff Bezanson wrote:
>>
>> I think this is just a different mindset than the one we've adopted. 
>> In the mindset you describe, there really *ought* to be only one 
>> function with each name, in other words a single global namespace. As 
>> long as all new definitions for a function have disjoint signatures, 
>> there are no conflicts. To deal with conflicts, each module has its 
>> own "view" of a function that resolves conflicts in favor of its 
>> definitions. 
>>
>
> As a practical point, *why* should I have to know about every other package 
> or module that users of my package might possibly want to use at the same 
> time?
> With the way it is now, it seems I have to force everybody to not use 
> using, and use fully specified names, which seems utterly against the 
> extensibility of Julia,
> because if I try to export a function, I must know the intentions of every 
> user, which packages they might load, etc. that might possibly have the 
> same name.
>
> I have a module that defines a packed database format, and I want to define 
> a length, push!, and getindex methods...
> Then (for examples sake) I also want to define a foobar method that people 
> can use, and be able to call it on objects from my module with just
> foobar(db,arg1,arg2) (where db is from my class).
> All is well and good, but then some user complains that they can't use my 
> package and package newdb, because coincidentally they also defined a 
> function
> called foobar, that does have a different signature.
>
> I believe they should be able to use both, as long as there aren't any real 
> conflicts, *without* spurious warnings...
>  
>
>> This approach has a lot in common with class-based OO. For example in 
>> Python when you say `x.sin()`, the `sin` name belongs to a single 
>> method namespace. Sure there are different namespaces for *top level* 
>> definitions, but not for method names. If you want a different `sin` 
>> method, you need to make a new class, so the `x` part is different. 
>> This corresponds to the requirement you describe of methods 
>> referencing some new type from the same julia module. 
>>
>> Well, that's not how we do things. For us, if two functions have the 
>> same name it's just a cosmetic coincidence, at least initially. In 
>> julia two functions can have the same name but refer to totally 
>> different concepts. For example you can have Base.sin, which computes 
>> the sine of a number, and Transgressions.sin, which implements all 
>> sorts of fun behavior. Say Base only defines sin(x::Float64), and 
>> Transgressions only defines sin(x::String). They're disjoint. However, 
>> if you say 
>>
>> map(sin, [1.0, "sloth", 2pi, "gluttony"]) 
>>
>> you can't get both behaviors. You'll get a method error on either the 
>> 1.0 or the string. You have to decide which notion of `sin` you mean. 
>> We're not going to automatically merge the two functions. 
>>
>>
>
> I'm not saying you should... on the other hand, if I have to functions from 
> different packages, developed independently,
> that happen to have a name in common, (but with different signatures), the 
> users should not have to somehow get the developers
> together (who may not even be around anymore), to somehow resolve the 
> conflict (which would probably adversely affect other users
> of both packages if some names had to be changed)
>
> Then if we 
>> see the same name appearing in multiple packages, we decide if there 
>> is indeed a common interface, and if so move the packages to using it, 
>> e.g. by creating something like StatsBase or maybe adding something to 
>> Base. But we don't want Base to grow much more, if at all. 
>
>
> I'm sorry, but that just seems like a recipe for disaster... you are saying 
> that *after* users finally
> decide they want to use two packages together, that then somehow you will 
> force the
> developers of the packages to agree on a common interface, or change the 
> names of conflicting functions,
> or make everybody use names qualified with the module name(s)...
>
> As for your map, example...
> If instead, I have map(sin, [1.0, myslothdays, 2pi, mygluttonydays] ),
> where myslothdays and mygluttonydays both have the type MySinDiary, and 
> there is a Transgressions.sin(x::Transgressions.MySinDiary) method...
> that should work, right?
>
> What is a good reason for it not to work?
>
> Scott
>
> On Sat, Apr 25, 2015 at 3:27 PM, Scott Jones <[email protected] 
>> <javascript:>> wrote: 
>> > My point is, if I have been careful, and export methods that always 
>> > reference at least one of type defined locally in my module, so that 
>> they 
>> > should always be unambiguous, I should NOT have to know about any other 
>> > module (or Base) that a user of my module might also be using having a 
>> > function with the 
>> > same name, and should NOT have to do an import. 
>> > 
>> > For methods where I *am* trying to extend some type defined in another 
>> > module/package or base, then yes, I believe you should do something 
>> > explicitly to indicate that. 
>> > 
>> > I don't think there is any real conflict here... right now it is too 
>> > restrictive when the module's programmer has clearly signaled their 
>> intent 
>> > by always using their own, unambiguous 
>> > signitures for their functions. 
>> > 
>> > Have I got something fundamentally wrong here? 
>> > 
>> > Thanks, 
>> > Scott 
>> > 
>> > On Saturday, April 25, 2015 at 2:10:25 PM UTC-4, Jeff Bezanson wrote: 
>> >> 
>> >> Scott, the behavior you're trying to get sounds to me like "IF this 
>> >> function exists in Base then I want to extend it, otherwise just make 
>> >> my own version of the function". That strikes me as a hack. What we've 
>> >> tended to do is let everybody define whatever they want. Then if we 
>> >> see the same name appearing in multiple packages, we decide if there 
>> >> is indeed a common interface, and if so move the packages to using it, 
>> >> e.g. by creating something like StatsBase or maybe adding something to 
>> >> Base. But we don't want Base to grow much more, if at all. 
>> >> 
>> >> Getting an error for using both Base and your package seems annoying, 
>> >> but alternatives that involve doing "something" silently surely must 
>> >> be considered worse. If a colliding name gets added to Base, the 
>> >> default behavior should not be to assume that you meant to interfere 
>> >> with its behavior. 
>> >> 
>> >> On Sat, Apr 25, 2015 at 1:57 PM, Jeff Bezanson <[email protected]> 
>> >> wrote: 
>> >> > Michael, that's not a bad summary. I would make a couple edits. You 
>> >> > don't really need to qualify *all* uses. If you want to use `foo` 
>> from 
>> >> > module `A`, you can put `import A.foo` at the top and then use `foo` 
>> >> > in your code. That will have no surprises and no breakage. 
>> >> > 
>> >> > Also I think calling it "SuperSecretBase" makes it sound worse than 
>> it 
>> >> > is. You can have modules that describe a certain named interface, and 
>> >> > then other modules extend it. Which reminds me that I need to 
>> >> > implement #8283, so you can introduce functions without adding 
>> methods 
>> >> > yet. 
>> >> > 
>> >> > On Sat, Apr 25, 2015 at 12:31 PM, Stefan Karpinski 
>> >> > <[email protected]> wrote: 
>> >> >> Scott, I'm not really understanding your problem. Can you give an 
>> >> >> example? 
>> >> >> 
>> >> >> 
>> >> >> On Sat, Apr 25, 2015 at 11:53 AM, Scott Jones <[email protected]> 
>>
>> >> >> wrote: 
>> >> >>> 
>> >> >>> A problem I'm running into is the following (maybe the best 
>> practice 
>> >> >>> for 
>> >> >>> this is documented, and I just to stupid to find it!): 
>> >> >>> I have created a set of functions, which use my own type, so they 
>> >> >>> should 
>> >> >>> never be ambiguous. 
>> >> >>> I would like to export them all, but I have to import any names 
>> that 
>> >> >>> already exist... 
>> >> >>> Then tomorrow, somebody adds that name to Base, and my code no 
>> longer 
>> >> >>> works... 
>> >> >>> I dislike having to explicitly import names to extend something, 
>> how 
>> >> >>> am I 
>> >> >>> supposed to know in advance all the other names that could be used? 
>> >> >>> 
>> >> >>> What am I doing wrong? 
>> >> >>> 
>> >> >>> On Saturday, April 25, 2015 at 11:20:14 AM UTC-4, Stefan Karpinski 
>> >> >>> wrote: 
>> >> >>>> 
>> >> >>>> I think you're probably being overly optimistic about how 
>> >> >>>> infrequently 
>> >> >>>> there will be dispatch ambiguities between unrelated functions 
>> that 
>> >> >>>> happen 
>> >> >>>> to have the same name. I would guess that if you try to merge two 
>> >> >>>> unrelated 
>> >> >>>> generic functions, ambiguities will exist more often than not. If 
>> you 
>> >> >>>> were 
>> >> >>>> to automatically merge generic functions from different modules, 
>> >> >>>> there are 
>> >> >>>> two sane ways you could handle ambiguities: 
>> >> >>>> 
>> >> >>>> warn about ambiguities when merging happens; 
>> >> >>>> raise an error when ambiguous calls actually occur. 
>> >> >>>> 
>> >> >>>> Warning when the ambiguity is caused is how we currently deal with 
>> >> >>>> ambiguities in individual generic functions. This seems like a 
>> good 
>> >> >>>> idea, 
>> >> >>>> but it turns out to be extremely annoying. In practice, there are 
>> >> >>>> fairly 
>> >> >>>> legitimate cases where you can have ambiguous intersections 
>> between 
>> >> >>>> very 
>> >> >>>> generic definitions and you just don't care because the ambiguous 
>> >> >>>> case makes 
>> >> >>>> no sense. This is especially true when loosely related modules 
>> extend 
>> >> >>>> shared 
>> >> >>>> generic functions. As a result, #6190 has gained a lot of support. 
>> >> >>>> 
>> >> >>>> If warning about ambiguities in a single generic function is 
>> >> >>>> annoying, 
>> >> >>>> warning about ambiguities when merging different generic functions 
>> >> >>>> that 
>> >> >>>> happen share a name would be a nightmare. Imagine popular packages 
>> A 
>> >> >>>> and B 
>> >> >>>> both export a function `foo`. Initially there are no ambiguities, 
>> so 
>> >> >>>> things 
>> >> >>>> are fine. Then B adds some methods to its `foo` that introduce 
>> >> >>>> ambiguities 
>> >> >>>> with A's `foo`. In isolation A and B are both fine – so neither 
>> >> >>>> package 
>> >> >>>> author sees any warnings or problems. But suddenly every package 
>> in 
>> >> >>>> the 
>> >> >>>> ecosystem that uses both A and B – which is a lot since they're 
>> both 
>> >> >>>> very 
>> >> >>>> popular – is spewing warnings upon loading. Who is responsible? 
>> >> >>>> Package A 
>> >> >>>> didn't even change anything. Package B just added some methods to 
>> its 
>> >> >>>> own 
>> >> >>>> function and has no issues in isolation. How would someone using 
>> both 
>> >> >>>> A and 
>> >> >>>> B avoid getting these warnings? They would have to stop writing 
>> >> >>>> `using A` or 
>> >> >>>> `using B` and instead explicitly import all the names they need 
>> from 
>> >> >>>> either 
>> >> >>>> A or B. To avoid inflicting this on their users, A and B would 
>> have 
>> >> >>>> to 
>> >> >>>> carefully coordinate to avoid any ambiguities between all of their 
>> >> >>>> generic 
>> >> >>>> functions. Except that it's not just A and B – it's all packages. 
>> At 
>> >> >>>> that 
>> >> >>>> point, why have namespaces with exports at all? 
>> >> >>>> 
>> >> >>>> What if we only raise an error when making calls to `foo` that are 
>> >> >>>> ambiguous between `A.foo` and `B.foo`? This eliminates the warning 
>> >> >>>> annoyance, which is nice. But it makes code that uses A and B that 
>> >> >>>> calls 
>> >> >>>> `foo` brittle in dangerous ways. Suppose, for example, you call 
>> >> >>>> `foo(x,y)` 
>> >> >>>> somewhere and initially this can only mean `A.foo` so things are 
>> >> >>>> fine. But 
>> >> >>>> then you upgrade B, which adds a method to `B.foo` that also 
>> matches 
>> >> >>>> the 
>> >> >>>> call to `foo(x,y)`. Now your code that used to work will fail at 
>> run 
>> >> >>>> time – 
>> >> >>>> and only when invoked with ambiguous arguments. This case may be 
>> >> >>>> possible 
>> >> >>>> but rare and not covered by your tests. It's a ticking time bomb 
>> >> >>>> introduced 
>> >> >>>> into your code just by upgrading dependencies. 
>> >> >>>> 
>> >> >>>> The way this issue has actually been resolved, if you were using A 
>> >> >>>> and B 
>> >> >>>> and call `foo`, initially only is exported by A, as soon as 
>> package B 
>> >> >>>> starts 
>> >> >>>> exporting `foo`, you'll get an error and be forced to explicitly 
>> >> >>>> disambiguate `foo`. This is a bit annoying, but after you've done 
>> >> >>>> that, your 
>> >> >>>> code will no longer be affected by any changes to `A.foo` or 
>> `B.foo` 
>> >> >>>> – it's 
>> >> >>>> safe and permanently unambiguous. This still isn't 100% 
>> bulletproof. 
>> >> >>>> When 
>> >> >>>> `B.foo` is initially introduced, your code that used `foo`, 
>> expecting 
>> >> >>>> to 
>> >> >>>> call `A.foo`, will break when `foo` is called – but you may not 
>> have 
>> >> >>>> tests 
>> >> >>>> to catch this, so it could happen at an inconvenient time. But 
>> >> >>>> introducing 
>> >> >>>> new exports is far less common than adding methods to existing 
>> >> >>>> exports and 
>> >> >>>> you are much more likely to have tests that use `foo` in some way 
>> >> >>>> than you 
>> >> >>>> are to have tests that exercise a specific ambiguous case. In 
>> >> >>>> particular, it 
>> >> >>>> would be fairly straightforward to check if the tests use every 
>> name 
>> >> >>>> that is 
>> >> >>>> referred to anywhere in some code – this would be a simple 
>> coverage 
>> >> >>>> measure. 
>> >> >>>> It is completely intractable, on the other hand, to determine 
>> whether 
>> >> >>>> your 
>> >> >>>> tests cover all possible ambiguities between functions with the 
>> same 
>> >> >>>> name in 
>> >> >>>> all your dependencies. 
>> >> >>>> 
>> >> >>>> Anyway, I hope that's somewhat convincing. I think that the way 
>> this 
>> >> >>>> has 
>> >> >>>> been resolved is a good balance between convenient usage and 
>> >> >>>> "programming in 
>> >> >>>> the large". 
>> >> >>>> 
>> >> >>>> On Fri, Apr 24, 2015 at 10:55 PM, Michael Francis 
>> >> >>>> <[email protected]> 
>> >> >>>> wrote: 
>> >> >>>>> 
>> >> >>>>> the resolution of that issue seems odd -  If I have two 
>> completely 
>> >> >>>>> unrelated libraries. Say DataFrames and one of my own. I export 
>> >> >>>>> value( 
>> >> >>>>> ::MyType) I'm happily using it. Some time later I Pkg.update(), 
>> >> >>>>> unbeknownst 
>> >> >>>>> to me the DataFrames dev team have added an export of value( 
>> >> >>>>> ::DataFrame, 
>> >> >>>>> ...) suddenly all my code which imports both breaks and I have to 
>> go 
>> >> >>>>> through 
>> >> >>>>> the entire stack qualifying the calls, as do other users of my 
>> >> >>>>> module? That 
>> >> >>>>> doesn't seem right, there is no ambiguity I can see and the 
>> multiple 
>> >> >>>>> dispatch should continue to work correctly. 
>> >> >>>>> 
>> >> >>>>> Fundamentally I want the two value() functions to collapse and 
>> not 
>> >> >>>>> have 
>> >> >>>>> to qualify them. If there is a dispatch ambiguity then game over, 
>> >> >>>>> but if 
>> >> >>>>> there isn't I don't see any advantage (and lots of negatives) to 
>> >> >>>>> preventing 
>> >> >>>>> the import. 
>> >> >>>>> 
>> >> >>>>> I'd argue the same is true with overloading methods in Base. Why 
>> >> >>>>> would 
>> >> >>>>> we locally mask get if there is no dispatch ambiguity even if I 
>> >> >>>>> don't 
>> >> >>>>> importall Base. 
>> >> >>>>> 
>> >> >>>>> Qualifying names seems like an anti pattern in a multiple 
>> dispatch 
>> >> >>>>> world. Except for those edge cases where there is an ambiguity of 
>> >> >>>>> dispatch. 
>> >> >>>>> 
>> >> >>>>> Am I missing something? Perhaps I don't understand multiple 
>> dispatch 
>> >> >>>>> well enough? 
>> >> >>>> 
>> >> >>>> 
>> >> >> 
>>

Re: [julia-users] Re: Defining a function in different modules

Reply via email to