I think it's a good thing to spell out precise scenarios where "using"
multiple modules at the same time is good and unambiguous, and when you can
get in trouble. If anything, it gives developers an idea of the edge cases
that need to be handled and can help in thinking about design changes
and/or workarounds, and may help to more clearly define best practices. I
think the crux of the issue is this:
*Name clash between similar/competing modules, which likely don't know or
care about the other, and which may or may not define functionality common
to Base*
Good examples mentioned are databases and plotting packages. FileSystemDB
and CloudDB might both want to export a connect(s::String) method, just
like Winston and Gadfly might want to export a plot(v::Vector{Float64})
method. I don't think this is a bad thing, but for it to work with "using
FileSystemDB, CloudDB" we have a couple requirements:
- Within both FileSystemDB and CloudDB, they *must *call their
respective connect methods internally. If this doesn't hold then every
package writer must know about every other package in existence now and in
the future to ensure nothing breaks. This requirement could be relaxed a
little if the package writer had some control over what/how its internal
methods could be overwritten. (Comparing to C++, a class can have
protected methods which can effectively be redefined by another class, but
also private methods which cannot. It's up to the class writer to decide
which parts can be changed without breaking internals.) In effect, a
module could have "private" methods that are never monkey-patched, and
"public" methods that could be. Some languages do this with naming
conventions (underscores, etc). The decision would then rest with the
package developer as to whether it would break their code to allow a
different module to override their methods. Here's an example where
there's one function that you *really* don't want someone else to
overwrite, and another that doesn't really matter. (Idea: could possibly
achieve this by automatically converting calls to "_cleanup()" within this
module into "MyModule._cleanup()" during parsing?)
module MyModule
const somethingImportant = loadMassiveDatabaseIntoMemory()
_cleanup() = cleanup(somethingImportant)
type MyType end
string(x::MyType) = "MyType{}"
...
end
- In the scope where the "using" call occurred, ambiguous calls should
require explicit calling
- The pain of this could certainly be lessened with syntax like
using Gadfly as G, Winston
which could use Winston's plot method by default... forcing you to
call G.plot() otherwise. This might be close to how it works now,
anyways.
or potentially harder to implement properly (but maybe under the hood
just re-sorts the module priorities?):
import Gadfly, Winston
with Gadfly
plot(x)
end
with Winston
plot(y)
end
Both are very reasonable syntax from my point of view. At some point
the user has to tell us what they want, right? You can't use multiple
packages defining the exact same method and expect something to "just
work"
Note: I think monkey patching is ok in some circumstances, but usually
dangerous in packages (i.e. redefining
Base.sin(x::Real) = fly("Vegas")
in a package will lead to problems that the package maintainer just
couldn't foresee.) Monkey patching by an end user is a much different
story, as they usually have a better idea on how all the components
interact. This kind of thing should end up in a best practices guide,
though... not forced by the language.
Maybe we just need better tooling to identify package interop... i.e. a
test system that will do "using X, Y, Z, ... " in a systematic way before
testing a package, thus letting tests happen with random subsets of other
packages polluting the Main module to identify how fragile that package may
be in the wild (and also whether using that package leads to breakages
elsewhere).
Thoughts? Does any of this exist already and I just don't know about it?
On Thursday, April 30, 2015 at 2:18:03 PM UTC-4, Matt Bauman wrote:
>
> On Thursday, April 30, 2015 at 1:11:27 PM UTC-4, Tom Breloff wrote:
>>
>> I agree that it would be really nice, in some cases, to auto-merge
>> function definitions between namespaces (database connects are very simple
>> OO example). However, if 2 different modules define foo(x::Float64,
>> y::Int), then there should be an error if they're both exported (or if not
>> an error, then at least force qualified access??) Now in my mind, the
>> tricky part comes when a package writer defines:
>>
>> module MyModule
>> export foo
>> type MyType end
>> foo(x::MyType) = ...
>> foo(x) = ...
>> end
>>
>
> I think this is a very interesting discussion, but it all seems to come
> back to a human communication issue. Each package author must *somehow*
> communicate to both users and other package authors that they mean the same
> thing when they define a function that's intended to be used
> interchangeably. We can either do this explicitly (e.g., by joining or
> forming an organization like JuliaStats/StatsBase.jl, JuliaDB/DBI.jl,
> JuliaIO/FileIO.jl, etc.), or we can try write code in Julia to help mediate
> this discussion.
>
> The heuristics you're proposing sound interesting (and may even work,
> especially when combined with delaying ambiguity warnings and making them
> errors at an ambiguous call), but I have a hunch that it will take a lot of
> work to implement. And I'm not sure that it really makes things better.
> Bad actors can still define their interfaces to prevent others from using
> the same names with multiple dispatch (e.g., by only defining
> `connect(::String)`). Doing the sort of automatic filetype dispatch (like
> FileIO is working towards) still needs *one* place where `load("data.jld")`
> is interpreted and re-dispatched to `load(::FileType{:jld})` that the
> HDF5/JLD package can define its dispatch on. Finally, one currently
> unsolved area is plotting. None of the `plot` methods defined in any of
> the various packages are combined into the same function, nor could they
> feasibly do so without massive coordination between the package authors
> (for no real functional gain). This proposal doesn't really solve that,
> either. It'll be just as impossible to do `using Gadfly, Winston` and have
> the `plot` function just work.
>
> I hope this doesn't read as overly negative. I think it's great that
> folks are pushing the edges here and proposing new ideas. But I'm afraid
> that this won't replace the collaboration needed to get these sorts of
> interfaces working well and interchangeably.
>