Re: [julia-users] Re: Defining a function in different modules

Tom Breloff Thu, 30 Apr 2015 13:01:22 -0700

I think it's a good thing to spell out precise scenarios where "using" 
multiple modules at the same time is good and unambiguous, and when you can 
get in trouble.  If anything, it gives developers an idea of the edge cases 
that need to be handled and can help in thinking about design changes 
and/or workarounds, and may help to more clearly define best practices.  I 
think the crux of the issue is this:


*Name clash between similar/competing modules, which likely don't know or 
care about the other, and which may or may not define functionality common 
to Base*

Good examples mentioned are databases and plotting packages.   FileSystemDB 
and CloudDB might both want to export a connect(s::String) method, just 
like Winston and Gadfly might want to export a plot(v::Vector{Float64}) 
method.  I don't think this is a bad thing, but for it to work with "using 
FileSystemDB, CloudDB" we have a couple requirements:

   - Within both FileSystemDB and CloudDB, they *must *call their 
   respective connect methods internally.  If this doesn't hold then every 
   package writer must know about every other package in existence now and in 
   the future to ensure nothing breaks.  This requirement could be relaxed a 
   little if the package writer had some control over what/how its internal 
   methods could be overwritten.  (Comparing to C++, a class can have 
   protected methods which can effectively be redefined by another class, but 
   also private methods which cannot.  It's up to the class writer to decide 
   which parts can be changed without breaking internals.)  In effect, a 
   module could have "private" methods that are never monkey-patched, and 
   "public" methods that could be.  Some languages do this with naming 
   conventions (underscores, etc).  The decision would then rest with the 
   package developer as to whether it would break their code to allow a 
   different module to override their methods.  Here's an example where 
   there's one function that you *really* don't want someone else to 
   overwrite, and another that doesn't really matter.  (Idea: could possibly 
   achieve this by automatically converting calls to "_cleanup()" within this 
   module into "MyModule._cleanup()" during parsing?)
   module MyModule
   
   const somethingImportant = loadMassiveDatabaseIntoMemory()
   _cleanup() = cleanup(somethingImportant)
   
   type MyType end
   string(x::MyType) = "MyType{}"
   
   ...
   
   end
   - In the scope where the "using" call occurred, ambiguous calls should 
   require explicit calling
      - The pain of this could certainly be lessened with syntax like 
      using Gadfly as G, Winston
      which could use Winston's plot method by default... forcing you to 
      call G.plot() otherwise.  This might be close to how it works now, 
anyways.
      or potentially harder to implement properly (but maybe under the hood 
      just re-sorts the module priorities?):
      import Gadfly, Winston
      
      with Gadfly
        plot(x)
      end
      
      with Winston
       plot(y)
      end
      Both are very reasonable syntax from my point of view.  At some point 
      the user has to tell us what they want, right?  You can't use multiple 
      packages defining the exact same method and expect something to "just 
work"
   


Note: I think monkey patching is ok in some circumstances, but usually 
dangerous in packages (i.e. redefining
Base.sin(x::Real) = fly("Vegas")
in a package will lead to problems that the package maintainer just 
couldn't foresee.)  Monkey patching by an end user is a much different 
story, as they usually have a better idea on how all the components 
interact.  This kind of thing should end up in a best practices guide, 
though... not forced by the language.  

Maybe we just need better tooling to identify package interop... i.e. a 
test system that will do "using X, Y, Z, ... " in a systematic way before 
testing a package, thus letting tests happen with random subsets of other 
packages polluting the Main module to identify how fragile that package may 
be in the wild (and also whether using that package leads to breakages 
elsewhere).

Thoughts?  Does any of this exist already and I just don't know about it?


On Thursday, April 30, 2015 at 2:18:03 PM UTC-4, Matt Bauman wrote:
>
> On Thursday, April 30, 2015 at 1:11:27 PM UTC-4, Tom Breloff wrote:
>>
>> I agree that it would be really nice, in some cases, to auto-merge 
>> function definitions between namespaces (database connects are very simple 
>> OO example).   However, if 2 different modules define foo(x::Float64, 
>> y::Int), then there should be an error if they're both exported (or if not 
>> an error, then at least force qualified access??)   Now in my mind, the 
>> tricky part comes when a package writer defines:
>>
>> module MyModule
>> export foo
>> type MyType end
>> foo(x::MyType) = ...
>> foo(x) = ...
>> end
>>
>
> I think this is a very interesting discussion, but it all seems to come 
> back to a human communication issue.  Each package author must *somehow* 
> communicate to both users and other package authors that they mean the same 
> thing when they define a function that's intended to be used 
> interchangeably.  We can either do this explicitly (e.g., by joining or 
> forming an organization like JuliaStats/StatsBase.jl, JuliaDB/DBI.jl, 
> JuliaIO/FileIO.jl, etc.), or we can try write code in Julia to help mediate 
> this discussion.
>
> The heuristics you're proposing sound interesting (and may even work, 
> especially when combined with delaying ambiguity warnings and making them 
> errors at an ambiguous call), but I have a hunch that it will take a lot of 
> work to implement.  And I'm not sure that it really makes things better. 
>  Bad actors can still define their interfaces to prevent others from using 
> the same names with multiple dispatch (e.g., by only defining 
> `connect(::String)`). Doing the sort of automatic filetype dispatch (like 
> FileIO is working towards) still needs *one* place where `load("data.jld")` 
> is interpreted and re-dispatched to `load(::FileType{:jld})` that the 
> HDF5/JLD package can define its dispatch on.  Finally, one currently 
> unsolved area is plotting.  None of the `plot` methods defined in any of 
> the various packages are combined into the same function, nor could they 
> feasibly do so without massive coordination between the package authors 
> (for no real functional gain).  This proposal doesn't really solve that, 
> either.  It'll be just as impossible to do `using Gadfly, Winston` and have 
> the `plot` function just work.
>
> I hope this doesn't read as overly negative.  I think it's great that 
> folks are pushing the edges here and proposing new ideas.  But I'm afraid 
> that this won't replace the collaboration needed to get these sorts of 
> interfaces working well and interchangeably.
>

Re: [julia-users] Re: Defining a function in different modules

Reply via email to