mapping module identifiers to URLs (was RE: [Haskell] URLs in haskell module namespace)
My original goal in this thread was to explore a way to map module identifiers to the locations of module implementations that implementations can retrieve and compile with minimal user intervention. We got sidetracked into grafting, and I'd like to return to the original goal. The big open question is whether that mapping of imported module identifiers to retrieval locations is determined per import (like HTML links), per module (like HTML base tags), per build (think -i), per system (the registry), or centrally/federated like the DNS. Per system and Centrally/Federated feel like they involve too much bureacracy either through a local sysadmin or an IANA like entity (currently the libraries mailing list) and they give the programmer too little control over which packages they actually want to use. Per import and per module mapping seem like unjustified maintenance headaches absent module identifier relativity (a feature rejected in the 2003 grafting thread to which Simon referred [1] though Malcolm seems to be backtracking on the issue in in this thread) Therefore per build seems like the way to go. One implementation of per build is to extend -i to take URLs but that requires the implementation to query every URL on the search path for every import and that seems inefficient. I actually think -i is harmful in general because it makes it much harder to track dependencies. Here is a strawman proposal for replacing -i. The compiler/interpreter should accept a Modules file that maps thid party module identifiers to URLs at which source may be found. Here is a strawman file format: #moduleId url(s) Foo.Bar.* http://domain.com/package-v1.0.hkg http://domain.com/package-v2.0.hkg Foo.Bar http://domain2.com/package2.hkg The urls on any line enumerate all packages that have compatible implementations of the module identified by the module identifier on that line. Each imported packages may contain at most one Modules file. The implementation attempts to find module URL implementation agreement among all imported packages. If that fails, foreign Modules files are interpreted as being included at the line where they were imported. Later URLs for the same moduleId override earlier ones. The implementation should give a warning if third party packages give conflicting module locations. Note: Yes, I know about Cabal's Build-Depends, but it doesn't serve the need described here. -Alex- [1] http://www.haskell.org/pipermail/libraries/2003-September/001457.html __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com On Thu, 24 Mar 2005, Simon Marlow wrote: On 23 March 2005 13:11, Malcolm Wallace wrote: I think this will be trivially possible once the compilers support multiple versioning of packages. (Ghc may even support it already.): {-# OPTIONS -package foo-1.0 #-} module Old (module Foo) where import Foo {-# OPTIONS -package foo-2.2 #-} module New (module Foo) where import Foo module Convert where import qualified Old import qualified New convert (Old.Foo x y) = New.Foo y x We're not going to support this, at least for the forseeable future. It's a pretty big change: every entity in the program becomes parameterised by the package name as well as the module name, because module names can overlap. This means a change to the language: there might be multiple types called M.T in the program, which are not compatible (they might have different representations). You can't pass a value of type M.T that you got from version 1.0 of the package to a function expecting M.T in version 2. This issue came up in the thread about grafting from late 2003 on the libraries list (sorry don't have a link to hand). Cheers, Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
RE: [Haskell] URLs in haskell module namespace
On 23 March 2005 13:11, Malcolm Wallace wrote: I think this will be trivially possible once the compilers support multiple versioning of packages. (Ghc may even support it already.): {-# OPTIONS -package foo-1.0 #-} module Old (module Foo) where import Foo {-# OPTIONS -package foo-2.2 #-} module New (module Foo) where import Foo module Convert where import qualified Old import qualified New convert (Old.Foo x y) = New.Foo y x We're not going to support this, at least for the forseeable future. It's a pretty big change: every entity in the program becomes parameterised by the package name as well as the module name, because module names can overlap. This means a change to the language: there might be multiple types called M.T in the program, which are not compatible (they might have different representations). You can't pass a value of type M.T that you got from version 1.0 of the package to a function expecting M.T in version 2. This issue came up in the thread about grafting from late 2003 on the libraries list (sorry don't have a link to hand). Cheers, Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
S. Alexander Jacobson wrote: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. Embedding the path in the source code seems like a bad idea. If you want to allow modules to be loaded via URLs, it would make more sense to extend GHC's -i switch, i.e. ghc -i http://module.org/ ... -- Glynn Clements [EMAIL PROTECTED] ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell-cafe] Re: [Haskell] URLs in haskell module namespace
Dear list members, I'd like to share some sketchy ideas I have on the subject to address some of the issues raised. At 12:14 22/03/05 +, Malcolm Wallace wrote: I cannot see any of the Haskell compilers ever implementing this idea as presented. It would introduce an enormous raft of requirements (networking client, database mapping, caching, etc) that do not belong in a compiler - they belong in separate (preprocessing/packaging) tools. Furthermore, these tools already exist, albeit they are young and have a long maturation process still ahead of them. An external program acting as an URI streamer might be the solution. Such a program (identified via an environment variable or a compiler's command line option, just like Hugs external editor) would take an URI as its command line argument, and respond with a streamed contents of that URI on its stdout. Like e. g. curl/wget -O -. All the compiler has to do is popen() that program if an import statement contains an URI. Using curl/wget helps get around various issues with proxies/encryption/etc. as those programs are specifically designed for that. I do not believe this would result in significant overhead comparing to regular fopen() used by the compiler for opening source files. On a non-networked system, such a program would be a not so complicated shell script pretending it downloads from an URI, but reading from local disk (flash, any other kind of storage) instead. To address the problem of module/package URI changes over time, the following may be suggested. Either purl.org is used (and then it is responsibility of a package maintainer to keep its URL pointer valid). Or, some kind of a purl server may be set up somewhere (at haskell.org for example) which also supports mirroring. This means that for each package/module registered with this server, multiple locations are known (well, probably willing people might allocate their computer resources for that, at least I would not object as long as I have my own web server). The hypothetical purl server serves redirects as usual, but shifting randomly to another mirroring location for each new request for a module/package. So, if an attempt to retrieve a module/package fails, it may be repeated, and other mirror location will be tried. Mirrors will be synchronized behind the scenes. Will such a centralized purl server a bottleneck or a single point of failure? Probably not more than a centralized Hackage database (or is it planned to be distributed?) Also, some resolver might be part of the URI streamer which maps module names to URIs. For example, the Prelude will most likely be stored locally, but some other module will not. This means that the resolver consults the local package database (cabal), and its own cache, and either streams a local file or popens curl with some URI constructed specifically based on the desired module name. Once downloaded, a module may be cached with some TTL, so further recompilations do not result in curl involvement (until the TTL expires). PS This all is written in the assumption that Haskell source files are served. Binary distributions, of course would require different techniques. -- Dimitry Golubovsky Anywhere on the Web ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] URLs in haskell module namespace
Hello Here are some problems I see with the scheme: 1) No easy way of seing the dependencies of a single package. Currently I can just look at the Cabal file and know what packages are required. In the new scheme I would need to grep all the source files. Not a very good improvement in my opinion. 2) What about machines without an easy http connection out? These machines are not as uncommon as you may think. Some companies have internal networks not connected to the rest of the net (and internet connected machines separate) - and even home I would like to use Haskell even if there is a network outage. 3) Package versions You cannot have to modules with the same name in your application. This is a limit in GHC afaik, not the current import syntax. To handle versioning you should plan ahead and make new versions support the serialization syntax of older versions. So there would be no real advantage. Also if we use an url like http://package/newest then we don't have control when we start using a new version of library (while usually one wants to test that) and if we use http://package/version then we have to perform the update in all the source files which use the package (and if we support multiple versions then missing one will have fun results). 4) Does not help for nontrivial packages Most nontrivial packages are a lot more than simple haskell source files scattered somewhere. They may have C-sources, need configuration when build or use alex, happy, cpphs or somesuch. - Einar Karttunen ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
S. Alexander Jacobson [EMAIL PROTECTED] writes: Ok, well lets unpack what is actually required here: 1. Change of syntax in import statements GHC already has lots of new syntax. I can see that one might wish to broaden the import syntax, rather like Hugs once supported (but not now) to use a full filepath rather than a module name. For parsing, a URI enclosed in string quotes would be easiest. However, I suspect that the same reason why Hugs abandoned filepaths, will arise here again. I don't know what that reason was, but would imagine that non-local absolute links are very fragile. The better solution to my mind is to (a) include all the local source code in a local relative directory/namespace, and (b) have any non-local modules referenced by package name, not location. The package name is likely to be constant, whereas its location (whether on the build machine or the wider net) is likely to change. 2. Module names with package scope GHC already has a -i. I assume the complexity of generating a -i w/r/t a notation provided in the import statment is not that high. I don't see a problem with adding {-# PACKAGES gtk2hs furble #-} to the top of a source file. 3. Networking Client I think GHC already bundles Cabal and Cabal already handles being a network client and doing some database mapping. (Lemmih: Please correct me if I am mistaken). Cabal just does packaging. It is the Hackage project which does the networking stuff to make the finding and downloading of packages easy. Also, it is ridiculous for a modern language implementation NOT to have a network client library. Of course the /language/ should (and does) have a network client library. But should every compiler expect to do networking stuff? No. For one thing, there is a huge number of machines out there with no network connection, whether for security or otherwise. What do you propose the compiler should do there? Inevitably, you are going to need some other tool, running on a different, net-connected machine, to collect the necessary library packages for physical transfer. 4. Caching Caching is new, but it is not that difficult to add to an extisting HTTP requester and the benefits seem well worth this marginal cost. Network caching belongs in a separate layer of the system environment - middleware if you like. It is a service unrelated to compilation. 5. Maturation of Packaging Tools I agree that the packaging tools are immature. That is why it makes sense to evaluate this proposal now. No one has a big investment in the current packaging model and packaging tools optimized for a language that works in the way I propose would look very different from packaging tools organized for the pre-Internet world. Fair enough. Just don't assume that every machine now or in the future will be net-enabled. No, it spreads the dependency problem over lots of import statements, Disaster? I don't think so. That is why purl.org exists. The HTTP 302 status code is your friend. If you don't want to use purl.org, feel free to set up your own redirect server. I imagine various different redirect servers operated by different people with different policies about what counts as a bug fix vs what counts as a new version, etc. Hmmm, this is pretty much what Hackage aims to provide. Give it a package name, and the tool will resolve it to a location and download the package. It will likely support multiple servers with different policies, and you simply configure which servers you want to use. And btw, it is a failure of Haskell right now that imports don't create dependency. Right now, I would like a sane way to import two different versions of the same module so I can do file conversion. It seems like the only way to accomplish this in Haskell as it stands is to rename one version and then I'm back in the world of global search and replace on import lines again. It would be MUCH niceer do this via packages URLs instead. I think this will be trivially possible once the compilers support multiple versioning of packages. (Ghc may even support it already.): {-# OPTIONS -package foo-1.0 #-} module Old (module Foo) where import Foo {-# OPTIONS -package foo-2.2 #-} module New (module Foo) where import Foo module Convert where import qualified Old import qualified New convert (Old.Foo x y) = New.Foo y x It would be much better to group the dependencies into a single file per project - so there is just one place where changes need to be made. This possibility already exists - just create a .cabal file for the project. How do I depend on multiple versions of the same package in a single module? You can't do it in a single module, because the namespaces overlap. But it should be straightforward using two or more modules, to separate out the namespaces via qualification, as illustrated above. How do I make sure that
Re: [Haskell] URLs in haskell module namespace
Here is what I designed and implemented for my language Kogut: There is a file format of compilation parameters (compiler options, source file encoding, directories to look for interface files for imported modules, directories to look for libraries, C libraries to link, directories to look for packages, packages used). Parameters are gathered from several places: - From the command line (in a slightly different format than a file). - From the file name derived from the source file, with a changed extension. - From common.kop in the directory of the source file. - From common.kop in parent directories of the source, up to the current directory. - From used packages. A package actually corresponds to such parameter file, nothing else. Packages are included in the global dependency order, with duplicates removed. - From the default file in the compiler installation. Some parameters are accumulated (e.g. directory lists or libraries; with duplicates removed; the order is important for static libraries) while others are overridden (e.g. the C compiler to use). So if a package is needed to bring one module used in one place, it can be specified near the file which needs it (and there is less chance that the dependency will stick forgotten when no longer needed). OTOH a package used all over the place will be given in the common.kop in the root of the directory tree. A package usually has two parameter files: one common.kop used during its compilation, and another named after the package which is used by its clients. Since compilation options can be put in parameter files corresponding to source files, it's not necessary to invent a way to specify them per-file in Makefiles. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell-cafe] Re: [Haskell] URLs in haskell module namespace
Switching to -cafe... At 12:14 22/03/05 +, Malcolm Wallace wrote: import http://domain.org/package-1.0.cabal#Network.HTTP as HTTP import http://hage.org/package-2.1.cabal#Network.HTTP as HTTP2 --note use of HTTP fragment identifier for module name I cannot see any of the Haskell compilers ever implementing this idea as presented. It would introduce an enormous raft of requirements (networking client, database mapping, caching, etc) that do not belong in a compiler - they belong in separate (preprocessing/packaging) tools. Furthermore, these tools already exist, albeit they are young and have a long maturation process still ahead of them. Well, I'd agree the networking stuff don't belong in the compiler. In the operating system, I'd suggest, as part of the I/O and file system. In the long, I believe the distinction between local and networked resources will fade away (other than for practical purposes, as today with specific networked file systems), and URIs are an effective and standardized way to identify all kinds of resources. #g Graham Klyne For email: http://www.ninebynine.org/#Contact ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] URLs in haskell module namespace
Proposal restatement: Import statements should be allowed to include URL of Cabal packages. Module namespace in these import statements should be with respect to the package and not the local environment. e.g. these import statements allow us to import two different versions of Network.HTTP import http://domain.org/package-1.0.cabal#Network.HTTP as HTTP import http://hage.org/package-2.1.cabal#Network.HTTP as HTTP2 --note use of HTTP fragment identifier for module name Advantages: * Simplified place for user to track/enforce external dependencies. If you are using non-standard modules, you had better know where they came from and how to get another copy. This proposal provides a sane way to do that (see below). * It makes it easy to move code between machines. The implementation takes care of retreiving and building the packages automatically and as necessary. There is no need for a separate retrieve/build/install cycle. * Eliminates the horrible globality of Haskell's module namespace You can use two modules with the same name and different functionality and you can use two modules that use different versions of the same module. (see below). * Users no longer need to think about package installation scope. Package installation is with respect to the current use. Whether multiple users are able to share the same installation is up to the installation. User's can't infest the machine's local namespace by adding new packages. On Tue, 22 Mar 2005, Lemmih wrote: 1. knowing the source package for each module used in their code even if they didn't insall the packages in the first place i.e. import Foo.Bar just worked on my development machine. I'm not sure I completely understand what you're saying but knowing the exact URL for every single module import seems more of a hassle than installing a few packages. You could perhaps even make a shell script containing 'cabal-get install package1 package2 ...'. I am assuming that I may want to move my code to another machine and that therefore I need to keep a record *somewhere* of the source package of every module I actually use. If I don't, then moving will be much more difficult. Yes, keeping track of these packages is a hassle, but I don't see how it can be avoided. Once I am keeping track, the *somewhere* that it makes the most sense to me to do so is the point in the code where I am importing the module. That way the implementation can enforce correspondence and if I stop using the module, the package dependency automatically vanishes. Doing this sort of work in a separate script strikes me as a maintenance headache and means that all modules I use have to coexist in a shared namespace which seems likely to create more headache. 2. knowing the current location of those packages even if they didn't obtain them for installation on the original machine where they used them and aren't on the mailing list for them. I assume you meant something like The developer don't know where to find the packages. The location of the packages is irrelevant to the developer since it's handled by Cabal/Hackage. I don't understand. Are you saying that there will be only one Hackage server ever and it will have all information about all packages everywhere and that the location of this hackage server will be hard coded into every cabal implementation? If so, I find that vision incredibly unappealing. I believe there should/will be multiple hackage servers with carrying different hackages under control of different parties (e.g. a corporation might have one for its own private code). And, if there are multiple hackage servers, we are going to need to indentify the server from which a particular package originates and the location of that package on that server. This proposal provides an obvious method of doing so. And a big bonus here is we get a simple solution to the problem of Haskell's global module namespace. There was a problem with module name spaces? Wouldn't there only be a problem if two packages used the same module name for different functionality? Yes, and that happens whenever you have different modules using different versions of the same module and it also happens when two different authors both chose to name their libraries Network.HTTP or Text.XML.Parse. If module names were with respect to packages that would be entirely fine. But right now module names are global and that is a serious problem. -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
At 15:47 21/03/05 -0500, S. Alexander Jacobson wrote: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. I think this is an interesting idea, that has some interesting implications for possible evolution of Haskell as an integrated scripting language for the web. One of the principles of web architecture is, roughly, that anything worth naming is worth naming with a URI [1][2]. I think a logical consequence of using URIs for module names is that module exports can become URI references with fragment identifiers. Thus, given: module http://example.org/modules/foo( func1, func2, val3 ) where ... Also introduces URI references: http://example.org/modules/foo#func1 http://example.org/modules/foo#func2 http://example.org/modules/foo#val3 This exactly where web services and SOAP are going, using URIs to identify processing services in the Web. Some other points to note... Concerning dependency on http: http: is just one URI scheme among many. It just happens to be very widely used, and provides a uniform naming and resource location framework [3] [6]. Within the Web technical community, there is a strong sense that *all* URIs should be stable [4] [5], or breakage occurs. This is illustrated by the concern for dependence on changeable URIs expressed in another message in this thread, but I think that when persistence is needed (for technical reasons) then it can be arranged. #g -- [1] http://www.w3.org/TR/webarch/#uri-benefits [2] http://www.w3.org/DesignIssues/Webize.html [3] http://www.w3.org/2002/Talks/www2002-tbl/ http://www.w3.org/2002/Talks/www2002-tbl/slide12-0.html (I spent a little while looking for a more definitive view on this idea of the less-than-absolute distinction between identification and location in the web. Although this topic is much-discussed in web circles, this was the best I could find quickly.) [4] http://www.w3.org/Provider/Style/URI [5] http://www.w3.org/DesignIssues/PersistentDomains [6] http://www.w3.org/DesignIssues/Naming Graham Klyne For email: http://www.ninebynine.org/#Contact ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
S. Alexander Jacobson [EMAIL PROTECTED] writes: Proposal restatement: Import statements should be allowed to include URL of Cabal packages. Module namespace in these import statements should be with respect to the package and not the local environment. e.g. these import statements allow us to import two different versions of Network.HTTP import http://domain.org/package-1.0.cabal#Network.HTTP as HTTP import http://hage.org/package-2.1.cabal#Network.HTTP as HTTP2 --note use of HTTP fragment identifier for module name I cannot see any of the Haskell compilers ever implementing this idea as presented. It would introduce an enormous raft of requirements (networking client, database mapping, caching, etc) that do not belong in a compiler - they belong in separate (preprocessing/packaging) tools. Furthermore, these tools already exist, albeit they are young and have a long maturation process still ahead of them. Advantages: * Simplified place for user to track/enforce external dependencies. No, it spreads the dependency problem over lots of import statements, which will quickly become a maintenance headache when the URLs become invalid. Imagine a GUI project that uses, say, the GTK+ libraries in a hundred different import statements. Then the GTK server moves to a different URL. Disaster! It would be much better to group the dependencies into a single file per project - so there is just one place where changes need to be made. This possibility already exists - just create a .cabal file for the project. If you are using non-standard modules, you had better know where they came from and how to get another copy. This proposal provides a sane way to do that (see below). The Hackage project is exactly a database of package/location mappings, which the /author/ of each package can keep up-to-date, not the user. Much more maintainable. Regards, Malcolm ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
On Tue, 22 Mar 2005 06:40:02 -0500 (Eastern Standard Time), S. Alexander Jacobson [EMAIL PROTECTED] wrote: Proposal restatement: Import statements should be allowed to include URL of Cabal packages. Module namespace in these import statements should be with respect to the package and not the local environment. e.g. these import statements allow us to import two different versions of Network.HTTP import http://domain.org/package-1.0.cabal#Network.HTTP as HTTP import http://hage.org/package-2.1.cabal#Network.HTTP as HTTP2 --note use of HTTP fragment identifier for module name Advantages: * Simplified place for user to track/enforce external dependencies. If you are using non-standard modules, you had better know where they came from and how to get another copy. This proposal provides a sane way to do that (see below). * It makes it easy to move code between machines. The implementation takes care of retreiving and building the packages automatically and as necessary. There is no need for a separate retrieve/build/install cycle. * Eliminates the horrible globality of Haskell's module namespace You can use two modules with the same name and different functionality and you can use two modules that use different versions of the same module. (see below). * Users no longer need to think about package installation scope. Package installation is with respect to the current use. Whether multiple users are able to share the same installation is up to the installation. User's can't infest the machine's local namespace by adding new packages. On Tue, 22 Mar 2005, Lemmih wrote: 1. knowing the source package for each module used in their code even if they didn't insall the packages in the first place i.e. import Foo.Bar just worked on my development machine. I'm not sure I completely understand what you're saying but knowing the exact URL for every single module import seems more of a hassle than installing a few packages. You could perhaps even make a shell script containing 'cabal-get install package1 package2 ...'. I am assuming that I may want to move my code to another machine and that therefore I need to keep a record *somewhere* of the source package of every module I actually use. If I don't, then moving will be much more difficult. Yes, keeping track of these packages is a hassle, but I don't see how it can be avoided. Once I am keeping track, the *somewhere* that it makes the most sense to me to do so is the point in the code where I am importing the module. That way the implementation can enforce correspondence and if I stop using the module, the package dependency automatically vanishes. Doing this sort of work in a separate script strikes me as a maintenance headache and means that all modules I use have to coexist in a shared namespace which seems likely to create more headache. 2. knowing the current location of those packages even if they didn't obtain them for installation on the original machine where they used them and aren't on the mailing list for them. I assume you meant something like The developer don't know where to find the packages. The location of the packages is irrelevant to the developer since it's handled by Cabal/Hackage. I don't understand. Are you saying that there will be only one Hackage server ever and it will have all information about all packages everywhere and that the location of this hackage server will be hard coded into every cabal implementation? Hackage and (the soon to come) cabal-get are tools layered on the Cabal but they are not a part of it. Hard coded URLs are evil in almost every context, IMHO. But defaulting to some server when the user hasn't specified otherwise would greatly increase the ease of use. If so, I find that vision incredibly unappealing. I believe there should/will be multiple hackage servers with carrying different hackages under control of different parties (e.g. a corporation might have one for its own private code). The idea with Hackage was to create a central place for people to put software or links to software, so keeping only one server for free (as in beer) packages would be desirable. However, this does in no way limit how Hackage can be used for private code repositories. And, if there are multiple hackage servers, we are going to need to indentify the server from which a particular package originates and the location of that package on that server. This proposal provides an obvious method of doing so. Specifying sources on the cmd line or in /etc/cabal/sources.list sounds more maintainable to me. And a big bonus here is we get a simple solution to the problem of Haskell's global module namespace. There was a problem with module name spaces? Wouldn't there only be a problem if two packages used the same module name for different functionality? Yes, and
RE: [Haskell] URLs in haskell module namespace
On 22 March 2005 13:03, Lemmih wrote: On Tue, 22 Mar 2005 06:40:02 -0500 (Eastern Standard Time), S. Alexander Jacobson [EMAIL PROTECTED] wrote: I don't understand. Are you saying that there will be only one Hackage server ever and it will have all information about all packages everywhere and that the location of this hackage server will be hard coded into every cabal implementation? Hackage and (the soon to come) cabal-get are tools layered on the Cabal but they are not a part of it. Hard coded URLs are evil in almost every context, IMHO. But defaulting to some server when the user hasn't specified otherwise would greatly increase the ease of use. If so, I find that vision incredibly unappealing. I believe there should/will be multiple hackage servers with carrying different hackages under control of different parties (e.g. a corporation might have one for its own private code). The idea with Hackage was to create a central place for people to put software or links to software, so keeping only one server for free (as in beer) packages would be desirable. However, this does in no way limit how Hackage can be used for private code repositories. It might make sense in the future to be able to express package names as URLs, with the default being relative to http://www.haskell.org/packages (or wherever). eg. in your .cabal file you could say BuildDepends: http://example.org/haskell-packages/foo = 1.0 Cheers, Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
On Tue, 22 Mar 2005, Lemmih wrote: The idea with Hackage was to create a central place for people to put software or links to software, so keeping only one server for free (as in beer) packages would be desirable. However, this does in no way limit how Hackage can be used for private code repositories. So I assume that means you also think we need a way to locate packages on various repositories. Would you agree that URLs would be a good way of doing so? And, if there are multiple hackage servers, we are going to need to indentify the server from which a particular package originates and the location of that package on that server. This proposal provides an obvious method of doing so. Specifying sources on the cmd line or in /etc/cabal/sources.list sounds more maintainable to me. Except you then need to notice when you are no longer using a particular package and do bookkeeping. You also have no way of saying that one of your modules is using version X of a package while another is using version Y without command line specification. If module names were with respect to packages that would be entirely fine. But right now module names are global and that is a serious problem. But they are! GHC can even handle several versions of the same package. Modules from a package wont be in scope if you hide or ignore it. But suppose you want to use two different verions of the same package in a single modulePerhaps because you need to read a file saved with a show corresponding to an old version into a data structure defined in the new version... -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
On Tue, 22 Mar 2005, Malcolm Wallace wrote: Import statements should be allowed to include URL of Cabal packages. Module namespace in these import statements should be with respect to the package and not the local environment. e.g. these import statements allow us to import two different versions of Network.HTTP import http://domain.org/package-1.0.cabal#Network.HTTP as HTTP import http://hage.org/package-2.1.cabal#Network.HTTP as HTTP2 --note use of HTTP fragment identifier for module name I cannot see any of the Haskell compilers ever implementing this idea as presented. It would introduce an enormous raft of requirements (networking client, database mapping, caching, etc) that do not belong in a compiler - they belong in separate (preprocessing/packaging) tools. Furthermore, these tools already exist, albeit they are young and have a long maturation process still ahead of them. Ok, well lets unpack what is actually required here: 1. Change of syntax in import statements GHC already has lots of new syntax. 2. Module names with package scope GHC already has a -i. I assume the complexity of generating a -i w/r/t a notation provided in the import statment is not that high. 3. Networking Client I think GHC already bundles Cabal and Cabal already handles being a network client and doing some database mapping. (Lemmih: Please correct me if I am mistaken). Also, it is ridiculous for a modern language implementation NOT to have a network client library. 4. Caching Caching is new, but it is not that difficult to add to an extisting HTTP requester and the benefits seem well worth this marginal cost. 5. Maturation of Packaging Tools I agree that the packaging tools are immature. That is why it makes sense to evaluate this proposal now. No one has a big investment in the current packaging model and packaging tools optimized for a language that works in the way I propose would look very different from packaging tools organized for the pre-Internet world. Advantages: * Simplified place for user to track/enforce external dependencies. No, it spreads the dependency problem over lots of import statements, which will quickly become a maintenance headache when the URLs become invalid. Imagine a GUI project that uses, say, the GTK+ libraries in a hundred different import statements. Then the GTK server moves to a different URL. Disaster! Disaster? I don't think so. That is why purl.org exists. The HTTP 302 status code is your friend. If you don't want to use purl.org, feel free to set up your own redirect server. I imagine various different redirect servers operated by different people with different policies about what counts as a bug fix vs what counts as a new version, etc. And btw, it is a failure of Haskell right now that imports don't create dependency. Right now, I would like a sane way to import two different versions of the same module so I can do file conversion. It seems like the only way to accomplish this in Haskell as it stands is to rename one version and then I'm back in the world of global search and replace on import lines again. It would be MUCH niceer do this via packages URLs instead. It would be much better to group the dependencies into a single file per project - so there is just one place where changes need to be made. This possibility already exists - just create a .cabal file for the project. How do I depend on multiple versions of the same package in a single module? How do I make sure that my .cabal file is up to date with the actual content of my imports? I am proposing to automate this process. You appear to want to keep it manual. If you are using non-standard modules, you had better know where they came from and how to get another copy. This proposal provides a sane way to do that (see below). The Hackage project is exactly a database of package/location mappings, which the /author/ of each package can keep up-to-date, not the user. Much more maintainable. See my comment to Lemmih about the possibility of multiple hackage servers and needing to know locations on each of those servers. If Haskell allowed import of package URLs then a hackage server would just be one of many 302 servers (like Purl.org) and not require special plumbing. Note, you different people might have different judgements about what constistutes a bugfix vs a new version. You can't rely on the package author to agree with you! -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell] URLs in haskell module namespace
As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. If would be nice if implementations cached these http requests and did If-Modified-Since requests on each compile. If the document at the URL has been modified it might show the diff and ask the user if it is appropriate to upgrade to the new version. Does this make sense? -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
Greetings Alexander, I have been thinking about something very much similar for some time. But: Am 21. Mrz 2005 um 21.47 Uhr schrieb S. Alexander Jacobson: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib I'm not sure a URL is the right thing to use. For instance, what about the http part? In the end, the URL gives a certain location for the module, which might change. Programs using the module should not become invalid just by movement of the dependency. If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. If would be nice if implementations cached these http requests and did If-Modified-Since requests on each compile. If the document at the URL has been modified it might show the diff and ask the user if it is appropriate to upgrade to the new version. Exactly. I think, even, that this kind of handling is what we _need_. I routinely feel, in writing my own modules, the hassle of questions like how do I package this?. It would be much easier and accessible to just put my modules up one by one on the Web, advertise them (by posting the documentation, preferably ;)) and know that people's GHC or whatnot will just auto-fetch them. The next thought of course is versioning. To make sure my Haskell system gets the version I meant when I wrote my program, modules need version numbers. I'd propose the following. module A [1,5.2] (...) where ... The bracketed expression after the module name is an interval of interface numbers: This version of the module exports interface 5.2, the decimal indicating the second revision since no. 5. The module further declares to be backwards-compatible with all interfaces down to version 1, inclusively (i.e. they form a sequence of subsets). Nota bene this scheme is the same as that used by GNU libtool (although libtool explains it much too complicated). A module author would start with interface 1 (i.e. write [1,1]) and upon changing the module: - If the change was only a code revision with no interface or semantic changes at all, raise the fractional part, e.g. [1,1.1] - If there was any change in the module exports, or the semantics of existing exports, raise the interface number (upper bound) to the next integer, e.g. [1,2] - If the change broke compatibility with the last version (i.e. removed or changed any of the existing exports), snap the lower bound up to reduce the interval to a single element again, e.g. [3,3]. import A 2 (...) The import statement includes a single integer interface number which is the number of the interface this module was written against. It indicates that any version of module A whose interface interval contains 2 is compatible. Obviously, the Haskell system should be able to provide some convenience for managing the interface numbers. It should also be possible to devise a smart way of handling omitted interface info (both on the ex- and import side). Finally, one will wish for a system of providing adaptor modules to interface old importers to new versions of their importees. That way, interfaces can be evolved rapidly because backwards-compatibility need not be retained, as long as one provides a suitable adaptor (to be auto-installed by an importing system). In such a setting, the simple latest compatible interval approach also becomes sufficient to handle even strong interface fluctuation because gaps can always be bridged with adaptors. Does this make sense? Cheers, Sven Moritz PGP.sig Description: Signierter Teil der Nachricht ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
A few quick thoughts: 1 Although technically HTTP URLs are locations rather than identifiers, that is the behavior we want in this context. If you want to trust someone else to serve you the correct module, you should specify it. A formal spec should define exactly what URI schemes are supported. I would like support for HTTP and HTTPS. 2. Versioning is an issue independent of whether Haskell allows HTTP URLs as module locators. However if Haskell does end up with versioning AND HTTP support, it might make sense for it to use WebDAV versioning to access remote modules. 3. I love the concept of adapters. In particular, I'd really like a way to make sure that Prelude.read does not produce an error when the saved representation of a datatype differs from the current one. Manual management is a big PITA. (And yes, this too is orthogonal from the question of URLs in haskell module namespace.) -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com On Mon, 21 Mar 2005, Sven Moritz Hallberg wrote: Greetings Alexander, I have been thinking about something very much similar for some time. But: Am 21. Mrz 2005 um 21.47 Uhr schrieb S. Alexander Jacobson: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib I'm not sure a URL is the right thing to use. For instance, what about the http part? In the end, the URL gives a certain location for the module, which might change. Programs using the module should not become invalid just by movement of the dependency. If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. If would be nice if implementations cached these http requests and did If-Modified-Since requests on each compile. If the document at the URL has been modified it might show the diff and ask the user if it is appropriate to upgrade to the new version. Exactly. I think, even, that this kind of handling is what we _need_. I routinely feel, in writing my own modules, the hassle of questions like how do I package this?. It would be much easier and accessible to just put my modules up one by one on the Web, advertise them (by posting the documentation, preferably ;)) and know that people's GHC or whatnot will just auto-fetch them. The next thought of course is versioning. To make sure my Haskell system gets the version I meant when I wrote my program, modules need version numbers. I'd propose the following. module A [1,5.2] (...) where ... The bracketed expression after the module name is an interval of interface numbers: This version of the module exports interface 5.2, the decimal indicating the second revision since no. 5. The module further declares to be backwards-compatible with all interfaces down to version 1, inclusively (i.e. they form a sequence of subsets). Nota bene this scheme is the same as that used by GNU libtool (although libtool explains it much too complicated). A module author would start with interface 1 (i.e. write [1,1]) and upon changing the module: - If the change was only a code revision with no interface or semantic changes at all, raise the fractional part, e.g. [1,1.1] - If there was any change in the module exports, or the semantics of existing exports, raise the interface number (upper bound) to the next integer, e.g. [1,2] - If the change broke compatibility with the last version (i.e. removed or changed any of the existing exports), snap the lower bound up to reduce the interval to a single element again, e.g. [3,3]. import A 2 (...) The import statement includes a single integer interface number which is the number of the interface this module was written against. It indicates that any version of module A whose interface interval contains 2 is compatible. Obviously, the Haskell system should be able to provide some convenience for managing the interface numbers. It should also be possible to devise a smart way of handling omitted interface info (both on the ex- and import side). Finally, one will wish for a system of providing adaptor modules to interface old importers to new versions of their importees. That way, interfaces can be evolved rapidly because backwards-compatibility need not be retained, as long as one provides a suitable adaptor (to be auto-installed by an importing system). In such a setting, the simple latest compatible interval approach also becomes sufficient to handle even strong interface fluctuation because gaps can always be bridged with adaptors. Does this make sense? Cheers, Sven Moritz ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman
Re: [Haskell] URLs in haskell module namespace
On Mon, 21 Mar 2005 23:06:25 +0100, Sven Moritz Hallberg [EMAIL PROTECTED] wrote: Greetings Alexander, I have been thinking about something very much similar for some time. But: Am 21. Mrz 2005 um 21.47 Uhr schrieb S. Alexander Jacobson: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib The extra complexity outstrips the gain since installing a package will soon be as easy as this: 'cabal-get install myPackage'. Checkout the Cabal/Hackage project. I'm not sure a URL is the right thing to use. For instance, what about the http part? In the end, the URL gives a certain location for the module, which might change. Programs using the module should not become invalid just by movement of the dependency. If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. If would be nice if implementations cached these http requests and did If-Modified-Since requests on each compile. If the document at the URL has been modified it might show the diff and ask the user if it is appropriate to upgrade to the new version. Exactly. I think, even, that this kind of handling is what we _need_. I routinely feel, in writing my own modules, the hassle of questions like how do I package this?. It would be much easier and accessible to just put my modules up one by one on the Web, advertise them (by posting the documentation, preferably ;)) and know that people's GHC or whatnot will just auto-fetch them. This is exactly what Cabal and Hackage is solving. The next thought of course is versioning. To make sure my Haskell system gets the version I meant when I wrote my program, modules need version numbers. I'd propose the following. module A [1,5.2] (...) where ... The bracketed expression after the module name is an interval of interface numbers: This version of the module exports interface 5.2, the decimal indicating the second revision since no. 5. The module further declares to be backwards-compatible with all interfaces down to version 1, inclusively (i.e. they form a sequence of subsets). Nota bene this scheme is the same as that used by GNU libtool (although libtool explains it much too complicated). A module author would start with interface 1 (i.e. write [1,1]) and upon changing the module: - If the change was only a code revision with no interface or semantic changes at all, raise the fractional part, e.g. [1,1.1] - If there was any change in the module exports, or the semantics of existing exports, raise the interface number (upper bound) to the next integer, e.g. [1,2] - If the change broke compatibility with the last version (i.e. removed or changed any of the existing exports), snap the lower bound up to reduce the interval to a single element again, e.g. [3,3]. import A 2 (...) The import statement includes a single integer interface number which is the number of the interface this module was written against. It indicates that any version of module A whose interface interval contains 2 is compatible. Cabal is using package based versioning and you can show/hide packages. Obviously, the Haskell system should be able to provide some convenience for managing the interface numbers. It should also be possible to devise a smart way of handling omitted interface info (both on the ex- and import side). Finally, one will wish for a system of providing adaptor modules to interface old importers to new versions of their importees. That way, interfaces can be evolved rapidly because backwards-compatibility need not be retained, as long as one provides a suitable adaptor (to be auto-installed by an importing system). In such a setting, the simple latest compatible interval approach also becomes sufficient to handle even strong interface fluctuation because gaps can always be bridged with adaptors. You don't need a new package system for this. Does this make sense? I understand your desire but all of this can/will be handled by Cabal and Hackage. -- Friendly, Lemmih ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] URLs in haskell module namespace
Lemmih, The current Haskell/Cabal module and packaging system is substantially annoying for the typical non-sysadmin end-user. In particular, if they move their code to another machine they have to do a bunch of different administrivia including: 1. knowing the source package for each module used in their code even if they didn't insall the packages in the first place i.e. import Foo.Bar just worked on my development machine. 2. knowing the current location of those packages even if they didn't obtain them for installation on the original machine where they used them and aren't on the mailing list for them. 3. going through the hassle of doing a cabal-get install for each of them once they have figured it all out. I'd rather have a system that takes care of 1-3 for me and just reports errors if particular modules are irretrievable. That being said, Cabal definitely solves a lot of problems that my original proposal left unaddressed (e.g. producing executables needed to build modules, handling C code, versioning?). Perhaps the correct answer is to import Cabal packages rather than haskell source e.g. import http://package.org/package-1.0.cabal#Foo.Bar as Baz import http://package.org/package-2.0.cabal#Foo.Bar as Baz2 --note use of HTTP fragment identifier for module name And a big bonus here is we get a simple solution to the problem of Haskell's global module namespace. Now module namespace is local to individual packages. If cabal also has a cabal-put package MyPackage http://myhost.com/dir; then we have a really simple and beautiful system for sharing libraries over the Internet as well! If the change of import syntax is blessed by the powers that be, would it be hard to adapt Cabal to work like this? -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com On Tue, 22 Mar 2005, Lemmih wrote: On Mon, 21 Mar 2005 23:06:25 +0100, Sven Moritz Hallberg [EMAIL PROTECTED] wrote: Greetings Alexander, I have been thinking about something very much similar for some time. But: Am 21. Mrz 2005 um 21.47 Uhr schrieb S. Alexander Jacobson: As I move from machine to machine, it would be nice not to have to install all the libraries I use over and over again. I'd like to be able to do something like this: import http://module.org/someLib as someLib The extra complexity outstrips the gain since installing a package will soon be as easy as this: 'cabal-get install myPackage'. Checkout the Cabal/Hackage project. I'm not sure a URL is the right thing to use. For instance, what about the http part? In the end, the URL gives a certain location for the module, which might change. Programs using the module should not become invalid just by movement of the dependency. If the requested module itself does local imports, the implementation would first try to resolve the names on the client machine and otherwise make requests along remote relative paths. If would be nice if implementations cached these http requests and did If-Modified-Since requests on each compile. If the document at the URL has been modified it might show the diff and ask the user if it is appropriate to upgrade to the new version. Exactly. I think, even, that this kind of handling is what we _need_. I routinely feel, in writing my own modules, the hassle of questions like how do I package this?. It would be much easier and accessible to just put my modules up one by one on the Web, advertise them (by posting the documentation, preferably ;)) and know that people's GHC or whatnot will just auto-fetch them. This is exactly what Cabal and Hackage is solving. The next thought of course is versioning. To make sure my Haskell system gets the version I meant when I wrote my program, modules need version numbers. I'd propose the following. module A [1,5.2] (...) where ... The bracketed expression after the module name is an interval of interface numbers: This version of the module exports interface 5.2, the decimal indicating the second revision since no. 5. The module further declares to be backwards-compatible with all interfaces down to version 1, inclusively (i.e. they form a sequence of subsets). Nota bene this scheme is the same as that used by GNU libtool (although libtool explains it much too complicated). A module author would start with interface 1 (i.e. write [1,1]) and upon changing the module: - If the change was only a code revision with no interface or semantic changes at all, raise the fractional part, e.g. [1,1.1] - If there was any change in the module exports, or the semantics of existing exports, raise the interface number (upper bound) to the next integer, e.g. [1,2] - If the change broke compatibility with the last version (i.e. removed or changed any of the existing exports), snap the lower bound up to reduce the interval to a single element again, e.g. [3,3]. import
Re: [Haskell] URLs in haskell module namespace
On Mon, 21 Mar 2005 22:06:25 -0500 (Eastern Standard Time), S. Alexander Jacobson [EMAIL PROTECTED] wrote: Lemmih, The current Haskell/Cabal module and packaging system is substantially annoying for the typical non-sysadmin end-user. In particular, if they move their code to another machine they have to do a bunch of different administrivia including: 1. knowing the source package for each module used in their code even if they didn't insall the packages in the first place i.e. import Foo.Bar just worked on my development machine. I'm not sure I completely understand what you're saying but knowing the exact URL for every single module import seems more of a hassle than installing a few packages. You could perhaps even make a shell script containing 'cabal-get install package1 package2 ...'. 2. knowing the current location of those packages even if they didn't obtain them for installation on the original machine where they used them and aren't on the mailing list for them. I assume you meant something like The developer don't know where to find the packages. The location of the packages is irrelevant to the developer since it's handled by Cabal/Hackage. 3. going through the hassle of doing a cabal-get install for each of them once they have figured it all out. See the above mentioned shell script or cabalize your software. I'd rather have a system that takes care of 1-3 for me and just reports errors if particular modules are irretrievable. That being said, Cabal definitely solves a lot of problems that my original proposal left unaddressed (e.g. producing executables needed to build modules, handling C code, versioning?). Perhaps the correct answer is to import Cabal packages rather than haskell source e.g. import http://package.org/package-1.0.cabal#Foo.Bar as Baz import http://package.org/package-2.0.cabal#Foo.Bar as Baz2 --note use of HTTP fragment identifier for module name Requiring Haskell implementations to honor this would be very bad, IMHO. A preprocessor which looks for package pragmas ({- USE somepackage -} perhaps) and install those packages via Cabal would be way easier to hack. But then again, a small script to fetch the packages or a .cabal file would be even simpler. And a big bonus here is we get a simple solution to the problem of Haskell's global module namespace. There was a problem with module name spaces? Wouldn't there only be a problem if two packages used the same module name for different functionality? Now module namespace is local to individual packages. If cabal also has a cabal-put package MyPackage http://myhost.com/dir; then we have a really simple and beautiful system for sharing libraries over the Internet as well! This is essentially what Hackage does. If the change of import syntax is blessed by the powers that be, would it be hard to adapt Cabal to work like this? Extending the Haskell syntax to include something which could easily be handled by a preprocessor would be inadvisable. -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com -- Friendly, Lemmih ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell