The regex form is (and has always been) [A-Za-z0-9-_.]+ (completely opaque) and Cabal has never had any other guarantee about the structure of this identifier. Case in point, GHC 7.10 took advantage of this flexibility to try to enforce a maximum length on IPIs (although we backed out of this change). If the opaqueness of these identifiers is taken seriously, the only supported way of getting your hands on the name of the package and its version... is to have kept track of it from the beginning (c.f. ConfiguredId in cabal-install's source code.)
I looked at Dh_Haskell.sh and it seems like it would be inconvenient if this truly were the case. So it sounds like what you would like instead is a guarantee that an installed package identifier embeds the package name and version. I suppose that we could give this guarantee (although it would not hold for GHC 7.10!) Supposing we gave that guarantee, currently, an installed package identifier can be regexed with the following productions: $alphanum_minus_num ::= [A-Za-z0-9]*[A-Za-z][A-Za-z0-9]* $package_name ::= $alphanum_minus_num(-$alphanum_minus_num)+ $package_ver ::= [0-9]+(\.[0-9]+)* $package_id ::= $package_name-$package_ver $installed_package_id ::= $package_id(-$hash)? $hash ::= [A-Za-z0-9-_.]+ To actually give this guarantee, we would need to write this restriction into Cabal and it won't be enforced by GHC until the next major release. Furthermore, some other changes to package identifier parsing would need to be made (specifically, we currently accept any number of trailing tags after version numbers e.g., 0.1-alpha; additionally, we permit a version number to be dropped.) But I would quite like it if these identifiers could be kept opaque; with things like internal libraries and Backpack they definitely may be in flux. Perhaps it would be possible for Dh_Haskell.sh to pass around package identifiers (no hashes) rather than installed package IDs? I don't know enough about the script to say one way or another. Edward Excerpts from Joachim Breitner's message of 2016-07-02 03:55:04 -0400: > Hi Edward, > > we treat it as opaque, but we currently try to match on a precise, > predictable length – this seems to be more reliable than just matching > on any sequence of characters. It helps, like, you know, types :-) > > But we can easily adjust. You can help us by giving a definite > description of how package IDs can look like nowadays, e.g. as a regex. > > Greetings, > Joachim > > Am Freitag, den 01.07.2016, 19:46 -0400 schrieb Edward Z. Yang: > > Yeah, we started compressing the IDs so that they take less > > length. Is there something we can do to make things easier > > for packagers? In general, these identifiers are supposed > > to be treated as opaque. > > > > Edward > > > > Excerpts from Joachim Breitner's message of 2016-07-01 06:16:24 > > -0400: > > > Hi Edward, > > > > > > Am Freitag, den 01.07.2016, 09:54 +0000 schrieb Clint Adams: > > > > When building tf-random with ghc 8, an id of > > > > > > > > tf-random-0.5-4z8OJUaXC1FRNfrLPFWAD > > > > > > > > is produced. Since this is the wrong length, this breaks > > > > Dh_Haskell.sh . > > > > > > > > Can someone explain what's happening and what should be done > > > > instead? > > > > > > previously, we (Debian Haskell packagers) could rely on package > > > hashes > > > to be 32 characters. Has this changed with GHC-8 somehow? > > > > > > Greetings, > > > Joachim > > > > > > >