Sorry, I still don't quite think I'm following. If a user downloads the
source of haskell-publicsuffixlist, that package includes a blob which they
can modify (albeit it's not in the same form as the original list).

The haskell-publicsuffixlist library also allows the user to generate the
opaque haskell data structure from any given list (so they can download it
themselves). The library includes a cache (read: "stale version") of this
list so users aren't required to mess around with downloading the list /
reading a file just to query a particular suffix, which I think is a good
thing. This also means that the user can parse their own list at runtime,
instead of compile time.

I also have a requirement that the publicsuffixlist library can't parse the
list itself, because parsing the list requires ICU, and the maintainer of
http-conduit objects to requiring all of his users installing it at
compilation time. This is the reason why I split the package into the
publicsuffixlist and publicsuffixlistcreate packages.

I think that it should be noted that the Chrome browser takes this same
approach of providing a local copy of the public suffix list inside its own
code [1]. That's where I got the idea for this particular solution.

I see a couple options:
- Make the Debian package for publicsuffixlist include ::both::
hs-publicsuffixlist and hs-publicsuffixlistcreate hackage packages, and
modify the build step to generate Network.PublicSuffixList.DataStructure
from a file on disk instead of Mozilla's website. The Debian package can
also depend on whichever package provides the file on disk, as well as ICU.
- Modify the source of publicsuffixlist to run an unsafePerformIO whenever
anyone asks for the included data structure. The unsafePerformIO would read
in the suffix list from a file. This would be hidden behind a #define
switch (which Debian can turn on in their build scripts), so the current
behavior would still exist by default. The haskelly part of my brain frowns
on unsafePerformIO, but I understand the reasoning for this approach.
- Any other ideas?

Thanks,
Myles

[1]
http://src.chromium.org/viewvc/chrome/trunk/src/net/base/registry_controlled_domains/effective_tld_names.cc?view=markup

On Wed, Feb 27, 2013 at 11:14 AM, Daniel Kahn Gillmor <[email protected]
> wrote:

> On 02/27/2013 07:33 AM, Clint Adams wrote:
> > * We package a patched version of haskell-publicsuffixlistcreate
> >   that can generate a haskell-publicsuffixlist package from
> >   /usr/share/publicsuffix/effective_tld_names.dat .  The publicsuffix
> >   source package is modified to build-depend on
> >   libghc-publicsuffixlistcreate-dev and will produce a
> >   libghc-publicsuffixlist-dev package itself.  The stable update
> >   process would be tied completely to the publicsuffix source package
> >   followed by binNMUs of all the reverse dependencies.
> >
> > * Same as the previous except done in a shim package outside the
> >   publicsuffix source package.
>
> Using the publicsuffix package seems like the right way to go here.  The
> fewer copies of this list that we have floating around, the more the
> canonical list will be well-maintained and useful for everyone.
>
> i'm also happy to have co-maintainers on publicsuffix.  I've just moved
> publicsuffix into collab-maint to try to make it easier for anyone to
> jump in and help out.
>
>   git://git.debian.org/git/collab-maint/publicsuffix.git
>
> I've also just uploaded a new version of publicsuffix to unstable.
>
>         --dkg
>
>

Reply via email to