Ben Finney writes ("Re: Removing duplication: Word lists of common words in languages"): > Ian Jackson <ijack...@chiark.greenend.org.uk> writes: > > I had roughly this question in 2013, and found the answer. Here is > > probably the best starting point: > > > > http://www.chiark.greenend.org.uk/ucgi/~ijackson/git?p=evade-mail-usrlocal.git;a=blob;f=lemma.al-permission.mbox > > Great! That asks for permission to redistribute the corpus under > free-software terms, and documents the response in the affirmative. > Vital for an eventual ‘debian/copyright’. Thank you. > > In that exchange, you also mention you're planning to distribute the > data in a program. Is that online somewhere, and what's the URL?
Yes. (Depending on your definition of `distribute'.) http://www.chiark.greenend.org.uk/ucgi/~ijackson/git?p=evade-mail-usrlocal.git It's a userv-based tool for managing a domain containing randomly-generated email aliases, on a shared shell account system. I run it on chiark. I have c&p the relevant section of chiark's /info/mail.text below, along with the relevant bit of chiark's /etc/exim4/exim4.conf.pl. On chiark I run this directly out of a git working tree in /usr/local, with symlinks from the relevant bits of /etc, /usr/local/bin, etc. If anyone else thinks they might actually want this, I might consider productising it a bit more. Of course anyone else is welcome to do so, starting with the git tree there. I see that I have forgotten to give it a copyright licence or indeed any copyright notices. Please treat it as AGPLv3+. (This is compatible with the GPLv2+ permission that I requested from Adam Kilgarriff. CCing Matthew Vernon as the other copyrightholder.) Ian. 3. Randomly-generated (weakly-psuedonymous) addresses ----------------------------------------------------- chiark users can have randomly-generated short email addresses <short-random-string>@fyvzl.net, and randomly-generated readable email addresses <word>.<word>.<word>@evade.org.uk. This is managed using the "slimy-rot13-mail" and "evade-mail" utilities. Run them without arguments for their usage messages. The "choose" option generates ten random addresses and lets you say which ones you would like to keep. Paste the ones you like back in, to have them allocated to you. (Of course do NOT publish addresses you have failed, or forgotten, to allocate!) If you redirect an alias to yourself@chiark then your .forward file will apply; your .forward file will see the address you redirect to, not the @fyvzl or @evade address. chiark's spamfilters treat fyvzl.net and evade.org.uk the same as slimy.greenend.org.uk (see /info/spam.text). These addresses do not go through SAUCE. On privacy: these addresses are not trivial to map to a particular user from outside chiark, although bounces (and any replies you send!) are likely to reveal the linkage. It's not easy for another user to get a complete list of your aliases, but chiark's mail logs are available to everyone. And any chiark user can use exim -bt to discover where a particular alias redirects. These aliases are recorded in a database. It is not possible to ever delete aliases because that would run the risk that another user would subsequently be allocated an alias previously used by someone else. The tools `evade-mail-pregen', `slimy-rot13-mail-pregen' and `numbered-alias-sheet' can arrange to conveniently format pregenerated aliases on sheets of paper for you to carry about and give to people when offline. Run them without arguments to see the usage messages. The usage message for numbered-alias-sheet has some examples. evade_hard_dir: ".aliasdir("/etc/aliases-evade")." domains = @evade_domains user = mail evade_db_dir: domains = @evade_domains driver = redirect allow_defer = true data = ".'${lookup sqlite {/var/lib/evade-mail/$domain.sqlite3 \ select redirect from addrs where \ localpart=\'${quote_sqlite:$local_part}\' \ and not redirect = \'\' \ and user not in disabled_users;}}'." -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/21603.25976.700221.615...@chiark.greenend.org.uk