Re: DIS: Coming clean

N. S. via agora-discussion Tue, 21 Jul 2020 17:23:24 -0700

Given that gaelan exercised much control over greg by registering him amd
by curating his training data, i think gregs actions should be considered
just an extention of gaelan


On Wed., 22 Jul. 2020, 10:15 am Gaelan Steele via agora-discussion, <
agora-discussion@agoranomic.org> wrote:

> Alright, people seem to have started to get annoyed, so I believe
> continuing this experiment would violate my sacred and eternal duty to
> Treat Agora Right Good.
>
> So, yeah. I'm Greg. Or maybe I'm sending messages on Greg's behalf. See
> recent CFJs.
>
> As most of you accurately surmised, Greg's messages were generated by
> GPT-2, specifically a version of GPT-2 fine-tuned with Agoran mailing list
> logs since 2014. (The 2014 date is largely arbitrary—I might have been able
> to go a bit further back, but using much more data ran into resource
> limitations.)
>
> Greg was implemented using a combination of shell scripts, commands in my
> shell history, python scripts, a Google Colaboaratory Notebook, and me
> manually copy/pasting messages around. Notably, I manually pasted in and
> hit send on each message. I did this primarily because figuring out email
> APIs sounded like a PITA, but also because I wanted to be able to pull the
> plug in case it said anything horrific. I did, however, do as much as I
> could to avoid injecting my free will into the process. I operated off of
> two rules: for each message to the public forum, I would run a python
> script which had a 10% chance of invoking GPT-2 to generate a reply, which
> I would send verbatim. (GPT-2 barfs on overly-large input data, so I
> included a failsafe that automatically removed old messages in the thread
> until the input was small enough to work. Some messages (like the rulesets)
> were far too big on their own, resulting in the code generating "replies"
> without any context.) Additionally, each day after the first (which looks
> like is just going to mean "today"), I ran a script which had a 50% chance
> of generating a brand-new proposal. Had I been aware of CFJ 3790, I might
> have actually went to the trouble of having it send the messages
> automatically after generating them.
>
> I did "intervene" twice: for the registration message, I specifically
> asked GPT-2 to generate a message to BUS with a subject of "BUS:
> Registration". In my testing, this had about a 75% chance of generating a
> message that was a somewhat plausible attempt to register. Unfortunately, I
> got unlucky and my first "real" attempt to generate a registration message
> resulted in something completely random (a proposal, I think), so I
> generated a second one and sent that one. In another case, I discovered a
> bug with the large-input failsafe (turns out, GPT-2 can barf by silently
> returning the input with no additional output, or by throwing an exception;
> I was only handling the first case), so I fixed the bug and re-ran the
> generation. In every other case, I mechanically copies messages back and
> forth, following the plans I had made before sending the first message,
> without attempting to impose any editorial control.
>
> In my testing, Greg did occasionally borrow other people's signatures, but
> I didn't expect it to be this common. I considered preventing it from doing
> this by removing signatures from the training data (so it would never learn
> to include them), but I thought it was rare enough and amusing enough that
> it wasn't worth removing. In retrospect, I probably should have removed
> them.
>
> In my testing, I ran into several outputs that were interesting enough to
> save so I could show them later. Here are links to them:
>
> A proposal for something vaguely resembling a functional auction mechanic:
> https://gist.github.com/Gaelan/e7f7d3fc48c1abd08f0afb8049077acb
> A made-up FLR excerpt containing an interesting-sounding royalty mechanic:
> https://gist.github.com/Gaelan/8d092a17ed9c210685a4f4dd1e622ae2
> Another ruleset excerpt, containing the core rules of an alternate
> universe Agora:
> https://gist.github.com/Gaelan/ee631f9f97b53df8483e342ef36b6618
> A batch of attempts at starting new threads, with varying quality:
> https://gist.github.com/Gaelan/0c027e3f5b97dab700182aa663401f47
> A fake rule called the "Register of Proposals", which looks like a
> semi-plausible implementation of proposals in an alternate-universe Agora:
> https://gist.github.com/Gaelan/0c6853f500799c5190a0a1ef474b098b
>
> I'd be happy to share the model at some point, but its a bit of a pain—I
> think it's 1.5 GB—so I'm not sure how best to do that. In the meantime, I'm
> happy to try out any inputs y'all are curious about. Also, From headers
> were included in the training data, so I should be able to ask it to
> generate message from specific Agorans. That might be fun.
>
> Happy to answer any questions, of course.
>
> Gaelan
> "Nothing in a democracy is sacred, and nothing in a democracy is
> sacrosanct. That's not to say it's never been broken, but it's been
> tracked for quite some time."
> [GPT-2 put that quote in someone's signature during one of my tests. As
> far as I can tell, it's not a real quote, but it sounds like a fairly
> interesting reflection on nomic.]

Re: DIS: Coming clean

Reply via email to