TL;DR: I started something I can't finish, want to help me?

Many times in my career I've found a need for large volumes of realistic 
test data (aka "fixtures"),
and had long had a thought at the back of my mind that it could be well 
provided by a service.
Last year I had time to work on the idea (using node.js), and made some 
good progress building
the core technology in a project I'm calling golems.io.  Later in the year 
I got
sucked into a new venture (http://www.snupi.com/) and no longer have time 
to dedicate to this.
However I don't want the effort so far to go to waste, and am wondering if 
1) the 
community thinks this is a potentially valuable service, and 2) if there 
are any
people or organizations out there interested in taking it on.  

So what's the point of this?  There are many, but the most obvious use case 
I can think of: automated testing of web sites/services.
Consider this form filling example from zombie.js 
<http://zombie.labnotes.org/#Feeding>

  browser.
    fill("Your Name", "Arm Biter").
    fill("Profession", "Living dead").
    select("Born", "1968").
    uncheck("Send me the newsletter").
    pressButton("Sign me up", function() {
      // Make sure we got redirected to thank you page.
      assert.equal(browser.location.pathname, "/thankyou");
    });

Consider the second arguments to fill and select--where do these come 
from?  Hard-coded values are perhaps fine 
for a simple unit test, but what if you wanted to create a few hundred or a 
few thousand subscribers for 
your newsletter?  That's where a fixture generator comes in handy, but 
there are serious limitations to the
existing ones.  First consider the output of a class random fixture 
generator like Faker <https://github.com/marak/Faker.js/>

  {
    "name":"Oswald Goldner",
    "username":"Izaiah",
    "email":"[email protected]",
    "address":{"zipcode":"35411"},
    "phone":"1-658-413-1550"
  }

While the text for each individual field is reasonable, there's no overall 
coherency: the name ("Oswald...")
and the email ("Michael...") suggest two different people, the zip code 
suggests Alabama, the area code Oregon,
the email address isn't actually usable, etc.  There's another problem with 
using random data: it's nearly
impossible to reproduce.  Run the exact same test again and you could get 
totally different results.

Now imagine a signup similar to the above that includes a password and a 
typical email confirmation step,
where you send an email to the user and they click on a link in the email 
that includes a temporary unique key
and requires them to reenter their password, then shows them a personalized 
"congratulations Oswald Goldner" page.
To test that scenario you need both an email address that works, some way 
to access that address's mailbox,
and some way of knowing or remembering the password and name across steps.  
This becomes quite difficult using
transient random fixtures.

Golems are a different approach to generating fixtures, using a 
deterministic but chaotic encryption algorithm
instead of a random number generator.  Realistic statistical data sets are 
used for demographic data and care is taken to ensure
consistency when possible, e.g. zip code and area code; year of birth, 
gender and first name (they correlate
surprisingly strongly, especially for females).  Here's an example:

  {
    "gender":"female",
    "given_name":"Kimberly",
    "family_name":"West",
    "birth_date":"1979-05-23",
    "username":"g22yjght",
    "password":"2r%B0m%B",
    "email":"[email protected]",
    "address":{"postal_code":"94947"},
    "phone_number":"(707) 229-7163"
  }

Every golem is grown from a single unique 32 bit number, given that number 
(and some keys and the right version of
the code you can fully recreate every attribute of the golem.  That number 
is directly derivable from a few of the
fields which are globally unique (username and email in this case), but can 
typically be recovered from a few pieces
of non-unique information (name and phone number would be enough).  To 
fullfill the second part of our
email-confirmation-round-trip test we actually need nothing more that the 
email address to which the confirmation
was sent, from this we can recreate the full golem, retrieve the password, 
and even verify that the user's
name is correctly displayed on the congratulations page.

So where is this project at?  I've developed all of the core enabling 
technology, much of the low level stuff
is already made available on npm <https://npmjs.org/%7Efemto113> and 
github<https://github.com/femto113>.  
There's  a prototype version of a service running on 
Heroku<http://api.golems.io/person/random.json>(http://api.golems.io/person/random.json).
Some of the more advanced features (like an API to let you retrieve mail 
sent to golems) are in prototype state.
I've even created a zombie/golem hybrid (a glombie) that lets you use 
zombie.js to test websites without having 
to make up test data, and have outlined similar approaches that should work 
with stuff like phantomjs & casperjs 
<http://casperjs.org/api.html#casper.fill>

There's a lot of detail I haven't been able to go into here, but if anybody 
is genuinely intrigued please
contact me, I'm happy to discuss.  I've always imagined that this would 
make a good freemium
service with a mixed open/closed source approach.  It might also make a 
great add-on or alternative to 
scale testing services like https://www.blitz.io/.  My first choice would 
be to find anyone interested in helping
make this into a viable service business, but failing that I'm open to open 
sourcing the whole thing as long
as there's someone (or some organization) that will actually carry it 
forward.

--Ken

-- 
-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to