so tl;dr suggestion would be to throw out all the cruft in the docgen and make it collect list of modules, write it out in JSON format and call it a day. Or do this for one module, in this case the user can write out a collection of `module3.json` and `module4.json` files like the original post needed and some extra tool will stitch them together.
But the current json is by no means "most stuff is already there" in my estimate.