I'm writing a script that, amongst other things, provides some basic documentation for php scripts; I'd like it to provide some bits of info from the php manual. The script will make 'tool-tips' for stuff (in case of php, the functions) by scraping a source (in the case of php, a local php manual copy the user has downloaded) to get the info which is then saved as an image (png) of the relevant text. Point is to generate cacheable images so the user doesn't download zillions of images. The script is designed to work for any source of documentation you can reasonably rely on for info by using regular expressions to grab it.

Of course, since this sort of like local screenscraping, formatting changes to the manual from version to version will break it. So I wonder: is there an alternative? Is there any short, downloadable format with some summary information that I could have users download for generating this info when they need it? I've seen editors etc. that know parameters types functions have, and other info - I'd also like to include version numbers, the basic description, category of function (e.g. filesystem), maybe first paragraph of what the manual now calls 'description'. And also: if I stick to just scraping a local copy of the manual with regular expressions, how often / likely is it that the format will be changed and require a new patch / regular expression? Any info on how to grab this besides scraping a local copy of the manual, or on scraping the local manual in a manner that's less apt to break in the future would be most welcome.

Thanks!

- James Coder

Reply via email to