On 09/09/11 10:32, Rhodri James wrote:
On Fri, 09 Sep 2011 00:40:42 +0100, Simon Cropper

Ahem. You should expect a certain amount of ribbing after admitting that
your Google-fu is weak. So is mine, but hey.

I did not admit anything. I consider my ability to find this quite good actually. Others assumed that my "Google-fu is weak".


4. If someone is willing to help me, rather than lecture me (or poke
me to see if they get a response), I would appreciate it.

The Google Python Sitemap Generator
(http://www.smart-it-consulting.com/article.htm?node=166&page=128,
fourth offering when you google "map a website with Python") looks like
a promising start. At least it produces something in XML -- filtering
that and turning it into HTML should be fairly straightforward.


I saw this in my original search. My conclusions were..

1. The last update was in 2005. That is 6 years ago. In that time we have had numerous upgrades to HTML, Logs, etc. 2. The script expects to run on the webserver. I don't have the ability to run python on my webserver. 3. There are also a number of dead-links and redirects to Google Webmaster Central / Tools, which then request you submit a sitemap (as I alluded we get into a circular confusing cross-referencing situation) 4. The ultimate product - if you can get the package to work - would be a XML file you would need to massage to extract what you needed.

To me this seems like overkill.

I assume you could import the parent html file, scrap all the links on the same domain, dump these to a hierarchical list and represent this in HTML using BeautifulSoup or something similar. Certainly doable but considering the shear commonality of this task I don't understand why a simple script does not already exist - hence my original request for assistance.

It would appear from the feedback so far this 'forum' is not the most appropriate to ask this question. Consequently, I will take your advice and keep looking... and if I don't find something within a reasonable time frame, just write something myself.

--
Cheers Simon

   Simon Cropper - Open Content Creator / Website Administrator

   Free and Open Source Software Workflow Guides
   ------------------------------------------------------------
   Introduction               http://www.fossworkflowguides.com
   GIS Packages               http://gis.fossworkflowguides.com
   bash / Python        http://scripting.fossworkflowguides.com
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to