def sitemap():
# Import Os and Regex
import os
from gluon.myregex import regex_expose
# Finding You Controllers
ctldir = os.path.join(request.folder, "controllers")
ctls = os.listdir(ctldir)
# Excluding The appadmin.py and the Manage.py
if 'appadmin.py' in ctls: ctls.remove('appadmin.py')
if 'manage.py' in ctls: ctls.remove('manage.py')
if 'administration.py' in ctls: ctls.remove('administration.py')
# Adding Schemas for the site map
xmlns='xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"\n'
xmlnsImg='xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"\n'
xmlnsVid='xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"\n'
sitemap_xml = '<?xml version="1.0" encoding="UTF-8"?>\n'
sitemap_xml +='<urlset %s %s %s>\n'%(xmlns,xmlnsImg,xmlnsVid)
# Add The Pages That You Dont want in the XML Sitemap
ExcludedPages = [
'coming_soon', 'view'
]
# Define Your Domain
Domain = 'https://www.mydomain.com'
for ctl in ctls:
if not ctl.endswith(".bak"):
filename = os.path.join(ctldir,ctl)
data = open(filename, 'r').read()
functions = regex_expose.findall(data)
print functions
ctl = ctl[:-3]
# Adding Statics URLs From Your Controllers
for f in functions:
# Ignore the Pages from the list above (
ExcludedPages )
if f not in ExcludedPages and 'ajax_' not in f:
sitemap_xml +=
'<url>\n<loc>%s/%s/%s</loc>\n</url>\n' %(Domain,ctl,f)
# Dynamic URLs From Tables For ex ... >> www.domain.com/post/1
posts = db().select(db.blog.ALL, orderby=~db.blog.id)
for item in posts:
sitemap_xml += '<url>\n<loc>%s/blog/view/%s</loc>\n</url>\n'
%(Domain,item.url)
for country in db().select(db.country.ALL):
sitemap_xml += '<url>\n<loc>%s/country/%s</loc>\n</url>\n'
%(Domain,country.code)
sitemap_xml +='</urlset>'
return sitemap_xml
Not sure if this helps but here's what we use - tweaked from a recipe
from http://www.web2pyslices.com/slice/show/1560/dynamic-xml-sitemaps
It's dynamic and if placed in default should return domain/sitemap.xml.
Obviously if we add new complex page with dynamic content to the site we'll
need to update the controller to cover them off.
On Wednesday, 21 June 2017 21:11:54 UTC+2, Ron Chatterjee wrote:
>
> Which directory sitemap.xml should be placed at? I am guessing under the
> app directory?
>
>
>
> On Wednesday, May 15, 2013 at 6:26:58 PM UTC-4, Omi Chiba wrote:
>>
>> You're right. The web site I previously posted create the index up to
>> controller level. I think it's OK for now.
>>
>> On Wednesday, May 15, 2013 3:44:26 PM UTC-5, Niphlod wrote:
>>>
>>> PS: the problem with this approach is that your site needs to include
>>> some page where all the urls that it can generate are listed.
>>>
>>> e.g. you make a /app/default/showme/ controller that takes an integer
>>> and returns the representation of a product, that you previously saved in a
>>> database...... (/app/default/showme/1, /app/dedault/showme/2, etc)
>>>
>>> How do you crawl them "automatically" without a /app/default/show/all
>>> that lists all products ?
>>> Usually when you want to generate a sitemap.xml it's because bots (that
>>> basically point to your domain and start crawling like the script provided)
>>> can't know in advance your dinamically generated pages if there isn't
>>> somewhere a page that lists ALL of them.
>>>
>>>
>>>
>>>
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.