I am looking for a way to make a static copy of a shop.
Google does not like dynamic sites where the client ID is part of the
url: Google thinks that are differnt pages with duplicated content.
So my plan is to make restrict cgi-bin for every spider other than
wget from my IP number and to offer one static copy for indexing into
Searchengines.

Will wget build me such a copy of the entire site?
Full interlinked and spiderable?

All pages of the static copy will then have the same client ID.
Or, if I use my shop software to recognize the visitor from the same
IP, no client ID is given.

I am thinking to use a tool for making the dynamic url�s to short
static urls e.g.
mydomain/shop.cgi?action=add&templ=cart1  -> mydomain/add/cart1
Such a "Dynamic2Static Rewriting" can be triggered by cron.
The indexed static url�s will be rewritten by mod_rewrite.

Whats a goog Linux tool for that stringreplacement?
A table for stringreplacement is required with regular expressions:
action=add&templ=cart1  -> mydomain/add/cart1
action=add&templ=cart2  -> mydomain/add/cart2

Is awk the right choice for that?
Or is there another simpler tool sufficient?
Thanks, Maggi

Reply via email to