I am looking for a way to make a static copy of a shop. Google does not like dynamic sites where the client ID is part of the url: Google thinks that are differnt pages with duplicated content. So my plan is to make restrict cgi-bin for every spider other than wget from my IP number and to offer one static copy for indexing into Searchengines.
Will wget build me such a copy of the entire site? Full interlinked and spiderable? All pages of the static copy will then have the same client ID. Or, if I use my shop software to recognize the visitor from the same IP, no client ID is given. I am thinking to use a tool for making the dynamic url�s to short static urls e.g. mydomain/shop.cgi?action=add&templ=cart1 -> mydomain/add/cart1 Such a "Dynamic2Static Rewriting" can be triggered by cron. The indexed static url�s will be rewritten by mod_rewrite. Whats a goog Linux tool for that stringreplacement? A table for stringreplacement is required with regular expressions: action=add&templ=cart1 -> mydomain/add/cart1 action=add&templ=cart2 -> mydomain/add/cart2 Is awk the right choice for that? Or is there another simpler tool sufficient? Thanks, Maggi
