It seems to work now. I had to add an option to tell httrack to *not* conform to the robot directives :), and to adjust the directive to “noindex, follow”. I also changed the global website file “robots.txt” since search engines may index test/ instead of web/ without any specific directive, which would lead to the same issue we have with spip/.
JM. ----- phili...@free.fr a écrit : > Hello, > > As we discussed in a private mail, Salvatore is proposing to block > indexation of Spip pages (but not our official website of course!). > This proposal makes me suspect that Google started to index our pages > in Spip, then, seeing them just copied to our static official website, > may have thought that Spip is the source to be indexed (instead of our > official website). > > So I modified the Spip page skeleton to include a mention “noindex, > nofollows” on every Spip page. I then modified the script that makes > the static copy of Spip each night. I checked that Spip pages actually > have this mention. Tomorrow morning we'll have to check that pages at: > > http://www.doudoulinux.org/test/ > > do *not* have this mention. As the static copy of our website hasn't > been explained on our online documentation yet, I give you some > details: > > 1. Authors/translators work with Spip, at > http://www.doudoulinux.org/spip/ > 2. Each night a snapshot of Spip is created using httrack, it is named > using the date “snapshot-yyyymmdd” > 3. The latest snapshot can be browsed at > http://www.doudoulinux.org/test/ > 4. Our official website http://www.doudoulinux.org/web/ is just a link > to a given snapshot > > Currently I'm still changing the link web/ manually but I plan to > write a simple algorithm that would do this automatically after some > tests are passed (we don't want a broken website :) ). > > We then have to wait for search engines to visit Spip again in order > to see if this is solving the issue (it should of course!). > > Cheers, > JM. _______________________________________________ Doudoulinux-docs mailing list Doudoulinux-docs@gna.org https://mail.gna.org/listinfo/doudoulinux-docs