I stumbled upon
https://bugzilla.redhat.com/show_bug.cgi?id=249743
and I say, "Easy, lets make some reasonable robots.txt", but then it get
complicate (as usually). And I would like to know your opinion.
Publicly reachable are 4 types of URL:
/rhn/Login.do
-- login page
/rhn/help/*
-- help in jsp
/help/
-- pxt help, or plain html
/pub/
-- your local garbage
Did I forget something?
Now the robots.txt...
One approach can be "disallow everything":
User-agent: *
Disallow: /
But do we want that?
Is there reason why to forbid indexing first login page (you can put
there info about your company and want that indexed)?
Is there reason why to forbid indexing help?
Is there reason why to forbid /pub? You either do not have it publicly
(and put it behind firewall) or you have it in wild internet and then
you probably do not care if somebody will index it.
Ideas? Comments?
Mirek
_______________________________________________
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel