Re: robots.txt and archiveteam.org...

Konstantin Boyandin via Gnupg-users Sat, 06 Jul 2019 17:24:31 -0700

I believe this subject is way off the mailing list, but just my 5 cents.

1. GDPR, as any other bloated, convoluted, written in inhuman juridicallanguage law, mostly benefits two kinds of people: lawyers andgovernment-related officials. It incurs a lot of ado and expenses, givesvast grounds for power abuse and so on and so forth.

As a side effect, it somewhat helps ordinary people to control the usageof their personal data. Since data lifespan on the Net is hardlycontrollable in whole, the abuse potential of GDPR is limitless. Cheerthe politicians for this excellent masterpiece of legislation.

As many such laws (the closest example of similarly inadequate law isRussian Federal Law #152, "On personal data") are introduced worldwide,they will strike a lethal blow to majority of small and mediumbusinesses, and cripple the base of normal human communication.


Let just watch the process and enjoy the show.

2. The "Robot Exclusion Protocol", as it's defined in its text, isadvisory only. It is not mandatory for any kind of data transmission.Thus any claims or demands about following its statements are void. Youmay ask, not to demand.

Any entity trying to transmit data over Net can't be reliably *and*efficiently identified as human being (or a bot). Thus, it's quite easyto imitate bot/human being, which makes the robots.txt a lame excuse forlack of efficient control over which data should be taken by whichmeans.

Simply stating, if you don't want your digital crap being available toanyone, don't make it publicly available.

robots.txt usage was weird and strange in many cases. I remember severalWordPress versions which silently changed, when installed, robots.txt todisable all page indexing. Also, you cannot magically demand to removedownloaded and stored locally data just by altering your robots.txt atwill. That's pure nonsense.

Although I do not like, in many cases, the wording Archive Teams uses,in this given case I think they are, generally, right.


Sincerely,
Konstantin Boyandin

Listo Factor via Gnupg-users wrote 2019-07-06 19:06:

On 7/5/19 10:13 AM, Wiktor Kwapisiewicz via Gnupg-users -
gnupg-users@gnupg.org wrote:
As for robots.txt not all archiving sites respect it:
https://www.archiveteam.org/index.php?title=Robots.txt
Thanks for posting the link. To quote from the text there:
What this situation does, in fact, is cause many more problems than itsolves - catastrophic failures on a website are ensured totaldestruction with the addition of ROBOTS.TXT. Modifications, poorchoices in URL transition, and all other sorts of management work canlead to a loss of historically important and relevant data. Unchecked,and left alone, the ROBOTS.TXT file ensures no mirroring or referencefor items that may have general use and meaning beyond the website'scontext.
 This is both stupid and arrogant. It is precisely the owner of the
website and data contain therein to decide what is and what isn't of
"general use and meaning beyond the website's context", not of some
aggregator/archiver's management.

GDPR has indeed changed the nature of Internet forever, and it is for
the better. If Google was put in its place (well, at least first steps
have been made..) by the EU, surely it will be possible to force other,
lesser operators of "archived information" to toe the line. Amongother,
to respect the straight and simple Robot Exclusion Protocol. It is not
at all something difficult to do.



_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users


_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users

Re: robots.txt and archiveteam.org...

Reply via email to