MtDu has uploaded a new change for review.
https://gerrit.wikimedia.org/r/265769
Change subject: Make reflinks only fetch 1 Mb of each linked document
......................................................................
Make reflinks only fetch 1 Mb of each linked document
Bug: T124138
Change-Id: Icd83a1c59e8451bfb5a44fbdad45697c3847e47d
---
M scripts/reflinks.py
1 file changed, 3 insertions(+), 2 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/pywikibot/core
refs/changes/69/265769/1
diff --git a/scripts/reflinks.py b/scripts/reflinks.py
index be630cb..a5594b6 100755
--- a/scripts/reflinks.py
+++ b/scripts/reflinks.py
@@ -523,7 +523,7 @@
f = None
try:
- f = requests.get(ref.url, headers=headers, timeout=60)
+ f = requests.get(ref.url, headers=headers, timeout=60,
stream=True)
# Try to get Content-Type from server
contentType = f.headers.get('content-type')
@@ -582,7 +582,8 @@
new_text = new_text.replace(match.group(), repl)
continue
- linkedpagetext = f.content
+ # Read the first 1,000,000 bytes (0.95 MB)
+ linkedpagetext = f.iter_content(1000000)
except UnicodeError:
# example :
http://www.adminet.com/jo/20010615¦/ECOC0100037D.html
# in [[fr:Cyanure]]
--
To view, visit https://gerrit.wikimedia.org/r/265769
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icd83a1c59e8451bfb5a44fbdad45697c3847e47d
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: MtDu <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits