Xqt has submitted this change. ( 
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/817363 )

Change subject: [backport] backport archivebot.py from master
......................................................................

[backport] backport archivebot.py from master

- Non latin digits support introduced with
  https://gerrit.wikimedia.org/r/c/pywikibot/core/+/163213
  never worked because the variable replacements like %(counter)d
  expected an int instead of s str. This did not fail as long as
  textlib.to_local_digits returned an unchanged value if there are no
  local digits given for a language but it might be failed for those
  who have it. With 7.5 textlib.to_local_digits always return a str
  and the archivebot failed. This was fixed recently with 7.5.1.
- User should be able to decide whether to use latin or non latin digits.
  Therefore a lot for new fields were introduced like 'localcounter'
  which uses the localized number instead of the latin one. This does
  not break the further implementation due to the %d replacement
  except in rare cases if the user had it replaced by %s already.
- Restore old values for non local fields
- Remove the 7.5.1 changes
- make a sanity check in analyze_page() method for the case that the
  local fields are used with %d and show a warning in this case.

- Use a generator instead of a list of pages to process. This decreases
  memory usage a lot and also speeds up start time by giving up sorting
  all pages.
- preload the page contents
- catch KeyboardInterrupt and leave the mean loop
- add look & feel of CurrentPageBot
- print execution time finally

Bug: T71551
Bug: T313682
Bug: T313692
Bug: T313785
Change-Id: I10c8ac62656aa53f41be629003e5ed6a875f9310
---
M pywikibot/__metadata__.py
M scripts/archivebot.py
2 files changed, 82 insertions(+), 56 deletions(-)

Approvals:
  Xqt: Verified; Looks good to me, approved



diff --git a/pywikibot/__metadata__.py b/pywikibot/__metadata__.py
index 21e0e01..e00d7eb 100644
--- a/pywikibot/__metadata__.py
+++ b/pywikibot/__metadata__.py
@@ -11,7 +11,7 @@


 __name__ = 'pywikibot'
-__version__ = '7.5.1'
+__version__ = '7.5.2'
 __description__ = 'Python MediaWiki Bot Framework'
 __maintainer__ = 'The Pywikibot team'
 __maintainer_email__ = '[email protected]'
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index ecc33d6..33c776e 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -52,18 +52,33 @@
  key                  A secret key that (if valid) allows archives not to be
                       subpages of the page being archived.

-Variables below can be used in the value for "archive" in the template above:
+Variables below can be used in the value for "archive" in the template
+above; numbers are latin digits:

-%(counter)s          the current value of the counter
-%(year)s             year of the thread being archived
-%(isoyear)s          ISO year of the thread being archived
-%(isoweek)s          ISO week number of the thread being archived
-%(semester)s         semester term of the year of the thread being archived
-%(quarter)s          quarter of the year of the thread being archived
-%(month)s            month (as a number 1-12) of the thread being archived
+%(counter)d          the current value of the counter
+%(year)d             year of the thread being archived
+%(isoyear)d          ISO year of the thread being archived
+%(isoweek)d          ISO week number of the thread being archived
+%(semester)d         semester term of the year of the thread being archived
+%(quarter)d          quarter of the year of the thread being archived
+%(month)d            month (as a number 1-12) of the thread being archived
 %(monthname)s        localized name of the month above
 %(monthnameshort)s   first three letters of the name above
-%(week)s             week number of the thread being archived
+%(week)d             week number of the thread being archived
+
+Alternatively you may use localized digits. This is only available for a
+few site languages. Refer :attr:`NON_LATIN_DIGITS
+<pywikibot.userinterfaces.transliteration.NON_LATIN_DIGITS>` whether
+there is a localized one:
+
+%(localcounter)s     the current value of the counter
+%(localyear)s        year of the thread being archived
+%(localisoyear)s     ISO year of the thread being archived
+%(localisoweek)s     ISO week number of the thread being archived
+%(localsemester)s    semester term of the year of the thread being archived
+%(localquarter)s     quarter of the year of the thread being archived
+%(localmonth)s       month (as a number 1-12) of the thread being archived
+%(localweek)s        week number of the thread being archived

 The ISO calendar starts with the Monday of the week which has at least four
 days in the new Gregorian calendar. If January 1st is between Monday and
@@ -87,9 +102,8 @@
   -page:PAGE      archive a single PAGE, default ns is a user talk page
   -salt:SALT      specify salt

-.. versionchanged:: 7.5.1
-   string presentation type should be used for "archive" variable in the
-   template to support non latin values
+.. versionchanged:: 7.5.2
+   Localized variables for "archive" template parameter are supported
 """
 #
 # (C) Pywikibot team, 2006-2022
@@ -104,6 +118,7 @@
 from collections import OrderedDict, defaultdict
 from hashlib import md5
 from math import ceil
+from textwrap import fill
 from typing import Any, Optional, Pattern
 from warnings import warn

@@ -484,16 +499,10 @@
         return self.get_attr('key') == hexdigest

     def load_config(self) -> None:
-        """Load and validate archiver template.
-
-        .. versionchanged:: 7.5.1
-           replace archive pattern fields to string conversion
-        """
+        """Load and validate archiver template."""
         pywikibot.info('Looking for: {{{{{}}}}} in {}'
                        .format(self.tpl.title(), self.page))

-        fields = self.get_params(self.now, 0).keys()  # dummy parameters
-        pattern = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
         for tpl, params in self.page.raw_extracted_templates:
             try:  # Check tpl name before comparing; it might be invalid.
                 tpl_page = pywikibot.Page(self.site, tpl, ns=10)
@@ -503,11 +512,7 @@

             if tpl_page == self.tpl:
                 for item, value in params.items():
-                    # convert archive pattern fields to string
-                    # to support non latin digits
-                    if item == 'archive':
-                        value = pattern.sub(r'%\1s', value)
-                    self.set_attr(item.strip(), value.strip())
+                    self.set_attr(item, value)
                 break
         else:
             raise MissingConfigError('Missing or malformed template')
@@ -562,20 +567,22 @@
     def get_params(self, timestamp, counter: int) -> dict:
         """Make params for archiving template."""
         lang = self.site.lang
-        return {
-            'counter': to_local_digits(counter, lang),
-            'year': to_local_digits(timestamp.year, lang),
-            'isoyear': to_local_digits(timestamp.isocalendar()[0], lang),
-            'isoweek': to_local_digits(timestamp.isocalendar()[1], lang),
-            'semester': to_local_digits(int(ceil(timestamp.month / 6)), lang),
-            'quarter': to_local_digits(int(ceil(timestamp.month / 3)), lang),
-            'month': to_local_digits(timestamp.month, lang),
-            'monthname': self.month_num2orig_names[timestamp.month]['long'],
-            'monthnameshort': self.month_num2orig_names[
-                timestamp.month]['short'],
-            'week': to_local_digits(
-                int(time.strftime('%W', timestamp.timetuple())), lang),
+        params = {
+            'counter': counter,
+            'year': timestamp.year,
+            'isoyear': timestamp.isocalendar()[0],
+            'isoweek': timestamp.isocalendar()[1],
+            'semester': int(ceil(timestamp.month / 6)),
+            'quarter': int(ceil(timestamp.month / 3)),
+            'month': timestamp.month,
+            'week': int(time.strftime('%W', timestamp.timetuple())),
         }
+        params.update({'local' + key: to_local_digits(value, lang)
+                       for key, value in params.items()})
+        monthnames = self.month_num2orig_names[timestamp.month]
+        params['monthname'] = monthnames['long']
+        params['monthnameshort'] = monthnames['short']
+        return params

     def analyze_page(self) -> Set[ShouldArchive]:
         """Analyze DiscussionPage."""
@@ -588,6 +595,9 @@
         whys = set()
         pywikibot.output('Processing {} threads'
                          .format(len(self.page.threads)))
+        fields = self.get_params(self.now, 0).keys()  # dummy parameters
+        regex = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
+        stringpattern = regex.sub(r'%\1s', pattern)
         for i, thread in enumerate(self.page.threads):
             # TODO: Make an option so that unstamped (unsigned) posts get
             # archived.
@@ -598,7 +608,21 @@
             params = self.get_params(thread.timestamp, counter)
             # this is actually just a dummy key to group the threads by
             # "era" regardless of the counter and deal with it later
-            key = pattern % params
+            try:
+                key = pattern % params
+            except TypeError as e:
+                if 'a real number is required' in str(e):
+                    pywikibot.error(e)
+                    pywikibot.info(
+                        fill('<<lightblue>>Use string format field like '
+                             '%(localfield)s instead of %(localfield)d. '
+                             'Trying to solve it...'))
+                    pywikibot.info()
+                    pattern = stringpattern
+                    key = pattern % params
+                else:
+                    raise MalformedConfigError(e)
+
             threads_per_archive[key].append((i, thread))
             whys.add(why)  # xxx: we don't know if we ever archive anything

@@ -791,26 +815,22 @@
         return

     for template_name in templates:
-        pagelist = []
         tmpl = pywikibot.Page(site, template_name, ns=10)
-        if not filename and not pagename:
-            if namespace is not None:
-                ns = [str(namespace)]
-            else:
-                ns = []
-            pywikibot.output('Fetching template transclusions...')
-            pagelist.extend(tmpl.getReferences(only_template_inclusion=True,
-                                               follow_redirects=False,
-                                               namespaces=ns))
         if filename:
             with open(filename) as f:
-                for pg in f.readlines():
-                    pagelist.append(pywikibot.Page(site, pg, ns=10))
-        if pagename:
-            pagelist.append(pywikibot.Page(site, pagename, ns=3))
-        pagelist.sort()
-        for pg in pagelist:
-            pywikibot.output('Processing {}'.format(pg))
+                gen = [pywikibot.Page(site, line, ns=10) for line in f]
+        elif pagename:
+            gen = [pywikibot.Page(site, pagename, ns=3)]
+        else:
+            ns = [str(namespace)] if namespace is not None else []
+            pywikibot.output('Fetching template transclusions...')
+            gen = tmpl.getReferences(only_template_inclusion=True,
+                                     follow_redirects=False,
+                                     namespaces=ns,
+                                     content=True)
+        for pg in gen:
+            pywikibot.info('\n\n>>> <<lightpurple>>{}<<default>> <<<'
+                           .format(pg.title()))
             # Catching exceptions, so that errors in one page do not bail out
             # the entire process
             try:
@@ -823,7 +843,13 @@
             except Exception:
                 pywikibot.exception('Error occurred while processing page {}'
                                     .format(pg))
+            except KeyboardInterrupt:
+                pywikibot.info('\nUser quit bot run...')
+                return


 if __name__ == '__main__':
+    start = datetime.datetime.now()
     main()
+    pywikibot.info('\nExecution time: {} seconds'
+                   .format((datetime.datetime.now() - start).seconds))

--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/817363
To unsubscribe, or for help writing mail filters, visit 
https://gerrit.wikimedia.org/r/settings

Gerrit-Project: pywikibot/core
Gerrit-Branch: stable
Gerrit-Change-Id: I10c8ac62656aa53f41be629003e5ed6a875f9310
Gerrit-Change-Number: 817363
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <[email protected]>
Gerrit-Reviewer: D3r1ck01 <[email protected]>
Gerrit-Reviewer: Xqt <[email protected]>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
_______________________________________________
Pywikibot-commits mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to