jayvdb created this task.
jayvdb added a subscriber: jayvdb.
jayvdb added projects: pywikibot-core, Pywikibot-interwiki.py.
Herald added subscribers: pywikibot-bugs-list, Aklapper.

TASK DESCRIPTION
  I've always considered this method to be a very strange thing, with its 
definition of empty being unlikely to be useful elsewhere. Why 4? my guess is 
that 4 characters is too small to be useful, e.g. {{a}} and [[a]] are 5 chars.
  
  If `isEmpty` is unintentionally used on a category page, which on small wikis 
are usually 'empty' except for category links, the category page will be 
considered to be 'empty'.  This is why interwiki.py needs to also check 
`isCategory`.
  
  My gut feeling is that this method should be copied to be a function in 
interwiki.py, where you can optimise it how you want, so that it skips useless 
pages using faster algorithms. e.g. if the page is in a content namespace, 
rather than looking at interwikilinks and category links, I am guessing that 
the following will be much faster and be almost as good:
  
  ```
  def simpleEmptyCheck(page):
      try:
          # get the 50th character, if it exists
          page.text[50]
          return False
      except IndexError:
          return True
  ```
  
  Then Page.isEmpty can be marked as `@deprecated` .
  
  The use in page_tests would then be moved into a new test method in 
TestPageDeprecation.

TASK DETAIL
  https://phabricator.wikimedia.org/T112340

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jayvdb
Cc: Aklapper, jayvdb, pywikibot-bugs-list



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to