Xqt created this task.
Xqt added projects: Pywikibot, Performance Issue.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper.

TASK DESCRIPTION
  **Feature summary, use case**:
  
  We have asyncronous put since Pywikibot 1.0 (compat) which shortens the 
script's executing time a bit due to the put throttle waiting time. This is 
usefull if the processing time for a single page is very long compared to 
putting the page but the put throttle would increase the time additionally.
  
  Instead of asyncronous put a parallel processing of pages retrieved from 
generators (run through BasePage.generator property for example) should be 
implemented e.g. as additional option.
  
  **Possible implementations**
  
  - concurrent.futures
  - threading
  - Awaitables
  
  
https://docs.python.org/3/library/concurrency.html?highlight=concurrent%20future
  https://docs.python.org/3/library/asyncio-task.html
  
https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.tools.html#pywikibot.tools.ThreadedGenerator
  
https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.tools.html#pywikibot.tools.ThreadList
  
  **Benefits**
  
  - ~10 times faster than serial processing of generators
  - async put is not necessary in most cases because other threads are still 
working during put throttle wait cycles
    - Any exception handling is inside the processed task
    - no` stopme()` must be called to wait for the put queue is done; otherwise 
some callbacks might be unavailable after the Bot class is leaved.
    - T104809 <https://phabricator.wikimedia.org/T104809> can be solved probably
  
  **Known problems**
  
  - Logging 
<https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.html#module-pywikibot.logging>
 functions like `pywikibot.info` must be used either on entry or exit of the 
parallel task method/function or must start with the thread name/identifier. 
Otherwise the output messages are confusing because you have no context to 
which page/generator item it belongs.
  - Any ui input function calls Rlock 
<https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.tools.html#pywikibot.tools.RLock>
 to queue any other output; they are flushed if RLock is freed. But it must be 
checked what happens if any ui input is waiting and another task also calls an 
ui input method. It should work because any stream output is made after the 
blocking lock the blocking lock 
<https://doc.wikimedia.org/pywikibot/master/_modules/pywikibot/userinterfaces/terminal_interface_base.html#UI.input>
  
  **Good to know**
  
  - `comms.threadedhttp` was given up with the switch from httplib2 to requests 
and was not adapted
  - See T57889 <https://phabricator.wikimedia.org/T57889>

TASK DETAIL
  https://phabricator.wikimedia.org/T314121

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Xqt
Cc: Aklapper, Xqt, pywikibot-bugs-list, PotsdamLamb, Jyoo1011, JohnsonLee01, 
SHEKH, Dijkstra, Khutuck, Zkhalido, Viztor, Wenyi, Tbscho, MayS, Vali.matei, 
Mdupont, JJMC89, Dvorapa, Altostratus, Avicennasis, mys_721tx, Dinoguy1000, 
jayvdb, Masti, Alchimista, Jay8g
_______________________________________________
pywikibot-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to