Anomie added subscribers: ori, Anomie.
Anomie added a project: Performance.
Anomie added a comment.

After digging deep into this issue, the slowness for Arabic-language wikis 
compared to other languages is because LanguageAr::normalize() is relatively 
slow, multiplied by almost 100000 calls.

Digging a little deeper, on PHP 5.5.9-1ubuntu4.11 using FastStringSearch 
instead of strtr for ReplacementArray is a fair bit faster for the replacement 
array from serialized/normalize-ar.ser. I can't compare the speeds in HHVM due 
to https://phabricator.wikimedia.org/T101418. 
https://phabricator.wikimedia.org/rMWbdb17a79a4bc8ed8bb65936d96e423d272e91583 
landing shortly before this task was opened is almost certainly the immediate 
cause.

@ori: This sounds like something you'd want to look into. My most-reduced test 
cases are:

  $data = unserialize( file_get_contents( 
"/srv/mediawiki/php-1.26wmf21/serialized/normalize-ar.ser" ) );
  $fss = fss_prep_replace( $data );
  for ( $i = 0; $i < 1000; $i++ ) {
      fss_exec_replace( $fss, "foo" );
  }

versus

  $data = unserialize( file_get_contents( 
"/srv/mediawiki/php-1.26wmf21/serialized/normalize-ar.ser" ) );
  for ( $i = 0; $i < 1000; $i++ ) {
      strtr( "foo", $data );
  }

The former takes about 0.034 seconds while the latter takes 0.422 seconds when 
run with `time php5` on mw1017. Increasing the number of iterations to 100000 
(which is about where the API query here is at), FSS goes to 0.074s while strtr 
jumps to over 20s. HHVM's behavior with strtr is in line with Zend PHP's.

The advantage for FSS seems mainly to be due to the ability to do the 
`fss_prep_replace()` once where strtr (presumably) has to do the equivalent for 
every iteration. Putting that inside the loop brings the FSS version up to 
around 18s for 100000 iterations.


TASK DETAIL
  https://phabricator.wikimedia.org/T111479

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Anomie
Cc: Anomie, ori, gerritbot, Aklapper, jayvdb, pywikibot-bugs-list, GWicke, 
GPHemsley, Darkdadaah, Krenair, Legoktm



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to