Re: [Wikitech-l] Help needed with ParserCache::getKey() and ParserCache::getOptionsKey()

2013-12-11 Thread Daniel Kinzler
Am 10.12.2013 22:38, schrieb Brad Jorsch (Anomie):
 Looking at the code, ParserCache::getOptionsKey() is used to get the
 memc key which has a list of parser option names actually used when
 parsing the page. So for example, if a page uses only math and
 thumbsize while being parsed, the value would be array( 'math',
 'thumbsize' ).

Am 11.12.2013 02:35, schrieb Tim Starling:
 No, the set of options which fragment the cache is the same for all
 users. So if the user language is included in that set of options,
 then users with different languages will get different parser cache
 objects.

Ah, right, thanks! Got myself confused there.

The thing is: we are changing what's in the list of relevant options. Before the
deployment, there was nothing in it, while with the new code, the user language
should be there. I suppose that means we need to purge these pointers.

Would bumping wgCacheEpoch be sufficient for that? Note that we don't care much
about puring the actual parser cache entries, we want to purge the pointer
entries in the cache.

 We just tried to enable the use of the parser cache for wikidata, and it 
 failed,
 resulting in page content being shown in random languages.
 
 That's probably because you incorrectly used $wgLang or
 RequestContext::getLanguage(). The user language for the parser is the
 one you get from ParserOptions::getUserLangObj().

Oh, thanks for that hint! Seems our code is inconsistent about this, using the
language from the parser options in some places, the one from the context in
others. Need to fix that!

 It's not necessary to call ParserOutput::recordOption().
 ParserOptions::getUserLangObj() will call it for you (via
 onAccessCallback).

Oh great, magic hidden information flow :)

Thanks for the info, I'll get hacking on it!

-- daniel


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Help needed with ParserCache::getKey() and ParserCache::getOptionsKey()

2013-12-10 Thread Brad Jorsch (Anomie)
On Tue, Dec 10, 2013 at 4:22 PM, Daniel Kinzler dan...@brightbyte.de wrote:

 what is the intention behind the current implementation of
 ParserCache::getOptionsKey()? It's based on the page ID only, not taking into
 account any options.

Looking at the code, ParserCache::getOptionsKey() is used to get the
memc key which has a list of parser option names actually used when
parsing the page. So for example, if a page uses only math and
thumbsize while being parsed, the value would be array( 'math',
'thumbsize' ).

Then ParserOptions::optionsHash is used to construct a key
corresponding to the actual ParserOptions object, for storing the
actual parser output for that page+ParserOptions combination. In the
example above, it would only use the 'math' and 'thumbsize' options to
vary the key; users having the same 'math' and 'thumbsize' would get
the same cached parser output even if they have different options for
stubthreshold, dateformat, numberheadings, userlang, editsection, an
so on. This reduces cache fragmentation.

I doubt that the ContentHandler is really going to need to override
getOptionsKey; the ParserOptions options used to parse the page really
shouldn't vary depending on user language or other stuff like that.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Help needed with ParserCache::getKey() and ParserCache::getOptionsKey()

2013-12-10 Thread Tim Starling
On 11/12/13 08:22, Daniel Kinzler wrote:
 what is the intention behind the current implementation of
 ParserCache::getOptionsKey()? It's based on the page ID only, not taking into
 account any options. This seems to imply that all users share the same parser
 cache key, ignoring all options that may impact cached content. Is that
 correct/intended? 

No, the set of options which fragment the cache is the same for all
users. So if the user language is included in that set of options,
then users with different languages will get different parser cache
objects.

That is to say, the options key stores the list of options which vary
the cache. ParserOptions::optionsHash() uses this list to form a
parser output key (as in ParserCache:getParserOutputKey()) which is
specific to the actual options requested.

If the parser output varies by language for some users, and not
others, then you may possibly have a problem, but it doesn't sound
like that is what you are doing.

 We just tried to enable the use of the parser cache for wikidata, and it 
 failed,
 resulting in page content being shown in random languages.

That's probably because you incorrectly used $wgLang or
RequestContext::getLanguage(). The user language for the parser is the
one you get from ParserOptions::getUserLangObj().

During page save, a default ParserOptions is used, with the default
user language, for the purposes of link table construction. The
ParserOutput thus generated will be saved into the ParserCache. So
it's not correct to use the context user language during parse, this
will cause pollution of the parser cache.

 I tried to split the parser cache by user language using
 ParserOutput:.recordOption to include userlang in the cache key. When tested
 locally, and also on our test system, that seemed to work fine (which seems
 strange now, looking at the code of getOptionsKey()).

It's not necessary to call ParserOutput::recordOption().
ParserOptions::getUserLangObj() will call it for you (via
onAccessCallback).

 ParserCache::getOptionsKey could delegate to ContentHandler::getOptionsKey,
 which could then be used to override the default behavior. Would that be a
 sensible approach?

No.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l