[Wikidata-bugs] [Maniphest] [Commented On] T253125: Federated Properties, ApiEntityLookup, trying to get data that was not prefetched.

2020-05-25 Thread Addshore
Addshore added a comment.


  I dug into some of this code a little bit more and put some findings below.
  
  I believe that rather than use and abuse this entity info thing, we should 
create our own new things specifically for prefetching referenced entities if 
needed.
  
  EntityInfo as a concept is not currently intentionally used and should be 
removed.
  
  **What is EntityInfo, why does it exist and is it still used?**
  
  - EntityInfo was introduced back in the day to "prefetch" entity info needed 
for rendering a page, this includes terms needed for rendering a page
  - The EntityInfo builder currently uses the terms storage tables, and during 
the rollout of wb_terms towards the end another layer of caching needed to be 
added in front of it 
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/546659/
  - When the "formattercache" was introduced in 
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/441010/30/lib/includes/Formatters/WikibaseValueFormatterBuilders.php
 the use of `labelDescriptionLookupFactory->getLabelDescriptionLookup` was 
removed, which would have use EntityInfo based term lookups for formatting 
entity ID values. The new behaviour instead introduced a caching lookup that 
always used a EntityRetrievingTermLookup
  - Despite not using the entityinfo, it hasn't since been cleaned up / removed 
(I seem to remember there being a ticket relating to this? but I can't find it)
  
  As a result of the above, one of the first things that happens when rendering 
an entity is getting the entity info, which is then not used at all during 
rendering, as this is all now done via the formatted cache things.
  
  **Some other interesting observations (hinting toward removing entity info):**
  
  - `FormatterLabelDescriptionLookupFactory::OPT_LABEL_DESCRIPTION_LOOKUP` is 
likely fairly pointless now, as this option is not looked at (see above) when 
creating the formatted to be used for most things. However
- `WikibaseValueFormatterBuilders::newItemPropertyIdHtmlLinkFormatter` uses 
the formatted cache, and creates its own lookup
- Other methods in this class still call 
`labelDescriptionLookupFactory->getLabelDescriptionLookup` which includes a 
check for OPT_LABEL_DESCRIPTION_LOOKUP These include:
  - newPlainEntityIdFormatter (seems it might need terms? it is a plain 
formatter that is used in newEntityIdFormatter (see next line)
  - newEntityIdFormatter
  - getVocabularyUriFormatter
- However adding breakpoints locally to the EntityInfoTermLookup class 
methods didn't result in being hit
- Adding a breakpoint to the OPT_LABEL_DESCRIPTION_LOOKUP case in 
`getLabelDescriptionLookup` resulted in hits for:
  - getVocabularyUriFormatter (so this code path still gets the terminfo 
lookup specified)
  - newLabelsProviderEntityIdHtmlLinkFormatter (we just touched / are 
touching this in fed prop) (it is called by newEntityIdFormatter)
- All calls to this should be able to use the formatter cache and we remove 
EntityInfo
  
  **Some bookmarks of the code areas I was looking at throughout this**
  F31841798: image.png 
  
  **Impact of removal**
  
  The current rate of requests to this EntityInfo builder stuff: 
https://grafana.wikimedia.org/d/u5wAugyik/wikibase-statsdrecordingsimplecache?panelId=2&fullscreen&orgId=1&from=now-1h&to=now
  The rate of requests for EntityInfo is essentially the same as the # of 
parser output generations for wikibase data.
  Looking at the cache metrics
  In a 12 hour period this is ~120million calls, so 10 million an hour, or 166k 
per min or ~2700 calls per second
  The cache hits come from local server cache, but the misses result in lookups 
from the terms storage db tables.
  
  We also track the db calls for this code path that may not be needed on 
https://grafana.wikimedia.org/d/00548/wikibase-sql-term-storage?orgId=1&refresh=30s
  You can see these in the top panel for the 
"select.EntityInfoBuilder_collectTermsForEntities" metric.
  Averaging 4.7k ops a second/minute (TBA confirm this??)
  
  Needless to say that not doing all of this stuff that is not needed should 
speed things up and reduce memory consumption.
  It will also remove an old outdated concept that should have been killed some 
while ago.

TASK DETAIL
  https://phabricator.wikimedia.org/T253125

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: RhinosF1, Aklapper, Addshore, Jimfhahn, darthmon_wmde, Nandana, lucamauri, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T253125: Federated Properties, ApiEntityLookup, trying to get data that was not prefetched.

2020-05-25 Thread Addshore
Addshore added a comment.


  So we did a task break down related to this ticket (although the immediate 
issue described in this ticket has already been fixed).
  
  The rough breakdown below should move the prefetching of data for federated 
properties to before "any" (perhaps) data is used.
  
  - Move the code from SingleEntitySourceServices::getEntityInfoBuilder into 
the entity type wiring files
  - Alter FullEntityParserOutput::getParserOutput to call getEntityInfo earlier 
and always (Need to think and check this is okay...) (See Fig1)
- Need to figure out what makes the calls to get parser output without 
getting the HTML?
- And how much this code path is called
- And what the alter in logic might do?
- TODO investigate what this entity info lookup is still used for etc... 
(Might be leftover from https://phabricator.wikimedia.org/project/view/3407/ ? )
  - Implement an EntityInfoBuilder for federated properties that prefetches 
$entityIds using the ApiEntityLookup
  - Override the EntityInfoBuilder for federated properties to do the 
prefetching
  
  Fig1.
  F31841545: image.png 
  
  The prefetching for other cases such as the history page should already be 
covered in 
LanguageFallbackLabelDescriptionLookupFactory::newLabelDescriptionLookup which 
does a `prefetchTerms` using a `termBuffer`.
  However looking at ItemHandler::getActionOverrides, the 
newLabelDescriptionLookup call which would allow the prefetching currently 
doesn't pass in any entity ids so maybe this doesn't happen?
  TODO figure this out? This probably doesn't happen on Wikidata either that 
means? but should it? because of the now used terms caching stuff?

TASK DETAIL
  https://phabricator.wikimedia.org/T253125

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: RhinosF1, Aklapper, Addshore, Jimfhahn, darthmon_wmde, Nandana, lucamauri, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T253125: Federated Properties, ApiEntityLookup, trying to get data that was not prefetched.

2020-05-19 Thread gerritbot
gerritbot added a comment.


  Change 597297 **merged** by jenkins-bot:
  [mediawiki/extensions/Wikibase@master] FP: ApiEntityLookup fetch and log when 
not already prefetched
  
  https://gerrit.wikimedia.org/r/597297

TASK DETAIL
  https://phabricator.wikimedia.org/T253125

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore, gerritbot
Cc: RhinosF1, Aklapper, Addshore, Blissjay007, Jimfhahn, Oblanco79, 
Alter-paule, Beast1978, Un1tY, Hook696, Daryl-TTMG, RomaAmorRoma, E.S.A-Sheild, 
darthmon_wmde, Kent7301, Meekrab2012, joker88john, CucyNoiD, Nandana, 
NebulousIris, Gaboe420, lucamauri, Versusxo, Majesticalreaper22, Giuliamocci, 
Adrian1985, Cpaulf30, Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
LawExplorer, WSH1906, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs