[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-08-04 Thread Multichill
Multichill added a comment. @Smalyshev / @debt :I think this is one of those tasks where we have a bit of a misunderstanding about scope (see https://lists.wikimedia.org/pipermail/wikidata/2018-August/012282.html ). Close this one as resolved and make clearly scoped follow up tasks to untangle

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-30 Thread Esc3300
Esc3300 added a comment. I don't see clear disadvantages of doing the indexing Multichill suggests. I don't see any mentioned here either, besides not indexing some specify ones (page number, e.g.). Compared to pubmed article titles, it seems at least as useful.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-23 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2018-05-23T16:31:01Z] starting wikidata full reindex for T163642TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, StashbotCc:

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-10 Thread Multichill
Multichill added a comment. And https://www.wikidata.org/w/index.php?search=haswbstatement%3AP217%3DSK-C-5 works :-). https://www.wikidata.org/w/index.php?search="SK-C-5" doesn't work (yet?). Is that the next step?TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-10 Thread Smalyshev
Smalyshev added a comment. @Multichill checkout https://www.wikidata.org/w/index.php?title=Q219831=""> - it has the data now.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Stashbot,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-09 Thread Smalyshev
Smalyshev added a comment. Yes, edit should show it.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Stashbot, Lea_Lacroix_WMDE, gerritbot, Liuxinyu970226, Smalyshev, debt, aude, Lydia_Pintscher,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-09 Thread Lea_Lacroix_WMDE
Lea_Lacroix_WMDE added a comment. Great!TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, Lea_Lacroix_WMDECc: Stashbot, Lea_Lacroix_WMDE, gerritbot, Liuxinyu970226, Smalyshev, debt, aude,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-09 Thread Smalyshev
Smalyshev added a comment. I'll note here when the reindex is done, and then I guess you can announce :) In the meantime I can check that everything works smoothly with edited entries.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Lea_Lacroix_WMDE
Lea_Lacroix_WMDE added a comment. I have no deadline in mind, I was just wondering when to announce it, and if you or me should do it :)TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2018-05-08T23:22:45Z] Synchronized wmf-config: SWAT: [[gerrit:431994|Add string and external-id types to Wikibase indexing]] T163642 T99899 (duration: 01m 26s)TASK

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Smalyshev
Smalyshev added a comment. @Lea_Lacroix_WMDE Also, for newly edited items it should be working as soon as wmf.3 is deployed. But for older items it will need reindex.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread gerritbot
gerritbot added a comment. Change 431994 merged by jenkins-bot: [operations/mediawiki-config@master] Add string and external-id types to Wikibase indexing https://gerrit.wikimedia.org/r/431994TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread gerritbot
gerritbot added a comment. Change 431994 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [operations/mediawiki-config@master] Add string and external-id types to indexing https://gerrit.wikimedia.org/r/431994TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Smalyshev
Smalyshev added a comment. Also, right now we can only locate by haswbstatement:P123=SK-C-5. If we want to index data without attached property IDs, we need to add different field & analyzer to do that. Should we do it?TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Smalyshev
Smalyshev added a comment. @Lea_Lacroix_WMDE we need to make configs that enable indexing (will be done next thing) and then we need to actually reindex. Reindexing takes several days, so I planned to do it immediately after the Hackathon, unless you need it sooner.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread Lea_Lacroix_WMDE
Lea_Lacroix_WMDE added a comment. Hey @Smalyshev, when is this going to be live?TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, Lea_Lacroix_WMDECc: Lea_Lacroix_WMDE, gerritbot, Liuxinyu970226,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-08 Thread gerritbot
gerritbot added a comment. Change 430277 merged by jenkins-bot: [mediawiki/extensions/Wikibase@master] Add capability to exclude properties from by-type index https://gerrit.wikimedia.org/r/430277TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-05-01 Thread gerritbot
gerritbot added a comment. Change 430277 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [mediawiki/extensions/Wikibase@master] Add capability to exclude properties from by-type index https://gerrit.wikimedia.org/r/430277TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-26 Thread Lydia_Pintscher
Lydia_Pintscher added a comment. Yeah let's leave them out for now.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lydia_PintscherCc: Liuxinyu970226, Smalyshev, debt, aude, Lydia_Pintscher, Aklapper,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-25 Thread Smalyshev
Smalyshev added a comment. There is a size limit on string values already. I don't remember the exact limit right now. Or are you looking for something else? I was thinking about shorter limit - not sure it makes sense to look up something by whole SPARQL query... but maybe we should just exclude

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-25 Thread Lydia_Pintscher
Lydia_Pintscher added a comment. In T163642#4155642, @Smalyshev wrote: OK, looking at current usage, there are only 21 string properties with more than 100K values. Looking at them in particular, the interesting ones are: HomoloGene ID (P593) - probably should be external ID. There are more like

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-24 Thread Smalyshev
Smalyshev added a comment. OK, looking at current usage, there are only 21 string properties with more than 100K values. Looking at them in particular, the interesting ones are: HomoloGene ID (P593) - probably should be external ID. There are more like this, with less usage. Over a million

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-24 Thread Lydia_Pintscher
Lydia_Pintscher added a comment. In T163642#411, @Smalyshev wrote: Also just had a thought - this does not cover qualifiers and references of course. Do we want anything there or that is already WDQS domain? I'd say for now let's leave them out.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-24 Thread Smalyshev
Smalyshev added a comment. Hmm 228 is not that bad... Let me see if I can get some usage stats. Also just had a thought - this does not cover qualifiers and references of course. Do we want anything there or that is already WDQS domain?TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-24 Thread Lydia_Pintscher
Lydia_Pintscher added a comment. Don't have a good answer but https://www.wikidata.org/w/index.php?title=Special:ListProperties/string=500=0 has a list of all of the current string properties.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-24 Thread Smalyshev
Smalyshev added a comment. OK, so outside of external IDs covered by T99899: [Story] Looking up entities by external identifiers, which string properties we want to add to the index? I am still concerned all of them might be too much, but ready to hear other opinions.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2018-04-11 Thread Multichill
Multichill added a comment. Viaf part is probably covered by T99899TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MultichillCc: Liuxinyu970226, Smalyshev, debt, aude, Lydia_Pintscher, Aklapper, Multichill,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2017-12-20 Thread Multichill
Multichill added a comment. Both "P217:ГЭ-3836" or just "ГЭ-3836" would be great.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MultichillCc: Smalyshev, debt, aude, Lydia_Pintscher, Aklapper, Multichill,

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2017-12-19 Thread Smalyshev
Smalyshev added a comment. @Multichill just to be sure, if you could search for P217:ГЭ-3836, with this syntax, it would be fine? We may need to do some infrastructure work before this works properly, but it seems not too hard to implement.TASK DETAILhttps://phabricator.wikimedia.org/T163642EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2017-12-19 Thread Multichill
Multichill added a comment. @Smalyshev coming back to the strings. It's just like Commons. I don't use the local search. I use Google. I noticed https://www.wikidata.org/w/index.php?title=Q45962939=""> and I'm pretty sure it's a duplicate. The item has an image with the link to the source and the

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2017-12-19 Thread Smalyshev
Smalyshev added a comment. That would require indexing the external identifiers with the property I think it should be possible. The main question that remains is - do we want to search per-property (e.g. P214:1234 for VIAF ID 1234 specifically) or just something like externalid:1234 which would

[Wikidata-bugs] [Maniphest] [Commented On] T163642: Index Wikidata strings in statements in the search engine

2017-04-27 Thread Multichill
Multichill added a comment. In T163642#3218412, @debt wrote: This looks to be more wikidata than discovery search task at this time. "The Discovery Department of Wikimedia Engineering has the mission to make the wealth of knowledge and content in the Wikimedia projects easily discoverable. "