Re: [Wikitech-l] Ifexists across wikis

2015-12-07 Thread Stas Malyshev
Hi!

> I don't think there is a way to get a database name from an interwiki
> prefix.

Not a good/easy way, AFAIK. I've looked into it recently and the way
current code does it is with a lot of ad-hoc stuff, external configs,
hard-coded configs and special cases. I think this ticket:
https://phabricator.wikimedia.org/T113034
aims to improve it.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-07 Thread MZMcBride
Bartosz Dziewoński wrote:
>It's also why #ifexist is expensive: it needs a separate database query
>for each time it's used, to check for a single page, because it's
>impossible to determine the list of pages to check in advance.

I'm not sure I understand the impossibility here.

When the expensive parser function count feature was added, I remember
this issue being discussed and my memory is that it seemed possible to
batch the ifexist lookups in a similar way to how we batch regular
internal link lookups against the pagelinks table, but nobody was
interested in implementing it at the time.

If the wikitext is parsed/evaluated on page save, I don't see why ifexist
lookups would be impossible to batch. We're already using the pagelinks
table for the ifexist functionality to properly work, as I understand it
(cf. ).

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-07 Thread Federico Leva (Nemo)
Italian projects would also like such a feature, especially for 
(semi)automatic creation of interproject links.

https://it.wikipedia.org/wiki/Discussioni_template:Interprogetto#Interprogetto_a_wikt:_quando_metterlo.3F

(By the way, the lack of Wiktionary on Wikidata even for interwiki links 
is extremely detrimental for a huge pile of things.)


Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Ifexists across wikis

2015-12-06 Thread Lars Aronsson

If I write a [[link]] it will be blue if the page exists and red otherwise.
But if I write [[:sw:link]] that will be an external or cross-wiki link,
that is never red, as if it were impossible to know whether that page
existed in Swahili Wikipedia.

But determining the existence of a page is just a quick database table
lookup, and all databases run on WMF's servers, so it shouldn't be more
expensive to look up a cross-wiki link, as long as it is one of WMF's wikis.

In Wiktionary, it is common to link to entries in foreign languages both
on the local wiki and to the native wiki for that language. For example,
in English Wikitionary the entry for "blue" links to the Swahili word "bluu"
both on en.wiktionary and on sw.wiktionary, using the template 
{{t+|sw|bluu}}.


https://en.wiktionary.org/wiki/blue#Translations

But since the Afrikaans translation "blou" doesn't have an entry on the
Afrikaans Wiktionary, another template is used: {{t|af|blou}}. And it is
a pain to know which one of these two templates to use. If it was possible
in {{#ifexists}} to determine the existence of a page in another wiki,
only one template would be needed, and the bot job to change to the right
template would not be needed.

#ifexist already works across namespaces (well, of course), so is there any
good reason it shouldn't work across wikis?

Oddly, the documentation says #ifexist is an "expensive" parser function.
That doesn't make much sense to me. It's as if red/blue links were
expensive, and most of our list pages should be banned.
https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist


--
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread Alex Monk
I don't think there is a way to get a database name from an interwiki
prefix.

Also, whether a page is known or not does not just depend on a simple
database lookup. Extensions can add arbitrary rules about which titles
should be considered known or not. EducationProgram, GlobalUserPage, and
WikimediaIncubator all do this.

On 6 December 2015 at 16:26, Lars Aronsson  wrote:

> If I write a [[link]] it will be blue if the page exists and red otherwise.
> But if I write [[:sw:link]] that will be an external or cross-wiki link,
> that is never red, as if it were impossible to know whether that page
> existed in Swahili Wikipedia.
>
> But determining the existence of a page is just a quick database table
> lookup, and all databases run on WMF's servers, so it shouldn't be more
> expensive to look up a cross-wiki link, as long as it is one of WMF's
> wikis.
>
> In Wiktionary, it is common to link to entries in foreign languages both
> on the local wiki and to the native wiki for that language. For example,
> in English Wikitionary the entry for "blue" links to the Swahili word
> "bluu"
> both on en.wiktionary and on sw.wiktionary, using the template
> {{t+|sw|bluu}}.
>
> https://en.wiktionary.org/wiki/blue#Translations
>
> But since the Afrikaans translation "blou" doesn't have an entry on the
> Afrikaans Wiktionary, another template is used: {{t|af|blou}}. And it is
> a pain to know which one of these two templates to use. If it was possible
> in {{#ifexists}} to determine the existence of a page in another wiki,
> only one template would be needed, and the bot job to change to the right
> template would not be needed.
>
> #ifexist already works across namespaces (well, of course), so is there any
> good reason it shouldn't work across wikis?
>
> Oddly, the documentation says #ifexist is an "expensive" parser function.
> That doesn't make much sense to me. It's as if red/blue links were
> expensive, and most of our list pages should be banned.
> https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist
>
>
> --
>   Lars Aronsson (l...@aronsson.se)
>   Aronsson Datateknik - http://aronsson.se
>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread Florian Schmidt
I'm not very familar with this, but wouldn't this need a bigger change in 
LinksUpdate? Or the question: how would a wiki know, if a page get's created 
after it was linked and mark it blue instead of red?

Gesendet mit meinem HTC

- Nachricht beantworten -
Von: "Alex Monk" <kren...@gmail.com>
An: "Wikimedia developers" <wikitech-l@lists.wikimedia.org>
Betreff: [Wikitech-l] Ifexists across wikis
Datum: So., Dez. 6, 2015 18:04

I don't think there is a way to get a database name from an interwiki
prefix.

Also, whether a page is known or not does not just depend on a simple
database lookup. Extensions can add arbitrary rules about which titles
should be considered known or not. EducationProgram, GlobalUserPage, and
WikimediaIncubator all do this.

On 6 December 2015 at 16:26, Lars Aronsson <l...@aronsson.se> wrote:

> If I write a [[link]] it will be blue if the page exists and red otherwise.
> But if I write [[:sw:link]] that will be an external or cross-wiki link,
> that is never red, as if it were impossible to know whether that page
> existed in Swahili Wikipedia.
>
> But determining the existence of a page is just a quick database table
> lookup, and all databases run on WMF's servers, so it shouldn't be more
> expensive to look up a cross-wiki link, as long as it is one of WMF's
> wikis.
>
> In Wiktionary, it is common to link to entries in foreign languages both
> on the local wiki and to the native wiki for that language. For example,
> in English Wikitionary the entry for "blue" links to the Swahili word
> "bluu"
> both on en.wiktionary and on sw.wiktionary, using the template
> {{t+|sw|bluu}}.
>
> https://en.wiktionary.org/wiki/blue#Translations
>
> But since the Afrikaans translation "blou" doesn't have an entry on the
> Afrikaans Wiktionary, another template is used: {{t|af|blou}}. And it is
> a pain to know which one of these two templates to use. If it was possible
> in {{#ifexists}} to determine the existence of a page in another wiki,
> only one template would be needed, and the bot job to change to the right
> template would not be needed.
>
> #ifexist already works across namespaces (well, of course), so is there any
> good reason it shouldn't work across wikis?
>
> Oddly, the documentation says #ifexist is an "expensive" parser function.
> That doesn't make much sense to me. It's as if red/blue links were
> expensive, and most of our list pages should be banned.
> https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist
>
>
> --
>   Lars Aronsson (l...@aronsson.se)
>   Aronsson Datateknik - http://aronsson.se
>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread Bartosz Dziewoński

On 2015-12-06 17:26, Lars Aronsson wrote:

If I write a [[link]] it will be blue if the page exists and red otherwise.
But if I write [[:sw:link]] that will be an external or cross-wiki link,
that is never red, as if it were impossible to know whether that page
existed in Swahili Wikipedia.

But determining the existence of a page is just a quick database table
lookup, and all databases run on WMF's servers, so it shouldn't be more
expensive to look up a cross-wiki link, as long as it is one of WMF's
wikis.


> (...)


#ifexist already works across namespaces (well, of course), so is there any
good reason it shouldn't work across wikis?

Oddly, the documentation says #ifexist is an "expensive" parser function.
That doesn't make much sense to me. It's as if red/blue links were
expensive, and most of our list pages should be banned.
https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist


To add to what Alex and Florian said, the simple database lookup to 
check page existence is not actually that simple. When parsing a page, 
the query to determine link color (and to mark links to non-existent, 
redirect or disambig pages) is done in batches of 1000 links, after the 
whole page has been parsed and we know all the pages it links to.

Special pages that have lists of links use a similar method.

This wouldn't be possible if we needed to query a different database for 
each link (at best, perhaps we could batch them per-database, which 
doesn't help the Wiktionary use case of links to various sites).


It's also why #ifexist is expensive: it needs a separate database query 
for each time it's used, to check for a single page, because it's 
impossible to determine the list of pages to check in advance.


--
Bartosz Dziewoński

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread Purodha Blissenbach

How about using the API on the targe side?
Purodha

On 06.12.2015 18:04, Alex Monk wrote:

I don't think there is a way to get a database name from an interwiki
prefix.

Also, whether a page is known or not does not just depend on a simple
database lookup. Extensions can add arbitrary rules about which 
titles
should be considered known or not. EducationProgram, GlobalUserPage, 
and

WikimediaIncubator all do this.

On 6 December 2015 at 16:26, Lars Aronsson  wrote:

If I write a [[link]] it will be blue if the page exists and red 
otherwise.
But if I write [[:sw:link]] that will be an external or cross-wiki 
link,
that is never red, as if it were impossible to know whether that 
page

existed in Swahili Wikipedia.

But determining the existence of a page is just a quick database 
table
lookup, and all databases run on WMF's servers, so it shouldn't be 
more
expensive to look up a cross-wiki link, as long as it is one of 
WMF's

wikis.

In Wiktionary, it is common to link to entries in foreign languages 
both
on the local wiki and to the native wiki for that language. For 
example,
in English Wikitionary the entry for "blue" links to the Swahili 
word

"bluu"
both on en.wiktionary and on sw.wiktionary, using the template
{{t+|sw|bluu}}.

https://en.wiktionary.org/wiki/blue#Translations

But since the Afrikaans translation "blou" doesn't have an entry on 
the
Afrikaans Wiktionary, another template is used: {{t|af|blou}}. And 
it is
a pain to know which one of these two templates to use. If it was 
possible
in {{#ifexists}} to determine the existence of a page in another 
wiki,
only one template would be needed, and the bot job to change to the 
right

template would not be needed.

#ifexist already works across namespaces (well, of course), so is 
there any

good reason it shouldn't work across wikis?

Oddly, the documentation says #ifexist is an "expensive" parser 
function.

That doesn't make much sense to me. It's as if red/blue links were
expensive, and most of our list pages should be banned.

https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist


--
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread John Erling Blad
Use Q-ids and get the links from Wikidata.

On Sun, Dec 6, 2015 at 10:49 PM, Purodha Blissenbach <
puro...@blissenbach.org> wrote:

> How about using the API on the targe side?
> Purodha
>
>
> On 06.12.2015 18:04, Alex Monk wrote:
>
>> I don't think there is a way to get a database name from an interwiki
>> prefix.
>>
>> Also, whether a page is known or not does not just depend on a simple
>> database lookup. Extensions can add arbitrary rules about which titles
>> should be considered known or not. EducationProgram, GlobalUserPage, and
>> WikimediaIncubator all do this.
>>
>> On 6 December 2015 at 16:26, Lars Aronsson  wrote:
>>
>> If I write a [[link]] it will be blue if the page exists and red
>>> otherwise.
>>> But if I write [[:sw:link]] that will be an external or cross-wiki link,
>>> that is never red, as if it were impossible to know whether that page
>>> existed in Swahili Wikipedia.
>>>
>>> But determining the existence of a page is just a quick database table
>>> lookup, and all databases run on WMF's servers, so it shouldn't be more
>>> expensive to look up a cross-wiki link, as long as it is one of WMF's
>>> wikis.
>>>
>>> In Wiktionary, it is common to link to entries in foreign languages both
>>> on the local wiki and to the native wiki for that language. For example,
>>> in English Wikitionary the entry for "blue" links to the Swahili word
>>> "bluu"
>>> both on en.wiktionary and on sw.wiktionary, using the template
>>> {{t+|sw|bluu}}.
>>>
>>> https://en.wiktionary.org/wiki/blue#Translations
>>>
>>> But since the Afrikaans translation "blou" doesn't have an entry on the
>>> Afrikaans Wiktionary, another template is used: {{t|af|blou}}. And it is
>>> a pain to know which one of these two templates to use. If it was
>>> possible
>>> in {{#ifexists}} to determine the existence of a page in another wiki,
>>> only one template would be needed, and the bot job to change to the right
>>> template would not be needed.
>>>
>>> #ifexist already works across namespaces (well, of course), so is there
>>> any
>>> good reason it shouldn't work across wikis?
>>>
>>> Oddly, the documentation says #ifexist is an "expensive" parser function.
>>> That doesn't make much sense to me. It's as if red/blue links were
>>> expensive, and most of our list pages should be banned.
>>>
>>> https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23ifexist
>>>
>>>
>>> --
>>>   Lars Aronsson (l...@aronsson.se)
>>>   Aronsson Datateknik - http://aronsson.se
>>>
>>>
>>>
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Ifexists across wikis

2015-12-06 Thread Tim Starling
On 07/12/15 06:29, Bartosz Dziewoński wrote:
> To add to what Alex and Florian said, the simple database lookup to
> check page existence is not actually that simple. When parsing a page,
> the query to determine link color (and to mark links to non-existent,
> redirect or disambig pages) is done in batches of 1000 links, after
> the whole page has been parsed and we know all the pages it links to.
> Special pages that have lists of links use a similar method.

Also, when you make a red link, and then someone creates the page,
people expect the link to turn blue straight away. That's implemented
using the pagelinks table -- when a page is created, we use pagelinks
to find all pages with red links to that page, update all their
page_touched fields, and purge them from Varnish, so that all the
links will turn blue in under a second.

It's possible to do that for interwiki links, but it increases the
amount of time it would take to implement such a feature. We currently
don't have a way to efficiently find all interwiki links to a page, so
one would have to be added.

-- Tim Starling



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l