Re: [Mediawiki-api] Estonian Wikipedia Web API give empty sentence extracts

Max Semenik Sun, 05 Oct 2014 09:32:17 -0700

Sentence handling algorithm appears to suck at HTML handling. I've filed a
bug for it: https://bugzilla.wikimedia.org/show_bug.cgi?id=71671 As a
workaround, try plaintext extracts:
https://et.wikipedia.org/w/api.php?action=query&prop=extracts%7Ccategories&exsentences=1&explaintext&redirects=&format=jsonfm&cllimit=10&exlimit=1&indexpageids=&maxlag=10&titles=j%C3%A4rv
or switch from requesting a number of sentences to a number of characters.


On Sun, Oct 5, 2014 at 8:34 AM, Kristian Kankainen <[email protected]> wrote:

> Hi!
>
> When I query the Estonian Wikipedia's Web API for the article's first
> sentence, I sometimes get empty response. Actually it gives back an
> horizontal rule and thats it.
>
> For example:
> https://et.wikipedia.org/w/api.php?action=query&prop=extracts|categories&
> exsentences=1&redirects=&format=jsonfm&cllimit=10&exlimit=1&indexpageids=&
> maxlag=10&titles=järv
>
> gives only an horizontal rule as the extract:
> "extract": "<hr />",
>
> Can anyone say what is happening here. Is the article's source organized
> in a wrong way or is it a problem on the APIs sentence parser side?
>
> Best regards
> Kristian Kankainen
>
> _______________________________________________
> Mediawiki-api mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>



-- 
Best regards,
Max Semenik ([[User:MaxSem]])

_______________________________________________
Mediawiki-api mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Re: [Mediawiki-api] Estonian Wikipedia Web API give empty sentence extracts

Reply via email to