Sentence handling algorithm appears to suck at HTML handling. I've filed a bug for it: https://bugzilla.wikimedia.org/show_bug.cgi?id=71671 As a workaround, try plaintext extracts: https://et.wikipedia.org/w/api.php?action=query&prop=extracts%7Ccategories&exsentences=1&explaintext&redirects=&format=jsonfm&cllimit=10&exlimit=1&indexpageids=&maxlag=10&titles=j%C3%A4rv or switch from requesting a number of sentences to a number of characters.
On Sun, Oct 5, 2014 at 8:34 AM, Kristian Kankainen <[email protected]> wrote: > Hi! > > When I query the Estonian Wikipedia's Web API for the article's first > sentence, I sometimes get empty response. Actually it gives back an > horizontal rule and thats it. > > For example: > https://et.wikipedia.org/w/api.php?action=query&prop=extracts|categories& > exsentences=1&redirects=&format=jsonfm&cllimit=10&exlimit=1&indexpageids=& > maxlag=10&titles=järv > > gives only an horizontal rule as the extract: > "extract": "<hr />", > > Can anyone say what is happening here. Is the article's source organized > in a wrong way or is it a problem on the APIs sentence parser side? > > Best regards > Kristian Kankainen > > _______________________________________________ > Mediawiki-api mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api > -- Best regards, Max Semenik ([[User:MaxSem]])
_______________________________________________ Mediawiki-api mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
