[Bug 62209] feature request: Text extraction from custom wiki markup

2014-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=62209

Max Semenik  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #5 from Max Semenik  ---
I don't think that turning TE into yet another wikitext parsing facility is the
way we want it to evolve. You can do it trivially for your infrastructure
though, using ExtractFormatter class.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 62209] feature request: Text extraction from custom wiki markup

2014-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=62209

--- Comment #4 from Dimitris Kontokostas  ---
Thanks again,

still, is it possible to add these two parameters? 
This setting works for us but it would suit us better if we had the text/title
option.

This way we only have to load the templates in the database and feed the text
in the api. Otherwise we need to load the whole dump.

If you agree to this request, we can work on this addition.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 62209] feature request: Text extraction from custom wiki markup

2014-03-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=62209

--- Comment #3 from Max Semenik  ---
(In reply to Dimitris Kontokostas from comment #2)
> does this extension loads the whole page, convert it to html and then return
> the first section? 

Once again, 
> 1) If you specify &exintro only intro will be parsed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 62209] feature request: Text extraction from custom wiki markup

2014-03-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=62209

--- Comment #2 from Dimitris Kontokostas  ---
Thanks,

I already saw the &exintro option so, one question to understand this. 

when I use this call:
http://en.wikipedia.org/w/api.php?action=query&prop=extracts&exintro=&explaintext=&titles=Athens

does this extension loads the whole page, convert it to html and then return
the first section? 
if not this extension is perfect for our purpose and don't read the rest :)

if yes, we would like to avoid loading the whole page as it would slow down our
extraction.

What we do so far is to take the wiki markup of the page up to the first
section and feed it in the mw "parse" api call [1] which normally returns html.
Then we hack into the mw core to return cleaned text. 

So, the request is to add the "text" and "title" parameters in your api. When
they are given, instead of parsing the page by title you will parse the "text"
parameter ("title" is used for  magic words like {{PAGENAME}}), get the html
and clean it the same way you do now.

Cheers,
Dimitris

[1] https://www.mediawiki.org/wiki/API:Parsing_wikitext#parse

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 62209] feature request: Text extraction from custom wiki markup

2014-03-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=62209

--- Comment #1 from Max Semenik  ---
1) If you specify &exintro only intro will be parsed.
2) TE operates only over HTML returned by parser, doing anything with wikitext
directly would be essentially a different extension. What do you mean by
"custom wiki markup"?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l