Hi Axel,
Thank you for your answer.
I am wondering... how do you explain that the two templates
{{Guil|'''parti philosophique'''}} and {{s-|XVIII|e|}}
in my example are not processed correctly (by default) (*)?
Is it because Bliki works correctly with English
wiki articles and not with, for
2010-08-07 20:24, lmhelp skrev:
So why not use the real parser?
Exactly. Where can it be found, please?
Thanks and all the best,
--
Lmhelp
fetch the html from wikipedia.org with something like wget
(playing nicely and using delays!) and then extract the
first p element with something
On Sun, Aug 8, 2010 at 9:49 PM, lmhelp lm...@wanadoo.fr wrote:
Hi,
I have abandonned Bliki because look what happenned:
Here is what I gave to Bliki as an input:
---
Le {{Guil|'''parti philosophique'''}} désignait
Thank you all for your contribs :).
Hi,
So... I was over-optimistic about managing to extract the first
paragraph of a Wikipedia article out of its Wikitext easily...
Yet, I managed (1) for instance (for the Wikipedia article Čokot)
to get the following Wikitext sentence:
On Sat, Aug 7, 2010 at 9:21 AM, lmhelp lm...@wanadoo.fr wrote:
MY FIRST QUESTION IS:
=
I was wondering if you knew a better tool than this one... one which
wouldn't miss some Wikitext chunks of code like in the above
example (or maybe which at least would handle usual
On Sat, Aug 7, 2010 at 10:54 AM, lmhelp lm...@wanadoo.fr wrote:
Hi,
Thank you for your answer.
mwlib is the best parser available for folks who want to do a quick job
such
as yours.
Maybe it is, I don't know...
I know (since recently) it is not an easy task constructing a parser for
-
mwlib was written in conjunction with the WMF, and IIRC had at least some
input from Brion Vibber. It's high quality and works well. There is a 2-3
hour learning curve for navigating the python modules and methods
So why not use the real parser?
Exactly. Where can it be found, please?
Thanks and all the best,
--
Lmhelp
--
View this message in context:
http://old.nabble.com/Wikitext-grammar-tp29350471p29376156.html
Sent from the WikiMedia General mailing list archive at Nabble.com.
If you are to extract only Wikipedia'a articles first paragraph no problema.
2010/8/6 Katharina Wolkwitz wolkw...@fh-swf.de
Hi,
Am 05.08.2010 16:47 schrieb lmhelp2:
Thank you!
So here is the list I have for the moment:
I need to ignore lines:
- containing: {{...}}
=
Also ignore lines starting with #, :, (space), or ; .
Then there are (potentially nested) tables, which start with a line
beginning with {| and end in a line beginning with |}.
There are more magic words with the general pattern
__SOMEUPPERCASECHARACTERS__, IIRC.
Note that sometimes, people
On Wed, Aug 4, 2010 at 1:45 PM, lmhelp lm...@wanadoo.fr wrote:
I need to extract automatically the first paragraph of a Wiki article...
See Extracted page extracts for Yahoo:
http://download.wikimedia.org/enwiki/20100730/
___
MediaWiki-l mailing
A colleague told me about that... so we had a look at it.
Unfortunately, abstracts are not correct most of the time...
-
Example (in French):
-
titleWikipédia : Arabie
On Fri, Aug 6, 2010 at 10:06 AM, Léa Massiot lea.mass...@ign.fr wrote:
A colleague told me about that... so we had a look at it.
Unfortunately, abstracts are not correct most of the time...
-
Example (in French):
On Fri, Aug 6, 2010 at 10:18 AM, Léa Massiot lea.mass...@ign.fr wrote:
Are you sure this will be able to extract the
introductory paragraph (only) which is not in any
section... (because it is not trivial).
There is only one example I could find at
http://code.pediapress.com/wiki/wiki/mwlib
The current parser is, as David Gerard said, not much of a parser by
any conventional definition. It's more of a macro-expander (for parser
tags and templates) and a series of mostly-regular-expression-based
replacement routines, which result in partially valid HTML which is then
repaired in
On 6 August 2010 18:59, Trevor Parscal tpars...@wikimedia.org wrote:
In short, the current parser is a bad example of how to write a
parser,
I forgot to call it a box of pure malevolent evil, a purveyor of
insidious insanity, an eldritch manifestation that would make Bill
Gates let out a low
@lists.wikimedia.org
Betreff: [Mediawiki-l] Wikitext grammar
Hi,
Thank you or reading my post.
I am wondering if there exists a grammar for the Wikicode/Wikitext
language (or an *exhaustive* (and formal) set of rules about how is
constructed
a Wikitext).
I've looked for such a grammar/set of rules
On Thu, Aug 5, 2010 at 1:10 PM, lmhelp2 lea.mass...@ign.fr wrote:
Hi,
Thanks to all of you for your answers.
I have decided (in the light of what you told me)
to read the Wikitext line after line.
I must ignore leading:
- templates (including the ones which span
over several
Hi,
there might be an occurrence of __TOC__ or __NOTOC__ before the first real
paragraph.
Good luck with finding all exeptions. :)
Katharina
Am 05.08.2010 14:10 schrieb lmhelp2:
Hi,
Thanks to all of you for your answers.
I have decided (in the light of what you told me)
to read the
Thank you!
So here is the list I have for the moment:
I need to ignore lines:
- containing: {{...}}
= possibly spreading over several lines,
= being possibly nested {{... {{ ... }} ... }}.
- containing: [[...]]
= being possibly nested [[... [[ ... ]] ... ]].
-
Hi,
Am 05.08.2010 16:47 schrieb lmhelp2:
Thank you!
So here is the list I have for the moment:
I need to ignore lines:
- containing: {{...}}
= possibly spreading over several lines,
= being possibly nested {{... {{ ... }} ... }}.
- containing: [[...]]
=
lmhelp wrote:
Hi,
Thank you or reading my post.
I am wondering if there exists a grammar for the Wikicode/Wikitext
language (or an *exhaustive* (and formal) set of rules about how is
constructed
a Wikitext).
I've looked for such a grammar/set of rules on the Web but I couldn't find
On 4 August 2010 20:45, lmhelp lm...@wanadoo.fr wrote:
I am wondering if there exists a grammar for the Wikicode/Wikitext
language (or an *exhaustive* (and formal) set of rules about how is
constructed
a Wikitext).
I've looked for such a grammar/set of rules on the Web but I couldn't find
On 4 August 2010 23:58, David Gerard dger...@gmail.com wrote:
On 4 August 2010 20:45, lmhelp lm...@wanadoo.fr wrote:
- Is a grammar available somewhere?
- Do you have any idea how to extract the first paragaph of a Wiki article?
- Any advice?
- Does a Java Wikitext parser exists which would
24 matches
Mail list logo