Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-06 Thread Dmitriy Sintsov
* George Herbert george.herb...@gmail.com [Wed, 5 Jan 2011 19:52:18 
-0800]:
 On Wed, Jan 5, 2011 at 7:37 PM, Jay Ashworth j...@baylink.com wrote:
   Original Message -
  From: Daniel Kinzler dan...@brightbyte.de
 
  On 05.01.2011 05:25, Jay Ashworth wrote:
   I believe the snap reaction here is you haven't tried to diff 
XML,
   have you?
 
  A text-based diff of XML sucks, but how about a DOM based
 (structural)
  diff?
 
  Sure, but how much more processor horsepower is that going to take.
 
  Scale is a driver in Mediawiki, for obvious reasons.

 I suspect that diffs are relatively rare events in the day to day WMF
 processing, though non-trivial.

 That said, and as much of a fan of some sort of conceptually object
 oriented page data approach... DOM?  Really??

 We're not trying to do 99% of what that does; we just need object /
 element contents, style and perhaps minimal other attributes, and
 order within a page.


DOM manipulation at templates level is not a bad thing. Also that could 
be partially unified with parsing because trees are used there as well. 
I just hope there is a chance to have XML to wikitext mapping (at least 
partially compatible in basic markups).
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-06 Thread Jay Ashworth
- Original Message -
 From: George Herbert george.herb...@gmail.com

  A text-based diff of XML sucks, but how about a DOM based
  (structural)
  diff?
 
  Sure, but how much more processor horsepower is that going to take.
 
  Scale is a driver in Mediawiki, for obvious reasons.
 
 I suspect that diffs are relatively rare events in the day to day WMF
 processing, though non-trivial.

Every single time you make an edit, unless I badly misunderstand the current 
architecture; that's how it's possible for multiple people editing the 
same article not to collide unless their edits actually collide at the
paragraph level.

Not to mention pulling old versions.

Can someone who knows the current code better than me confirm or deny?

Cheers,
-- jra

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-06 Thread Brion Vibber
On Thu, Jan 6, 2011 at 11:01 AM, Jay Ashworth j...@baylink.com wrote:

 - Original Message -
  From: George Herbert george.herb...@gmail.com

   A text-based diff of XML sucks, but how about a DOM based
   (structural)
   diff?
  
   Sure, but how much more processor horsepower is that going to take.
  
   Scale is a driver in Mediawiki, for obvious reasons.
 
  I suspect that diffs are relatively rare events in the day to day WMF
  processing, though non-trivial.

 Every single time you make an edit, unless I badly misunderstand the
 current
 architecture; that's how it's possible for multiple people editing the
 same article not to collide unless their edits actually collide at the
 paragraph level.

 Not to mention pulling old versions.

 Can someone who knows the current code better than me confirm or deny?


There's a few separate issues mixed up here, I think.


First: diffs for viewing and the external diff3 merging for resolving edit
conflicts are actually unrelated code paths and use separate diff engines.
(Nor does diff3 get used at all unless there actually is a conflict to
resolve -- if nobody else edited since your change, it's not called.)


Second: the notion that diffing a structured document must inherently be
very slow is, I think, not right.

A well-structured document should be pretty diff-friendly actually; our
diffs are already working on two separate levels (paragraphs as a whole,
then words within matched paragraphs). In the most common cases, the diffing
might actually work pretty much the same -- look for nodes that match, then
move on to nodes that don't; within changed nodes, look for sub-nodes that
can be highlighted. Comparisons between nodes may be slower than straight
strings, but the basic algorithms don't need to be hugely different, and the
implementation can be in heavily-optimized C++ just like our text diffs are
today.


Third: the most common diff view cases are likely adjacent revisions of
recent edits, which smells like cache. :) Heck, these could be made once and
then simply *stored*, never needing to be recalculated again.


Fourth: the notion that diffing structured documents would be overwhelming
for the entire Wikimedia infrastructure... even if we assume such diffs are
much slower, I think this is not really an issue compared to the huge CPU
savings that it could bring elsewhere.

The biggest user of CPU has long been parsing and re-parsing of wikitext.
Every time someone comes along with different view preferences, we have to
parse again. Every time a template or image changes, we have to parse again.
Every time there's an edit, we have to parse again. Every time something
fell out of cache, we have to parse again.

And that parsing is *really expensive* on large, complex pages. Much of the
history of MediaWiki's parser development has been in figuring out how to
avoid parsing quite as much, or setting limits to keep the worst corner
cases from bringing down the server farm.

We parse *way*, *way* more than we diff.


Part of what makes these things slow is that we have to do a lot of work
from scratch every time, and we have to do it in slow PHP code, and we have
to keep going back and fetching more stuff halfway through. Expanding
templates can change the document structure at the next parsing level, so
referenced files and templates have to be fetched or recalculated, often one
at a time because it's hard to batch up a list of everything we need at
once.

I think there would be some very valuable savings to using a document model
that can be stored in a machine-readable way up front. A data structure that
can be described as JSON or XML (for examples) allows leaving the low-level
how do I turn a string into a structure details to highly-tuned native C
code. A document model that is easily traversed and mapped to/from
hierarchical HTML allows code to process just the parts of the document it
needs at any given time, and would make it easier to share intermediate data
between variants if that's still needed.

In some cases, work that is today done in the 'parser' could even be done by
client-side JavaScript (on supporting user-agents), moving little bits of
work from the server farm (where CPU time is vast but sharply limited) to
end-user browsers (where there's often a local surplus -- CPU's not doing
much while it's waiting on the network to transfer big JPEG images).


It may be easier to prototype a lot of this outside of MediaWiki, though, or
in specific areas such as media or interactive extensions, before we all go
trying to redo the full core.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-06 Thread George Herbert
On Thu, Jan 6, 2011 at 11:38 AM, Brion Vibber br...@pobox.com wrote:
 On Thu, Jan 6, 2011 at 11:01 AM, Jay Ashworth j...@baylink.com wrote:
  From: George Herbert george.herb...@gmail.com
  I suspect that diffs are relatively rare events in the day to day WMF
  processing, though non-trivial.

 Every single time you make an edit, unless I badly misunderstand the
 current
 architecture; that's how it's possible for multiple people editing the
 same article not to collide unless their edits actually collide at the
 paragraph level.

 Not to mention pulling old versions.

 Can someone who knows the current code better than me confirm or deny?


 There's a few separate issues mixed up here, I think.


 First: diffs for viewing and the external diff3 merging for resolving edit
 conflicts are actually unrelated code paths and use separate diff engines.
 (Nor does diff3 get used at all unless there actually is a conflict to
 resolve -- if nobody else edited since your change, it's not called.)


 Second: the notion that diffing a structured document must inherently be
 very slow is, I think, not right.

 A well-structured document should be pretty diff-friendly actually; our
 diffs are already working on two separate levels (paragraphs as a whole,
 then words within matched paragraphs). In the most common cases, the diffing
 might actually work pretty much the same -- look for nodes that match, then
 move on to nodes that don't; within changed nodes, look for sub-nodes that
 can be highlighted. Comparisons between nodes may be slower than straight
 strings, but the basic algorithms don't need to be hugely different, and the
 implementation can be in heavily-optimized C++ just like our text diffs are
 today.


 Third: the most common diff view cases are likely adjacent revisions of
 recent edits, which smells like cache. :) Heck, these could be made once and
 then simply *stored*, never needing to be recalculated again.


 Fourth: the notion that diffing structured documents would be overwhelming
 for the entire Wikimedia infrastructure... even if we assume such diffs are
 much slower, I think this is not really an issue compared to the huge CPU
 savings that it could bring elsewhere.

 The biggest user of CPU has long been parsing and re-parsing of wikitext.
 Every time someone comes along with different view preferences, we have to
 parse again. Every time a template or image changes, we have to parse again.
 Every time there's an edit, we have to parse again. Every time something
 fell out of cache, we have to parse again.

 And that parsing is *really expensive* on large, complex pages. Much of the
 history of MediaWiki's parser development has been in figuring out how to
 avoid parsing quite as much, or setting limits to keep the worst corner
 cases from bringing down the server farm.

 We parse *way*, *way* more than we diff.
[...]

Even if we diff on average 2-3x per edit, we're only doing order ten
edits a second across the projects, right?  Not going to dig up the
current stats, but that's what I remember from last time I looked.

So; priority remains parser and actual used syntax cleanup, from a
sanity point of view (being able to describe the syntax usefully, and
in a way that allows multiple parsers to be written), with diff
management as a distant low-impact priority...


-- 
-george william herbert
george.herb...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-06 Thread Roan Kattouw
2011/1/6 Brion Vibber br...@pobox.com:
 Third: the most common diff view cases are likely adjacent revisions of
 recent edits, which smells like cache. :) Heck, these could be made once and
 then simply *stored*, never needing to be recalculated again.

We already do this for text diffs between revisions, we cache them in memcached.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread Daniel Kinzler
On 05.01.2011 05:25, Jay Ashworth wrote:
 I believe the snap reaction here is you haven't tried to diff XML, have you?

A text-based diff of XML sucks, but how about a DOM based (structural) diff?

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread Daniel Friesen
On 11-01-05 02:09 AM, Daniel Kinzler wrote:
 On 05.01.2011 05:25, Jay Ashworth wrote:
 I believe the snap reaction here is you haven't tried to diff XML, have you?
 A text-based diff of XML sucks, but how about a DOM based (structural) diff?

 -- daniel
I don't think a discussion on diff comparison of XML has much point.

I believe the idea floating around here (or at least the idea I'm 
thinking of based on these discussions) is that we would store page text 
in an xml format or a serialized php format or something else where 
contents are semantically noted with things like 'template 
title=Template:Fooparam name=1.../paramparam 
name=foobar/param/templateiThis is italic/ilink 
internal=true title=FooBarFooBar/link', to actually edit this 
page content we provide the data in multiple formats:
- Fully parsed output for page viewing
- A semantically marked up version of the html that is compatible with 
the use of a WYSIWYG editor and can be converted back to the xml format 
and then saved
- A WikiText like format similar to the WikiText we already have that 
users can edit in plaintext, we use the xml and covert it into that 
format, and then when the user saves parse that back into the xml format.

Naturally, if we're doing things like this, then rather than diffing the 
ugly xml, the natural thing would most likely be to take the xml format 
of both pages, convert it into that WikiText-like plaintext format and 
show the user a diff of that so they know what meaningful changes were 
made to the page.
If you really wanted to, you could also show them a diff of the end html 
as an option, but that's fairly pointless.

As an extra bonus, besides enabling WYSIWYG, having that xml format also 
has a good chance of making efforts of giving users an in-page diff 
marking up what was actually changed in the contents itself much easier.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


-- 
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread David Gerard
On 5 January 2011 22:16, Ryan Kaldari rkald...@wikimedia.org wrote:

 Having XML-based content would also enable a wide variety of new re-uses
 of Wikimedia content. People could build all sorts of custom apps,
 games, feeds, etc., without having to worry about broken syntax or
 resorting to screen scraping (like we do for our mobile site). It would
 also make implementing semantic features easier and thus could improve
 our search capabilities. Plus it makes a great Bloody Mary!


Before we go haring off - what would be *really* nice would be getting
Magnus' WYSIFTW developed to a stage where it's fit to put in front of
nontechnical users and do some decent usability testing:

http://meta.wikimedia.org/wiki/WYSIFTW

Magnus does this stuff in his spare time. and has to get back to
actual work - but there's a list of needed features (which of course
anyone can add to) and I know he very much welcomes other people
hacking on it.

It's not ready for prime time yet, but it's one of the most promising
approaches I've seen in a while.

(And the nice thing about WYSIFTW is that it requires *no* action on
server side - the only thing it needs right now is to be developed to
a state where it can be usability-tested.)


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread George Herbert
I just started testing WYSIWTF; I would like to encourage as many
other people on this list to do so as well.


On Wed, Jan 5, 2011 at 2:22 PM, David Gerard dger...@gmail.com wrote:
 On 5 January 2011 22:16, Ryan Kaldari rkald...@wikimedia.org wrote:

 Having XML-based content would also enable a wide variety of new re-uses
 of Wikimedia content. People could build all sorts of custom apps,
 games, feeds, etc., without having to worry about broken syntax or
 resorting to screen scraping (like we do for our mobile site). It would
 also make implementing semantic features easier and thus could improve
 our search capabilities. Plus it makes a great Bloody Mary!


 Before we go haring off - what would be *really* nice would be getting
 Magnus' WYSIFTW developed to a stage where it's fit to put in front of
 nontechnical users and do some decent usability testing:

 http://meta.wikimedia.org/wiki/WYSIFTW

 Magnus does this stuff in his spare time. and has to get back to
 actual work - but there's a list of needed features (which of course
 anyone can add to) and I know he very much welcomes other people
 hacking on it.

 It's not ready for prime time yet, but it's one of the most promising
 approaches I've seen in a while.

 (And the nice thing about WYSIFTW is that it requires *no* action on
 server side - the only thing it needs right now is to be developed to
 a state where it can be usability-tested.)


 - d.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
-george william herbert
george.herb...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread David Gerard
On 5 January 2011 22:47, George Herbert george.herb...@gmail.com wrote:

 I just started testing WYSIWTF; I would like to encourage as many
 other people on this list to do so as well.


It's not even close to finished - but the more features we can add and
the more bugs we can find, the closer to a proper usability test we
are.

Devs! Please give it a go! Please report problems!


- d.



 On Wed, Jan 5, 2011 at 2:22 PM, David Gerard dger...@gmail.com wrote:

 http://meta.wikimedia.org/wiki/WYSIFTW
 Magnus does this stuff in his spare time. and has to get back to
 actual work - but there's a list of needed features (which of course
 anyone can add to) and I know he very much welcomes other people
 hacking on it.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread Jay Ashworth
 Original Message -
 From: Daniel Kinzler dan...@brightbyte.de

 On 05.01.2011 05:25, Jay Ashworth wrote:
  I believe the snap reaction here is you haven't tried to diff XML,
  have you?
 
 A text-based diff of XML sucks, but how about a DOM based (structural)
 diff?

Sure, but how much more processor horsepower is that going to take.

Scale is a driver in Mediawiki, for obvious reasons.

Cheers,
-- jra

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-05 Thread George Herbert
On Wed, Jan 5, 2011 at 7:37 PM, Jay Ashworth j...@baylink.com wrote:
  Original Message -
 From: Daniel Kinzler dan...@brightbyte.de

 On 05.01.2011 05:25, Jay Ashworth wrote:
  I believe the snap reaction here is you haven't tried to diff XML,
  have you?

 A text-based diff of XML sucks, but how about a DOM based (structural)
 diff?

 Sure, but how much more processor horsepower is that going to take.

 Scale is a driver in Mediawiki, for obvious reasons.

I suspect that diffs are relatively rare events in the day to day WMF
processing, though non-trivial.

That said, and as much of a fan of some sort of conceptually object
oriented page data approach... DOM?  Really??

We're not trying to do 99% of what that does; we just need object /
element contents, style and perhaps minimal other attributes, and
order within a page.


-- 
-george william herbert
george.herb...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-04 Thread Alex Brollo
I apologyze, I sent an empty reply. :-(

Just a brief comment: there's no need of seaching for a perfect wiki
syntax, since it exists: it's the present model of well formed markup, t.i.
xml.

While digging into subtler troubles from wiki syntax, t.i. difficulties in
parsing it by scripts or understanding fuzzy behavior of the code, I always
find a trouble coming from tha simple fact, that wiki is a markup that isn't
intrinsecally well formed - it doen't respect the simple, basic rules of a
well formed syntax:  strict and evident rules about beginning-ending of a
modifier; no mixing of attributes and content inside its tags, t.i.
templates.

In part, wiki markup can be hacked to take a step forward; I'm using more
and more well formed templates, splitted into two parts, a starting
template and an ending template. Just a banal example: it.source users
are encouraged to use {{Centrato!l=20em}} text .../div syntax, where
text - as you see - is outside the template, while the usual
syntax {{Centrato| text ... |l=20em}} mixes tags and contents (Centrato
is Italian name of center and l attribute states the width of centered
div). I find such a trick extremely useful when parsind text, since - as
follows by the use of a well-formed marckup - I can retrieve the whole text
simply removing any template code and any html tag; an impossible task using
the common not well formed  syntax, where nothing tells about the nature
of parameters: they only can be classified by human understanding of the
template code or by the whole body of wiki parser.

Alex
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-04 Thread Jay Ashworth
- Original Message -
 From: Alex Brollo alex.bro...@gmail.com

 Just a brief comment: there's no need of seaching for a perfect wiki
 syntax, since it exists: it's the present model of well formed
 markup, t.i. xml.

I believe the snap reaction here is you haven't tried to diff XML, have you?

My personal snap reaction is that the increase in cycles necessary to process
XML in both directions, *multiplied by the number of machines in WMF data 
center* will make XML impractical, but I'm not a WMF engineer.

Cheers,
-- jra

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread Ryan Kaldari
The perfect wiki syntax would be XML (at least behind the scenes). Then 
people could use whatever syntax they want and have it easily translated 
via XSLT.

Ryan Kaldari

On 1/1/11 9:51 AM, lampak wrote:
 I've been following the discussion and as I can see it's already become
 rather unproductive*. So I hope my cutting in will not be very much out
 of place (even if I don't really know what I'm talking about).

 Many people here has stated the main reason why a WYSIWYG editor is not
 feasible is the current wikitext syntax.

 What's actually wrong with it?

 The main thing I can thing of is the fact one template may include an
 opening of a table etc. and another one a closing (e.g. {{col-begin}},
 {{col-end}}). It makes it impossible to isolate the template from the
 rest of the article - draw a frame around it, say this box here is a
 template.

 It could be fixed by forbidding leaving unclosed tags in templates. As a
 replacement, a kind of foreach loop could be introduced to iterate
 through an unspecified number of arguments.

 Lack of standardisation has also been mentioned. Something else?

 I've tried to think how a perfect parser should work. Most of this has
 been already mentioned. I think it should work in two steps: first
 tokenise the code and transform it into an intermediate tree structure like
  *paragraph
  title:
* plain text: Section 1
  content:
* plain text: foo
* bold text:
  * plain text: bar
* template
  name: Infobox
  * argument
name: last name:
value:
* plain text: Shakespear
 and so on. Then this structure could be transformed into a) HTML for
 display, b) JSON for the WYSIWYG editor. Thanks for this you wouldn't
 need to write a whole new JS parser. The editor would get a half-ready
 product. The JS code would need to be able to: a) transform this
 structure into HTML, b) modify the structure, c) transform this
 structure back into wikitext.

 But I guess it's more realistic to write a new JS parser than to write a
 new PHP parser. The former can start as a stub, the latter would need to
 be fully operational from the beginning.

 Stephanie's suggestions are also interesting.

 lampak

 * (except the WYSIWTF, of course)


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread George Herbert
On Sun, Jan 2, 2011 at 6:28 AM, Jay Ashworth j...@baylink.com wrote:
 [...]
 This has been done a dozen times in the last 5 years, lampak.  The short
 version, as much as *I* am displeased with the fact that we'll never have
 *bold*, /italic/ and _underscore_, is that the installed base, both of
 articles and editors, means that Mediawikitext will never change.


That we've multiply concluded that it will never change doesn't mean
it won't; as a thought exercise, as I suggested in OtherThread, we
should consider negating that conclusion and seeing what happens.


-- 
-george william herbert
george.herb...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread Rob Lanphier
On Mon, Jan 3, 2011 at 4:59 PM, George Herbert george.herb...@gmail.com wrote:
 That we've multiply concluded that it will never change doesn't mean
 it won't; as a thought exercise, as I suggested in OtherThread, we
 should consider negating that conclusion and seeing what happens.

Agreed.  I think part of the problem in the past is that the
conversation generally focused on the actual syntax, and not enough on
the incremental changes that we can make to MediaWiki to make this
happen.

If, for example, we can build some sort of per-revision indicator of
markup language (sort of similar to mime type) which would let us
support multiple parsers on the same wiki, then it would be possible
to build alternate parsers that people could try out on a per-article
basis (and more importantly, revert if it doesn't pan out).  The
thousands of MediaWiki installs could try out different syntax
options, and maybe a clear winner would emerge.

Rob

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread Chad
On Mon, Jan 3, 2011 at 8:41 PM, Rob Lanphier ro...@wikimedia.org wrote:
 If, for example, we can build some sort of per-revision indicator of
 markup language (sort of similar to mime type) which would let us
 support multiple parsers on the same wiki, then it would be possible
 to build alternate parsers that people could try out on a per-article
 basis (and more importantly, revert if it doesn't pan out).  The
 thousands of MediaWiki installs could try out different syntax
 options, and maybe a clear winner would emerge.


Or you end up supporting 5 different parsers that people like
for slightly different reasons :)

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread Rob Lanphier
On Mon, Jan 3, 2011 at 5:54 PM, Chad innocentkil...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 8:41 PM, Rob Lanphier ro...@wikimedia.org wrote:
 If, for example, we can build some sort of per-revision indicator of
 markup language (sort of similar to mime type) which would let us
 support multiple parsers on the same wiki, then it would be possible
 to build alternate parsers that people could try out on a per-article
 basis (and more importantly, revert if it doesn't pan out).  The
 thousands of MediaWiki installs could try out different syntax
 options, and maybe a clear winner would emerge.

 Or you end up supporting 5 different parsers that people like
 for slightly different reasons :)

Yup, that would definitely be a strong possibility without a
disciplined approach.  However, done correctly, killing off fringe
parsers on a particular wiki would be fairly easy to do.  Just because
the underlying wiki engine allows for 5 different parsers, doesn't
mean a particular wiki would need to allow the creation of new pages
or new revisions using any of the 5.  If we build the tools that allow
admins some ability to constrain the choices, it doesn't have to get
too out of hand on a particular wiki.

If we were to go down this development path, we'd need to commit ahead
of time to be pretty stingy about what we bless as a supported
parser, and brutal about killing off support for outdated parsers.

Rob

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-03 Thread Alex Brollo
2011/1/4 Rob Lanphier ro...@robla.net

 On Mon, Jan 3, 2011 at 5:54 PM, Chad innocentkil...@gmail.com wrote:
  On Mon, Jan 3, 2011 at 8:41 PM, Rob Lanphier ro...@wikimedia.org
 wrote:
  If, for example, we can build some sort of per-revision indicator of
  markup language (sort of similar to mime type) which would let us
  support multiple parsers on the same wiki, then it would be possible
  to build alternate parsers that people could try out on a per-article
  basis (and more importantly, revert if it doesn't pan out).  The
  thousands of MediaWiki installs could try out different syntax
  options, and maybe a clear winner would emerge.
 
  Or you end up supporting 5 different parsers that people like
  for slightly different reasons :)

 Yup, that would definitely be a strong possibility without a
 disciplined approach.  However, done correctly, killing off fringe
 parsers on a particular wiki would be fairly easy to do.  Just because
 the underlying wiki engine allows for 5 different parsers, doesn't
 mean a particular wiki would need to allow the creation of new pages
 or new revisions using any of the 5.  If we build the tools that allow
 admins some ability to constrain the choices, it doesn't have to get
 too out of hand on a particular wiki.

 If we were to go down this development path, we'd need to commit ahead
 of time to be pretty stingy about what we bless as a supported
 parser, and brutal about killing off support for outdated parsers.

 Rob

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-02 Thread Jay Ashworth
- Original Message -
 From: lampak llam...@gmail.com

 I've been following the discussion and as I can see it's already
 become rather unproductive*. So I hope my cutting in will not be very much
 out of place (even if I don't really know what I'm talking about).
 
 Many people here has stated the main reason why a WYSIWYG editor is
 not feasible is the current wikitext syntax.
 
 What's actually wrong with it?

Oh god!  *Run*!

:-)

This has been done a dozen times in the last 5 years, lampak.  The short
version, as much as *I* am displeased with the fact that we'll never have
*bold*, /italic/ and _underscore_, is that the installed base, both of 
articles and editors, means that Mediawikitext will never change.

It *might* be possible to *extend* it, but that requires that at least
one of the 94 projects to write a formally defined parser for it, in 
something resembling yacc, would have to complete -- and to my knowledge, 
none has done so.

Cheers,
-- jra

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-01 Thread lampak
I've been following the discussion and as I can see it's already become
rather unproductive*. So I hope my cutting in will not be very much out
of place (even if I don't really know what I'm talking about).

Many people here has stated the main reason why a WYSIWYG editor is not
feasible is the current wikitext syntax.

What's actually wrong with it?

The main thing I can thing of is the fact one template may include an
opening of a table etc. and another one a closing (e.g. {{col-begin}},
{{col-end}}). It makes it impossible to isolate the template from the
rest of the article - draw a frame around it, say this box here is a
template.

It could be fixed by forbidding leaving unclosed tags in templates. As a
replacement, a kind of foreach loop could be introduced to iterate
through an unspecified number of arguments.

Lack of standardisation has also been mentioned. Something else?

I've tried to think how a perfect parser should work. Most of this has
been already mentioned. I think it should work in two steps: first
tokenise the code and transform it into an intermediate tree structure like
*paragraph
title:
  * plain text: Section 1
content:
  * plain text: foo
  * bold text:
* plain text: bar
  * template
name: Infobox
* argument
  name: last name:
  value:
  * plain text: Shakespear
and so on. Then this structure could be transformed into a) HTML for
display, b) JSON for the WYSIWYG editor. Thanks for this you wouldn't
need to write a whole new JS parser. The editor would get a half-ready
product. The JS code would need to be able to: a) transform this
structure into HTML, b) modify the structure, c) transform this
structure back into wikitext.

But I guess it's more realistic to write a new JS parser than to write a
new PHP parser. The former can start as a stub, the latter would need to
be fully operational from the beginning.

Stephanie's suggestions are also interesting.

lampak

* (except the WYSIWTF, of course)


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)

2011-01-01 Thread Roan Kattouw
2011/1/1 lampak llam...@gmail.com:
 It could be fixed by forbidding leaving unclosed tags in templates.
[...]
 I've tried to think how a perfect parser should work. Most of this has
 been already mentioned. I think it should work in two steps: first
 tokenise the code and transform it into an intermediate tree structure like
[...]
 and so on. Then this structure could be transformed into a) HTML for
 display, b) JSON for the WYSIWYG editor. Thanks for this you wouldn't
 need to write a whole new JS parser. The editor would get a half-ready
 product. The JS code would need to be able to: a) transform this
 structure into HTML, b) modify the structure, c) transform this
 structure back into wikitext.

Trevor Parscal already has a proof-of-concept parser that follows this
philosophy pretty much to the letter. I don't think it's in our SVN
repository yet (he said he would commit it some time ago) and I
haven't succeeded in convincing him to reply on this list (holidays, I
guess), but he's been playing around for it for about nine months now,
on and off, and from what I've heard and seen it's promising and
entirely in the spirit of your post.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l