My view is we should “just do it”. If we wait until DocFormats is “stable” 
it’ll never get done, because it’s going to take me somewhere between a few 
weeks and a few months (depending on how my other commitments go) to fully 
solve the separation of change detection and change application, and I have 
lots of work to do on the  editor as well (specifically, getting it into a 
state where it can be used in a web browser or embedded webview, outside the 
context of UX Write).

Similarly, design-by-committee style discussions on how to handle feature 
mapping/etc is only going to delay things even further, and be uninformed by 
implementation experience. If we start on the code now, we’ll both have at 
least some useful progress (even if it needs to be replaced or significantly 
refactored later), and we’ll learn things from the experience that will greatly 
help us in decision making.

I suggest we start with following the current design of the Word filter. I want 
to try and document this design ASAP, but am trying to juggle this with other 
work. In particular I think there are some key conceptual issues I haven’t 
adequately explained properly about the library. Some relevant questions would 
be good to give me starting points for explanation/documentation.

What I’ll do as soon as I get the chance (hopefully next week) is to lay out a 
skeleton of the code in the repository, and put TODO comments everywhere, with 
explanations and references to the corresponding code in the Word filter. This 
initial skeleton will give us the most basic form of conversion (text only, 
with no formatting or special objects), and also illustrate how non-supported 
features are handled (no support for tables yet? no problem - the update code 
just doesn’t touch them). I’ll try to incrementally document both the Word 
filter and the general design approach incrementally as I go along; and will do 
that on the mailing list in the hope of prompting questions and discussions 
about improvements or alternative designs.

What I’d ultimately like to do - and this is a big thing - is to design a 
programming language specifically designed at implementing bidirectional 
transformations for tree-structured data, combined with built-in parsing 
capability that handles both XML and other arbitrary formal grammars. The 
parser will be based on PEG and Packrat parsing, and in terms of the 
transformation language I’m thinking something along the likes of Stratego/XT 
(check it out if you haven’t seen it yet; it’s utterly brilliant) but with 
static typing, similarly to Haskell. I anticipate this greatly simplifying the 
implementation of filters; but it’ll likely be a year or more before we have 
something usable on this and at that time we can rewrite filters to fit within 
this if appropriate. Also this is something which will require input from 
everyone on the project, which means lots of prototypes and discussion.

So let’s not wait for stability. That’s going to take too long, possibly 
postponing any new filter development indefinitely. The necessary pieces are 
already in place for us to begin work on the ODF filter, and I think it’ll be 
much more practical to deal with problems as we encounter them rather than 
trying to get a perfect “base” beforehand.

--
Dr. Peter M. Kelly
[email protected]
http://www.kellypmk.net/

PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

> On 12 Feb 2015, at 12:45 am, Dennis E. Hamilton <[email protected]> 
> wrote:
> 
> -- replying below to --
> From: Louis Suárez-Potts [mailto:[email protected]] 
> Sent: Wednesday, February 11, 2015 06:04
> To: [email protected]; Dennis E. Hamilton
> Subject: Re: UnRTF Makes HTML
> [ ... ]
> 
> Perhaps we need to return, however, to our roadmap ambition. For instance, 
> what kind of plans do we have regarding ODF support? If we think it is time 
> to return to roadmap discussions, let’s start a new thread on that subject 
> and focus, yes?
> 
> <orcmid> A start?
> 
> I'm not so sure about plans, since it depends on where the developer effort 
> comes from, but I can see some definition happening.
> 
> 1. I don't think the code base around DocFormats and the HTML in and out is 
> quite stable yet.  Let's assume it is declared stable enough with acceptable 
> interface/architectural boundaries.
> 
> 2. Then we know that there needs to be an ODF access component and an ODF 
> edit/create component.
> 
> 3. With regard to feature support, there needs to be an agreement on how 
> features not supported through Corinthia are to be dealt with.  There are two 
> cases - features that cannot round-trip safely through the HTML, an features 
> that are not even mapped to or from the HTML.  This is an iterative cycle.
> 
> 4. Presumably, the feature set at the HTML level for editing of OOXML should 
> be the target at any point for ODF also.  So we know what the HTML case is 
> and have the equivalent ODF features target those cases should be a feasible 
> way of tracking with the evolution of Corinthia and DocFormats.
> 
> 5. I don't know if there is any source-target capability intended.  That is, 
> ODF -> HTML -> OOXML and vice versa.  That makes for nice testing cases 
> though.  It may serve the other Corinthia effort that is not being discussed, 
> the profiling of document-format provisions in implemented documents.
> 
> Is this enough to get the ball rolling?
> 
> </orcmid>
> 

Reply via email to