[
https://issues.apache.org/jira/browse/COR-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282084#comment-14282084
]
Peter Kelly commented on COR-35:
--------------------------------
I have a working prototype of this algorithm in Python, but am still working
out a few issues. Once I've solved these I will implement the algorithm based
in C within DocFormats itself, and write a paper on it. Will write more soon;
I'm mostly just putting this issue here to flag it's something that I'm working
on and will have a fundamental impact on the API and the ease with which one
can develop filters.
> Have put operation accept changes, not updated view
> ---------------------------------------------------
>
> Key: COR-35
> URL: https://issues.apache.org/jira/browse/COR-35
> Project: Corinthia
> Issue Type: Improvement
> Components: DocFormats - API
> Reporter: Peter Kelly
> Assignee: Peter Kelly
>
> The three canonical operations for bidirectional transformations are:
> get(S) -> V
> put(S,V') -> S
> create(V) -> S
> where S is a "source", or "concrete" document (in our case a "third party"
> file format like .docx, .odt, or .md) and V is a "view", or "abstract"
> document (in our case, HTML). In the notation I've used, V is the original
> view created from the file, and V' is a modified version of the view obtained
> from the editor.
> The current implementation of put, at least in the Word filter (which is the
> only one that implements it so far) does what effectively amounts to a
> reconstruction of the original V produced during the get, combined with a
> comparison of that V with V' (the edited version). This comparison is,
> however, done in a rather ad-hoc and poorly-defined manner, and is something
> that would need to be replicated by every other filter.
> Essentially, the put operation has two jobs: 1) determine the set of changes
> that were made to the view - that is, D = diff(V,V'), and 2) apply those
> changes to the source document - that is, patch(S,D). The patch operation
> uses changes expressed in terms of the view's data model (HTML) to figure out
> how it must update the concrete representation.
> To simplify filter implementation, we can separate out these two tasks, so
> that only the latter has to be performed by the put operation in each filter.
> This requires a diff algorithm that works on HTML files, and a ways of
> representing the changes to the file. Once these exist, implementation of a
> patch algorithm will be relatively straightforward - it just consists of
> taking the original document V and executing the operations in sequence to
> produce S'. Thus, the relationship between the functions are as follows,
> assuming V is the original view, V' is the modified view produced by the
> editor, and D is the diff, or rather a sequence of change operations which
> mutate the DOM tree.
> diff(V,V') -> D
> patch(V,D) -> V'
> A line-by-line or similar linear diff is insufficient, as it does not take
> into account the tree structure of the document. Whereas a line-by-line diff
> consists of a set of insert/delete operations that work on a list/array, we
> need an algorithm that produces a set of operations that work on a tree.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)