In current leo, the data that could be associated with a vnode within
the leo file is the following (taken from leoNodes.py):
# Archived...
clonedBit = 0x01 # True: vnode has clone mark.
# not used = 0x02
expandedBit = 0x04 # True: vnode is expanded.
markedBit = 0x08 # True: vnode is marked
orphanBit = 0x10 # True: vnode saved in .leo file, not derived file.
selectedBit = 0x20 # True: vnode is current vnode.
topBit = 0x40 # True: vnode was top vnode when saved.
Arbitrary further attributes could be added by plugins.
The proposal for "unified nodes" moves this data to the tnode. E.g.,
the clonedBit is derived from the tnode (i.e., lenght of "parent"
list > 1). The expanded bit becomes an attribute of the tnode instead
of the vnode, and as I had foreseen, all nodes with the same tnode get
expanded and contracted in concert.
in the new regime, it may be beneficial to continue attaching saved
information for the directed link between one node and another,
although perhaps in a sparser manner than currently: instead of
putting the entire expanded tree in this new form of leo file, "link"
data could default to "unexpanded, unmarked,unselected,nottop", and a
link data item could be written for the exceptions. What identifies a
link? *Not* the full path from the (implicit) root node,that is.,
*not* a position. Instead, only the node from which it proceeds
(either a normal node or the inherent "root" node) and the index of
the child to which it proceeds (never the inherent root node). Nodes
are uniquely identified by their gnx, so such a element would look
like
<link from="gnx1" cx="4" a="EMOTV" unknAtrr1="foo" />
Leo would check that that the gnx1 node is defined within the graph
and that there are a sufficient number of children for that node, or
report an error in the .leo file.
Nodes would look like the current tnodes, but with child lists. As I
noted earlier, a node is defined by
node_data # header, body
ordered_list_of_children # references to other nodes
which could be represented by an element similar to the current <t>,
but with the newly consolidated data: the gnx, the child list, the
headline, and the body:
<node tx="gnx1" cl="gnx2,gnx3,gnx3" h="The headline">The body
</node>
Alternatively,
<node tx="gnx1" cl="gnx2,gnx3,gnx3">
<h>The headline</h>
<b>The body
</b>
</node>
Leo would need to assure that child nodes were defined, or generate a
warning that the leo file was corrupt and that it had created minimal
nodes to satisfy, something like "Missing Headline" and and empty
body. I leave it as an exercise for the reader to find the algorithm
assuring that the nodes describe a DAG and not a generalized graph ;-}
- Stephen
On Apr 14, 3:29 pm, derwisch <[EMAIL PROTECTED]>
wrote:
> Edward, I am very glad you are sharing your thoughts in this depth.
> At the same time I still feel stymied as I just sent in the slides,
> where it is basically stated that Leo's data structure basically
> reflects
> ODM, that Leo is being developed for 10 years and has matured,
> is stable, unlikely to undergo big changes etc. It seems like the
> removal to Google Groups and Launchpad has re-vitalised the
> project (not that it ever felt dead anyway) and spurred not only
> development, but the thinking abot the general data model. Again,
> I appreciate that you are helping existing projects to not fall behind
> the wayside. Of course I always saw some redundancy in the
> current data model, but I mostly failed to distinguish between p
> and vnodes.
>
> On 14 Apr., 16:32, "Edward K. Ream" <[EMAIL PROTECTED]> wrote:
>
> > On Feb 26, 3:15 pm, derwisch <[EMAIL PROTECTED]>
> > wrote:
>
> [...]
> > As I understand it, you propose to create clones of data, like this:
>
> > - a(1)
> > - a(2)
>
> > And then you distinguish between a(1) and a(2) using
> > v.unknownAttributes. Imo this is very bad style. You can not be
> > blamed: the fault is Leo's for giving you two flavors of uA's.
>
> A rose by any other name. If you look at the ODM you will see
> that there is a distinction between --Ref and --Def elements, and
> that some attributes are peculiar to the reference and some to
> the definition. You may refer to it as bad style but I was elated
> to see this structure mirrored by Leo.
>
>
>
> > A much better style would be the following. It is based on the
> > observation that view nodes are *not* clones, they *contain* clones.
> > So the organization would be:
>
> > - trial (summary view)
> > - common trial data(clone)
> > - trial view 1
> > - common trial data (clone)
> > - data local to trial 1
> > - trial view 2
> > - common trial data (clone)
> > - data local to trial 2
>
> It is quite obvious to me that you can still model DAGs with
> the data model to be. You just need auxiliary nodes, somehow
> like the --Ref nodes, which currently can be abstracted away.
>
> The question is, how do I hide the auxiliary nodes from the
> user. I would really like to preserve the tree view which is
> for instance seen in the screenshot of this
> offer:http://www.xml4pharma.com/SDTM-ETL/
>
> The other suggestion was outlined by you and Terry in the
> parallel thread: to somehow annotate the position during a
> traversal of the tree. You call that easy, but then you are a
> programmer.
>
> > Furthermore, you can create clones of the 'trial view 1' and 'trial
> > view 2' nodes and put those clones anywhere you like in the outline.
> > My guess is that this kind of organization gives you much more
> > flexibility than you had before. You can attach a uA to any of the
> > nodes, and there will be no need ever to distinguish what the uA
> > contains based on the location of the node, or whether it is a clone
> > or not, or on any other criterion except what the node *is*.
>
> > So we see that distinguishing between vnodes and tnodes naturally
> > leads people to *bad* style. This kind of mistake will simply not be
> > possible to make in the unified node world.
>
> > 2. Point 1 also shows why I am not enthusiastic about the graph world,
> > even if some low-level impediments will be removed in the unified-node
> > world. Indeed, **views do not exist in the graph world**. Or rather,
> > if they do exist, they will be a contrived combination of hard-to-
> > understand iters and specialized conventions. Imo, the essence of
> > understanding and manipulating data is the creation of arbitrarily
> > many views on the data. This is true regardless of the scale of the
> > problem: it is true for the human genome project, or for any other
> > project. No exceptions.
>
> > Edward
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/leo-editor?hl=en
-~----------~----~----~----~------~----~------~--~---