RE: [fs] RE: Data tables

2004-10-07 Thread Jon Berndt
> Summarising, the current DAVE-ML gridded table structure is valid XML
> but the gridded tables are not as fine-grained as they could be.
> However, the current structure seems to me a reasonable balance between
> size of the resulting DOM and accessibility of data points within it.
> Division into individual data points will not significantly improve
> accessibility for the calling program, and will add both memory and
> initialisation time overhead.
>
> I guess the point I'm really making here is that the appearance of the
> data to the program accessing a DOM is probably more important than its
> appearance to humans using various XML viewers.
>
> Dan Newman

I agree with you on this. I'm now writing code in JSBSim submodels to extract values 
from
the DOM tree structure that was parsed using an XML parsing class I wrote that's based 
on
(and wraps) eXpat. I can see that the deeper the tree, the more tedious it gets. I was
unsure before, but this discussion has made it pretty clear that the DAVE-ML gridded 
table
structure is a good balance.

Jon S. Berndt



Re: [fs] RE: Data tables

2004-10-07 Thread Dan Newman
On Fri, 2004-10-08 at 03:53, Giovanni A. Cignoni wrote:
> The gridded text is useful for readability for those who like to
> look in the XML code using a text editor. Readability that, for
> all the rest of DAVE-ML items, is anyway disturbed by the verbosity
> of the XML tags.
> 
> If you open the file with a general XML viewer the gridded text
> is usually put on a single line. Tables are unreadable.
> 
> A tagged solution is terrible looking in the code. Viewed 
> with a standard XML viewer is a little better, but still not very 
> readable (columns will be listed one by one). Try for instance to 
> read the following with MS IE:
> 
> 
>   
> 00  01  02  03
> 04  05  06  07
>   
>   
>  00 01 02 03 
>  04 05 06 07 
>   
> 
> 
> The tagged format can be useful if we want to process the XML 
> to convert the format (for instance in a HTML readable form).
> 
> Another thing that, for the XMLers, may result odd is the
> different handling of values: for variables they are tagged
> inside a specific attribute, for table defined functions
> they are untagged.
> 
> XML is for exchanging data. The goal of XML is to identify 
> the relevant items of a set of data and give them a semantics.
> We can't consider the same a table of n x m values and
> a textual description. I'm afraid the "right" thing to do for
> an XML format is to eventually go tagged (and, of course, 
> provide a viewer).
> 
It seems to me that the most important aspect is not how the XML looks
in either a text viewer or an XML viewer (although in passing I'd say
that the current format looks quite reasonable in Mlview, and not too
bad in XMLSpy).  We can use XSLT and stylesheets to make it look any way
we like in IE or Mozilla, if necessary.

The significant aspect is how parsers (or just our programs if we're
reading it directly and doing our own parsing) deal with the XML data. 
I've been using the Xerces-C parser to load a DOM, and this results in a
tree structure whose node attributes and content are all XMLStrings.  A
string-based structure will result no matter which XML format we use,
but the current DAVE-ML format will have many less nodes than one with
each gridded data point like data, although the DAVE-ML table
nodes will have longer content strings.

This means that any subsequent program access to the DOM contents, which
involves traversing the tree structure, will have many more nodes to
process and is therefore likely to take longer.  It doesn't provide any
great accessibility benefit, since a computational model is unlikely to
want to access just one point somewhere in a gridded data table.
 
Also, since we can't perform flight mechanics computations using
strings, at some point we're going to have to transform all these DOM
table node contents to a numeric array.  At present I do that with a
loop over the data table string contents, like:

char* next = strpbrk( XMLString::transcode(
dataTableItem->getNodeValue() ), digits );
while ( NULL != next ) {
dataTable_[i][ia] = strtod( next, NULL );
ia = ia + 1;
next = strpbrk( next, delimiters );
if ( NULL != next ) {
next = strpbrk( next, digits );
}
}

If we tagged the gridded data points individually, this would change to
make the loop traverse the individual elements and convert each of them
individually, which would add somewhat to model initialisation
computational overhead.  The run-time computational load for tabular
data would be unchanged.

Summarising, the current DAVE-ML gridded table structure is valid XML
but the gridded tables are not as fine-grained as they could be. 
However, the current structure seems to me a reasonable balance between
size of the resulting DOM and accessibility of data points within it. 
Division into individual data points will not significantly improve
accessibility for the calling program, and will add both memory and
initialisation time overhead.

I guess the point I'm really making here is that the appearance of the
data to the program accessing a DOM is probably more important than its
appearance to humans using various XML viewers.

Dan Newman




RE: [fs] RE: Data tables

2004-10-07 Thread Chen, Alan
> XML is for exchanging data. The goal of XML is to identify 
> the relevant items of a set of data and give them a semantics.
> We can't consider the same a table of n x m values and
> a textual description. I'm afraid the "right" thing to do for
> an XML format is to eventually go tagged (and, of course, 
> provide a viewer).

Perhaps a good compromise between the XML "granularity", readability,
and performance would be to tag the data per row as below:

 
   
 00  01  02  03
 04  05  06  07
   
   
  00 01 02 03 
  04 05 06 07 
   

  
00 01 02 03
04 05 06 07
  
 


Re: [fs] RE: Data tables

2004-10-07 Thread Giovanni A. Cignoni
> I like the format of the gridded table as it is now. It's quite 
> similar to the way we do it in JSBSim. I had gotten into a discussion
> with a more hard-core XML guy recently,
> though, and some of the points he made seemed sensible. 
> But, I wanted to get some opinions from others who are doing similar 
> work. The answers are pretty much along the lines of
> what I expected - that is, there's not really any convincing 
> reason to tag each data point.

The gridded text is useful for readability for those who like to
look in the XML code using a text editor. Readability that, for
all the rest of DAVE-ML items, is anyway disturbed by the verbosity
of the XML tags.

If you open the file with a general XML viewer the gridded text
is usually put on a single line. Tables are unreadable.

A tagged solution is terrible looking in the code. Viewed 
with a standard XML viewer is a little better, but still not very 
readable (columns will be listed one by one). Try for instance to 
read the following with MS IE:


  
00  01  02  03
04  05  06  07
  
  
 00 01 02 03 
 04 05 06 07 
  


The tagged format can be useful if we want to process the XML 
to convert the format (for instance in a HTML readable form).

Another thing that, for the XMLers, may result odd is the
different handling of values: for variables they are tagged
inside a specific attribute, for table defined functions
they are untagged.

XML is for exchanging data. The goal of XML is to identify 
the relevant items of a set of data and give them a semantics.
We can't consider the same a table of n x m values and
a textual description. I'm afraid the "right" thing to do for
an XML format is to eventually go tagged (and, of course, 
provide a viewer).

Just for the sake of discussion: I'm used to look in the code,
I stay well with the gridded solution :)

Giovanni.