RE: [fs] RE: Data tables

2004-10-07 Thread Jon Berndt
> Summarising, the current DAVE-ML gridded table structure is valid XML
> but the gridded tables are not as fine-grained as they could be.
> However, the current structure seems to me a reasonable balance between
> size of the resulting DOM and accessibility of data points within it.
> Division into individual data points will not significantly improve
> accessibility for the calling program, and will add both memory and
> initialisation time overhead.
>
> I guess the point I'm really making here is that the appearance of the
> data to the program accessing a DOM is probably more important than its
> appearance to humans using various XML viewers.
>
> Dan Newman

I agree with you on this. I'm now writing code in JSBSim submodels to extract values 
from
the DOM tree structure that was parsed using an XML parsing class I wrote that's based 
on
(and wraps) eXpat. I can see that the deeper the tree, the more tedious it gets. I was
unsure before, but this discussion has made it pretty clear that the DAVE-ML gridded 
table
structure is a good balance.

Jon S. Berndt



Re: [fs] RE: Data tables

2004-10-07 Thread Dan Newman
On Fri, 2004-10-08 at 03:53, Giovanni A. Cignoni wrote:
> The gridded text is useful for readability for those who like to
> look in the XML code using a text editor. Readability that, for
> all the rest of DAVE-ML items, is anyway disturbed by the verbosity
> of the XML tags.
> 
> If you open the file with a general XML viewer the gridded text
> is usually put on a single line. Tables are unreadable.
> 
> A tagged solution is terrible looking in the code. Viewed 
> with a standard XML viewer is a little better, but still not very 
> readable (columns will be listed one by one). Try for instance to 
> read the following with MS IE:
> 
> 
>   
> 00  01  02  03
> 04  05  06  07
>   
>   
>  00 01 02 03 
>  04 05 06 07 
>   
> 
> 
> The tagged format can be useful if we want to process the XML 
> to convert the format (for instance in a HTML readable form).
> 
> Another thing that, for the XMLers, may result odd is the
> different handling of values: for variables they are tagged
> inside a specific attribute, for table defined functions
> they are untagged.
> 
> XML is for exchanging data. The goal of XML is to identify 
> the relevant items of a set of data and give them a semantics.
> We can't consider the same a table of n x m values and
> a textual description. I'm afraid the "right" thing to do for
> an XML format is to eventually go tagged (and, of course, 
> provide a viewer).
> 
It seems to me that the most important aspect is not how the XML looks
in either a text viewer or an XML viewer (although in passing I'd say
that the current format looks quite reasonable in Mlview, and not too
bad in XMLSpy).  We can use XSLT and stylesheets to make it look any way
we like in IE or Mozilla, if necessary.

The significant aspect is how parsers (or just our programs if we're
reading it directly and doing our own parsing) deal with the XML data. 
I've been using the Xerces-C parser to load a DOM, and this results in a
tree structure whose node attributes and content are all XMLStrings.  A
string-based structure will result no matter which XML format we use,
but the current DAVE-ML format will have many less nodes than one with
each gridded data point like data, although the DAVE-ML table
nodes will have longer content strings.

This means that any subsequent program access to the DOM contents, which
involves traversing the tree structure, will have many more nodes to
process and is therefore likely to take longer.  It doesn't provide any
great accessibility benefit, since a computational model is unlikely to
want to access just one point somewhere in a gridded data table.
 
Also, since we can't perform flight mechanics computations using
strings, at some point we're going to have to transform all these DOM
table node contents to a numeric array.  At present I do that with a
loop over the data table string contents, like:

char* next = strpbrk( XMLString::transcode(
dataTableItem->getNodeValue() ), digits );
while ( NULL != next ) {
dataTable_[i][ia] = strtod( next, NULL );
ia = ia + 1;
next = strpbrk( next, delimiters );
if ( NULL != next ) {
next = strpbrk( next, digits );
}
}

If we tagged the gridded data points individually, this would change to
make the loop traverse the individual elements and convert each of them
individually, which would add somewhat to model initialisation
computational overhead.  The run-time computational load for tabular
data would be unchanged.

Summarising, the current DAVE-ML gridded table structure is valid XML
but the gridded tables are not as fine-grained as they could be. 
However, the current structure seems to me a reasonable balance between
size of the resulting DOM and accessibility of data points within it. 
Division into individual data points will not significantly improve
accessibility for the calling program, and will add both memory and
initialisation time overhead.

I guess the point I'm really making here is that the appearance of the
data to the program accessing a DOM is probably more important than its
appearance to humans using various XML viewers.

Dan Newman




[fs] RE: Data tables

2004-10-07 Thread Brian, Geoff
>From our experience, also refer previous comments by Dan Newman, we are comfortable 
>with the tagging format of the gridded table as it is currently specified in DAVE-ML, 
>but also think the including the type definition would be beneficial.

More thought required for the ungridded data though. It will require individual data 
source and quality tags, whereas the gridded data needs only a single data source and 
quality reference, because by definition it has been massaged into a regular grid and 
it is this massaging process that needs to be referenced rather than the raw data.

Maybe a single string with all the data for each ungridded point will make for a more 
easily manipulable? DOM than the full-on XML / XHTML option, an on-going thought.

Cheers
Geoff

-Original Message-
From: Chen, Alan [mailto:[EMAIL PROTECTED]
Sent: Friday, 8 October 2004 4:50 AM
To: [EMAIL PROTECTED] Nasa. Gov
Subject: RE: [fs] RE: Data tables


> XML is for exchanging data. The goal of XML is to identify 
> the relevant items of a set of data and give them a semantics.
> We can't consider the same a table of n x m values and
> a textual description. I'm afraid the "right" thing to do for
> an XML format is to eventually go tagged (and, of course, 
> provide a viewer).

Perhaps a good compromise between the XML "granularity", readability,
and performance would be to tag the data per row as below:

 
   
 00  01  02  03
 04  05  06  07
   
   
  00 01 02 03 
  04 05 06 07 
   

  
00 01 02 03
04 05 06 07
  
 


RE: [fs] RE: Data tables

2004-10-07 Thread Chen, Alan
> XML is for exchanging data. The goal of XML is to identify 
> the relevant items of a set of data and give them a semantics.
> We can't consider the same a table of n x m values and
> a textual description. I'm afraid the "right" thing to do for
> an XML format is to eventually go tagged (and, of course, 
> provide a viewer).

Perhaps a good compromise between the XML "granularity", readability,
and performance would be to tag the data per row as below:

 
   
 00  01  02  03
 04  05  06  07
   
   
  00 01 02 03 
  04 05 06 07 
   

  
00 01 02 03
04 05 06 07
  
 


Re: [fs] RE: Data tables

2004-10-07 Thread Giovanni A. Cignoni
> I like the format of the gridded table as it is now. It's quite 
> similar to the way we do it in JSBSim. I had gotten into a discussion
> with a more hard-core XML guy recently,
> though, and some of the points he made seemed sensible. 
> But, I wanted to get some opinions from others who are doing similar 
> work. The answers are pretty much along the lines of
> what I expected - that is, there's not really any convincing 
> reason to tag each data point.

The gridded text is useful for readability for those who like to
look in the XML code using a text editor. Readability that, for
all the rest of DAVE-ML items, is anyway disturbed by the verbosity
of the XML tags.

If you open the file with a general XML viewer the gridded text
is usually put on a single line. Tables are unreadable.

A tagged solution is terrible looking in the code. Viewed 
with a standard XML viewer is a little better, but still not very 
readable (columns will be listed one by one). Try for instance to 
read the following with MS IE:


  
00  01  02  03
04  05  06  07
  
  
 00 01 02 03 
 04 05 06 07 
  


The tagged format can be useful if we want to process the XML 
to convert the format (for instance in a HTML readable form).

Another thing that, for the XMLers, may result odd is the
different handling of values: for variables they are tagged
inside a specific attribute, for table defined functions
they are untagged.

XML is for exchanging data. The goal of XML is to identify 
the relevant items of a set of data and give them a semantics.
We can't consider the same a table of n x m values and
a textual description. I'm afraid the "right" thing to do for
an XML format is to eventually go tagged (and, of course, 
provide a viewer).

Just for the sake of discussion: I'm used to look in the code,
I stay well with the gridded solution :)

Giovanni.



RE: Data tables

2004-10-01 Thread Jon Berndt
> The suggestion JSB has made looks to be in accordance with a "pure" XML
> or HTML layout, but in the DSTO application I'm involved in we're using
> the Xerces-C parser to build our DOM at low level, so the approach of
> tagging each table point will result in many more short nodes in the DOM
> rather than one node with a very long content string.  In effect,
> navigating the DOM tree could be slowed down quite a lot (including for
> access to data not related to the table), and the DOM might require more
> memory just for the parser's internal structure.
>
> On the whole the current structure seems a reasonable compromise.

Let me clarify that I wasn't suggesting a move to tagging each table data point. In 
fact,
I like the format of the gridded table as it is now. It's quite similar to the way we 
do
it in JSBSim. I had gotten into a discussion with a more hard-core XML guy recently,
though, and some of the points he made seemed sensible. But, I wanted to get some 
opinions
from others who are doing similar work. The answers are pretty much along the lines of
what I expected - that is, there's not really any convincing reason to tag each data
point.

Jon



Re: Data tables

2004-10-01 Thread Dan Newman
The suggestion JSB has made looks to be in accordance with a "pure" XML
or HTML layout, but in the DSTO application I'm involved in we're using
the Xerces-C parser to build our DOM at low level, so the approach of
tagging each table point will result in many more short nodes in the DOM
rather than one node with a very long content string.  In effect,
navigating the DOM tree could be slowed down quite a lot (including for
access to data not related to the table), and the DOM might require more
memory just for the parser's internal structure.  

Since we load the table string into a 1-D numeric array as part of the
run-time initialisation, and would similarly pre-load the individual
points if we had to, this proposed structure would not affect our
real-time computational performance, but it might make the actual
initialisation quite slow.

On the whole the current structure seems a reasonable compromise.

As far as the ungridded data goes, I'm working on it at present, and in
the structure I've developed so far, again the use of more detail in
more nodes will not affect run-time speed but may slow down both
initialisation and navigation of the DOM, as well as using more memory.

Dan Newman





Re: Data tables

2004-10-01 Thread Bruce Jackson
Title: Re: Data tables


At 5:50 PM -0500 9/29/04, Jon S Berndt wrote:
I've been looking at the data table
format for DAVE-ML, for example:

 
 9.5013e-01  6.1543e-01 
5.7891e-02  1.5274e-02  8.3812e-01 1.9343e-01
 2.3114e-01  7.9194e-01 
3.5287e-01  7.4679e-01  1.9640e-02 6.8222e-01
 8.2141e-01  4.1027e-01 
2.7219e-01  2.0265e-01  3.0462e-01 8.5366e-01
 4.4470e-01  8.9365e-01  1.9881e-01  6.7214e-01 
1.8965e-01 5.9356e-01


For JSBSim, we have been working on moving our aircraft config file
format closer to DAVE-ML, where applicable. Our table format is
already similar to the above, though in our case we have the
breakpoints as part of the table data.

As we discuss the newer XML format we intend to use for JSBSim, the
question has been raised about the pros and cons of the above format.
For one, it is nice to be able to look at the data and see what it
means without a bunch of tags obfuscating the data. The above format
also is conducive to import/export with spreadsheet programs. However,
if an editor or automated process is to be used in authoring a DAVE-ML
file (or in our case a JSBSim file) we also ask if there is a greater
benefit in using a more "whole" XML specification, such
as:


   0.1   0.34 

   0.2   0.49 

   0.3   0.57 

   0.4   0.68 

   0.5   1.75
< /tr>


Jon Berndt

Jon,

My goal in the current DAVE-ML table design was to make it as
efficient as an ASCII implementation could be, since these tables
comprise the majority of data in most "heavyweight"
simulations. The interpretation of rows/columns/higher dimensions
comes from the surrounding XML structure in which this table is
embedded; the table doesn't (usually) stand alone.

As you are probably aware, DAVE-ML doesn't embed breakpoints
directly in the tables because these are typically reused - for
example, the angle-of-attack breakpoint set is often used in dozens of
tables. Again, the surrounding XML structure contains a reference to
the appropriate breakpoint set.

I notice the use of HTML row/data delimiters above; those can
easily be achieved for a given table in an XSL transformation
stylesheet so that a browser, for example, can display a DAVE-ML table
properly. We didn't really expect to have to do that, however.

I must confess that Bruce Hildreth has been urging an example of
an "ungridded" table, which the DAVE-ML format supports but
has to date been ignored, sort of like:


   

   0.1, -4.0, 9.5013e-01   
 
 
   

   0.1,  0.0, 6.1543e-01   
 
 
   

   0.5, -4.0, 2.3114e-01   
 
 
   

   0.6,  0.0, 4.1027e-01   
 
 


where you'll notice we embed the independent value(s) in the data
record, ahead of the dependent value.  For this type of data,
there is no breakpoint set per se.  This is more in line with the
"whole" XML implementation you're considering.

I guess you could go hog-wild XML and generate something
like


  
    0.1
    -0.4
    9.5013e-01
 
 

...



but that seems to be overkill to me.

Also note the (optional) use of the modID attribute, so we can
tag each data point to a particular source data document and thus know
where it came from or why it was changed.

[IMPORTANT NOTE TO AIRCRAFT DESIGNERS: do not rely on the table
above to design your aircraft. It is intended for illustrative
purposes only.] :)

-- Bruce




Data tables

2004-09-29 Thread Jon S Berndt
I've been looking at the data table format for DAVE-ML, for example:
 
 9.5013e-01  6.1543e-01  5.7891e-02  1.5274e-02  8.3812e-01 
1.9343e-01
 2.3114e-01  7.9194e-01  3.5287e-01  7.4679e-01  1.9640e-02 
6.8222e-01
 8.2141e-01  4.1027e-01  2.7219e-01  2.0265e-01  3.0462e-01 
8.5366e-01
 4.4470e-01  8.9365e-01  1.9881e-01  6.7214e-01  1.8965e-01 
5.9356e-01


For JSBSim, we have been working on moving our aircraft config file 
format closer to DAVE-ML, where applicable. Our table format is 
already similar to the above, though in our case we have the 
breakpoints as part of the table data.

As we discuss the newer XML format we intend to use for JSBSim, the 
question has been raised about the pros and cons of the above format. 
For one, it is nice to be able to look at the data and see what it 
means without a bunch of tags obfuscating the data. The above format 
also is conducive to import/export with spreadsheet programs. However, 
if an editor or automated process is to be used in authoring a DAVE-ML 
file (or in our case a JSBSim file) we also ask if there is a greater 
benefit in using a more "whole" XML specification, such as:


   0.1   0.34  
   0.2   0.49  
   0.3   0.57  
   0.4   0.68  
   0.5   1.75 < /tr>

Jon Berndt