While trying to fix a bug related to name attachments not 
saving/loading, I worked through the code for the NFIO file handler to 
try to understand how things work (you may have seen my commits with 
documentation and comments).

Anyway, I noticed a few things that I wanted to ask about:

- There was a lot more string comparison and processing then I expected 
in a native format.

I think I was expecting something more like the bin file handler that 
streams everything in and out.  How much better does the BIN filehandler 
perform?  

There may be some things that could be done to improve the performance 
of the OSB file handler by replacing some of the string processing.

- Why are field container names written as strings to the file for every 
field container? 

Couldn't we decrease the file size and increase load performance by 
extending the header to include a map from a fc type id to a field 
container name?  That way every where a field container name is in the 
current file structure we could just use a Uint32 that we could map 
directly to a field container type.  Note: to write this header we would 
need the next idea....

- I think the writer could be optimized and simplified quite a bit.

This may be a bit tough to explain, but I will do my best. The writer 
currently starts by writing the first fc.  In the process of writing an 
fc, a list is extend with any other fc's that are pointed to in the 
system and are not already in the list.  So after that first fc is 
written, the code loops over the list until the end of the list is 
reached.  When it has been reached then all fc's have been written.

This is a reasonable way to implement the code, but I see two problems:  
1) The reachability traversal is done as a side-effect of the writing 
process.  This is a little inelegant but works.  2) It seems dangerous 
to me to loop over a list at the same time it is being extended.

The interesting thing is that there is already code in here that could 
simplify this a lot.  There is another recursive reachability crawl that 
is used to get an initial count of nodes fc's that will be written out.  
This is used for updating the writing progress bar.  We could reuse this.

So my idea would be to refactor this code to:
   - Count all fc's to be written and build list of reachable fc's
   - Write standard header but use the size field to store the number of 
fc's we will write
   - Write a header with mapping of all used fc_type_id to fc_type_names
   - Iterate through the list of reachable fc's and write them one at a time
   - Close file

This would have a couple of benefits:
   - Simplify the code a bit by having a single method that finds all 
reachable fc's
   - Allow use of fc_type_id's instead of names
   - Allow loader to find the number of expected fc's in the header, and 
use this for progress CB
   - List is not growing while we are writing so we could use a vector 
or even just reuse the counting set to write out the fcs

Anyway, this is just a thought.   Any comments

-Allen


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Opensg-core mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensg-core

Reply via email to