Anders Logg wrote: > On Sun, Mar 14, 2010 at 08:35:32AM +0000, Garth N. Wells wrote: >> >> Anders Logg wrote: >>> On Sun, Mar 14, 2010 at 07:39:45AM +0000, Garth N. Wells wrote: >>>> Anders Logg wrote: >>>>> On Fri, Mar 12, 2010 at 06:58:22PM -0000, [email protected] wrote: >>>>>> ------------------------------------------------------------ >>>>>> revno: 4635 >>>>>> committer: Garth N. Wells <[email protected]> >>>>>> branch nick: dolfin-all >>>>>> timestamp: Fri 2010-03-12 18:53:05 +0000 >>>>>> message: >>>>>> Work on reading Vectors in parallel. Some issues to resolve still. >>>>>> >>>>>> Some issues: >>>>>> - How should files be named when in parallel? >>>>>> - Should we have a 'master' xml file which points to the files >>>>>> - from different processes? >>>>> I think this should be done in the same way as for Meshes. We >>>>> discussed the following design: >>>>> >>>>> 1. Reading a single file "foo.xml" results in each process reading the >>>>> entire file but skipping data located on another process as determined >>>>> by local_range. This is what is implemented now for meshes (followed >>>>> by communication and mesh partitioning). The difference for vectors >>>>> would be that no extra communication is necessary. >>>>> >>>> OK. >>>> >>>>> 2. Reading a set of files "foo*.xml" results in each process reading >>>>> its portion stored in "foo%d.xml" % p. The File interface then needs >>>>> to check for the occurence of '*' and figure out the correct file name >>>>> based on its process number. >>>>> >>>> I think that are a number of advantages to having a single .xml that >>>> points to the 'sub-files'. An obvious advantage is that we won't need to >>>> distinguish between cases 1 and 2 when reading in a vector. >>>> >>>> Garth >>> I don't feel strongly about either option, but if we go for the >>> master-file/sub-file design I think we should do the same for vectors >>> and meshes. >>> >>> The master file could look something like this for vectors: >>> >>> <distributed_vector size="1024" num_partitions="16"> >>> <sub_vector partition="0" file="foo_0.xml" offset="0"/> >>> <sub_vector partition="1" file="foo_1.xml" offset="64"/> >>> <sub_vector partition="2" file="foo_2.xml" offset="128"/> >>> ... >>> </distributed_vector> >>> >> >> Looks good, except 'offset' should be 'size', or 'local_size'. > > Yes, but then maybe it's not needed since the local size will be > available in the local files (which can be standard XML vector data). > > But then won't the master files always be trivial? The only extra > information that is contained in the master file is the total size, > and the number of partitions (which will only be used to check that it > matches the actual number of processes). >
The master file is the definitive file. Say a program is run with 4 processes, and then with 2. The files vector_0.xml, vector_1.xml, vector_2.xml and vector_3.xml will be floating around, but which files make up the vector? The master file will point to vector_0.xml and vector_1.xml. Also, there should be no need to check that the number of 'partitions' matches the number of processes. Garth > -- > Anders > > >> Garth >> >>> For meshes, we can do this: >>> >>> <distributed_mesh num_partitions="16"> >>> <sub_mesh partition="0" file="foo_0.xml"/> >>> <sub_vector partition="1" file="foo_1.xml"/> >>> <sub_vector partition="2" file="foo_2.xml"/> >>> ... >>> </distributed_mesh> >>> _______________________________________________ Mailing list: https://launchpad.net/~dolfin Post to : [email protected] Unsubscribe : https://launchpad.net/~dolfin More help : https://help.launchpad.net/ListHelp

