John Darrington <[EMAIL PROTECTED]> writes: > I'm generally happy with the structure you proposed. Having said > this, when I've done these sort of exercises in the past, at the end > of it, I have a much better idea of what is appropriate than at the > start. It's difficult to get a good picture of the design while > everything is mixed up in one place (which is the whole point of > seperating it in the first place). > > > The procedure usually ends up as: > > 1. Remove all -I directives from the Makefiles. > > 2. Classify the source files according to some reasonable criteria, and put > them into respective sub-directories. > > 3. Put in the whatever -I directives are necessary in order to make > the damn thing build.
#1 and #3 together are going to be kind of painful, because some source files need a lot of headers in PSPP. > 4. Run a script over the Makefiles, to extract the -I directives and > create a diagram of dependencies between directories. Identify > any dependencies which don't seem to make sense. This sounds like a good idea but I don't know of a script that does this. It sounds like you've done this before--any pointers? > 5. Rework modules which cause illogical dependencies. Split up any > files which seem to belong in more than one class. > > 6. Goto 1. If you're volunteering... okay. I personally hate "organizational" type stuff--I'd much rather write code--which is one reason it's been put off so long. I'd propose that, when you get to someplace you think makes sense to some extent (whether it compiles and links or not), you post a .tar.gz of it somewhere and we can discuss it. It's such a pain dealing with major changes in a CVS tree that it'd be a shame to have to make major changes more than once. Here are the major components of PSPP in my opinion: * Dictionaries and associated data, including properties of variables, processing the active file (vfm), casefiles, sorting, missing values. * Data I/O, including the dfm, pfm, and sfm modules (which should really get better names) and the new any-* and scratch-* modules. I'm not sure whether format.* and data-{in,out}.* belong in the former or the latter category. I suspect the former. * Parsing and executing the PSPP language. (Actually I'd like to separate parsing from execution, but that's a big project, not something that can be accomplished by rearranging files.) At the top level, this is things like the line reader (getl), the lexer, command.*, vars-prs.c, etc. There would be multiple subcategories: - Control structures. - Commands that modify the dictionary (as their primary purpose). - Commands for data I/O. - Statistical procedures (commands that analyze data and produce output based on it). - Transformations (commands that modify data). - Utilities (commands that don't modify the dictionary or access data). * Output: the table formatter, PostScript driver, etc. - "charts" as a subdirectory of "output" just because they're easily distinguished and there's a lot of them. - I have big plans for output but again that's another big project in itself. * Statistical calculation library: these are routines that are tied to PSPP but otherwise just mathematics. (Routines that are separable from PSPP would presumably go in "lib", not "src".) * User interface. Presumably the GUI, when merged, would be in a separate directory too, but John can say for sure. Some files will need to be split to fit this well, e.g. sort.c currently implements both the SORT CASES BY command and the infrastructure for sorting. The former should go into the "dictionaries and data" directory, the latter into "transformations". Other files aren't named well and we'd want to change them, e.g. I've been a bit irritated with "sfm-read.c" and related files for a while. It should really be something like "sysfile-reader.c", because that makes it a lot more obvious what it actually does. Let me propose an initial file split to start out, based on that, and everyone can criticize it. I haven't done any file renaming in this sample, because then nobody would really be sure what each file actually is: .: CVS/ Make.build data/ glob.h lib/ main.h settings.c stats/ ChangeLog Makefile.am glob.c language/ main.c output/ settings.h ui/ ./CVS: Entries Repository Root ./data: case.c data-in.h format.c sort.c vfm.h case.h data-out.c format.def sort.h vfmP.h casefile-test.c dictionary.c format.h val.h casefile.c dictionary.h io/ var.h casefile.h file-handle-def.c missing-values.c vars-atr.c data-in.c file-handle-def.h missing-values.h vfm.c ./data/io: any-reader.c dfm-read.h pfm-write.c scratch-reader.h sfm-write.c any-reader.h dfm-write.c pfm-write.h scratch-writer.c sfm-write.h any-writer.c dfm-write.h scratch-handle.c scratch-writer.h sfmP.h any-writer.h pfm-read.c scratch-handle.h sfm-read.c dfm-read.c pfm-read.h scratch-reader.c sfm-read.h ./language: command.c dictionary/ io/ lexer.h sort-prs.c subclist.h command.def expressions/ lex-def.c q2c.c sort-prs.h utilities/ command.h getl.c lex-def.h range-prs.c stats/ vars-prs.c control/ getl.h lexer.c range-prs.h subclist.c xforms/ ./language/control: ctl-stack.c ctl-stack.h do-if.c loop.c repeat.c repeat.h ./language/dictionary: apply-dict.c modify-vars.c split-file.c val-labs.c var-labs.c format-prs.c numeric.c sysfile-info.c value-labels.c vector.c formats.c rename-vars.c temporary.c value-labels.h weight.c mis-val.c sample.c title.c var-display.c ./language/expressions: CVS/ evaluate.h.pl helpers.h optimize.inc.pl public.h Makefile.am evaluate.inc.pl operations.def parse.c TODO generate.pl operations.h.pl parse.inc.pl evaluate.c helpers.c optimize.c private.h ./language/expressions/CVS: Entries Repository Root ./language/io: data-list.c file-handle.h file-type.c inpt-pgm.c matrix-data.c data-list.h file-handle.q get.c list.q print.c ./language/stats: aggregate.c crosstabs.q flip.c oneway.q regression_export.h autorecode.c descript.c frequencies.q rank.q t-test.q correlations.q examine.q means.q regression.q ./language/utilities: copyleft.c copyleft.h date.c echo.c include.c permissions.c set.q ./language/xforms: compute.c count.c recode.c sel-if.c ./lib: algorithm.c calendar.c hash.c magic.h random.c algorithm.h calendar.h hash.h mkfile.c random.h alloc.c debug-print.h linked-list.c mkfile.h str.c alloc.h filename.c linked-list.h pool.c str.h bitvector.h filename.h magic.c pool.h version.h ./output: ascii.c font.h html.c output.c postscript.c som.h tab.h charts/ groff-font.c htmlP.h output.h som.c tab.c ./output/charts: barchart.c chart.c histogram.c plot-chart.c box-whisker.c chart.h histogram.h plot-hist.c cartesian.c dummy-chart.c piechart.c ./stats: cat-routines.h design-matrix.h group.h misc.c percentiles.c cat.c factor_stats.c group_proc.h misc.h percentiles.h cat.h factor_stats.h levene.c moments.c design-matrix.c group.c levene.h moments.h ./ui: cmdline.c cmdline.h error.c error.h readln.c readln.h -- "I admire him, I frankly confess it; and when his time comes I shall buy a piece of the rope for a keepsake." --Mark Twain _______________________________________________ pspp-dev mailing list pspp-dev@gnu.org http://lists.gnu.org/mailman/listinfo/pspp-dev