Our library automation system's statistical reports contain only
non-zero rows of data. However, this makes them challenging to use
effectively because comparisons among reports (for example, in Excel)
requires data manipulations by hand to "normalize" two reports that
differ in which rows contained non-zero data. Because I'd like to
"clean up" this situation to help out our staff, I have a question about
which approach (algorithm) I should take in manipulating data (tables)
in J. What I'm thinking of doing is to compare each table file (upon
creation) against a "master" table file and then "normalizing" the new
data file so that, in the future, it can be compared with any other
table file. Below is a "toy example" of the two files, boxed, and the
desired end result. The first column of the two tables is the row name
to match upon (the "key", if you will). (The "master" table is merely a
table containing zero values for all possible row name values; these
zero values would have existed in the original, but suppressed, data
that was output in the "new" table.)
"NEW" TABLE INSIDE J: "MASTER" FILE INSIDE J:
+----------+---+---+---+ +----------+---+---+---+
| LiteralA | 1 | 0 | 0 | | LiteralA | 0 | 0 | 0 |
+----------+---+---+---+ +----------+---+---+---+
| LiteralB | 2 | 4 | 3 | | LiteralB | 0 | 0 | 0 |
+----------+---+---+---+ +----------+---+---+---+
| LiteralD | 0 | 3 | 5 | | LiteralC | 0 | 0 | 0 |
+----------+---+---+---+ +----------+---+---+---+
| LiteralD | 0 | 0 | 0 |
+----------+---+---+---+
| LiteralE | 0 | 0 | 0 |
+----------+---+---+---+
DESIRED END RESULT:
+----------+---+---+---+
| LiteralA | 1 | 0 | 0 |
+----------+---+---+---+
| LiteralB | 2 | 4 | 3 |
+----------+---+---+---+
| LiteralC | 0 | 0 | 0 |
+----------+---+---+---+
| LiteralD | 0 | 3 | 5 |
+----------+---+---+---+
| LiteralE | 0 | 0 | 0 |
+----------+---+---+---+
My question: should I (1) attempt to copy/insert rows from the "master"
table into the "new" table or (2) overlay/replace the rows from the
"new" table in the "master" table (without, of course, destroying the
original "template", if you will, on disk)? Is there an algorithmic
preference in J (especially with regard to the matching of row names),
or is there not any significant difference between the two approaches?
Or does the choice depend on the size ratio between the "new" and
"master" tables?
Thanks in advance for your opinions and insights!
Harvey
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm