Making the assumption that unique keys are the objective, and that the non-zero numbers are normally not single digits (and are all, in fact, numbers) then my preference is to simply combine and eliminate duplicates e.g. -

   MASTER =: (,.<"1 'Literal',"1 ,.'ABCDE'),"1 ] 0;0;0
   NEW =: _4]\'LiteralA';1;0;0 ;'LiteralB';2;4;3;'LiteralD';0;3;5
   CONDENSE =: 13 : '(~: 0{"1 y)#y'
   CONDENSE NEW, MASTER
+--------+-+-+-+
|LiteralA|1|0|0|
+--------+-+-+-+
|LiteralB|2|4|3|
+--------+-+-+-+
|LiteralD|0|3|5|
+--------+-+-+-+
|LiteralC|0|0|0|
+--------+-+-+-+
|LiteralE|0|0|0|
+--------+-+-+-+

Of course, if you want the alphabetic key sort, then -

   /:~ CONDENSE NEW, MASTER
+--------+-+-+-+
|LiteralA|1|0|0|
+--------+-+-+-+
|LiteralB|2|4|3|
+--------+-+-+-+
|LiteralC|0|0|0|
+--------+-+-+-+
|LiteralD|0|3|5|
+--------+-+-+-+
|LiteralE|0|0|0|
+--------+-+-+-+

This does mean that if you want to return a file for Excel or some such you need to convert the numbers back to literal representations (so Excel can convert them again...) e.g. -

   untable  ": &.> /:~ CONDENSE NEW, MASTER
LiteralA        1       0       0
LiteralB        2       4       3
LiteralC        0       0       0
LiteralD        0       3       5
LiteralE        0       0       0

   untable =: 3 : 0
  TAB untable y
:
  ; ((,(i.1{$y),. 1{$y),1+1{$y){"1 y,"1 x ;NL
)

TAB and NL have usual meanings (for DOS  NL =: LF,CR)

But perhaps energy would be better spent preening the keys. In my experience, such things often acquire "junk" like trailing, or even embedded, blanks. In any case, ~: is a good tool for the job.


At 17:26  -0500 2008/04/12, Hahn, Harvey wrote:
Our library automation system's statistical reports contain only
non-zero rows of data.  However, this makes them challenging to use
effectively because comparisons among reports (for example, in Excel)
requires data manipulations by hand to "normalize" two reports that
differ in which rows contained non-zero data.  Because I'd like to
"clean up" this situation to help out our staff, I have a question about
which approach (algorithm) I should take in manipulating data (tables)
in J.  What I'm thinking of doing is to compare each table file (upon
creation) against a "master" table file and then "normalizing" the new
data file so that, in the future, it can be compared with any other
table file.  Below is a "toy example" of the two files, boxed, and the
desired end result.  The first column of the two tables is the row name
to match upon (the "key", if you will).  (The "master" table is merely a
table containing zero values for all possible row name values; these
zero values would have existed in the original, but suppressed, data
that was output in the "new" table.)

"NEW" TABLE INSIDE J:           "MASTER" FILE INSIDE J:

+----------+---+---+---+      +----------+---+---+---+
| LiteralA | 1 | 0 | 0 |      | LiteralA | 0 | 0 | 0 |
+----------+---+---+---+      +----------+---+---+---+
| LiteralB | 2 | 4 | 3 |      | LiteralB | 0 | 0 | 0 |
+----------+---+---+---+      +----------+---+---+---+
| LiteralD | 0 | 3 | 5 |      | LiteralC | 0 | 0 | 0 |
+----------+---+---+---+      +----------+---+---+---+
                              | LiteralD | 0 | 0 | 0 |
                              +----------+---+---+---+
                              | LiteralE | 0 | 0 | 0 |
                              +----------+---+---+---+

DESIRED END RESULT:

+----------+---+---+---+
| LiteralA | 1 | 0 | 0 |
+----------+---+---+---+
| LiteralB | 2 | 4 | 3 |
+----------+---+---+---+
| LiteralC | 0 | 0 | 0 |
+----------+---+---+---+
| LiteralD | 0 | 3 | 5 |
+----------+---+---+---+
| LiteralE | 0 | 0 | 0 |
+----------+---+---+---+

My question: should I (1) attempt to copy/insert rows from the "master"
table into the "new" table or (2) overlay/replace the rows from the
"new" table in the "master" table (without, of course, destroying the
original "template", if you will, on disk)?  Is there an algorithmic
preference in J (especially with regard to the matching of row names),
or is there not any significant difference between the two approaches?
Or does the choice depend on the size ratio between the "new" and
"master" tables?

Thanks in advance for your opinions and insights!

Harvey

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to