It is much simpler than what I thought... but still unclear: what to do with type-A/C lines and with type-B lines whose second field never appears in a type-A line? Notice in particular that, keeping type-A lines along with modified type-B lines, it becomes impossible to distinguish them (I believe).

So, I believe you actually want to first separate the three types, e.g., with this trivial AWK program:
#!/usr/bin/awk -f
NF == 4 { print > "type-A" }
NF == 2 { print > "type-B" }
NF == 5 { print > "type-C" }

Then, you can give "type-A" and "type-B" (in this order) as arguments of:
#!/usr/bin/awk -f
FILENAME == ARGV[1] { m[$1] = $2 " " $3 " " $4 }
FILENAME == ARGV[2] { if ($2 in m) $2 = m[$2]; print }

That two-line program outputs as many lines as there are in the second input file ("type-B"). Those that were modified have two additional fields. As a consequence, they are easily identifiable (in an AWK program: testing NF == 4).

Notice the simplicity of those programs. Again, you do not need to study AWK for 77 hours to be able to write such programs (8 hours should be enough, I believe). And they do not make typos.

Reply via email to