On 28/01/18 06:07, Jon Perryman wrote:
For large complicated problems, assembler is the language of choice.
For large complicated problems a domain-specific language,
targeted at the problem domain, is the language of choice.
At least, according to the theory of "Language Oriented Programming":
see the paper with that title published in
Software--Concepts and Tools, Volume 15 (4), pp 147-161, 1994
http://www.gkc.org.uk/martin/papers/middle-out-t.pdf
Other than utilities which need to dive into the guts of
the operating system, I am not sure if there are any domains
for which assembler is the ideal language: but I am not personally
familiar with all the various problem domains of conmputing.
For the XML parsing problem, perl and the XML::Parser package
would appear to be a good choice. It offers a choice of parsing methods:
Debug: just prints out the document in outline form
Subs: call the appropriately named subroutines for each element
Tree: return a parse tree
Objects: like Tree but creates a hash object for each element
Stream: call callback routines at various points
There are certainly features that would be nice in ASM but a good
optimizing language doesn't make up for a language that encourages
good coding practices (e.g. XML parse).
Does assembler encourage good coding practices?
I am not asking "is it possible to write programs
using good coding practices in assembler", but rather
"does learning to code in assembler automatically
encourage programmers to use good coding practices?"
In theory, this seems to be unlikely. Learning to code
in assembler involves thinking of a program as a sequence
of "instructions" to be executed by a "machine".
This encourages the creation of spaghetti code
consisting of a mass of labels and branches.
When the programmer learns about structured macros,
they will have to unlearn their original unstructured coding style.
Very few organisations enforce the use of structured macros,
but much of the benefit of structured macros is lost
if they are mingled with unstructured branches.
In practice, I have analysed some tens of millions of lines
of assembler (as part of my research, and working for
Software Migrations Ltd), and "good coding practices"
appear to be extremely rare. Some examples from a recent study
(presented at the GSE UK Conference in 2016):
US State Government Department:
870,000 LOC
Bugs Detected: 550 per MLOC (Million Lines Of Code)
Self-Modifying Code: 2,347 per MLOC
Complex Subroutine Linkage: 74% of modules
Non-Standard Module Linkage: 60% of modules
Large Insurance Company:
1.8 million lines of code
Bugs Detected: 274 per MLOC
Self-Modifying Code: 590 per MLOC
Complex Subroutine Linkage: 37% of modules
Non-Standard Module Linkage: 14% of modules
Human Resource Company (Payroll Systems):
350,000 lines of code
Bugs Detected: 302 per MLOC
Self-Modifying Code: 127 per MLOC
Complex Subroutine Linkage: 53% of modules
Non-Standard Module Linkage: 78% of modules
Here "Complex Subroutine Linkage" includes any unstructured
subroutine code, eg branching out of the middle of a subroutine
into main code or another subroutine, subroutines with multiple
entry points or multiple exit points, and so on.
Non-Standard Module Linkage includes any module calls
which do not use the standard calling convention.
For example, passing parameters in random registers,
returning to an offset of the return address,
passing parameters as inline data, and so on.
"Progress will only be achieved in programming if we are willing
to temporarily fully ignore the interconnection between our programs
(in textual form) and their implementation as executable code...
... In short: for the effective understanding of programs,
we must learn to abstract away from the existence of computers."
-- Edsgar W. Dijkstra
--
Martin
Dr Martin Ward | Email: [email protected] | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4