[ECCO] A cybernetic-connectionist view of genetic regulation

Francis Heylighen Mon, 29 Nov 2004 04:49:24 -0800

I just read the Scientific American article by John Mattick that my friend Cliff summarized below, and I agree with his interpretation of the importance and novelty of the hypothesis Mattick proposes: the "non-coding" or "junk" DNA in cells may actually represent the regulatory network that gives complex organisms their specific organization, thus distinguishing them from bacteria where most DNA directly codes for proteins (building blocks) rather than regulation (building plans). Its emergence can be seen as a metasystem transition to a higher level of complexity, thus extending our PCP or ECCO approach downward, to the cellular level.

I would go one step further, and suggest that this metasystem transition follows one of the scenarios that I outlined in my paper on Mediator Evolution (http://pespmc1.vub.ac.be/Papers/MediatorEvolution.pdf), and which is inspired by the work of John Stewart. The traditional interpretation of junk DNA is that it is a parasitic structure that profits from the replication machinery in the cell to have itself reproduced, but without contributing to the cell's survival. That may well have been how it originated, as John Mattick also suggests.

But according to my scenario, parasites ("exploiters") tend to evolve into "cultivators" (or what John Stewart calls "managers"), that regulate or mediate the interactions between the agents they exploit, in order to increase their collective productivity. Thus, since junk DNA is dependent on the well-functioning of the metabolic network in the cell, it is evolutionarily beneficial for it to make that metabolism more effective, e.g. by carefully regulating which proteins are produced when, so as to maximize their synergy. It does this, not by directly producing proteins, but by promoting or inhibiting the activity of the genes that do.

Let me take the speculation one step further, and try to understand how non-coding DNA could regulate the cell better than coding DNA does. In my most recent attempts to model distributed cognition (which is a form of distributed regulation), I have sugggested that all such systems can be viewed as generalized connectionist networks, i.e. collections of nodes (agents or components) connected by variable strength links (communication channels) along which "activation" is propagated. All DNA (coding and non-coding) is transcribed into RNA. This RNA can produce proteins, or it can activate or deactivate other RNA or DNA. Proteins in turn can also activate or deactivate DNA.

Thus we have three types of connections:

proteins -> DNA/RNA.

DNA/RNA -> DNA/RNA

DNA/RNA -> proteins,

If we see the RNA -> RNA connections as "internal" to the network, then proteins -> RNA connections represent the network's "input", and RNA -> proteins its "output". In other words, we treat proteins and other non-RNA/DNA molecules present in the cell as the "environment" of the regulatory system, and the RNA/DNA molecules and their direct interactions as its internal network.

This network is analogous in its structure and function to a neural network, that takes a collection of sensory signals as perceptual input, processes that information, compares it to in-built (in the DNA) "reference signals" that represent the ideal situation, and sends outgoing signals to the effectors, in order for them to take the actions that will bring the perceived situation closer to the ideal situation. This is a very general model of cybernetic control or regulation, except that in the most general model the internal network structure is left implicit. (See my paper with Cliff: http://pespmc1.vub.ac.be/papers/Cybernetics-EPST.pdf).

This network can be seen as a kind of neural network, where the activation of a node corresponds to the concentration of the corresponding molecules. Excitatory connections between nodes are those where an increase in concentration in the first type of molecules produces an increase in the second one. Inhibitory connections are those in which an increase in the first produces a decrease in the second. The only further assumption we need to turn this into the molecular equivalent of a traditional neural network is that the activation of a node is determined by the SUM of incoming activation, i.e. that the function that maps the states (activations) of the input nodes onto the receiving node is linear. This requirement is not strictly needed, as there are also exist neural net models with non-additive functions, but it may simplify modelling. It may also be realistic if we assume that the different molecules that can promote/inhibit the formation of a given molecule do not interact with each other, but only with their target.

We now have all we need to look for the equivalent in cells of all the traditional perceptual and control functions that can be implemented by neural networks. This includes mechanism such as feedforward, feedback, spreading activation, relaxation into an attractor, pattern recognition, classification, etc. This may for example explain cellular differentiation in multicellular organisms, where each cell during ontogeny is "classified" into one of a fixed set of discrete cell types, depending on its molecular environment. Note that Stuart Kauffman famously modelled that attractor-driven process of self-organization by representing genetic regulation with (discrete) Boolean networks, whereas I hypothesize here that this could be achieved through (continuous) neural networks.

What is still missing for a full neural net is learning: the differential reinforcement of useful links, and weakening of unsuccessful ones. This is necessary for the network to develop a complex regulatory structure without getting bogged down by the combinatoric explosion of possible architectures. We now encounter one apparent difference between neural and genetic mechanisms: the strength of a neural connection (synapse) can vary continuously, but a RNA-> RNA link is typically binary: it is either there, or not (in part explaining Kauffman's discrete models). But there probably are ways to vary its relative strength, if only by repeating the corresponding DNA sequence in the genome for a variable number of times, so that more or less copies of a certain RNA molecule are made once the necessary "activator" molecule is present. This may explain the many repeating sequences that are typically found in "junk" DNA. (This apparent redundancy is one of the reasons why such DNA tends to be interpreted as uninformative junk).

The learning mechanism then could be the insertion of more copies of "successful" RNA catalysts, and removal of "unsuccessful" ones, which can happen by simple variation and selection. However, to make this mechanism more efficient, it would be good to have a measure of success that is simpler and more local than whole organism survival, e.g. an equivalent of the neural mechanisms of Hebbian reinforcement or backpropagation. But whether we can find the equivalent of those at the cellular level is something I would have to reflect deeper upon, and perhaps consult a specialist in molecular biology... On the other hand, it seems plausible that the cellular learning mechanisms are (much) less efficient than the neural ones, as biological evolution is obviously much slower than cognitive development in the brain.

All comments welcome!


Francis

Date: Fri, 26 Nov 2004 22:28:06 -0700 To: [EMAIL PROTECTED] From: Cliff Joslyn <[EMAIL PROTECTED]> Subject: [pcp-discuss:] The likely Meta-System Transition in molecular evolution


I would like to draw everyone's attention to:

Mattick, John: (2004) ``The Hidden Genetic Program of Complex
Organisms'', Scientific American, v. 291:4, pp. 60-67

See also his technical papers:

"The evolution of controlled multitasked gene networks: The role of
introns and other noncoding RNAs in the development of complex
organisms", Mattick, JS; Gagen, MJ Source: Molecular Biology and
Evolution; September, 2001; v.18, no.9, p.1611-1630

"Challenging the dogma: The hidden layer of non-protein-coding RNAs in
complex organisms."  Mattick, JS Source: BioEssays; October 2003;
v.25, no.10, p.930-939

"RNA regulation: a new genetics?"  Mattick, JS Source: NATURE REVIEWS
GENETICS; APR 2004; v.5, no.4, p.316-323

I saw Mattick give a technical plenary at the 2003 Intelligence
Systems for Molecular Biology (ISMB 03, one of the two premier
bioinformatics conferences), and was really blown away. The Scientific
American article is a superb semi-technical distillation of his
work. He has a revolutionary, but simple and elegant, thesis, highly
coherent with the principles of evolutionary cybernetics, and most
importantly, highly likely to be TRUE, about molecular evolution. It
puts so much of what I know about biological systems in context, while
answering many current mysteries, and really opens up the kind of
explanatory paradigm we've been lacking for so long, but is so
obviously suggested by a cybernetic perspective.

In brief, consider these facts:

*) Most genomes are characterized by a VERY high degree (> 98%) of
genomic sequence which are not genes, that is, does not code for
protein. This includes introns and so-called "intergenic space".

*) However, recent evidence indicates that much of this genome is
actually expressed as RNA, and moreover, good chunks of it are
identical among evolutionarily distinct orgnanisms. This is a property
called "conservation", which indicates that it's functionally
significant for survival. And moreover, portions of non-coding DNA are
MORE highly conserved than proteins.

*) This is NOT true in prokaryotes (bacteria lacking nuclei), but is
in eukaryotes. Prokaryotes were the only life on earth for 2.5 B
years. But a few hundred million years after the emergence of
eukaryotes also saw the origin of metazoans (multi-cellular
organisms), all of which are eukaryotes.

*) Nonetheless, prokaryotes have on the same order of magnitude of
number of genes as eukaryotes. The riddle that organismal size and
complexity (however measured, a different discussion) does not
correlate to the number of genes present is well noted, especially in
the wake of the genomic revolution.

*) BUT, total genome size, and in particular the RATIO of non-coding
to coding genome DOES more or less correlate with complexity.

*) Finally, we note that the standard hypothesis for explaining
regulatory organization of sufficient complexity to generate metazoans
is that it is somehow embedded in the combinatorics of protein
interaction, that is, proteins acting on each other to form regulatory
networks. This is despite the fact that to a first approximation,
regulatory complexity must grow non-linearally with the number of
"components" controlled, on the order of the quadratic (to handle
pairs of proteins). And indeed, in PROKARYOTES the number of genes
increases with the square of organism size, up to a limit where the
number of regulatory genes is predicted to exceed the number of
functional genes, and the plateaus.

The conclusion is inescapable: there was a major evolutionary step at
2.5 B years where an RNA-mediated network for the regulation of
protein function, encoded in "non-coding" DNA (introns and
intergenic space), arose, which resulted in the possibility of complex
organisms, including eukaryotes and especially the morphological
development of, and cell differentiation within, metazoans. Mattick
uses the metaphor of genes as simply the "parts list" (a description
of the individual TYPES of "lego blocks"), and the rest as the
instructions for putting them together (how many blocks of which type
to use where and when in morphological development).

The argument is so strong and so reasonable, and simply MUST be
accepted prima facie: "The implications of this rule are
staggering. We may have totally misunderstood the nature of the
genomic programming and the basis of variations in traits among
individuals and species." (Mattick, the Sci Am paper).

There's much more to this argument, including some fascinating
observations about further GENETIC specialty of primates, and even
humans. And while I've seen one of Mattick's technical talks, and read
the Sci Am piece, I have not studied his papers. Nor am I anything
like an expert in this area. My good colleagues here at LANL who are
molecular biologists say "yes, he's made a splash, but let's go slow".
And of course revolutionary ideas require the strongest evidence, and
Mattick is suggesting nothing other than a major revision to, if not
an obliteration of, the Central Dogma:

"We may be witnessing such a turning point in our understanding of
genetic information. The central dogma of molecular biology for the
past half a century and more has stated that genetic information
encoded in DNA is transcribed as intermediary molecules of RNA, which
are in turn translated into the amino acid sequences that make up
proteins. The prevailing assumption, embodied in the credo 'one gene,
one protein', has been that genes are generally synonymous with
proteins. A corollary has been that proteins, in addition to their
structural and enzymatic roles in cells, must be the primary agents
for regulating the expression, or activation, of genes."  (Mattick,
the Sci Am paper).

But fortunately, I'm not a biologist, and so I can without hesitancy
say the following to this group of people interested in (and some
dedicated to) Turchin's Meta-System Transition (MST) theory.

Turchin's original evolutionary system begins with multi-cellular
organisms, and we have speculated for some time about extending the
ideas to earlier evolutionary times. The route is now open with the
origin of the control of genetic expression. In the MST schema, this
is "X is the control of genetic expression", and I don't really know
what X is, something like "protein mechanisms" or "protein
interaction". But the other hallmarks of am MST are there, in the
possible divergence and specialization of the components being
controlled.
-----
O--------------------------------------------------------------------------->
| Cliff Joslyn, Research Team Leader (Cybernetician at Large)
| Knowledge Systems & Computational Biology; Computer & Computational Science
| Los Alamos National Laboratory, Mail Stop B265, Los Alamos NM 87545 USA
| [EMAIL PROTECTED]     http://www.c3.lanl.gov/~joslyn     (505) 667-9096
V All the world is biscuit-shaped. . .

========================================
Posting to [EMAIL PROTECTED] from Cliff Joslyn <[EMAIL PROTECTED]>

--

Francis Heylighen
"Evolution, Complexity and Cognition" research group
Free University of Brussels
http://pespmc1.vub.ac.be/HEYL.html

[ECCO] A cybernetic-connectionist view of genetic regulation

Reply via email to