reply to Peter Marks, re: net net baud rate

Douglas P. Wilson Fri, 18 Dec 1998 00:23:07 -0800
My thanks to Peter Marks <[EMAIL PROTECTED]> for his comments on my
"net net baud rate" message, in which I wrote

> > Of all the things I have to say, if there is only one thing you take 
> > seriously let it be this:  
> > 
> >    We can maximize the global net baud rate for interpersonal 
> >    communications by using combinatorial optimization to match people 
> >    based on personality, interest, and education profiles.

Peter Marks replied:
 
> This approach breaks "interpersonal communication" into two streams, 
> 
> + the actual interpersonal communication
> 
> + communication of each potential communicant's "parameters" to the optimizer

This is more or less what I intended, but it might help to spell it 
out a bit.  ALL acts of interpersonal communication involve a separate 
step of deciding who to communicate with.  Sometimes this is done 
directly in person, or by addressing a piece of e-mail to a single 
recipient, and it may also be done quite indirectly by sending a 
message to a mailing list, but it must happen.  Somehow people must 
make some kind of choice to take part in interpersonal communication, 
and making that choice is something quite distinct from the 
actual communication itself.

A mailing list is one way in which society helps people make these 
choices, and sometimes it works quite well.  The comments of Peter 
Marks illustrate this -- they raise good points and show considerable 
understanding of what I was trying to say, so my choice of mailing 
list to send my message to was probably a good one.

But http://www.liszt.com/ lists something like 90,000 mailing lists,
and I've barely scratched the surface -- it is hard to choose from
amongst the 90,000 mailing lists, let alone the 50 million people
I could conceivably send e-mail to.  

Years of bitter experience have proven to me that the ideas I'm trying 
to communicate are hard to understand and that I'm not very good at 
presenting them.  Another interpretation could be put on this: it may 
be that what I am trying to communicate is simply wrong, and the real 
barrier to communication is my failure to understand the arguments of 
the people I talk to.  But either way, there is a barrier to 
communication.

I daydream constantly about partaking in true high-net-baud-rate
discussions with other people and I think that would mean discussions
which would conclude with either of these results:

1) they understand what I am trying to say and agree with me, OR

2) I understand their arguments against my views and have to agree 
   with them.

Of course I have a preference, I would obviously prefer the first
outcome over the second.  But if I am wrong, I'd like to know it,
and I'd like to be thorougly convinced -- then I could stop wasting
my time.  So really either outcome would be productive.

Let us suppose my poor writing, choice of topic, or pig-headedness 
raise a barrier to communication so high I could have a productive 
high-net-baud-rate discussion like that with only one in a million 
people or so.  That would mean there are somewhere around 50 such 
people out of the 50 million I could send an e-mail to.

What are my chances of finding those 50 people?  In pre-web days the
chances were almost zero, but now because of search engines the 
odds are somewhat improved.  Somewhat.  But not much.  I'm sure we
all know how frustrating search engines can be.  

> The bulk of the latter stream must be considered in determining overall system
> efficiency.  If the amount of information required by the optimizer is high
> (because a lot more detail must be specified than would be present in any
> individual interpersonal communique), or if the parameters have to be
> frequently updated, then the overhead of optimization could swamp any direct
> gains, and the efficiency comparison could actually go against the
> optimization scheme.

Well, yes, this is correct as it stands.  But note the "If" that 
begins the long second sentence.   What I have planned involves filling
out a questionaire on a web page.  And yes, IF it took two weeks to
fill out the form and the results were only useful for a ten minute
conversation, then this just wouldn't work.  But I'm thinking more
along the lines of an hour to fill out the questionaire and the results
being useful for a matter of weeks or months.

Please note that I'm not talking about perfection here, just 
improvement.  Instead of the result being a precise list of the 50
best people to talk to, it might be a list of 150 candidates and
an estimate that there is, say, a 90% chance that 20 of the ideal 50 
are amongst the 150.  

> > I can think of a few people out there who know me and my favourite 
> > topics quite well, people who can read most of my text quickly and 
> > still understand it almost perfectly.
> 
> "People who know me" presumably means that there has been significant prior
> communication and that (some interpretation of it) has been remembered.  This
> is sometimes characterized as establishing a shared context.  It is these
> contexts that must be somehow encoded for the optimizer.

Actually this part of the message was just a way of introducing an 
idea from information theory that some people might not know -- the 
idea that even the free and easy conversation between old friends
doesn't necessarily involve much actual communication  -- the net baud
rate may be quite low because the conversation lacks novelty.
 
> It's still an open question how usefully such contexts can be reduced to byte
> strings; much of Artificial Intelligence research has been an attempt to do
> this. But even accepting that possibility, it is not at all obvious to me that
> the many separate contexts which one person shares with other individuals and
> groups can be combined into a unified set of optimizer parameters for each
> person.

This is quite correct, but I'm not proposing anything that ambitious.

> Mathematical information theory speaks to the probability of different
> messages - low probability messages contain high information.  It doesn't
> require that each bit be equivalent to every other bit; such an assumption
> simply makes the math easier, and is adequate for assessing _maximum_ channel
> capacities.

Yes, of course, that's correct.

> Discounting bits that contain 'less information' than they might otherwise is
> well within the capability of the formulas of information theory.

Yes, it is.  I am trying to apply information theory here, not attack it.

> Transmitting in an unknown language is one such discounting, but so it
> transmitting bland, "unsurprising" messages.  In other words both examples
> seem to me of arguably low net baud rate (but for different reasons).

This is precisely the point I was trying to make.  I think most of us
have little interest in bland messages, and I am trying to suggest 
ways of matching people to encourage more actual communication of
information.

> From this perspective, it is in this flexibility of discounting that the
> overall scheme will ultimately break down.  It will prove impossible to
> find a discounting scheme which is simultaneously 
> 
> 1. consistent
> 2. general enough to usefully summarize a wide range of "intended meaning"
> 3. specific enough to make adequate distinctions for optimization purposes
> 4. computationally efficient

Well thanks, Peter, you've done a bit of requirements analysis for me.
I really wasn't intending anything so ambitious, but I'll think hard 
about these four requirements and see what comes of it.  But you say
"It will prove impossible to ..." and such statements are hazardous.

I'll quote again Clarke's First Law:

   When a distinguished but elderly scientist states that something is 
   possible, he is almost certainly right.  When he states that something 
   is impossible, he is very probably wrong.

I don't know if Peter Marks is distinguished, elderly, or a scientist,
but he should still use words like "impossible" very carefully.

> In short, the notion of "net baud rate" will turn out to not have a
> practicable definition, and therefore to not be subject to optimization.

I think "a practicable definition" is easy, because it is literally a
definition that can be used in practise.  It would be harder to completely
formalize the notion.  For my purposes I propose to use the judgements
of the people involved.  

If you fill out my questionaire (in a month or so when it is ready for 
use) and the matching program recommends an exchange of e-mail with 
several people, then I propose to accept your judgement about how 
productive the exchanges were.  And I will encourage you to provide 
that judgement as feedback to improve the matching process.

I'll be using a machine-learning system, perhaps based on a neural 
network, and will essentially be training the machine (the program)
to generate suggestions that BOTH of the people involved in the
discussions will agree were good ones.

By the way, I'd like to add a prediction to the effect that the global
(total) net baud rate, a measure of genuine productive communication,
will turn out to be a significant factor in the global economy, and
hope my simulator will be able to include it.  If I am correct then
my matching experiments should increase the amount of real 
communication and benefit the global economy.  But of course this
must emerge from the simulation as a result, and must not be 
programmed into it as an assumption.

      dpw

Douglas P. Wilson     [EMAIL PROTECTED]
http://www.island.net/~dpwilson/index.html
reply to Peter Marks, re: net net baud rate

Reply via email to