Re: [agi] openAI's AI advances and PR stunt...

Linas Vepstas Thu, 21 Feb 2019 20:30:50 -0800

I've got to write shorter emails. My apologies. I get started...

On Thu, Feb 21, 2019 at 5:15 PM Rob Freeman <[email protected]>
wrote:


>
> By "formal incompleteness" I mean Goedel's proof that "every sufficiently
> powerful formal system is either inconsistent or incomplete".
>

Oh. There are a variety of completeness and incompleteness theorems. They
always struck me as being similar to other statements about ...
completeness, closures, compactifications, entireness. So, like Banach
spaces are complete. Great. Very important property, when you work with
them, but otherwise its a propos nothing at all. Some other theory will be
different. Its just one of those properties that systems have or don't
have. But I never really read/looked/thought hard about that so I dunno.
Doesn't strike me as important for for AGI or anything.

Here's a better example. Its like "real numbers lie in between rational
numbers; rationals are not a complete field over polynomials".
Whoop-de-doo. So what.  Was that supposed to be important in some way? How?

> Not necessarily a big deal. But I had difficulty in the past trying to
persuade linguists that these mathematical results have relevance for
linguistics.

I was pointed to this book:

Solomon Marcus, Algebraic Linguistics; Analytical Models, Elsevier, 1967,
https://monoskop.org/images/2/26/Marcus_Solomon_editor_Algebraic_Linguistics_Analytical_Models_1967.pdf

which even after a quick skim becomes clear that he was anticipating
categorial linguistics.  I mean, its almost the same thing, with
differences in terminology and notation, but otherwise clearly
recognizable.  So even as Chomsky was going on a tirade, others, (I'm
blanking on names) were applying category theory to linguistics in the
60's, 70's. I think it was just wayyyy  too abstract for anyone without a
formal training in math; and linguists don't have that formal training,
they can't really follow or play the game according to the rules.

>  The problem was that Chomsky shot it down because it resulted in
"inconsistent or incoherent ... analyses". This was big. It cracked
linguistics apart. Linguistics is divided by it to this day:
>
> Frederick J. Newmeyer, Generative Linguistics a historical perspective,
Routledge 1996:
>
> "Part of the discussion of phonology in ’LBLT’ is directed towards
showing that the conditions that were supposed to define a phonemic
representation (including complementary distribution, locally determined
biuniqueness, linearity, etc.) were inconsistent or incoherent in some
cases and led to (or at least allowed) absurd analyses in others."
>
> Sydney Lamb:
>
> 'For example, perhaps his most celebrated argument concerns the Russian
obstruents. He correctly pointed out that the usual solution incorporates a
loss of generality, but he misdiagnosed the problem. The problem was the
criterion of linearity. He stubbornly holds on to this criterion, although
it really is faulty, and comes up with a solution for the Russian
obstruents that obscures the phonological structure. I showed (in accounts
cited below) that by relaxing the linearity requirement we get an elegant
solution while preserving "centrality of contrastive function of linguistic
elements".'

Wow. I did not know that. Interesting, I suppose. Its, well, beats me.
Science is littered with misunderstandings by brilliant people. Time
passes. Debates are forgotten.  I don't know what to do with this.

> What we need is a concerted effort to bring these theoretical concerns to
the forefront.

:-/
> I think if we do that your problems finding volunteers to work on your
codebase may disappear in quick time. The time is very ripe. The likes of
Geoff Hinton, Ng, just about everyone, are coming out saying there is a
problem, we need a reboot, we are looking for new solutions. A really tidy
theoretical presentation of non-linearity, might focus minds, and bring
funding.

I'll take that as a motivational incentive. But I haven't finished writing
what I want to write, I haven't finished exploring and proving what I want
to explore/prove, and maybe 95% of the codebase that I'm talking about
really has little/nothing to do with *this* conversation; its something
else again.

> You have your "jigsaw". OK in itself, but then a jigsaw only goes
together in one way. Is that "one way" significant??

Yes, no, sort-of. It can be one way, it can be many ways.  So in the 1960's
and onward, talking about and articulating "context free grammars" etc.
explained and clarified and gave solid form to various vague confusions.
The jigsaw-puzzle-piece analogy is more recent - the first published
diagrams I know of were in the 1991 sleator/temperley paper - but once you
know what to look for, you can see them in earlier works, just not
explicitly placed into a figure/diagram.  As an organizing principle, I
think they're a powerful metaphor; they break you out of the world on
1-dimensional strings of symbols, and enable thinking about connectivity of
graphs.   And once you see them, they're just absolutely everywhere. But
once you see them everywhere, they become trite, like 1+1=2, and getting
past the triteness requires attention to details. Like 1+1=2 implies prime
numbers implies the vast tract of number theory, and that's a journey of a
thousands steps. Jigsaw connectors are just one step; after that step...
well, there's many more.

> How efficient is your "jigsaw" search?

If you know what the pieces are, then assembling them is effectively a
parsing problem. Currently the English dictionary parser benchmarks run at
5K sentences in 10 seconds. (or not; those are short sentences. 18th
century Jane Austen, with lots of long sentences, dialog and archaic
English stumbles with lots of errors and takes 10 minutes for the 5K
pseudo-sentence-mashup literary hash. So "it depends") If you don't know
what the pieces are, but are setting out to discover them by looking for
statistical regularities in language, then it takes many weeks of elapsed
time on multicore machines to find something interesting. But this is
untuned, brute-force stumbling in the dark kind-of work.  Once someone
figures out exactly how much data is needed, instead of being
pessimistic/optimistic, once it becomes clear what the optimal sizes and
lengths and amounts are;  which steps to be carried out when, -- shoot, you
could make it a thousand times faster.

I mean, we've been talking theory; I like the theory. Actually proving that
these things work well and do the right thing -- that's incomplete work. it
looks really good to me, so far, but I encounter raw disbelief that won't
dissipate until a final completed working version is assembled. So that's
normal.

We haven't really clarified with "it" is, so talk of speed is premature.

>  I don't know if you saw the paper I linked for Ben earlier.

I didn't look.

> I could probably implement my relations in your codebase.

I work in 4 or five different codebases, they're really very very
different, almost unrelated, and yet related. So which one, where how? My
best ideas haven't even been coded yet; they're festering, abandoned in
papers and PDF's and crib notes and TODO files.

Anyway, most of the gaggle of ideas discussed here is disjoint of the
codebase. And the codebase .. well. phew, its very nearly an unrelated
discussion. It's a whole different thing.

-- Linas


-- 
cassette tapes - analog TV - film cameras - you

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T581199cf280badd7-Ma5e60f09b41a2d13cb1937cb
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] openAI's AI advances and PR stunt...

Reply via email to