Linas,

The thing to take away from Pissanetzky's formulation is the idea of
finding structure of cognitive relevance by permuting observations.

Permutation, not patterns. An active principle. It's a subtle difference.
You can analyse them with the same maths, but that is to miss the point.

The idea that perception is permuting observed elements to emphasize causal
symmetries is very powerful. Causal symmetries can be independent of a
particular form. The way the parts are put together can be quite different
each time. You can have "one shot learning", even "creativity". Compare
this to the state-of-the-art which is to "learn" patterns. That's more
Platonic. Permutation is a more active process, more Heraclitus. There is a
change in perspective. It does reduce to the same maths as yours Linas.
Maths is maths. Your element, I know. What matters is how you apply and
interpret it. All this stuff may be the same. Except for your assumption:

LV, Feb 9: "If you are picky, you can have several different kinds of
common nouns, but however picky you are, you need fewer classes than there
are words."

You don't address this at all in your reply, except so imply there is
context which justifies it. What is the context which justifies it? Are you
saying there are fewer classes than words, or more? This is a crucial point.

Perhaps even more crucial is the related point about whether these classes,
these factorizations, have gauge?

Note 10, p.g. 52, "Neural-Net vs. Symbolic Machine Learning":

"In physics, such an ambiguity of factorization is known as a global gauge
symmetry; fixing a gauge removes the ambiguity. In natural language, the
ambiguity is spontaneously broken: words have only a few senses, and are
often synonymous, making the left and right factors sparse. For this
reason, the analogy to physics is entertaining but mostly pointless."

You don't address that at all either. Only to say:

'I doubt that the use of the word "gauge" there has anything at all to do
with the word "gauge" of "gauge theory".   In physics, "gauge" has a very
specific and precise meaning; I suspect that there's an abuse of that word,
here.'

"Doubt", "suspect"? You don't know? Which use of "gauge" do you suspect is
an abuse? My use is reference to your use.

Do you suspect your use of "gauge" is an abuse?

Why do you dismiss it, your use, as "mostly pointless"? For a meticulous
guy you are remarkably quiet on this. I'm surprised you would leave
something at doubt and suspicion, or even "mostly", and go no further.

Do you assert that such "ambiguity of factorization" is insignificant in
natural language (even only "mostly", which is still non-zero and might be
significant)? On what evidence? On the evidence your grammar learning teams
find nice clean factors and grammar learning is demonstrably solved? Or on
the evidence everybody ends up 'stumbling at the "clustering" step'.
Certainly everybody was stumbling at this step in the '90s, by the likes of
David Powers and Hinrich Schuetze trying to learn category, and if you dig
a bit deeper, going right back to the destruction of American Structuralist
linguistics by Chomsky in the '50s.

You keep ignoring the evidence Chomsky put his finger on 60 years ago that
even distributional analyses of the phoneme '...were inconsistent or
incoherent in some cases and led to (or at least allowed) absurd analyses
in others.' Your only comment was:

"Wow. I did not know that. Interesting, I suppose. Its, well, beats me.
Science is littered with misunderstandings by brilliant people. Time
passes. Debates are forgotten.  I don't know what to do with this."

You "don't know what to do with this" so you ignore it and move on?

You see gauge in natural language, but you find reason to assume, to
"doubt", to "suspect" it is "mostly" not important, and move on?

These are points you keep ignoring as you remain in your comfort zone
citing maths.

How are you applying the maths?

1) Are there more classes (permutations?) than examples?

2) Is observed gauge in language (your observation!) "pointless", "mostly
pointless" or not pointless at all, simply not explored further?

3) How do you explain Chomsky's observation 60 years ago that
distributional analysis was '...inconsistent or incoherent in some cases
and led to (or at least allowed) absurd analyses in others.'

-Rob

On Mon, Feb 25, 2019 at 6:30 PM Linas Vepstas <linasveps...@gmail.com>
wrote:

>
>
> On Sun, Feb 24, 2019 at 6:34 PM Rob Freeman <chaotic.langu...@gmail.com>
> wrote:
>
>>
>> Where I see the gap might be summarized with one statement by Linas:
>>
>> LV, Feb 9: "If you are picky, you can have several different kinds of
>> common nouns, but however picky you are, you need fewer classes than there
>> are words."
>>
>> "...however picky you are, you need fewer classes than there are
>> words..."
>>
>> "Fewer classes"?
>>
>> How do we know that is true?
>>
>
> Quoting out of context is dangerous, and can lead to misunderstanding. We
> know its true because Noah Webster pointed it out in 1806 and pretty much
> everyone has agreed with him, ever since.
>
>
>> Experimentally, that language "gauge" is not pointless, was observed in
>> the results for the distributional learning of phonemes I cited, dating
>> back 60 years, to the time Chomsky used it to destroy distributional
>> learning as the central paradigm for linguistics.
>>
>
> I doubt that the use of the word "gauge" there has anything at all to do
> with the word "gauge" of "gauge theory".   In physics, "gauge" has a very
> specific and precise meaning; I suspect that there's an abuse of that word,
> here.
>
>
> > Sergio Pissanetzky comes close to this, with his analysis in terms of
> permutations, also resulting in a network:
>
> > "Structural Emergence in Partially Ordered Sets is the Key to
> Intelligence"
> http://sergio.pissanetzky.com/Publications/AGI2011.pdf
>
> I'll look. Is it actually that good?
>
>
>>
>>   Currently they are stumbling at the "clustering" step'
>>
>> Sure they will!
>>
>
> It's more mundane than that. They haven't bothered with some basic steps
> of data analysis. Getting them to actually filter out the junk from the
> valid data is the proverbial "like pulling teeth", they don't want to do it
> and insist that it'll be fine but then complain about the results.  It's
> ordinary lab science, repeated around the world dozens of times a day:
> experiments don't work because of contaminants and interference.
>
>
>>
>>
>> Looking up what you are doing with your "MST" parser. You start by
>> clustering links, starting with mutual information.
>>
>
> It's just one lens in a microscope assembly. It does a certain task, it
> does it OK-ish, what matters is it's role in the entire assembly, and not
> as a stand-alone component. Too many people are too obsessed about it as a
> stand-alone component.
>
>
>>
>>
>> OK, you may be right that BERT models may have an advantage that Linas
>> doesn't see, because deep nets do allow some recombination of parts, within
>> a network layer, at run time.
>>
>
> I also haven't studied BERT. What's more important, BERT or Pissanetsky?
>
> --linas
>
>>
>>
>
> --
> cassette tapes - analog TV - film cameras - you
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-Mca1d69e229a473c2d04fe098>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-M23bf174eb9b5eec4de3dd35c
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to