On Sun, Sep 19, 2021 at 11:57 AM Adrian Borucki <[email protected]> wrote: > > Just to clarify: by “performance” I mean the rate of success on a given task, > not necessarily speed.
Well, I think its likely to be successful, but clearly I have not convinced you of that. > Anyway: I’m afraid I can’t help with the visual processing part then — I know > nothing of using wavelets for image analysis You don't have to use wavelets. You do have to have a basic understanding of image processing and how one applies image processing primitives to extract information. There is an easy way to learn this, though: The earliest programming task is simply to write atomese wrappers for common textbook image-processing primitives. This means, in practice, downloading a copy of OpenCV, and reading through it's documentation. Writing Atomese wrappers for it would allow you to learn image processing "hands on" -- there are a number of OpenCV demos; you can run them, convert them to Atomese, run the Atomese versions, and verify that you get the same results. There are textbooks on image processing, filled with examples; converting them to atomese and running them would be a good, practical way of learning the core concepts. An alternative would be to do this for audio; in some sense, this would be simpler, but certainly a lot geekier: audio does not have the immediate visual feedback of image processing. It's more abstract. > so I can’t really say anything further until how this is supposed to work is > fully sorted out. I'm sorry to hear this. You seem to be politely backing away from the project; I'm not sure what you expected it to be, but clearly what I painted is not what you'd hoped for. The project is "sorted out", but I guess I'm not communicating something important about it. Again: the pipeline is already working in the language domain. I tried to provide enough of an explanation and pseudocode snippets to explain how to port it over for vision and audio. It's pretty concrete; there's no airy-fairy hand-waving, just a pile of pseudocode that needs to be converted to real code. I'm guessing that somehow, I still failed to somehow explain what this is all about. Perhaps I should bounce you to the abstract theory papers? There are two: one that's hand-wavey with no math, another with lots of math. These are https://github.com/opencog/atomspace/blob/master/opencog/sheaf/docs/sheaves.pdf and https://github.com/opencog/learn/blob/master/learn-lang-diary/skippy.pdf --linas > > On Friday, 17 September 2021 at 22:19:47 UTC+2 linas wrote: >> >> Hi Adrian, >> >> On Thu, Sep 16, 2021 at 3:02 PM Adrian Borucki <[email protected]> wrote: >> > >> > Yeah, this is clear to me to now — the grammar learning part is kind of a >> > given, the real question is how well this “image predicate” learning can >> > go… >> >> Yes, that is a question. Based on current experience, I'll say "very >> far" or at least, "much farther than anyone else has gone". But that >> is rather speculative: it's based on what I've been learning in a 1D >> setting, and so any doubters or skeptics in the audience are >> justified in doubting. Basically, I'm proposing this because it looks >> promising. >> >> It does not help that I am just one person proposing a rather novel, >> radical, counter-cultural idea that flies in the face of conventional >> wisdom. I'm quite aware of this. My burden of proof is much higher, >> and I am trying to supply it as best as I can. Keep asking doubtful >> questions, this is maybe the most useful thing you can do right now. >> So I like how this is going. I'm only irritated that you can't read my >> mind :-) >> >> > This is a deep question as no one is even sure why neural nets themselves >> > work so well. >> >> Well, again, this goes in a very different direction. Here, the >> reason that it would "work so well" is much more obvious: we ourselves >> are very good at spotting part-whole structure. Why, in just a few >> minutes, I can write down the obvious grammar for stop lights: glowing >> red above yellow above green, surrounded by a painted yellow or black >> harness. This is "obvious", and detecting this in images seems like it >> should be pretty easy. >> >> This is in very sharp contrast to what neural nets do: you are right: >> when a neural net picks out a stoplight from an image, we have no idea >> how it is doing that. Perhaps somewhere in there are some weight >> vectors for red, yellow, green, but where are they? Where are they >> hiding? How do neural nets handle part-whole relationships? There is >> a paper (from Hinton?) stating that the part-whole relationship for >> neural nets is the grand challenge of the upcoming decades. By >> contrast, the part-whole relationship for grammars is "obvious". >> >> > What needs clarification is what the structure of this filter learning >> > would be — what is the algorithm and what direct learning objective is it >> > given? >> >> The exact same algo as in the existing grammar learning code, modulo >> needed tweaks. That code is debugged and works well. Getting it going >> on images does pose some serious challenges and open questions, but I >> think the general ideas survive. >> >> To recap that algo: given a set of inputs, one explores the parameter >> space, and looks for high mutual-information correlations between >> pairs. Once high-MI pairs are discovered, the dataset is passed over a >> second time, this time, creating maximal spanning trees. The tree >> edges are then cut to give the grammar components. >> >> The above yields extremely high-dimensional sparse vectors: dimension >> of a million. By comparison, the highest dimension that neural nets go >> up to is about a thousand. So this is one of the big differences >> between the two approaches. The other, of course, is that the basis is >> labelled symbolically: you can see exactly which basis element >> attaches to what ("red above yellow", etc.) >> >> I'm currently working on the best ways to cluster these vectors into >> groupings. Early results look pretty good, but also show that these >> can be made much better. I can say much more in this. >> >> > Like in the above example, where are all these filters and numerical >> > arguments even coming from? >> >> Randomly generated. With or without some sampling bias. >> >> > The numerical part is especially difficult, given that you seemingly want >> > to get some symbolic structure out of it. >> >> I don't understand this statement. >> >> > >> > Going back to neural nets, the obvious problem is that if we make one big >> > neural “filter” then you don’t know what is going on inside — >> >> That's correct. >> >> > so the learning will be “shallower”. The question is how much of a problem >> > this really is. >> >> Well, the leading lights of neural-net world claim that this is one of >> the grand challenges of the upcoming decades, and I won't argue with >> them about that. >> >> > Is learning down to the low-level filtering operations a viable approach >> > right now? >> >> Yes, absolutely, I think so. Obviously, I haven't convinced you yet. >> That is in part because I have not fully (clearly?) communicated the >> general idea, just yet. >> >> > An interesting research question is if you could train a neural net that >> > can be “queried”, possibly in natural language or some simple formal one, >> > so that the system on top of it can learn to “extract” various statements >> > about an image out of it — so these predicates would be essentially hooked >> > to some queries that get send to the underlying model. >> >> Sure, there are hundreds of people working on this, and they are >> making progress. You can go to seminars, new results are regularly >> presented on this. >> >> > Technically this probably falls somewhere in the Visual Question Answering >> > field… the challenge is that these models are trained to answer questions >> > about more abstract things like objects, not some low level features of >> > the image. >> >> Yes. Lack of a symbolic structure to neural nets impedes desirable >> applicatiions, such as symbolic reasoning. >> >> > The final big question is what can you really do after you get that >> > grammar? What sort of inferences? How useful they are? >> >> Well, for starters, if the system recognizes a stop light, you can ask >> it: "how do you know its a stop light?" and get an answer: "because >> red above yellow above green." you can ask "and what else?" and get >> the answer "on a painted black or yellow background" -- "and what >> else?" "the colors glow in the dark" "and what else?" "they are round" >> and what else" only one comes on at a time" "and what else?" "the >> cycle time varies from 30 second to three minutes" "what is a cycle >> time?" "the parameter on the time filter by which repetition repeats" >> "what do you mean by round?" the image area of the light is defined >> via a circular aperature filter". >> >> Good luck getting a neural net answering even one of those questions, >> never mind all of them. >> >> > The key thing here is that if you, say, have a system that classifies >> > pictures, if it being built on top of this whole grammar and filter >> > learning pipeline means it doesn’t achieve competitive performance with >> > neural nets then it’s difficult to see what the comparative advantage of >> > it is — beyond the obvious advantage of interpretability, but that won’t >> > save that solution if its performance is considerably lower. >> >> Really? The ability to do symbolic reasoning is valueless if it is >> slow? If the filter that recognizes that lights are round also >> appears in other grammatically meaningful situations, you can ask a >> question "what else is round?" "the sun, the moon, billiard balls, >> bowling balls, baseballs, basketballs". I think we are very very far >> away from having a neural net do that kind of question answering. I >> think this is well within reach of grammatical systems. >> >> Associations between symbols and the things they represent is the >> famous "symbol grounding problem", considered to be a very difficult, >> unsolved problem in AI. I'm sketching a technique that solves this >> problem. I think this is unique in the history of AI research. I don't >> see that anyone else has ever proposed a plausible solution to the >> symbol grounding problem. >> >> > Well, the problem is not really with grammars, that can definitely be >> > useful, but if that “filter sequence” part works poorly then it will >> > bottleneck the performance of the entire system. >> >> Learning it, or running it, once learned? Clearly, running it can be >> superfast .. even 1980's-era DSP's did image processing quite well. >> Even single-threaded CPU's have no particular problem; these days we >> have multi-core CPU's and oodles of GPU's. >> >> The learning algo is ..something else. There are two steps: Step one: >> can we get it to work, at any speed? (I think we can) Step two: can we >> get it to work fast? (Who knows -- compare to deep learning, which >> took decades of basic research spanning hundreds of PhD theses before >> it started running fast. You and I and whatever fan-base might >> materialize are not going to replicate a few thousand man-years of >> basic research into performance.) >> >> > If that low level layer outputs garbage, then all the upper layers get >> > garbage, and we know what happens when you have garbage inputs in this >> > field... >> >> Don't feed it garbage! >> >> --linas >> >> -- >> Patrick: Are they laughing at us? >> Sponge Bob: No, Patrick, they are laughing next to us. > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/959e9352-cc5b-481d-9a85-a4fd0a587578n%40googlegroups.com. -- Patrick: Are they laughing at us? Sponge Bob: No, Patrick, they are laughing next to us. -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA35-SZQ9RUh7fgsK2C_yQhWESmJP5_jRizFocnkEUzZjJA%40mail.gmail.com.
