On Fri, Dec 24, 2021 at 9:21 AM <[email protected]> wrote:
> On Thursday, December 23, 2021, at 10:50 PM, James Bowery wrote: > > Where is Google's big investment in sparse matrix multiplication hardware, > for example? Why the reliance on dense models when it is known that's not > how the brain works? > > I don't know why, but, at least they are getting close. > When I would have made the investment at least 7 years ago. <https://jimbowery.blogspot.com/2013/04/a-circuit-minimizing-multicore-shared.html> > On Thursday, December 23, 2021, at 10:50 PM, James Bowery wrote: > > And why the emphasis on "big" models when it is known that minimizing the > size of the algorithm that outputs the data is the optimal model selection > criterion? Might it have anything to do with trying to HIDE that ground > truth of AGI by throwing vast amounts of money around that they shouldn't > have in the first place? > > ...So I think you overlooked this James, in the LC contest, it "looks" > like you want to achieve a small envelope, yes. But if one were to include > (if, they do!) the ability to use 10GB or 100GBs of text or other data > types, then suddenly the goal is not just to make the smallest envelope --- > but the biggest envelope that has the best prediction accuracy for its > size, or simply the best AI predictor. So bigger means better, but it is no > not the only way to improve AI. However, for the AI's evaluation, one does > not need to use more than 1GB or 100MBs to measure Perplexity or Lossless > Compression. > No and this is another example of why I think the malincentives of capital misallocation are, in fact, preserving themselves by crippling AGI: By increasing the size of the corpus of data, one doesn't change the goal, one enhances the potential of discovering "bias" and "toxicity" in, for example, the social sciences that are about to send civilization into a massive bloodbath. how could this be, one might ask. Simple: When you include more data under the SAME GOAL OF LOSSLESS COMPRESSION you make it necessary to discover latent factors in the data that produce incoherence between the increasing numbers of data sources. It's called "cross checking". It is tiresome to repeat this over and over and over to the lemmings, but here it is again as a Christmas present: We all have our subjective opinions about what is "bias" and "toxicity", but how can one go about deciding whose opinion is the right one other than simply those opinions that currently get one excluded from secure employment or at least regarded as a social pariah? Well, if one is careful in one's thinking, one can see that somewhere, somehow, one got the impression that certain kinds of ideas are dangerously erroneous -- hence unethical to hold and promulgate if not immoral if not downright criminal to hold. Note I said "erroneous" -- which refers not to any kind of value but to some notion of "truth". OK, so somehow one has these notions regarding "truth" that are value neutral -- except insofar as one may _decide_ things based upon those notions and take _actions_ as a result. So even before we get to the question of what we all value thence how we differ in our values, we have to come up with some notion of the "truth" of things and that includes where we got our subjective notions of what is "erroneous" in the first place. Well, an intellectually honest public "conversation" would simply ask people to lay their cards on the table about where, exactly, they got their notions of "truth". In this process, those cards would, ultimately, refer to certain "facts" or "observed phenomena" as the basis upon which one induced one's model of reality. So lossless compression as model selection of "truth" would say: "Fine, let's include ALL you data in the corpus, but you can't then tell everyone else that only _your_ data is 'unbiased' and lacks 'toxicity' since what we are now in the process of discovering is whose ideas are 'erroneous' and that is the precursor to discovering 'bias' and 'toxicity'! So now your data is include with everyone else's 'facts' that are in the 'conversation'. Fair enough?" At this point, if someone's thermometer is, say, 2C degrees off from "reality", the losslessly compressed corpus will need to bring its measurements into alignment with those of other measurement instruments and this will be done by reifying the latent identity of that "biased" thermometer in an _explicit_ model of its "bias" which says: "Thermometer is off by 2 degrees to to maximally compress the entire corpus, we must offset its measurements by 2 degrees." Now, all of a sudden, latent identities are exposed and with an explicit statement of their "bias" thence exactly _how_ they are "erroneous". Nuff said. I'm now blocking you. > This is why I say in my Guide to AGI that big data, better recognizer, > make a smarter AI, but the evaluation only has to use 100MBs for now. So if > someone asks "why the emphasis on BIG?", the answer is not really > evaluation score, but usefulness (helping humans, raking in cash. People > get better generated data if it is scaled up). > > On Thursday, December 23, 2021, at 10:50 PM, James Bowery wrote: > > And, finally, there is your misconception that Transformers are, in some > sense, recurrent -- a misconception advanced by Google's own paper titled > "Attention Is All You Need", leaving you here to do their dirty work for > them in your Google induced brain fog. > > Transformers can do everything LSTMs could do, and more. Prove me wrong. > *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T358f938c1cfb5c51-M2405670d2014ab18edc90cac> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T358f938c1cfb5c51-M5fdcc65327251dd2a40c12fb Delivery options: https://agi.topicbox.com/groups/agi/subscription
