Thanks. Those are impressive numbers. Tom On Sat, Oct 7, 2023 at 4:11 PM Marcus Daniels <mar...@snoutfarm.com> wrote:
> The “large” refers to the number of parameters used. A smaller large > language model – a deep neural net -- start about 3 billion parameters, but > larger ones like Claude 2 (the latest large language model of the company > that wrote the paper Steve mentioned) have more than 130 billion > parameters. Amazingly, it is possible using (rooms of) GPUs and other > accelerators to optimize in this a space of this size. The billions of > parameters come from the vocabulary size – the number of tokens that need > to be discriminated, the many layers of transformers that are needed to > capture the complexity of human and non-human languages (like DNA), and the > context window size – how many paragraphs or pages the model is trained on > at a time. A small language model might be suitable for understanding the > geometries of chemicals, say. > > > > *From:* Friam <friam-boun...@redfish.com> *On Behalf Of *Tom Johnson > *Sent:* Saturday, October 7, 2023 2:38 PM > *To:* The Friday Morning Applied Complexity Coffee Group < > friam@redfish.com> > *Subject:* Re: [FRIAM] Language Model Understanding > > > > Thanks for passing this along, Steve. I wish, however, the authors of this > short piece would have included a definition of, in their usage, "Large > Language Models" and "Small Language Models." Perhaps I can find those in > the larger paper. > > Tom > > > > On Sat, Oct 7, 2023 at 12:34 PM Steve Smith <sasm...@swcp.com> wrote: > > This popular-press article came through my Google News feed recently which > I thought might be useful to the Journalists/English-Majors on the list to > help understand how LLMs work, etc. When I read it in detail (forwarded > from my TS (TinyScreenPhone) on my LS (Large Screen Laptop)) I found it a > bit more detailed and technical than I'd expected, but nevertheless > rewarding and possibly offering some traction to Journalism/English majors > as well as those with a larger investment in the CS/Math implied. > > Decomposing Language Models into Understandable Components > > <https://www.anthropic.com/index/decomposing-language-models-into-understandable-components> > > and the (more) technical paper behind the article > > https://transformer-circuits.pub/2023/monosemantic-features/index.html > > Despite having sent a few dogs into vaguely similar scuffles in my > careen(r): > > Faceted Ontologies for Pre Incident Indicator Analysis > <https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf> > SpindleViz <https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf> > ... > > ... I admit to finding this both intriguing and well over my head on > casual inspection... the (metaphorical?) keywords that drew me in most > strongly included *Superposition* and *Thought Vectors*, though they are > (nod to Glen) probably riddled (heaped, overflowing, bursting, bloated ... > ) with excess meaning. > > https://gabgoh.github.io/ThoughtVectors/ > > This leads me (surprise!) to an open ended discursive series of thoughts > probably better left for a separate posting (probably rendered in a > semasiographic language like Heptapod B > <https://en.wikipedia.org/wiki/Heptapod_languages#Orthography>). > > <must... stop... now... > > > - Steve > > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ > > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ >
-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/