The “large” refers to the number of parameters used.  A smaller large language 
model – a deep neural net -- start about 3 billion parameters, but larger ones 
like Claude 2 (the latest large language model of the company that wrote the 
paper Steve mentioned) have more than 130 billion parameters.   Amazingly, it 
is possible using (rooms of) GPUs and other accelerators to optimize in this a 
space of this size.   The billions of parameters come from the vocabulary size 
– the number of tokens that need to be discriminated, the many layers of 
transformers that are needed to capture the complexity of human and non-human 
languages (like DNA), and the context window size – how many paragraphs or 
pages the model is trained on at a time.   A small language model might be 
suitable for understanding the geometries of chemicals, say.

From: Friam <friam-boun...@redfish.com> On Behalf Of Tom Johnson
Sent: Saturday, October 7, 2023 2:38 PM
To: The Friday Morning Applied Complexity Coffee Group <friam@redfish.com>
Subject: Re: [FRIAM] Language Model Understanding

Thanks for passing this along, Steve. I wish, however, the authors of this 
short piece would have included a definition of, in their usage, "Large 
Language Models" and "Small Language Models."  Perhaps I can find those in the 
larger paper.
Tom

On Sat, Oct 7, 2023 at 12:34 PM Steve Smith 
<sasm...@swcp.com<mailto:sasm...@swcp.com>> wrote:

This popular-press article came through my Google News feed recently which I 
thought might be useful to the Journalists/English-Majors on the list to help 
understand how LLMs work, etc.   When I read it in detail (forwarded from my TS 
(TinyScreenPhone) on my LS (Large Screen Laptop)) I found it a bit more 
detailed and technical than I'd expected, but nevertheless rewarding and 
possibly offering some traction to Journalism/English majors as well as those 
with a larger investment in the CS/Math implied.

Decomposing Language Models into Understandable Components
<https://www.anthropic.com/index/decomposing-language-models-into-understandable-components>
[https://efficient-manatee.transforms.svdcdn.com/production/images/Untitled-Artwork-11.png?w=2880&h=1620&auto=compress%2Cformat&fit=crop&dm=1696477668&s=d32264d5f5e32c79026b8e310e415c74]

and the (more) technical paper behind the article

https://transformer-circuits.pub/2023/monosemantic-features/index.html
Despite having sent a few dogs into vaguely similar scuffles in my careen(r):
Faceted Ontologies for Pre Incident Indicator Analysis 
<https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf>
SpindleViz<https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf>
...

... I admit to finding this both intriguing and well over my head on casual 
inspection...  the (metaphorical?) keywords that drew me in  most strongly 
included Superposition and Thought Vectors, though they are (nod to Glen) 
probably riddled (heaped, overflowing, bursting, bloated ... )  with excess 
meaning.

https://gabgoh.github.io/ThoughtVectors/

This leads me (surprise!) to an open ended discursive series of thoughts 
probably better left for a separate posting (probably rendered in a 
semasiographic language like Heptapod 
B<https://en.wikipedia.org/wiki/Heptapod_languages#Orthography>).

<must... stop... now... >

- Steve
-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Reply via email to