Re: [FRIAM] Language Model Understanding

Tom Johnson Sat, 07 Oct 2023 16:14:29 -0700

Thanks. Those are impressive numbers.
Tom

On Sat, Oct 7, 2023 at 4:11 PM Marcus Daniels <mar...@snoutfarm.com> wrote:


> The “large” refers to the number of parameters used.  A smaller large
> language model – a deep neural net -- start about 3 billion parameters, but
> larger ones like Claude 2 (the latest large language model of the company
> that wrote the paper Steve mentioned) have more than 130 billion
> parameters.   Amazingly, it is possible using (rooms of) GPUs and other
> accelerators to optimize in this a space of this size.   The billions of
> parameters come from the vocabulary size – the number of tokens that need
> to be discriminated, the many layers of transformers that are needed to
> capture the complexity of human and non-human languages (like DNA), and the
> context window size – how many paragraphs or pages the model is trained on
> at a time.   A small language model might be suitable for understanding the
> geometries of chemicals, say.
>
>
>
> *From:* Friam <friam-boun...@redfish.com> *On Behalf Of *Tom Johnson
> *Sent:* Saturday, October 7, 2023 2:38 PM
> *To:* The Friday Morning Applied Complexity Coffee Group <
> friam@redfish.com>
> *Subject:* Re: [FRIAM] Language Model Understanding
>
>
>
> Thanks for passing this along, Steve. I wish, however, the authors of this
> short piece would have included a definition of, in their usage, "Large
> Language Models" and "Small Language Models."  Perhaps I can find those in
> the larger paper.
>
> Tom
>
>
>
> On Sat, Oct 7, 2023 at 12:34 PM Steve Smith <sasm...@swcp.com> wrote:
>
> This popular-press article came through my Google News feed recently which
> I thought might be useful to the Journalists/English-Majors on the list to
> help understand how LLMs work, etc.   When I read it in detail (forwarded
> from my TS (TinyScreenPhone) on my LS (Large Screen Laptop)) I found it a
> bit more detailed and technical than I'd expected, but nevertheless
> rewarding and possibly offering some traction to Journalism/English majors
> as well as those with a larger investment in the CS/Math implied.
>
> Decomposing Language Models into Understandable Components
>
> <https://www.anthropic.com/index/decomposing-language-models-into-understandable-components>
>
> and the (more) technical paper behind the article
>
> https://transformer-circuits.pub/2023/monosemantic-features/index.html
>
> Despite having sent a few dogs into vaguely similar scuffles in my
> careen(r):
>
> Faceted Ontologies for Pre Incident Indicator Analysis
> <https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf>
> SpindleViz <https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf>
> ...
>
> ... I admit to finding this both intriguing and well over my head on
> casual inspection...  the (metaphorical?) keywords that drew me in  most
> strongly included *Superposition* and *Thought Vectors*, though they are
> (nod to Glen) probably riddled (heaped, overflowing, bursting, bloated ...
> )  with excess meaning.
>
> https://gabgoh.github.io/ThoughtVectors/
>
> This leads me (surprise!) to an open ended discursive series of thoughts
> probably better left for a separate posting (probably rendered in a
> semasiographic language like Heptapod B
> <https://en.wikipedia.org/wiki/Heptapod_languages#Orthography>).
>
> <must... stop... now... >
>
> - Steve
>
> -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://bit.ly/virtualfriam
> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:  5/2017 thru present
> https://redfish.com/pipermail/friam_redfish.com/
>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>
> -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://bit.ly/virtualfriam
> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:  5/2017 thru present
> https://redfish.com/pipermail/friam_redfish.com/
>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Re: [FRIAM] Language Model Understanding

Reply via email to