Re: [FRIAM] Language Model Understanding

Steve Smith Sat, 07 Oct 2023 17:37:09 -0700

Tom -

This is probably a little less relevant to the business of journalismthan you deserve but this is a pretty good (IMO) fleshing out of some ofthe technical details of (L)LMs which might fill in some blanks:


   https://developers.google.com/machine-learning/resources/intro-llms

There must be (by now) some (yet) more relevant resources forJournalists (for example), not only to de-Jargonize a little but also tomake it (yet more) relevant to your domain.

Here, straight from the /Horses' Mouth/ (GPT 3.5) itself ( I leave it toyou to tell us how well these answers work for you?). It is probablyworth engaging with chatGPT yourself and asking more and more focusedquestions... a good self-study in "prompt engineering"!


   http://chat.openai.com

***

   /Introduction to Modern Language Models for Veteran Journalists//
   //
   //In today's digital age, the field of journalism is undergoing a
   significant transformation, largely driven by advancements in
   technology. One key innovation that has had a profound impact on the
   industry is the development of modern language models, often
   abbreviated as LLMs. While you may not be a tech expert,
   understanding the basics of LLMs can be invaluable for journalists.
   So, let's break it down in simple terms.//
   //
   //**What Are Language Models?**//
   //
   //At its core, a language model is like a super-smart digital
   writer. It's a computer program trained to understand and generate
   human language. Think of it as a tool that helps computers
   understand and use words, sentences, and paragraphs just like humans
   do.//
   //
   //**The Power of Large Language Models (LLMs)**//
   //
   //Now, let's talk about Large Language Models (LLMs). These are the
   big players in the world of language models. They're like language
   supercomputers because they're trained on massive amounts of text
   from the internet. This means they've learned a lot about how people
   write and talk.//
   //
   //**How LLMs Help Journalists**//
   //
   //1. **Content Generation**: LLMs can generate written content, from
   news articles to summaries and even creative pieces. Journalists can
   use them to draft stories faster and get ideas flowing.//
   //
   //2. **Fact-Checking**: They're handy for fact-checking, helping you
   quickly verify information by scanning a vast database of knowledge.//
   //
   //3. **Summarization**: LLMs can take long, complex documents and
   create concise summaries, saving you time on research.//
   //
   //4. **Translation**: They can also translate text between
   languages, which can be incredibly useful for international reporting.//
   //
   //5. **Personalization**: Some news websites use LLMs to tailor
   content to individual readers, providing a more personalized news
   experience.//
   //
   //6. **Social Media Analysis**: LLMs can sift through social media
   data, helping journalists understand public sentiment and trends.//
   //
   //**Potential Concerns**//
   //
   //While LLMs are incredibly powerful, there are some concerns to
   keep in mind. They might unintentionally spread misinformation if
   not used carefully, and they can't replace human judgment or
   critical thinking.//
   //
   //**Conclusion**//
   //
   //In a nutshell, modern language models (LLMs) are like digital
   writing assistants that can help journalists write better, research
   faster, and adapt to the changing landscape of digital journalism.
   While they're powerful tools, it's important to use them
   responsibly, keeping in mind their strengths and limitations, as
   well as the ethical considerations of automated content creation in
   journalism.//
   /

Can you elaborate on some of the technical jargon?/
/

   /Certainly! Here are some common technical jargon related to Large
   Language Models (LLMs) explained in journalist-friendly terms://
   //
   //1. **Pretraining**: This is the initial phase where the LLM learns
   from a massive amount of text data on the internet. Think of it as
   the model absorbing a vast library of information to understand
   language better.//
   //
   //2. **Fine-tuning**: After pretraining, LLMs undergo fine-tuning to
   specialize them for specific tasks. It's like giving the model extra
   training in a particular subject, like finance or healthcare, to
   make it more useful for specialized reporting.//
   //
   //3. **Token**: In LLMs, text is broken down into smaller units
   called tokens. A token can be as short as one character or as long
   as one word. Knowing the number of tokens is important because it
   affects how much text the model can handle at once.//
   //
   //4. **Inference**: This term refers to the process of using a
   trained LLM to generate text or make predictions. Journalists use
   inference to create content, answer questions, or analyze data.//
   //
   //5. **Prompt**: A prompt is the input you provide to an LLM to get
   a response. It's like giving the model a question or a starting
   point for it to generate text. Journalists often craft prompts to
   get specific information or insights.//
   //
   //6. **Bias**: LLMs can sometimes produce biased or unfair content
   because they've learned from biased internet data. Journalists
   should be aware of this and critically evaluate the model's outputs
   for any biases.//
   //
   //7. **Algorithm**: This is a fancy word for a set of rules or
   instructions that an LLM follows to perform tasks. Algorithms guide
   how the model generates text and makes decisions.//
   //
   //8. **NLP (Natural Language Processing)**: NLP is the field of
   study that focuses on teaching computers to understand and generate
   human language. LLMs are a product of NLP research.//
   //
   //9. **ML (Machine Learning)**: LLMs are a type of machine learning
   model. Machine learning is a way to make computers learn from data
   and improve their performance over time.//
   //
   //10. **Training Data**: This is the text data used to teach an LLM.
   It's like the model's textbooks, and the quality and diversity of
   this data influence how well the LLM performs.//
   //
   //11. **Bias Mitigation**: This is the process of reducing bias in
   LLMs. It's important in journalism to ensure that the information
   provided by the model is fair and unbiased.//
   //
   //12. **Ethical AI**: This refers to the practice of using LLMs
   responsibly and ethically. Journalists should consider the ethical
   implications of using these tools, especially when it comes to
   issues like privacy and misinformation.//
   //
   //13. **API (Application Programming Interface)**: Some LLMs offer
   APIs that allow journalists to interact with the model through code,
   enabling custom applications and integrations.//
   //
   //14. **Fine-Tuned Models**: These are LLMs that have been trained
   for specific tasks, like summarization, translation, or
   question-answering. They are more specialized and can be
   particularly useful for journalism.//
   //
   //Understanding these terms can help journalists effectively
   leverage LLMs in their work while also being mindful of their
   limitations and ethical considerations./


- Steve

On 10/7/23 5:13 PM, Tom Johnson wrote:

Thanks. Those are impressive numbers.
Tom

On Sat, Oct 7, 2023 at 4:11 PM Marcus Daniels <mar...@snoutfarm.com>wrote:

The “large” refers to the number of parameters used. A smaller
large language model – a deep neural net -- start about 3 billion
parameters, but larger ones like Claude 2 (the latest large
language model of the company that wrote the paper Steve
mentioned) have more than 130 billion parameters. Amazingly, it
is possible using (rooms of) GPUs and other accelerators to
optimize in this a space of this size. The billions of
parameters come from the vocabulary size – the number of tokens
that need to be discriminated, the many layers of transformers
that are needed to capture the complexity of human and non-human
languages (like DNA), and the context window size – how many
paragraphs or pages the model is trained on at a time. A small
language model might be suitable for understanding the geometries
of chemicals, say.

*From:* Friam <friam-boun...@redfish.com> *On Behalf Of *Tom Johnson
*Sent:* Saturday, October 7, 2023 2:38 PM
*To:* The Friday Morning Applied Complexity Coffee Group
<friam@redfish.com>
*Subject:* Re: [FRIAM] Language Model Understanding

Thanks for passing this along, Steve. I wish, however, the authors
of this short piece would have included a definition of, in their
usage, "Large Language Models" and "Small Language Models."
Perhaps I can find those in the larger paper.

Tom

On Sat, Oct 7, 2023 at 12:34 PM Steve Smith <sasm...@swcp.com> wrote:

This popular-press article came through my Google News feed
recently which I thought might be useful to the
Journalists/English-Majors on the list to help understand how
LLMs work, etc. When I read it in detail (forwarded from my
TS (TinyScreenPhone) on my LS (Large Screen Laptop)) I found
it a bit more detailed and technical than I'd expected, but
nevertheless rewarding and possibly offering some traction to
Journalism/English majors as well as those with a larger
investment in the CS/Math implied.

Decomposing Language Models into Understandable Components

<https://www.anthropic.com/index/decomposing-language-models-into-understandable-components>

and the (more) technical paper behind the article

https://transformer-circuits.pub/2023/monosemantic-features/index.html

<https://transformer-circuits.pub/2023/monosemantic-features/index.html>

Despite having sent a few dogs into vaguely similar scuffles
in my careen(r):

Faceted Ontologies for Pre Incident Indicator Analysis
<https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf>
SpindleViz
<https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf>
...

... I admit to finding this both intriguing and well over my
head on casual inspection... the (metaphorical?) keywords
that drew me in most strongly included /Superposition/ and
/Thought Vectors/, though they are (nod to Glen) probably
riddled (heaped, overflowing, bursting, bloated ... ) with
excess meaning.

https://gabgoh.github.io/ThoughtVectors/

This leads me (surprise!) to an open ended discursive series
of thoughts probably better left for a separate posting
(probably rendered in a semasiographic language like Heptapod
B
<https://en.wikipedia.org/wiki/Heptapod_languages#Orthography>).

<must... stop... now... >

- Steve

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p
Zoom https://bit.ly/virtualfriam
to (un)subscribe
http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present
https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present
https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p
Zoomhttps://bit.ly/virtualfriam
to (un)subscribehttp://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIChttp://friam-comic.blogspot.com/
archives: 5/2017 thru presenthttps://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Re: [FRIAM] Language Model Understanding

Reply via email to