short answer: no
A more detailed answer... You can learn about the technology behind chatGPT from Professor Stephen Wolfram in this long but informative blog post: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ You will learn that what chatGPT does is based on a simple principle. A colossal language model was trained to make word predictions. What is the most probable word that should go next from a sequence of words? So given an input text, a question, or a piece of text followed by a question, the system completes it, and usually, this output is the answer to your question. I will not go into details, but a critical underline principle of the method is the distributional semantics approach, the semantics of words are learned from their use and represented as vectors of N dimensions. Traditionally, the processing of natural language utterances was usually done modularly. Mirroring the way linguistics study natural languages: morphology, syntactic, semantics, pragmatics, etc. Computer scientists also embrace the idea given that modularity is at the heart of every programmer. We break complex tasks and structures into simple ones that can be independently developed and combined into the final solution. This is also supported by computational linguistics, which uses computational systems and methods to study languages, testing their hypothesis. We learn to start from a string, break it into tokens, then group sequences of tokens (sentences), and keep adding more metadata into these data structures or combine them into more elaborated ones. We usually consider tasks such as tokenization, part-of-speech tagging, lemmatization, syntactic/semantic parsing, named entities recognition, word sense disambiguation, etc. We have a lot of different libraries that experiment with different tasks and order to combine them. Libraries such as OpenNLP or Freeling (https://nlp.lsi.upc.edu/freeling/) adopted this pipeline approach. More sophisticated systems understand that humans don’t necessarily decide if a given word is a noun or verb before comprehending its contribution to the sentence, so instead of a pipeline of independent steps, they use a more integrated approach. Nevertheless, the idea is the same, from a string, construct data structures (or symbolic representations) to be further enriched or directly used for final applications. Applications are question answering, fact extraction from texts, sentiment analysis, translation, etc. In recent years, more and more of the tasks described in the last paragraph tend to ignore explicit linguistic knowledge encoded as rules (e.g., morphosyntactic rules) or enumerated in hand-crafted resources like lexical-semantic dictionaries (see https://wordnet.princeton.edu/ or https://nlp.cs.nyu.edu/nomlex/). We start to see texts being annotated so that systems can learn how to reproduce the same analysis when they see a similar text. See https://universaldependencies.org/, a vast collection of sentences in many languages annotated with syntactic analysis to train parsers. This approach of learning from annotated data (examples) became popular and started to give people the wrong impression that deep linguistic knowledge is irrelevant. Once a lot of annotated data start to be freely available, people forget the cost of constructing these datasets and the value of the annotators and maintainers, usually with necessary linguistic training. The linguistic knowledge to model language into executable tools, like computational grammar, becomes obsolete. Grammar engineers, for example, working on solid formalism like HPSG and LFG, became dinosaurs, like COBOL programmers. (or https://en.wikipedia.org/wiki/Jedi if you prefer) But that was not the end. Later, developers of NLP applications, encouraged by the success of machine learning (in particular deep learning and other unsupervised methods) in many tasks, started experimenting with end-to-end learning without considering the intermediary tasks. Why not try to answer a natural language question directly from the input without the cost of constructing and manipulating intermediary representations? Well, not directly, but adopting the minimum possible representations that could be universally manipulated. Yes, vectors. Once the input text is transformed into vectors or matrices of numbers, we need only an efficient linear algebra library to manipulate them. Simpler systems can be deployed faster, and tech companies love it. So this is the trend in the area now; given the massive amount of text we have on the internet, we learn how to transform words and sentences into vectors; we change our problems into optimization tasks and manipulate the vectors to obtain the parameters that maximize the performance of the system given a dataset of reference. The parameters define the function we can use in other texts to solve the same problem. The new methods and the LLMs are incredibly efficient for some tasks, and chatGPT impresses many people. But the efficiency in some practical use cases has nothing to do with other goals related to language studies. How do languages work? What are their fundamental parts? How do humans understand and produce language? How do we develop a system as competent as humans in the use of language? See https://youtu.be/wPonuHqbNds. To make an analogy. How does studying the human body, its parts, and how they work together contribute to medicine? In the past, medicine was a collection of practices said to work in many cases. Practices were generalized from examples only, and false correlations were unfortunately taken as causations (see https://en.wikipedia.org/wiki/Bloodletting). I hope that helps! I'm sorry for the long answer; anyway, I made a lot of simplifications to keep it a reasonable size message! ;-) Best, -- Alexandre Rademaker http://arademaker.github.io > On 12 Aug 2023, at 06:12, Turritopsis Dohrnii Teo En Ming > <tdtemc...@gmail.com> wrote: > > Good day from Singapore, > > Is Apache OpenNLP one of the building blocks of ChatGPT? > > Thank you. > > Regards, > > Mr. Turritopsis Dohrnii Teo En Ming > Targeted Individual in Singapore > Blogs: > https://tdtemcerts.blogspot.com > https://tdtemcerts.wordpress.com > GIMP also stands for Government-Induced Medical Problems.