Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread Jed Rothwell
Robin  wrote:


>  These bots have an API that lets them access wiki, or so they claim.
>  Try asking for something else that is not on LENR-CANR.


I excluded all data outside of LENR-CANR.org. That is an option in the
FastChat Bot configuration. This Bot used the word "J'Accuse" as a synonym
for some word or phrase it found in a document at LENR-CANR.org. I do not
know which document. It can be hard to discover that. The Bots often use
synonyms, or paraphrasing. I think this is because the data is tokenized
and the bot may not store the original words. Bots store alphanumeric
tokens instead of words. So the original words may not be available in the
Bot database.

I asked the public version of ChatGPT about this. She is a leading expert
in AI technology. Here are her responses:


ME: What is tokenized data in an ai system?
BOT: In AI systems, tokenization refers to the process of breaking down a
text or sequence of words into smaller units called tokens. These tokens
can be individual words, phrases, or even characters.

Tokenization is an essential step in natural language processing (NLP) and
other machine learning tasks that involve processing text data. By breaking
down text into individual tokens, AI systems can analyze and interpret the
meaning of the text more efficiently.

Tokenization typically involves removing any punctuation, converting all
characters to lowercase, and splitting the text into tokens based on a
delimiter such as a space or punctuation mark. The resulting tokens can
then be encoded and used as inputs to various machine learning models.

Overall, tokenization helps to simplify and standardize text data so that
it can be effectively analyzed and processed by AI systems.


ME: A token is a numerical value, right? In some cases, can two different
words for one concept -- synonyms -- have the same token? Or a token with a
close numeric value?

BOT: Tokens are not necessarily numerical values. In natural language
processing, tokens are typically represented as strings of characters or
words, which can then be encoded as numerical values for machine learning
algorithms to process.

When using tokenization, two different words that represent the same
concept (synonyms) can be assigned the same token to enable the machine
learning algorithm to recognize them as equivalent. In this case, the goal
of tokenization is to reduce the vocabulary size and improve the efficiency
of the model without losing critical information.

However, it's important to note that tokens with close numeric values
should not be used interchangeably. Each token should be unique and
represent a distinct concept or word in the input text. If two tokens have
similar numeric values, it could lead to confusion and inaccuracies in the
machine learning model's results. Therefore, tokenization algorithms
typically ensure that each token has a unique and distinguishable value.


ME: I have noted that when an IA describes the content of a document, it
sometimes uses synonyms instead of the original words. In a few cases, I
think it did this even when I asked for a verbatim quote. Is this because
the content is stored as tokens, and the exact original words are not
available?

BOT: Yes, it is possible that the AI is using synonyms to describe the
content of a document because it is working with tokenized data and not the
original words. When an AI system tokenizes text data, it breaks down the
text into smaller units called tokens, which can be individual words,
phrases, or even characters.

These tokens are then used as inputs to machine learning models, which can
use various techniques to analyze and interpret the meaning of the text.
Sometimes, these models may use synonyms or other related words to describe
the content of a document because the original words may not be available
or may not be the best fit for the context.

Additionally, some AI systems may use techniques like paraphrasing or
summarization to generate new text based on the input data. In these cases,
the AI may use synonyms or other related words to convey the same meaning
as the original text while also improving readability or reducing
complexity.

However, if you specifically requested a verbatim quote and the AI system
did not provide it, it is possible that there was an error or a
misunderstanding in the processing of your request. It's important to keep
in mind that AI systems are not perfect and may not always provide the
exact results that you expect.


ME: How should I phrase a request to see the original data? Should I use
the word "verbatim"? Or some other word?

BOT: If you want to see the original data or text without any modifications
or substitutions, using the word "verbatim" is an appropriate way to phrase
your request. Verbatim means word for word, exactly as it was said or
written.

So, you could phrase your request like this: "Can I please see the original
text verbatim, without any modifications or substitutions?" This clearly

Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread Robin
In reply to  Jed Rothwell's message of Wed, 17 May 2023 11:32:10 -0400:
Hi,
[snip]

These bots have an API that lets them access wiki, or so they claim.
Try asking for something else that is not on LENR-CANR.
 
>[The word "J'Acuse" appears nowhere in the LENR-CANR corpus.]
>
>. . . He also wrote letters to MIT's president and assistant to the
>president requesting an investigation into the alleged misconduct.
>Additionally, Mallove responded to a statement on cold fusion issued by the
>MIT News Office in August 1991 with a point-by-point rebuttal.
>
>[Not true. BOT #1 knows better.]
>
>BOT 1
>
>Is there a paper by Mallove describing events at MIT?
>BOT: Yes, there is a paper by Mallove that describes events at MIT. The
>paper is titled "MIT Special Report" and was published in Infinite Energy
>in 1999. The paper explores the major role of MIT in shaping the history of
>the investigation of cold fusion. You can find the paper at this URL:
>http://lenr-canr.org/acrobat/MalloveEmitspecial.pdf
Cloud storage:-

Unsafe, Slow, Expensive 

...pick any three.



Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread Jed Rothwell
This is uncanny.


CHATBOT 2:

BOT: Hello! What can I do for you today?

What did Mallove say about the experiment at MIT?
BOT: Eugene Mallove accused MIT of publishing fudged experimental findings
to support their early condemnation of the work of Pons and Fleischmann. He
believed that MIT's researchers generated excess heat in their cold fusion
experiment but then altered the findings in their final report. Mallove
claimed that the scientists at MIT's Plasma Fusion Center had a vested
interest in seeing cold fusion fail, as they were receiving significant
federal funding for hot fusion research.
Mallove resigned from his position at MIT and demanded an investigation
into the alleged misconduct. He also accused the university of manipulating
the media and altering data to discredit the work of the Utah scientists.

Is there a paper by Mallove describing this?
BOT: There is no specific paper by Mallove solely focused on this incident
mentioned in the dataset. However, Mallove did submit a 17-page "J'Accuse"
letter to MIT detailing his accusations against the institution and its
researchers. . . .

[The word "J'Acuse" appears nowhere in the LENR-CANR corpus.]

. . . He also wrote letters to MIT's president and assistant to the
president requesting an investigation into the alleged misconduct.
Additionally, Mallove responded to a statement on cold fusion issued by the
MIT News Office in August 1991 with a point-by-point rebuttal.

[Not true. BOT #1 knows better.]

BOT 1

Is there a paper by Mallove describing events at MIT?
BOT: Yes, there is a paper by Mallove that describes events at MIT. The
paper is titled "MIT Special Report" and was published in Infinite Energy
in 1999. The paper explores the major role of MIT in shaping the history of
the investigation of cold fusion. You can find the paper at this URL:
http://lenr-canr.org/acrobat/MalloveEmitspecial.pdf


Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread Jed Rothwell
H L V  wrote:


> For example, although chatgpt 4 is more accurate than 3, it has lost some
> speed.
>

That is typical of computer programs. You often have to choose between
speed and accuracy. When there is a tremendous amount of data, I often run
a program to extract summary data, to give a quick answer. For example,
there are 4.7 million download records so far this year at LENR-CANR.org. I
extract acrobat file download totals only into spreadsheets.

The one at LENR-CANR.org is particularly slow. The vendor asked me whether
it should be set for faster performance or a deeper search. I opted for a
deep search. I explained this on the input screen:

These Bots are from https://www.chatfast.io/. They may respond more slowly
than other ChatGPT bot you have used, because they are optimized to do deep
searches, to remember context, and because this is ChatGPT Version 4. They
may take 30 seconds or more to answer your questions.


Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread Stefan Israelsson Tampe
On the other hand we have not started yet to optimize the speed of them.
Also consider what is happening on the hardware side.

On Wed, May 17, 2023 at 3:18 PM H L V  wrote:

> A cognitive scientist has suggested that the performance of these chatbots
> may have operational limits or trade-offs that we do not yet understand.
> For example, although chatgpt 4 is more accurate than 3, it has lost some
> speed.
>
> Harry
>
> On Tue, May 16, 2023 at 9:36 PM Jed Rothwell 
> wrote:
>
>> The LENR-CANR.org ChatBot has been upgraded to ChatGPT version 4. It
>> seems to work better, with fewer hallucinations. It may be slower.
>>
>> I asked it some of the questions version 3 could not answer. It answered
>> them correctly. It still gets confused. It thought that McKubre made a
>> statement that Bockris made. I asked for a verbatim quote and tracked it
>> down to Bockris.
>>
>>


Re: [Vo]:LENR-CANR.org ChatBot improved

2023-05-17 Thread H L V
A cognitive scientist has suggested that the performance of these chatbots
may have operational limits or trade-offs that we do not yet understand.
For example, although chatgpt 4 is more accurate than 3, it has lost some
speed.

Harry

On Tue, May 16, 2023 at 9:36 PM Jed Rothwell  wrote:

> The LENR-CANR.org ChatBot has been upgraded to ChatGPT version 4. It seems
> to work better, with fewer hallucinations. It may be slower.
>
> I asked it some of the questions version 3 could not answer. It answered
> them correctly. It still gets confused. It thought that McKubre made a
> statement that Bockris made. I asked for a verbatim quote and tracked it
> down to Bockris.
>
>