Richard,

The code you show is correct and it does not include where you say ChatGTP
explained it was 33/26 rather than the correct 42/216.

I gather it have the proper fraction for the other two scenarios.

So what would cause such a localized error?

The method chosen is to try all possible combinations and count how many
times they add up to a winning combo and then divide by the possible
combinations. Why would that go wrong?

As a human, at least most days, I would calculate using the first part of
the formula it used to just get a numerator.

> sum(rowSums(expand.grid(faces, faces, faces)) %in% c(7,11))
[1] 42

So is this perhaps a case where ChatGTP, instead, did some kind of reverse
engineering and used something else to estimate what 0.1944444 might be as a
fraction with an integral numerator and denominator? There are often many
ways to do this including some that allow an approximation of 7/36 or 42/170
or so. Include the fact that floating point representations are not exact
and it may be it used an algorithm in which it neglected to say it should be
a fraction with a denominator of 216.

If my guess is correct, you could argue this is partially an issue of
comprehension. Humans, to a limited extent, can look at a problem and see
that a solution to a second issue is already almost visible in what they did
to solve the first. A machine who searches data they were fed, may see your
question as having several parts and solves them sequentially using advice
from two places and has an imperfect understanding of what it read in the
second place, or perhaps there was an error there.

This reminds me a bit of the way some computer languages have a sort of
in-line assignment operator in python that was added. The walrus operator
allows parts of an expression to be evaluated and the result stored in a
variable for use elsewhere in that expression or later.  So it you have an
expression that effectively needs to do some calculation two or more times,
such as the sum of some numbers or their average, you do it once and ask for
the number to be saved and then just state you want it elsewhere in the
expression.

Or consider something like the quadratic formula where you calculate the
square root part twice because the two answer are + or - the same thing.

It is often easy for humans to see and extract such commonalities but
programs written so far often are not really designed with examining things
this way.

I note that the above method can be a tad slow and expensive for very large
cases like rolling a hundred dice as you end up making a huge data structure
in which all the entries must sum above 11 if the minimum roll is 1. Again,
a human may realize this and skip using the method. The chances of rolling
100 die and getting a 7 or 11 or even a 99 are absolutely zero. For some
other problems, such as rolling 8 die, there are only solutions for 11, not
for 7. And, rather than generating all possible combinations in advance,
there may be an algorithm that builds a tree with pruning so that a first
toss of 6 or 5 or anything where having all remaining dice at 1 each makes
it go too high (such as 12, makes it skip any further exploration in that
direction. If the current sum is such that the only valid solution is all
ones, again, you can declare that result and prune any further progress
along the tree.

But would chatGTP be flexible enough to suggest using such an algorithm or
know to switch to it for dice above some level?

Can anyone explain better what went wrong? I have heard statements before
about how some of these pseudo-AI make simple mathematical errors and this
sounds like one.



-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Richard O'Keefe
Sent: Saturday, April 13, 2024 5:54 AM
To: R Project Help <r-help@r-project.org>
Subject: [R] Just for your (a|be)musement.

I recently had the chance to read a book explaining how to use
ChatGPT with a certain programming language.  (I'm not going
to describe the book any more than that because I don't want to
embarrass whoever wrote it.)

They have appendix material showing three queries to ChatGPT
and the answers.  Paraphrased, the queries are "if I throw 2 (3, 4)
fair dice, what is the probability I get 7 or 11?  Show the reasoning."
I thought those questions would make a nice little example,
maybe something for Exercism or RosettaCode.  Here's the R version:

> faces <- 1:6
> sum(rowSums(expand.grid(faces, faces)) %in% c(7,11))/6^2
[1] 0.2222222
> sum(rowSums(expand.grid(faces, faces, faces)) %in% c(7,11))/6^3
[1] 0.1944444
> sum(rowSums(expand.grid(faces, faces, faces, faces)) %in% c(7,11))/6^4
[1] 0.09567901

Here's where it gets amusing.  ChatGPT explained its answers with
great thoroughness.  But its answer to the 3 dice problem, with what
was supposedly a list of success cases, was quite wrong.  ChatGPT
claimed the answer was 33/216 instead of 42/216.

Here's where it gets bemusing.  Whoever wrote the book included
the interaction in the book WITHOUT CHECKING the results, or at
least without commenting on the wrongness of one of them.

I actually wrote the program in 6 other programming languages,
and was startled at how simple and direct it was in base R.
Well done, R.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to