Re: [NonGNU ELPA] New package: llm

2023-09-01 Thread Ihor Radchenko
chad  writes:

> For large AI models specifically: there are many users for whom it is not
> practical to _actually_ recreate the model from scratch everywhere they
> might want to use it. It is important for computing freedom that such
> recreations be *possible*, but it will be very limiting to insist that
> everyone who wants to use such services actually do so, in a manner that
> seems to me to be very similar to not insisting that every potential emacs
> user actually compile their own. In this case there's the extra wrinkle
> that the actual details of recreating the currently-most-interesting large
> language models involves both _gigantic_ amounts of resources and also a
> fairly large amount of not-directly-reproducible randomness involved. It
> might be worth further consideration.

Let me refer to another message by RMS:

>>   > While I certainly appreciate the effort people are making to produce 
>>   > LLMs that are more open than OpenAI (a low bar), I'm not sure if 
>>   > providing several gigabytes of model weights in binary format is 
really 
>>   > providing the *source*. It's true that you can still edit these 
models 
>>   > in a sense by fine-tuning them, but you could say the same thing 
about a 
>>   > project that only provided the generated output from GNU Bison, 
instead 
>>   > of the original input to Bison.
>> 
>> I don't think that is valid.
>> Bison processing is very different from training a neural net.
>> Incremental retraining of a trained neural net
>> is the same kind of processing as the original training -- except
>> that you use other data and it produces a neural net
>> that is trained differently.
>> 
>> My conclusiuon is that the trained neural net is effectively a kind of
>> source code.  So we don't need to demand the "original training data"
>> as part of a package's source code.  That data does not have to be
>> free, published, or available.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [NonGNU ELPA] New package: llm

2023-08-31 Thread chad
On Thu, Aug 31, 2023 at 5:06 AM Ihor Radchenko  wrote:

> Richard Stallman  writes:
>
> > As for LLMs that run on servers, they are a different issue entirely.
> > They are all SaaSS (Service as a Software Substitute), and SaaSS is
> > always unjust.
> >
> > See https://gnu.org/philosophy/who-does-that-server-really-serve.html
> > for explanation.
>
> I do not fully agree here. [...]
> Thus, for many users (owning less powerful computers) LLMs as a service
> are going to be SaaS, not SaaSS. (Given that the SaaS LLM has free
> licence and users who choose to buy the necessary hardware retain their
> freedom to run the same LLM on their hardware.)
>

 It's a somewhat subtle, gnarly point, and I didn't find a way to express
it as well as Ihor Radchenko here, but I will add: the ability for a free
software-loving user to run their own SaaS is both increasing and
decreasing in ease recently. On the one hand, it's difficult these days to
run a personal email service and not get trapped by the shifting myriad of
overlapping spam/fraud/monopoly `protection' features, at least if you want
to regularly send email to a wide variety of users. On the other hand, it's
increasingly viable to have a hand-held machine that's a tiny fraction of a
space-cadet keyboard running (mostly; binary blobs are a pernicious evil)
free software that easily connects back to one's own free-software
"workstation" for medium and large jobs, even while avoiding "the cloud
trap", as it were.

(Such things have been a long-time hobby/interest of mine, dating back
before my time as a professional programmer. They're still not common, but
they're getting increasingly moreso; native Android support for emacs, as
one example, will likely help.)

For large AI models specifically: there are many users for whom it is not
practical to _actually_ recreate the model from scratch everywhere they
might want to use it. It is important for computing freedom that such
recreations be *possible*, but it will be very limiting to insist that
everyone who wants to use such services actually do so, in a manner that
seems to me to be very similar to not insisting that every potential emacs
user actually compile their own. In this case there's the extra wrinkle
that the actual details of recreating the currently-most-interesting large
language models involves both _gigantic_ amounts of resources and also a
fairly large amount of not-directly-reproducible randomness involved. It
might be worth further consideration.

Just now, re-reading this seems like a topic better suited to
emacs-tangents or even gnu-misc-discuss, so I'm changing the CC there.
Apologies if this causes an accidental fork.

I hope that helps,
~Chad