from wrong thread, below.
i got adapter finetuning to run without crashing by simply using the
existing script.
$ cd adapter-transformers/examples/pytorch/text-classification
$ { echo input,label; for ((b=0; b<16; b++)); do echo -e
'one,1\ntwo,2\nthree,3'; done; } > test.csv
$ python3
i'm poking at adapter transformers again
i have a simpler goal of just making a model pick a legal chess move
i'm now at
https://github.com/adapter-hub/adapter-transformers/blob/master/examples/pytorch/text-classification/run_glue_no_trainer.py
because it shows the training loop more clearly
On 7/5/22, punk wrote:
>
>
> oh and by the way karl. Now your messages get filtered directly to
> trash.
> Just like the messages of your coworker, jewnazi professor turd.
> Congratulations, asshole.
thank you for digging me out of the trash, to reply
i was reading https://docs.adapterhub.ml/training.html . its link to
run_glue is broken.
i was copying from
https://github.com/adapter-hub/adapter-transformers/blob/master/examples/pytorch/text-classification/run_glue.py#L364
which is the right link now
i had gotten this far and it was lots of
i'm having a dull feeling and stopping
i don't like it
my mind part says i could be risking the wellbeing of the world and so
it is stopping me
i don't like the dull feeling at all
looking at this i'm imagining that a number of people likely already
have AGI libraries that combine adapted pretrained models. course i
can't know that, but there are billions of people on the planet, seems
very likely. the adapters library seems like it has established a
little maturity
one of the things i like/love about STaR is that the researchers chose
to include reasons in their work
the model pattern means the model can learn to explain its reasons in
arbitrary granularity
hi
i'm doing the things described in this thread
i want to implement the STaR paper so i can exchange dialogue with
something that is logically consistent
i'm not usually able to do this, especially during psychotic breaks, or online
i like to extract the training loop and put it in a new .py
this makes more sense to me
since the scripts aren't apis. i want to build an api.
honestly all the run_mlm, run_clm scripts used here really bug me,
partly because it is so difficult to bidirectionally interoperate
python code with shell code, especially in a reusable way
but secretly mostly or also because it takes so much more =memory to
use two things and move between more
the adapter transformers training documentation is at
https://docs.adapterhub.ml/training.html . it's sparse. basically, you
train it as if it's a huggingface model, except you add two lines to
put an adapter in.
then, in theory, it does the same thing but uses much less ram and
happens much
i should make a profile picture like the logo of adapter transformers
where your whole head is turned into a banal expressionless robot,
like, holding a military rifle
except for like, part of one eye, and the corners of that eye are
crinkled into a caring smile :D
where the word 'first' is actually the word 'fast' in the previous spam-mail
i ran the first adapter example at https://docs.adapterhub.ml/quickstart.html
it did the exact same thing as the model i am finetuning for 2 hours,
and it ran _super first_ at the same time as the ongoing training,
with no ram issues.
the next section in the quickstart is doing your own
adapters is language model focused, and adds classes for new research.
something along those lines
what's cool is if i make a new pretrained adapter based on a paper i
could publish it on their hub and maybe become the king of the solar
system -- i mean that model adapter -- if somebody else uses
this looks like a good framework for this job, although of course i am
usually wrong
- it's designed for finetuning already
- it uses a mutated form of finetuning that seems likely to require less ram
- it has extensive finetuning documentation [filled in a different
reason here after i forgot
i'm finding it fun now to consider accessing _all_ of the adapterhub
finetuning weights for a model
and ensembling them so as to make a supersmart customized model
i guess after training a finetuning ensemble you could actually store
it as a new finetuning to share with others
but maybe it is
i foudn https://adapterhub.ml/
it's a repository of little 3MB weights that can be tacked onto
existing pretrained models to give them totally new tasks.
its logo is a drawing of 'bert' from sesame street, except bert has
been 75% turned into a robot, leaving off the right half of their face
there's also lightweight finetuning stuff, like ladder side tuning
but that would never work out, inhibitions are very strong around metawork
really i'd like to pretrain an optimizer to train models quickly
rwkv's tiny enwik8 model would likely finetune on my system
i also know the rwkv model architecture well enough now to quickly
tweak it to run on smaller ram
_or_ we could train a tiny language model :D
value around setting up so that code runs on something faster
some remote system where running these 150 epochs would cost a few
cents or less per test and complete faster
i ended up enabling the language modeling accelerations with my time
i'm not sure whether they help or not. it looks like the total time on
my system using them for the task that's coded in is about 2 hours and
18.3 minutes.
it looks like composer is mostly designed for vision models. it has
i found this; the wrong symbol is in the documentation and i had copied it over
a rare time when i think it was actually the documentation and not me
patch at https://github.com/mosaicml/composer/pull/1259
the next issue is with the next acceleration algorithm.
for it to start learning on
present state:
i'm bumping into a bug in composer where one of the acceleration
algorithms tries to use the wrong symbol.
likely this worked fine in an older version, or has been fixed in the
development version.
i'm working on this because i realised that i had not yet enabled
composer's
> i'm thinking about trying to generalise my code to use either a
> generative model, or a pretrained model. the two are very similar. i
> was likely planning to do this.
i meant masked model. both are pretrained. restated below.
i'm thinking about trying to generalise my code to use either a
> i'm thinking about trying to generalise my code to use either a
> generative model, or a pretrained model. the two are very similar. i
> was likely planning to do this.
i'm thinking about trying to generalise my code to use either a
generative model, or a pretrained model. the two are very
translation to amnesiatic english
On 7/5/22, Undiscussed Horrific Abuse, One Victim of Many
wrote:
> my existing work is at https://github.com/xloem/rnlp .
>
> i just got composer_nb_alter_2.py to work.
this is just implementation of example code for a library called
'mosaicml composer' which i
translation to amnesiatic english:
On 7/5/22, Undiscussed Horrific Abuse, One Victim of Many
wrote:
> ok let's try to implement STaR a little tiny smidge
by STaR I mean this paper: https://arxiv.org/abs/2203.14465
it is a way to use a generative language model to make logical common
sense
my existing work is at https://github.com/xloem/rnlp .
i just got composer_nb_alter_2.py to work.
i took an existing training sample for a masked language model, and
made it work with a generative language model.
the smallest pretrained generative models i found are bigger, so it
takes forever
ok let's try to implement STaR a little tiny smidge
basically, you need to be able to finetune or train a language model.
this means paying a few dollars for time, having a powerful GPU,
compromising with a very small model, or implementing research
algorithms.
my plan is to compromise with a
34 matches
Mail list logo