Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-06-18 Thread Undiscussed Horrific Abuse, One Victim of Many
It's getting more normal to use recurrent models that no longer have bounds on their input and output sizes. This removes half the challenge of this task. https://github.com/BlinkDL/RWKV-LM

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-03-16 Thread Undiscussed Horrific Abuse, One Victim of Many
so maybe pip3 install https://github.com/xloem/GPTb from GPTB import GPTBLMHeadModel from transformers.models.gpt2.configuration_gpt2 import GPT2Config config = GPT2Config() # pass settings or you can pull the config from some pretrained model and tweak it config.rebias = True # additional

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-24 Thread Undiscussed Horrific Abuse, One Victim of Many
i'm suspecting some people have been using fairseq for things like this https://github.com/pytorch/fairseq it's a facebook project focused on training sequence transformer models. noticed there was a deep learning related repo on the old gitopia, too, could be meaningful to look through such

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-03 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
regarding the idea for saving state, that could work here. basically you take a fancy text generation model and finetune it to produce its own embeddings by feeding it one token at a time instead of a document, each time feeding back its generated state as embeddings. it then is possibly bound by

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
uhhh the discord i remember the best is eleutherai's. they made gptj and also an open source coding assistant app for vscode.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
Note: I won't be effective at using the cutting edge here, because I am not hanging in research chats on discord collaborating with researchers sharing their latest work. Anybody can do that by hopping through the chat servers, asking around. It feels a little overwhelming for me.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
Another idea: We could design something using human knowledge or ghidra, then review it and figure out how it could have designed it on its own.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
I'm thinking I'd like to try training a bytestokenizer for bigbird and extend its sequence length to entire binaries. I expect the result to be about 30% successful given my lack of experience and time.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
idea: a model could be trained to guess the source layout by sequentially producing filepaths and selecting areas of the source code to consider, like an agent that's similar to language generation except the output words/phrases are unordered: a set of filepaths. might be interesting to try

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
- I skimmed bigbird's description a little. it's trained for sequence lengths of 4096 tokens but it doesn't look like memory requirements would rise too much if that were increased somehow. curious if you can finetune a model with increased position embeddings, probably can. - I glanced at realm

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-02-01 Thread Undiscussed Horrific Abuse, One Victim & Survivor of Many
- a large pretrained model that has significant understanding of english logic and knowledge could be finetuned on bytes by training perceiver-like cross attention embedding/tokenization encoders and decoders to match the behaviors if its original tokenizer and embeddings but accept byte streams.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-25 Thread k
- a large T5 model could be tpu compiled on colab notebooks by calling pmap() on individual blocks rather than the whole model - much larger models could be trained by masking the training weights to reduce autograd memory load as has been done for at-home training of large text generation models

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-23 Thread k
- it turns out that deserialization of compiled tpu code isn't implemented in colab notebooks yet. might be easy to implement, might be nearly impossible, haven't looked. so not too much was accomplished by the use of tpu vms other than realising they're there for when a lot of speed is needed.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-22 Thread k
this has been going slower than needed because colab was bailing when i tried to run the model on google's tpus, during compilation. today i made a google cloud vm and precompiled the model in their shell, and addded precompilation support to the notebook. it was _really_ hard to make the vm, my

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-19 Thread grarpamp
On 1/19/22, k wrote: > decompiled function as of today: > \00 def example_sum(left, right, sum): > it doesn't look like much, but it's progress There will be a party if your new ghidra prints printf("Hello world.\n"); https://github.com/NationalSecurityAgency/ghidra > might take me a bit to

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-19 Thread k
this is currently autouploading new snapshots of the model training as it goes, for as long as google lets my notebook stay running. it's presently between 1.0 and 2.0 loss and is making decompilations that don't have weird symbols in them. it's training on only a little under 30k unreviewed and

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-18 Thread k
a jax contributor kindly shared this with me. you can store tpu models precompiled, which significantly speeds launch time, by using a compilation cache folder. from jax.experimental.compilation_cache import compilation_cache as cc cc.initialize_cache("/path/name/here", max_cache_size_bytes=32 *

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-18 Thread k
um er - i went back to that and it turned out i had just scrolled up, and the training was all there - i think i may have uploaded another snapshot - i let it train for a number more hours, but when i returned the vm had run out of ram and X wasn't accepting keyboard input. it took me some time

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-18 Thread k
- show and tell - the checkpoint on huggingface currently has a loss of around 2.1, so it doesn't succeed yet. but it turns out it can produce an output, and guesses a simple signature correctly: git clone https://github.com/xloem/techsketball cd techsketball python3 demo.py it compiles a very

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-18 Thread k
note: in my bumbling i found this doc which gives a general intro to flax/jax/huggingface from google: https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md . i'm wondering if stuff like that doc is how jax reached me.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-17 Thread k
i posted but my email disappeared for me, here's another. my continuing on this is waning atm, maybe will change. the first model was lost after a day or two when the notebook closed itself. reusing the token ids of the T5 tokenizer really speeds training from the T5 model. i spent some time

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-17 Thread k
note: - additionally, the perceiver model structure may not need tokenization - and, google made a new T5 called LongT5 that can handle much larger data already, code usually released in coming months given many functions are short, i might skip the length problem for now but maybe now something

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-17 Thread k
the T5 tokenizer the current code uses removes linebreaks, so the output source isn't recompilable. last night i added to find_pycode.py to add functions to train a tokenizer for the source, preserving linebreaks. there is a further big issue: embedded strings are not tokenized on the input

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
batchsize of 20 is about the same speed redaction: this is not actually the free colab. to make it work on the free colab, you'd drop the batchsize so it fit in ram. while frustrated with the tpu rpc timeouts i bought the paid colab. it didn't help, turns out because the timeout is hardcoded

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
it's successfully fitting the model to the task on the colab gpu. the tpu compilation times out colab's rpc connection to google's cloud. the eta for 10 runs through my example data is within 520 hours (3 weeks) on the free colab gpu notebook using a batch size of 16.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
it's successfully fitting the model to the task on the colab gpu. the tpu compilation times out colab's rpc connection to google's cloud. the eta for 10 runs through my example data is within 520 hours (3 weeks) on the free colab gpu notebook using a batch size of 16.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
hbm limits relate to the TPU linked to the notebook. a v2-8 (i think?) has 64 GB which gets split into 8x 8GB if all 8 cores are used. TRC provides larger TPUs, but it still raises the memory size issue.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
hbm limits relate to the TPU linked to the notebook. a v2-8 (i think?) has 64 GB which gets split into 8x 8GB if all 8 cores are used. TRC provides larger TPUs, but it still raises the memory size issue.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
[missing change was committed] t5-base with batch size of 6 is looking for 22GB of hbm (tpu memory). crashes complained has only 7 gb, might be a notebook limit or a time of day thing

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-16 Thread k
[after a number of psychotic breaks] the training loop runs now. it's likely not running very effectively. for the notebook to run right now, an uncommitted change is needed: # compute loss loss = optax.softmax_cross_entropy(logits, flax.training.common_utils.onehot(labels,

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-01 Thread k
i've addressed bugs enough that it actually gets to the point where the tpus evaluate the model with passed data. so far the first evaluation pass hasn't returned, maybe cause this demo is low-end, unsure. i have no idea how long it hsould take and should try a smaller model to continue

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-01 Thread k
i'm looking at https://github.com/huggingface/transformers/blob/master/examples/flax/summarization/run_summarization_flax.py#L534 , which is for flax as a summarization task, and noting that the decoder input ids are the labels shifted by one. i'm thinking that summarization is basically the

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-01 Thread k
wow those two emails are _full_ of errors. don't take the log of logits, you'll get a double-log probability and nobody will know what to do with it except people investigating the insides of neural network models that manipulate other neural network models or something

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-01 Thread k
oh, and .view(-1, ...) means to squish an n-dimensional vector so that is has the dimension sizes listed, where -1 means to make that dimension as large as needed to fit all the needed. so .view(-1) turns it into a 1-dimensional array.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2022-01-01 Thread k
so, the jax/flax hugging face t5 output doesn't include loss the way the huggingface t5 documentation implies. the pytorch output does. here's the loss from the huggingface pytorch t5 code. for me this is line 1643 of my old checkout of github.com/huggingface/transformers

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-31 Thread k
looks like i pasted together data batching code that doesn't line up basically the code needs to be mutated such that each batch is a dict, rather than the whole data. the example at https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/run_t5_mlm_flax.py uses

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-31 Thread k
I've pasted a training function into the .ipynb in https://github.com/xloem/techsketball/ . it's not mutated into a .py yet. i've also added mmaping functionality to the data generator so data larger than ram can be used and cached between tests. it is not used yet. i code with dense bugs due to

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
The linked function will need to be mutated for T5 per the T5 page linked earlier in this thread and farther down in my repo readme. Or the page's instructions could simply be used, rather than this TPU-oriented tutorial.

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
i got some simple data prepared and into the training implementation but have not written it, hard to continue. i'm at https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/causal_language_modeling_flax.ipynb#scrollTo=GjKzb0zJd-aH code is

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
On 12/29/21, Punk-BatSoup-Stasi 2.0 wrote: > On Wed, 29 Dec 2021 17:44:57 -0500 > k wrote: > >> i think this example notebook shows training a transformer model on >> the free tpus https://colab.research.google.com > > > again, fuck you karl and your fucking JOOGLE SPAM. Take it

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
yeah i dunno =/ but hey, big corps guiding advanced tech to use big computing resources and then monopolising control of them is just like how we spam lists to do things, maybe! use whatcha got? gotta figure out how to turn the problem into a different solution

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
i think this example notebook shows training a transformer model on the free tpus https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/causal_language_modeling_flax.ipynb

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
the trick is to get how google's military contract is fueled by the terrorist propaganda it researches dispensing actually targeting the problems up there

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
yay progress! time for me to spin in circles a bit. [crazy]

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
i wrote a quick call-and-go class to generate short pairs of bytecode and sourcecode from the python runtime at https://github.com/xloem/techsketball/blob/main/find_pycode.py it might be reasonable to use this as a proof of concept, filtering on input length. since others are likely adding the

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
i summarized some things at https://github.com/xloem/techsketball/blob/main/README.md including a link to that memory reducing paper at https://arxiv.org/abs/2112.05682 and some python import statements. there's code for this paper at https://github.com/AminRezaei0x443/memory-efficient-attention

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
ok, this was great motion i think vanilla models have a maximum sequence length. this can be expanded by altering the algorithm to not be O(n^2) for memory in the attention function. there's a paper out there on one approach to this. another idea is to chunk the text in some way and train

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
note: the huggingface demo passes information to the model using token ids token ids are just indexed sets of character orders that occur together frequently (the tokenizer counts and decides these) with something based on math, since it's going to be learning using linear algebra, i'm wondering

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
The tutorial looks not that great. I'm using google colab notebooks now to play on google's machines at https://colab.research.google.com/ and reading about the T5 transformer model which was the basis for the latest big free model, and is commonly used for translation:

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
This project provides a normative way to train on data _without_ storing it locally which could make it much simpler to use google's TPU's: https://github.com/activeloopai/Hub

Re: [crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
The reason I picked a TPU, google cloud oriented tutorial, is because google has a 1 month research program for access to higher level tpus. So, if something basic is set up, then it can be upgraded for free for a month. Google cloud sdk is downloading at 20k/sec for me, so I'm thinking a good

[crazy][hobby][spam] Automated Reverse Engineering

2021-12-29 Thread k
Obviously it's everybody's duty to build this once you believe it's possible. I found this tutorial for finetuning a language model used for translation: https://pythonrepo.com/repo/gsarti-t5-flax-gcp I made this empty github repository that could hold some attempts: