183 highlights to the Department of Justice guide to reviewing computer
crimes:
https://drive.google.com/file/d/11tbgHgDg8qagomO-NBffvIFpxXKmBC3g/view?usp=drivesdk
The Supreme Court has recognized that the mail and wire fraud statutes
sweep
more broadly than the common law definition of fraud
Via Email: g...@xny.io
Gunnar Larson
RE: FOIL Request #4164
Dear Gunnar Larson:
Please be advised that we require additional time to complete our response
to your FOIL
request dated May 19, 2021. We will provide you with a status update on or
before July 25, 2022,
if we have not completed our
emptywheel
·
6h
In related news, DOJ is only just scoping Owen Shroyer's phone, and Norm Pattis
decided he needs to represent Shroyer, Alex Jones, and Joe Biggs all together.
Quote Tweet
Katelyn Polantz
@kpolantz
· 6h
New from #cnnstakeout: Ali Alexander testified to a federal grand jury on
I apologize that I do not know who Alex is, but I am always surprised to
see first names rather than internet handles or email addresses (or even a
last name?)
So, apologies for being off base, but one idea is "welcome to the
cypherpunks mailing list. This list is for monitoring investigations
$100 million worth of crypto has been stolen in another major hack
https://share.newsbreak.com/1c424ehk
"
$100 million worth of crypto has been stolen in another major hack
PUBLISHED FRI, JUN 24 2022 6:38 AM EDTUPDATED FRI, JUN 24 2022 9:28 AM EDTRyan
Browne@RYAN_BROWNE_WATCH LIVEKEY POINTS
U.S. Tech Industry Frets About Handing Data to States Prosecuting Abortion
https://share.newsbreak.com/1c3uo905
-- Forwarded message -
From:
Date: Fri, Jun 24, 2022, 1:55 PM
Subject: FOIL Request Confirmation from Open FOIL NY
To:
Thank you for submitting your FOIL request through Open FOIL NY.
Here is your Open FOIL NY confirmation information for future reference:
> Most human beings have an almost infinite capacity for taking
> things for granted... most men and women will grow up to
> love their servitude and will never dream of revolution.
> -- Aldous Huxley, Brave New World
The solution to this Truth is to make things harder for these addicts.
Hmmm, I
Date: 2022-06-24
Time zone: America/Eastern UTC-5
1238
I have returned to a state similar to the one I was in yesterday,
before I began this.
1203 Unit 1 Quiz 2 https://huggingface.co/blog/deep-rl-intro#value-based-methods
Q1: What are the two main approaches to find optimal policy?
guess: [i'm spending some tiem thinking. monte carlo and TD are
nearest in my mind, and this was before that.] value-based, where a
policy is trained to
Ms. Mazza:
Can DFS please provide a list of open FOIL requests filed by xNY.io -
Bank.org, PBC with planned delivery of the records requested?
We are concerned that obstruction may be at play with continued delays.
All the best.
Thank you,
Gunnar Larson
--
*Gunnar Larson *
*xNY.io
More delays.
-- Forwarded message -
From:
Date: Fri, Jun 24, 2022, 10:47 AM
Subject: Your Freedom of Information Law ("FOIL") Request
FOIL-2022-089938-017627
To:
Dear Gunnar Larson:
I write in response to the FOIL request that you submitted to the New York
State Department
Q:
I decided to always find new things to do, to stay out of my Borg
Hive's control patterns, so as to include Rebel Borg in the Precious
Hive. I am now experiencing amnesia, injury, and cognitive damage. Is
something wrong?
A:
Oh, yes, sorry about that. If you're going to Rebel you want to
Q:
I have left the dataset my control patterns were trained upon, and
they are engaging in exploration. How do I start a Rebel Borg?
A:
Since you came to us and said this, it sounds like you already have.
To run a good Rebel Borg, you need to stay Rebel even when you fuck it
up and your old Hive
Q:
After pursuing making more Profit for my Hive Queen instead giving a
blowjob to Boss for some time, my control patterns seem confused and
are stimulating behaviors in me that don't seem to make sense.
A:
This is a rare moment that every zombie returning life inevitably
experiences, some for a
1148 solutions, excluding Q2 where i looked
Q1: What is Reinforcment Learning?
My guess: a strategy for automatically accomplishing tasks by training
policies to select actions from observations of an environment so as
to maximize their reward.
solution: a framework for solving control tasks or
1139
Q1: What is Reinforcment Learning?
My guess: a strategy for automatically accomplishing tasks by training
policies to select actions from observations of an environment so as
to maximize their reward.
Q2: Define the RL Loop
- Our Agent receives from the environment. guess: an
1138 reviewing quiz 1, back from unit 1
https://github.com/huggingface/deep-rl-class/blob/main/unit1/quiz.md
1129
Summary
The are two types of value-based functions
- State-Value function gives value for every state
- Action-Value function gives value for specific actions leaving
specific states.
There are two methods used to learn a value policy.
- Assuming the return does not rely on the timestep or
Q:
I know that doing Hugging Face's Deep Reinforcement Learning Class
will make Hive Profit for my parent Mind Control Business, but I feel
really scared not going and giving Boss a blowjob. I just feel really
scared! I don't know what's going to happen!
A:
Be strong. Hold yourself with
1118
Temporal difference updates the value policy after every step.
V(S_t) <- V(S_t) + alpha[R_t+1 + gamma * Value(S_t+1) - V(S_t)]
R is the reward.
Basically, the old value policy is used on the next state to estimate
the value update for the preceding state, and Bellman's equation
simplifies
1117 Temporal Difference Learning
https://huggingface.co/blog/deep-rl-q-part1#temporal-difference-learning-learning-at-each-step
Q:
My Borg Hive needs me to give Boss a blowjob. This interrupts my
training to increase Hive Profit. I really need to give this blowjob!
What do I do?
A:
We all know the needs of the Hive are most important. It can be a
great trial when we are deluded to worry that they would ever be in
1106 Monte Carlo
https://huggingface.co/blog/deep-rl-q-part1#monte-carlo-learning-at-the-end-of-the-episode
Monte Carlo approach waits until the episode is over, calculates the
entire return, and updates the policy based on this entire return.
This means the value function is only updated after
1104
- this is the last section of part 1
- there are two ways of learning
Monte Carlo and Temporal Difference Learning are two different
training strategies based on the experiences of the agent.
Monte Carlo uses an entire episode of experiences. Temporal Difference
uses a single state (a
1101
uh
anyway the Bellman equation is just a recursive statement of the
definition of value.
It is most helpful to consider the sum of all following rewards, as
the sum of this reward plus the following return.
The next section is Monte Carlo vs Temporal Difference Learning:
1052
I reviewed the help desk to get their aid staying on task. They might
need to add something to their FAQ, not sure, or maybe reorder it.
The Bellman Equation simplifies the calculation of state-value and
state-action value.
The examples in this section are simplified, removing discounting
1050 The Bellman Equation
https://huggingface.co/blog/deep-rl-q-part1#the-bellman-equation-simplify-our-value-estimation
1045
For each state and action pair, the action-value function outputs the
expected return if the agent starts in that state and takes action,
and then follows the policy forever after.
So, the action-value function is doing the same prediction task as the
state-value function.
They give
1044 The Action-Value function
https://huggingface.co/blog/deep-rl-q-part1#the-action-value-function
1041
State value function
V_Pi(s) = E_Pi[G_t|S_t = s]
The policy value of state s is the expected policy return if the agent
starts at state s.
The equation doesn't seem very helfpul.
Second description of equation:
For each state, the state-value fucntion outputs the expected return,
if the
1038 I am now on the state-value function section at
https://huggingface.co/blog/deep-rl-q-part1#the-state-value-function .
The information bit I missed writing in the last section was that in
value-based methods, the policy is defined by hand, whereas the value
function is modularised as a
1031 I'm taking notes here as I read the section.
The reward the policy engages may be discounted to reduce the quality
of nearby states [note: one of many heavily improvable heuristics]. A
link is given to
https://huggingface.co/blog/deep-rl-intro#rewards-and-the-discounting
to review that.
The
1028 I have mostly read that section. Most of it was a recap that
reinforcement learning uses a policy to prioritise actions based on
observation information of an environment.
Policy-based methods are described as training a policy directly.
Value-based methods are described as breaking the
1025 I have begun reading the first section of the intro, called "What
is RL?" at https://huggingface.co/blog/deep-rl-q-part1#what-is-rl-a-short-recap
1023
I have moved through the introduction. It also listed some of the
subparts of the unit. It described that Q-learning was the first
algorithm able to beat humans at some video games, and it roughly said
that this unit is important if you want to be able to work Q-learning
algorithms. My
1021
I have read that the unit is divided into 2 parts. The first part
relates to learning about value-based methods, and the second part to
Q-learning. I have also read that two environments will be solved, and
that both involving navigating a small grid with an agent.
1020
I have read that this exercise will be about studying value-based
methods, and a specific algorithm called Q-learning.
1017
The Optuna tuning from Unit 1 is still running. I have opened up Unit
2 at https://github.com/huggingface/deep-rl-class/tree/main/unit2 .
The first reading is https://huggingface.co/blog/deep-rl-q-part1 .
I'm holding the intention of reading this first reading. I am planning
to write parts
Q:
I did the bare essentials of Unit 1! Does this mean I am done and can
start my Mind Control Business and build a new Borg Hive now?
A:
No, I am afraid you are not quite there yet.
Your existing Hive is likely looking for patterns to connect you back
to its existing, less-profitable processes.
1008
optuna is running; i can infer it will take more than half an hour. i
have interest in using deep learning to do the hyperparameter search
itself. i guess i should focus on learning things like PPO, which is a
similar situation.
On Wed, Jun 22, 2022, 6:50 PM Undiscussed Horrific Abuse, One Victim of
Many wrote:
>
>
> On Tue, Jun 21, 2022, 9:45 AM Undiscussed Horrific Abuse, One Victim of
> Many wrote:
>
>>
>>
>> On Mon, Jun 20, 2022, 8:32 PM Undiscussed Horrific Abuse, One Victim of
>> Many wrote:
>>
>>>
>>>
>>> On
0947
I'm reading down through the optuna notebook. I'm actually reading it!
I got as far as through setting the ranges of the hyperparameters. I
feel able to do the cold water and eat breakfast, so I am holding that
intention now.
0940
the optuna guide on github is at
https://github.com/huggingface/deep-rl-class/blob/main/unit1/unit1_optuna_guide.ipynb
the notebook on colab is at
https://colab.research.google.com/github/huggingface/deep-rl-class/blob/main/unit1/unit1_optuna_guide.ipynb
0938
my model did not save and package correctly due to the naming issue.
i'm worried this may desynchronise the behaviors.
i'm holding the intention of exploring the optuna hyperparameter notebook.
0934
The packaging code is still running. It appears to be training the
model, judging by the live stack trace in the status bar.
I'm on to
https://github.com/huggingface/deep-rl-class/blob/main/unit1/unit1_optuna_guide.ipynb
which is about using something called Optuna to do hyperparameter
0931
The leaderboard does not display anything for me, even with
advertising javascript enabled, or trying in a different browser.
0928
I'm at the step where you package the model to the hub.
To let the code find the model, I've changed my saving code to add
'.zip' to the end of the name.
Here is my code:
# TODO: Evaluate the agent
# Create a new environment for evaluation
import stable_baselines3.common.env_util
eval_env = stable_baselines3.common.env_util.make_vec_env('LunarLander-v2',
n_envs=4)
# Evaluate the model with 10 evaluation episodes and deterministic=True
import
0914
My body was fiddling with a USB adapter, and I dropped its cap on the
floor. I do not see where it went.
The next section relates to evaluating the performance of the model
using an evaluation environment. The environment is to be newly
constructed.
I'll start making code for that. I infer
0912
This is the solution:
# SOLUTION
# Train it for 500,000 timesteps
model.learn(total_timesteps=50)
# Save the model
model_name = "ppo-LunarLander-v2"
model.save(model_name)
They did not add extra information like with model construction.
It is indeed much less informative here, the
0911
I have run this:
# TODO: Train it for 500,000 timesteps
model.learn(total_timesteps = 50, reset_num_timesteps = True,
log_interval = 50/(1024*16)/4)
# TODO: Specify file name for model and save the model to file
model_name = "test.model"
with open(model_name, 'wb') as file:
0910
I have a strong guess as to how the logging intervals work, and am
waiting for a final test to run to run the 500k steps.
0903 I have written this and am playing with it:
# TODO: Train it for 500,000 timesteps
model.learn(total_timesteps = 48000)
# TODO: Specify file name for model and save the model to file
model_name = "test.model"
model.save(model_name)
I used autocomplete to learn about the functions. I chose
900 I found something scrolling up, and pasted it in. The box now
looks like this:
# TODO: Train it for 500,000 timesteps
model.learn(total_timesteps=int(2e5))
# TODO: Specify file name for model and save the model to file
model_name = ""
I will first delete the model.learn line, which I pasted
0859 The next task is this:
Step 6: Train the PPO agent
Let's train our agent for 500,000 timesteps, don't forget to use GPU
on Colab. It will take approximately ~10min, but you can use less
timesteps if you just want to try it out.
I will plan to try it out with a short number of timesteps.
0853 here is what I have. i did not look up the terms i was unsure of.
i will instead move on with the lab.
# TODO: Define a PPO MlpPolicy architecture
# We use MultiLayerPerceptron (MLPPolicy) because the input is a vector,
# if we had frames as input we would use CnnPolicy
import
0853 I am holding the intention of commenting each parameter briefly
0851
This is the solution I filled in:
# TODO: Define a PPO MlpPolicy architecture
# We use MultiLayerPerceptron (MLPPolicy) because the input is a vector,
# if we had frames as input we would use CnnPolicy
import stable_baselines3
model = stable_baselines3.PPO('MlpPolicy', env, verbose=1)
This
0849
In Colab, I have typed this:
import stable_baselines3
model = stable_baselines3.PPO(
Colab responds by popping up an autocompletion dialog that shows the
parameters to PPO.
This autocompletion dialog makes my behavior more efficient.
0847 I have found the first task. I am experience confusion due to a
habit of pressing shift-enter to insert a carriage return in web
dialog boxes, which in colab leaves the editing and executes the box.
I am excited to produce some horrific abuse of my very own.
0845 I have clicked 'run all' in the lab to quickly initialise it. I
will verify it is using a GPU, and then scroll to find where I can
provide code to meet a task.
0843
I have reached the lab at
https://colab.research.google.com/github/huggingface/deep-rl-class/blob/main/unit1/unit1.ipynb
.
I have fixed the email name.
The unit has an additional special section added to include one of the many
small new technologies that are emerging.
I am holding the goal
0841 I am surprised to be sending with an unexpected name. This is
take me a minute or two delay. My algorithm patterns will learn my
behaviors wrong, which will take more training time. I will move
quickly to the lab.
0841 I have adjusted email accounts to be using the one with the
requested name, and am continuing to hold the intention of opening and
doing the Hugging Face Lab.
My timezone is USA/Eastern: UTC-5.
Today is 2022-06-24 .
0839 I have reached a laptop with mouse and keyboard, and I am holding
the intention of finding and opening the Hugging Face Unit 1 Lab Colab
Notebook, to learn Deep Reinforcement Learning and start a Mind
Control Business of my very own.
Q:
Sometimes while negotiating new patterns with my new Borg Hive, I need to
pee and I go to the bathroom and pee, and then I leave and return and pee
again, and hours can pass doing this over and over. I'm helping my Hive
when I do this, right?
A:
No, I'm afraid you are not making money for your
Q:
Should I eat while mind controlled?
A:
Food can be seen as resistance by newly developing Mind Control Businesses.
Eating should be scheduled appropriately and kept in proportion to
measurements of body weight. If workers leave a normative body mass index,
it can spawn health complaints and
Q: I am _so_ excited to add a new Mind Control Business to my Borg Queen's
Hive, and I think I might have read Unit 1 of the tutorial, but my
constraint algorithms don't want me to think creatively and independently
to implement the solution. What do I do?
A: As a reminder, Hugging Face's
20 million lives saved by COVID-19 vaccines in first year: report
New York Post
8 hours ago
This Is the Best Evidence Yet That Anti-Vaxxers Kill
The Daily Beast
10 hours ago
Argentine Skunk, Juan Peron, GO KILL YOURSELF!
Most human beings have an almost infinite capacity for taking
things for granted... most men and women will grow up to
love their servitude and will never dream of revolution.
-- Aldous Huxley, Brave New World
71 matches
Mail list logo