Re: [ot][spam] Behavior Log For Control Data: HFRL Unit 1 Lab

Undiscussed Horrific Abuse, One Victim of Many Fri, 24 Jun 2022 06:00:03 -0700

0859 The next task is this:

Step 6: Train the PPO agent 🏃
Let's train our agent for 500,000 timesteps, don't forget to use GPU
on Colab. It will take approximately ~10min, but you can use less
timesteps if you just want to try it out.


I will plan to try it out with a short number of timesteps. My first
approach for finding how to do this will be scrolling up in the lab.

Re: [ot][spam] Behavior Log For Control Data: HFRL Unit 1 Lab

Reply via email to