Hi dear statisticians, I am trying to implement a Simulation from the book "Elements of Statistical Learning" by Hastie et al. My Problem is that I don't understand how to generate the pseudodata as they did. The book says /For each of N =100 Samples, we generated p standard Gaussian features X with pairwise correlation 0.2. The outcome Y was generated according to a linear model/ Y = \sum_{j=1}^p X_j*b_j + sigma*Epsilon, (Sorry, don't know if a math mode exists here?) / where Epsilon was generated from a Standard Gaussian Distribution. For each dataset, the set of coefficients b_j were also generated from a Standard Gaussian Distribution. We investigated p = 20, 100 and 1000. The standard deviation sigma was chosen in each case so that the signal-to-noise-ratio Var[E(Y|X)]/sigma² equaled 2. /
So, what I managed to generate so far are the Xs, the Epsilons and the bs. I don't get how I'm meant to generate Y without knowing sigma and according to the description of sigma, I need Y to compute it. Can someone please help me? What am I not understanding here?? Thanks and best regards! -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html _______________________________________________ datatable-help mailing list datatable-help@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help