On Tue, 29 Dec 2020 02:52:15 -0800 (PST), Priya Singh wrote: [snip] > I have two spectra with wavelength, flux, and error on flux. I want to > find out the variability of these two spectra based on the 2 sample > Chi-square test. I am using following code: > > def compute_chi2_var(file1,file2,zemi,vmin,vmax): > w1,f1,e1,c1,vel1 = get_spec_vel(dir_data+file1,zemi) > id1 = np.where(np.logical_and(vel1 >= vmin, vel1 < vmax))[0] > w2,f2,e2,c2,vel2 = get_spec_vel(dir_data+file2,zemi) > id2 = np.where(np.logical_and(vel2 >= vmin, vel2 < vmax))[0] > f_int = interp1d(w1[id1], f1[id1]/c1[id1], kind='cubic') > e_int = interp1d(w1[id1], e1[id1]/c1[id1], kind='cubic') > f_obs,e_obs = f_int(w2[id2]), e_int(w2[id2]) > f_exp, e_exp = f2[id2]/c2[id2], e2[id2]/c2[id2] > e_net = e_obs**2 + e_exp**2 > chi_square = np.sum( (f_obs**2 - f_exp**2)/e_net ) > dof = len(f_obs) - 1 > pval = 1 - stats.chi2.cdf( chi_square, dof) > print('%.10E' % pval) > > NN = 320 > compute_chi2_var(file7[NN],file14[NN],zemi[NN],vmin[NN],vmax[NN]) > > > I am running this code on many files, and I want to grab those pair of > spectra where, the p-value of chi-squa is less than 10^(-8), for the > change to be unlikely due to a random occurrence. > > Is my code right concept-wise? Because the chi-squ value is coming out > to be very large (positive and negative), such that my p-value is > always between 1 and 0 which I know from other's results not correct. > > Can anyone suggest me is the concept of 2-sample chi-squ applied by me > is correct or not?
1. This is not really a Python question, is it? 2. Recommendation: test your chi-squared code on simpler sample data. 3. Observation: P-values *are* normally between 0 and 1. 4. Observation: chi-squared values are never negative. 5. Recommendation: Learn a little about the chi-squared distribution (but not on a Python newsgroup). The chi-squared distribution with N degrees of freedom is the distribution expected for a quantity that is the sum of the squares of N normally distributed random variables with mean 0 and standard deviation 1. If you expect f_obs to equal f_exp plus some normally distributed noise with mean 0 and standard deviation sigma, then (f_obs-f_exp)/sigma should be normally distributed with mean 0 and standard deviation 1. 6. Observation: (f_obs**2 - f_exp**2)/e_net is probably not what you want, since it can be negative. You probably want something like (f_obs-f_exp)**2/e_net. But don't take my word for it. Good luck. -- To email me, substitute nowhere->runbox, invalid->com. -- https://mail.python.org/mailman/listinfo/python-list