Re: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe
FYI, binaries linking openblas should add this patch in some way: https://github.com/numpy/numpy/pull/4580 Cliffs: linking OpenBLAS prevents parallelization via threading or multiprocessing. just wasted a bunch of time figuring that out ... (though its well documented in numerous stackoverflow questions, too bad none of them reached us) On 04/01/2014 02:59 AM, Matthew Brett wrote: Hi, On Wed, Mar 26, 2014 at 11:34 AM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 26.03.2014 16:27, Olivier Grisel wrote: Hi Carl, I installed Python 2.7.6 64 bits on a windows server instance from rackspace cloud and then ran get-pip.py and then could successfully install the numpy and scipy wheel packages from your google drive folder. I tested dot products and scipy.linalg.svd and they work as expected. Would it make sense to embed the blas and lapack header files as part of this numpy wheel and make numpy.distutils.system_info return the lib and include folder pointing to the embedded libopenblas.dll and header files so has to make third party libraries directly buildable against those? as for using openblas by default in binary builds, no. pthread openblas build is now fork safe which is great but it is still not reliable enough for a default. E.g. the current latest release 0.2.8 still has one crash bug on dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. git head has the former four fixed bug still has wrong results for cgemv. I noticed the Carl was only getting three test failures on scipy - are these related? == FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) -- Traceback (most recent call last): File D:\devel\py27\lib\site-packages\nose\case.py, line 197, in runTest self.test(*self.arg) File D:\devel\py27\lib\site-packages\scipy\linalg\tests\test_decomp.py, line 642, in eigenhproblem_general assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 811, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 4 decimals (mismatch 100.0%) x: array([ 0., 0., 0.], dtype=float32) y: array([ 1., 1., 1.]) == FAIL: Tests for the minimize wrapper. -- Traceback (most recent call last): File D:\devel\py27\lib\site-packages\nose\case.py, line 197, in runTest self.test(*self.arg) File D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py, line 435, in test_minimize self.test_powell(True) File D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py, line 209, in test_powell atol=1e-14, rtol=1e-7) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 1181, in assert_allclose verbose=verbose, header=header) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-14 (mismatch 100.0%) x: array([[ 0.75077639, -0.44156936, 0.47100962], [ 0.75077639, -0.44156936, 0.48052496], [ 1.50155279, -0.88313872, 0.95153458],... y: array([[ 0.72949016, -0.44156936, 0.47100962], [ 0.72949016, -0.44156936, 0.48052496], [ 1.45898031, -0.88313872, 0.95153458],... == FAIL: Powell (direction set) optimization routine -- Traceback (most recent call last): File D:\devel\py27\lib\site-packages\nose\case.py, line 197, in runTest self.test(*self.arg) File D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py, line 209, in test_powell atol=1e-14, rtol=1e-7) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 1181, in assert_allclose verbose=verbose, header=header) File D:\devel\py27\lib\site-packages\numpy\testing\utils.py, line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-14 (mismatch 100.0%) x: array([[ 0.75077639, -0.44156936, 0.47100962], [ 0.75077639, -0.44156936, 0.48052496], [ 1.50155279, -0.88313872, 0.95153458],... y: array([[ 0.72949016, -0.44156936, 0.47100962], [ 0.72949016, -0.44156936, 0.48052496], [ 1.45898031, -0.88313872, 0.95153458],...
Re: [Numpy-discussion] ANN: NumPy 1.8.1 release
On Wed, Apr 2, 2014 at 7:52 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Tue, Apr 1, 2014 at 4:46 PM, David Cournapeau courn...@gmail.com wrote: On Wed, Apr 2, 2014 at 12:36 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 1, 2014 at 11:58 PM, David Cournapeau courn...@gmail.com wrote: On Tue, Apr 1, 2014 at 6:43 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 1, 2014 at 6:26 PM, Matthew Brett matthew.br...@gmail.com wrote: I'm guessing that the LOAD_WITH_ALTERED_SEARCH_PATH means that a DLL loaded via: hDLL = LoadLibraryEx(pathname, NULL, LOAD_WITH_ALTERED_SEARCH_PATH); will in turn (by default) search for its dependent DLLs in their own directory.Or maybe in the directory of the first DLL to be loaded with LOAD_WITH_ALTERED_SEARCH_PATH, damned if I can follow the documentation. Looking forward to doing my tax return after this. But - anyway - that means that any extensions in the DLLs directory will get their dependencies from the DLLs directory, but that is only true for extensions in that directory. So in conclusion, if we just drop our compiled dependencies next to the compiled module files then we're good, even on older Windows versions? That sounds much simpler than previous discussions, but good news if it's true... That does not work very well in my experience: - numpy has extension modules in multiple directories, so we would need to copy the dlls in multiple subdirectories - copying dlls means that windows will load that dll multiple times, with all the ensuing problems (I don't know for MKL/OpenBlas, but we've seen serious issues when doing something similar for hdf5 dll and pytables/h5py). We could just ship all numpy's extension modules in the same directory if we wanted. It would be pretty easy to stick some code at the top of numpy/__init__.py to load them from numpy/all_dlls/ and then slot them into the appropriate places in the package namespace. Of course scipy and numpy will still both have to ship BLAS etc., and so I guess it will get loaded at least twice in *any* binary install system. I'm not sure why this would be a problem (Windows, unlike Unix, carefully separates DLL namespaces, right?) It does not really matter here. For pure blas/lapack, that may be ok because the functions are stateless, but I would not count on it either. The cleanest solution I can think of is to have 'privately shared DLL', but that would AFAIK require patching python, so not really an option. David - do you know anything about private assemblies [1]? I never managed to make that work properly. Might they work for our problem? How about AddDllDirectory [2]? I don't think it is appropriate to use those functions in a C extension module (as it impacts the whole process). David Cheers, Matthew [1] http://msdn.microsoft.com/en-us/library/windows/desktop/ff951638(v=vs.85).aspx [2] http://msdn.microsoft.com/en-us/library/windows/desktop/hh310513(v=vs.85).asp ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value
alex argri...@ncsu.edu wrote: I don't have any opinion about this debate, but I love the justification in that thread Any surprise that is created by the different default should be mitigated by the fact that it's an opportunity to learn something about what you are doing. That is so true. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value
josef.p...@gmail.com wrote: pandas came later and thought ddof=1 is worth more than consistency. Pandas is a data analysis package. NumPy is a numerical array package. I think ddof=1 is justified for Pandas, for consistency with statistical software (SPSS et al.) For NumPy, there are many computational tasks where the Bessel correction is not wanted, so providing a uncorrected result is the correct thing to do. NumPy should be a low-level array library that does very little magic. Those who need the Bessel correction can multiply with sqrt(n/float(n-1)) or specify ddof. Bu that belongs in the docs. Sturla P.S. Personally I am not convinced unbiased is ever a valid argument, as the biased estimator has smaller error. This is from experience in marksmanship: I'd rather shoot a tight series with small systematic error than scatter my bullets wildly but unbiased on the target. It is the total error that counts. The series with smallest total error gets the best score. It is better to shoot two series and calibrate the sight in between than use a calibration-free sight that don't allow us to aim. That's why I think classical statistics got this one wrong. Unbiased is never a virtue, but the smallest error is. Thus, if we are to repeat an experiment, we should calibrate our estimator just like a marksman calibrates his sight. But the aim should always be calibrated to give the smallest error, not an unbiased scatter. Noone in their right mind would claim a shotgun is more precise than a rifle because it has smaller bias. But that is what applying the Bessel correction implies. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe
2014-04-03 14:56 GMT+02:00 Julian Taylor jtaylor.deb...@googlemail.com: FYI, binaries linking openblas should add this patch in some way: https://github.com/numpy/numpy/pull/4580 Cliffs: linking OpenBLAS prevents parallelization via threading or multiprocessing. just wasted a bunch of time figuring that out ... (though its well documented in numerous stackoverflow questions, too bad none of them reached us) You mean because of the default CPU affinity stuff in the default OpenBLAS? If we ship OpenBLAS with a windows binary of numpy / scipy we can compile OpenBLAS with the NO_AFFINITY=1 flag to avoid the issue. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] mtrand normal sigma = 0 too restrictive
Traceback (most recent call last): File ./test_inroute_frame.py, line 1694, in module run_line (sys.argv) File ./test_inroute_frame.py, line 1690, in run_line return run (opt, cmdline) File ./test_inroute_frame.py, line 1115, in run burst.tr (xbits, freq=freqs[i]+burst.freq_offset, tau=burst.time_offset, phase=burst.phase) File /home/nbecker/hn-inroute-fixed/transmitter.py, line 191, in __call__ self.channel_out, self.complex_channel_gain = self.channel (mix_out) File ./test_inroute_frame.py, line 105, in __call__ ampl = 10**(0.05*self.pwr_gen()) File ./test_inroute_frame.py, line 148, in __call__ pwr = self.gen() File ./test_inroute_frame.py, line 124, in __call__ x = self.gen() File /home/nbecker/sigproc.ndarray/normal.py, line 11, in __call__ return self.rs.normal (self.mean, self.std, size) File mtrand.pyx, line 1479, in mtrand.RandomState.normal (numpy/random/mtrand/mtrand.c:9359) ValueError: scale = 0 I believe this restriction is too restrictive, and should be scale 0 There is nothing wrong with scale == 0 as far as I know. It's a convenient way to turn off the noise in my simulation. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value
On Wed, Apr 2, 2014 at 10:06 AM, Sturla Molden sturla.mol...@gmail.com wrote: josef.p...@gmail.com wrote: pandas came later and thought ddof=1 is worth more than consistency. Pandas is a data analysis package. NumPy is a numerical array package. I think ddof=1 is justified for Pandas, for consistency with statistical software (SPSS et al.) For NumPy, there are many computational tasks where the Bessel correction is not wanted, so providing a uncorrected result is the correct thing to do. NumPy should be a low-level array library that does very little magic. Those who need the Bessel correction can multiply with sqrt(n/float(n-1)) or specify ddof. Bu that belongs in the docs. Sturla P.S. Personally I am not convinced unbiased is ever a valid argument, as the biased estimator has smaller error. This is from experience in marksmanship: I'd rather shoot a tight series with small systematic error than scatter my bullets wildly but unbiased on the target. It is the total error that counts. The series with smallest total error gets the best score. It is better to shoot two series and calibrate the sight in between than use a calibration-free sight that don't allow us to aim. calibration == bias correction ? That's why I think classical statistics got this one wrong. Unbiased is never a virtue, but the smallest error is. Thus, if we are to repeat an experiment, we should calibrate our estimator just like a marksman calibrates his sight. But the aim should always be calibrated to give the smallest error, not an unbiased scatter. Noone in their right mind would claim a shotgun is more precise than a rifle because it has smaller bias. But that is what applying the Bessel correction implies. https://www.youtube.com/watch?v=i4xcEZZDW_I I spent several days trying to figure out what Stata is doing for small sample corrections to reduce the bias of the rejection interval with uncorrected variance estimates. Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value
Sturla P.S. Personally I am not convinced unbiased is ever a valid argument, as the biased estimator has smaller error. This is from experience in marksmanship: I'd rather shoot a tight series with small systematic error than scatter my bullets wildly but unbiased on the target. It is the total error that counts. The series with smallest total error gets the best score. It is better to shoot two series and calibrate the sight in between than use a calibration-free sight that don't allow us to aim. That's why I think classical statistics got this one wrong. Unbiased is never a virtue, but the smallest error is. Thus, if we are to repeat an experiment, we should calibrate our estimator just like a marksman calibrates his sight. But the aim should always be calibrated to give the smallest error, not an unbiased scatter. Noone in their right mind would claim a shotgun is more precise than a rifle because it has smaller bias. But that is what applying the Bessel correction implies. I agree with the point, and what makes it even worse is that ddof=1 does not even produce an unbiased standard deviation estimate. I produces an unbiased variance estimate but the sqrt of this variance estimate is a biased standard deviation estimate, http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation. Bago ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Standard Deviation (std): Suggested change for ddof default value
On Thu, Apr 3, 2014 at 2:21 PM, Bago mrb...@gmail.com wrote: Sturla P.S. Personally I am not convinced unbiased is ever a valid argument, as the biased estimator has smaller error. This is from experience in marksmanship: I'd rather shoot a tight series with small systematic error than scatter my bullets wildly but unbiased on the target. It is the total error that counts. The series with smallest total error gets the best score. It is better to shoot two series and calibrate the sight in between than use a calibration-free sight that don't allow us to aim. That's why I think classical statistics got this one wrong. Unbiased is never a virtue, but the smallest error is. Thus, if we are to repeat an experiment, we should calibrate our estimator just like a marksman calibrates his sight. But the aim should always be calibrated to give the smallest error, not an unbiased scatter. Noone in their right mind would claim a shotgun is more precise than a rifle because it has smaller bias. But that is what applying the Bessel correction implies. I agree with the point, and what makes it even worse is that ddof=1 does not even produce an unbiased standard deviation estimate. I produces an unbiased variance estimate but the sqrt of this variance estimate is a biased standard deviation estimate, http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation. But ddof=1 still produces a smaller bias than ddof=0 I think the main point in stats is that without ddof, the variance will be too small and t-test or similar will be liberal in small samples, or confidence intervals will be too short. (for statisticians that prefer to have tests that maintain their level and prefer to err on the conservative side.) Josef Bago ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: Scipy 0.14.0 release candidate 1
Hi, I'm pleased to announce the availability of the first release candidate of Scipy 0.14.0. Please try this RC and report any issues on the scipy-dev mailing list. A significant number of fixes for scipy.sparse went in after the beta release, so users of that module may want to test this release carefully. Source tarballs, binaries and the full release notes can be found at https://sourceforge.net/projects/scipy/files/scipy/0.14.0rc1/. The final release will follow in one week if no new issues are found. A big thank you to everyone who contributed to this release! Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion