Hi all-
I think I just found a memory leak in numpy, or maybe I just don’t understand
generators. Anyway, the following snippet will quickly eat a ton of RAM:
P = randint(0,2, (20,13))
for i in range(50):
for ai in ndindex((2,)*13):
j = P.dot(ai)
If you replace the last line with
Hi all,
Please critique my draft exploring the possibilities of adding group_by
support to numpy:
http://pastebin.com/c5WLWPbp
In nearly ever project I work on, I require group_by functionality of some
sort. There are other libraries that provide this kind of functionality,
such as pandas for
This conversation gets discussed often with Numpy developers but since the
requirement for optimized Blas is pretty common these days, how about
distributing Numpy with OpenBlas by default? People who don't want optimized
BLAS or OpenBLAS can then edit the site.cfg file to add/remove. I can
Francesc
Congratulations and will definitely be benchmarking Numexpr soon.
Will similar performance improvements been seen with OpenBLAS as with MKL?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
For me, binary data wrt arrays means that data values are [0|1]. Is this
what is meant in The compression process is carried out internally by
Blosc, a high-performance compressor that is optimized for binary data. ?
___
NumPy-Discussion mailing
26.01.2014 14:44, Dinesh Vadhia kirjoitti:
This conversation gets discussed often with Numpy developers but
since the requirement for optimized Blas is pretty common these
days, how about distributing Numpy with OpenBlas by default? People
who don't want optimized BLAS or OpenBLAS can then
Hi Dinesh Vadhia,
* Dinesh Vadhia dineshbvad...@hotmail.com [2014-01-26]:
For me, binary data wrt arrays means that data values are [0|1]. Is this
what is meant in The compression process is carried out internally by
Blosc, a high-performance compressor that is optimized for binary data. ?
Hi Eelco
On Sun, 26 Jan 2014 12:20:04 +0100, Eelco Hoogendoorn wrote:
key1 = list('abaabb')
key2 = np.random.randint(0,2,(6,2))
values = np.random.rand(6,3)
print group_by((key1, key2)).median(values)
I agree that group_by functionality could be handy in numpy.
In the above example, what
On Sun, 26 Jan 2014 16:40:44 +0200, Pauli Virtanen wrote:
The Numpy Windows binaries distributed in the numpy project at
sourceforge.net are compiled with ATLAS, which should count as an
optimized BLAS. I don't recall what's the situation with OSX binaries,
but I'd believe they're with Atlas
An object of type GroupBy.
So a call to group_by does not return any consumable output directly. If
you want for instance the unique keys, or groups if you will, you can call
GroupBy.unique. In this case, for a tuple of input keys, youd get a tuple
of unique keys back. If you want to compute
To follow up with an example as to why it is useful that a temporary object
is created, consider the following (taken from the radial reduction
example):
g = group_by(np.round(radius, 5).flatten())
pp.errorbar(
g.unique,
g.mean(sample.flatten())[1],
On 1/26/2014 12:02 PM, Stéfan van der Walt wrote:
what would the output of
``group_by((key1, key2))``
I'd expect something named groupby to behave as below.
Alan
def groupby(seq, key):
from collections import defaultdict
groups = defaultdict(list)
for item in seq:
On 26.01.2014 18:06, Stéfan van der Walt wrote:
On Sun, 26 Jan 2014 16:40:44 +0200, Pauli Virtanen wrote:
The Numpy Windows binaries distributed in the numpy project at
sourceforge.net are compiled with ATLAS, which should count as an
optimized BLAS. I don't recall what's the situation with
Alan:
The equivalent of that in my current draft would be group_by(keys, values),
which is shorthand for group_by(keys).group(values); a optional values
argument to the constructor of GroupBy is directly bound to return an
iterable over the grouped values; but we often want to bind different
My comment is just on the name.
I'd expect something named `groupby`
to behave essentially like Mathematica's `GatherBy` command.
http://reference.wolfram.com/mathematica/ref/GatherBy.html
I think you are after something more like Matlab's grpstats:
not off topic at all; there are several matters of naming that I am not at
all settled on yet, and I don't think it is unimportant.
indeed, those are closely related functions, and I wasn't aware of them
yet, so that's some welcome additional perspective. The mathematica
function differs in that
Julian Taylor jtaylor.deb...@googlemail.com wrote:
if this issue disqualifies accelerate, it also disqualifies openblas as
a default. openblas has the same issue, we stuck a big fat warning into
the docs (site.cfg) for this now as people keep running into it.
What? Last time I checked,
On 26.01.2014 22:33, Sturla Molden wrote:
Julian Taylor jtaylor.deb...@googlemail.com wrote:
if this issue disqualifies accelerate, it also disqualifies openblas as
a default. openblas has the same issue, we stuck a big fat warning into
the docs (site.cfg) for this now as people keep running
Julian Taylor jtaylor.deb...@googlemail.com wrote:
the use of gnu openmp is probably be the problem, forking and gomp is
only possible in very limited circumstances.
see e.g. https://github.com/xianyi/OpenBLAS/issues/294
maybe it will work with clangs intel based openmp which should be
19 matches
Mail list logo