Re: [Numpy-discussion] Fwd: Numpy for data manipulation

2015-10-06 Thread Alex Rogozhnikov

Thanks for comments, I've fixed the named issues.

Code is python2&3 compatible, I aliased numpy and used better inversion.
Specially thanks for pointing at histogram equalization - I've added 
example for images.
Probably some other 'visual' examples would help - I'll try to invent 
something to other points, but this is not simple.


(I left %matplolib inline due to more appropriate rendering)

Alex.

02.10.15 10:50, Kiko пишет:



2015-10-02 9:48 GMT+02:00 Kiko <kikocorre...@gmail.com 
<mailto:kikocorre...@gmail.com>>:




2015-10-02 9:38 GMT+02:00 Alex Rogozhnikov
<alex.rogozhni...@yandex.ru <mailto:alex.rogozhni...@yandex.ru>>:

I would suggest

%matplotlib notebook

It will still have to a nice png, but you get an
interactive figure when it is live.


Amazing, thanks. I was using mpld3 for this.
(for some strange reason I need to put %matplotlib notebook
before each plot)


You should create a figure before each plot instead of putthon
%matplotlib notebook
plt.figure()



putthon == putting


The recommendation of inverting a permutation by
argsort'ing it, while it works, is suboptimal, as it takes
O(n log(n)) time, and you can do it in linear time:

Actually, there is (later in post) a linear solution using
bincount, but your code is definitely better. Thanks!

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
https://mail.scipy.org/mailman/listinfo/numpy-discussion





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fwd: Numpy for data manipulation

2015-10-01 Thread Alex Rogozhnikov
Hi, I have written some numpy tips and tricks I am using, which may be 
interesting to you.

This is quite long reading, so I've splitted it into two parts:

http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html
http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html

Comments are welcome, specially if you know any other ways to make this 
code faster (or better).


Regards,
Alex.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: Numpy for data manipulation

2015-10-02 Thread Alex Rogozhnikov

I would suggest

%matplotlib notebook

It will still have to a nice png, but you get an interactive figure 
when it is live.


Amazing, thanks. I was using mpld3 for this.
(for some strange reason I need to put %matplotlib notebook before each 
plot)


The recommendation of inverting a permutation by argsort'ing it, while 
it works, is suboptimal, as it takes O(n log(n)) time, and you can do 
it in linear time:
Actually, there is (later in post) a linear solution using bincount, but 
your code is definitely better. Thanks!

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Weighted percentile / quantile

2016-03-01 Thread Alex Rogozhnikov
Hi, 
I know the topic was already raised a long ago: 
https://mail.scipy.org/pipermail/numpy-discussion/2010-July/051851.html

There are also several questions on SO:
http://stackoverflow.com/questions/20601872/numpy-or-scipy-to-calculate-weighted-median
http://stackoverflow.com/questions/13546146/percentile-calculation-with-weighted-data
http://stackoverflow.com/questions/26102867/python-weighted-median-algorithm-with-pandas

The only working solution with numpy:
http://stackoverflow.com/questions/21844024/weighted-percentile-using-numpy
uses sorting. 

Are there better options at the moment (numpy/scipy/pandas)?

Cheers, 
Alex.___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Weighted percentile / quantile

2016-03-02 Thread Alex Rogozhnikov
Hi, Joe, 
> I am working (slowly) on upgrading the C code for partitioning with
> arbitrary arrays of real weights
really good to know there is some work in this direction. 

02 марта 2016 г., в 6:27, Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com> 
написал(а):

> Alex,
> 
> At the moment, there does not appear to be anything in numpy. However,
> I am working (slowly) on upgrading the C code for partitioning with
> arbitrary arrays of real weights. That will get `partition`, `median`,
> `percentile` to work with weights, as well as enabling weights for the
> automated bin estimators of `histogram`. `mean` already has an
> implementation of weights via `average`.
> 
> You may be interested in my original post to the mailing list here:
> https://mail.scipy.org/pipermail/numpy-discussion/2016-February/075000.html.
> Josef P. mentioned in one of his responses that statsmodels has a
> weighted quantile computation available as of PR 2707:
> https://github.com/statsmodels/statsmodels/pull/2707. That should
> effectively serve your purpose.

It’s the same sort+cumsum approach, and even worse because relies on 
aggregating.
Thanks for letting know, but I’ll definitely prefer implementation from SO 
(till numpy will support weights).

Cheers, 
Alex
> 
>    -Joe
> 
> 
> On Tue, Mar 1, 2016 at 6:03 PM, Alex Rogozhnikov
> <alex.rogozhni...@yandex.ru> wrote:
>> Hi,
>> I know the topic was already raised a long ago:
>> https://mail.scipy.org/pipermail/numpy-discussion/2010-July/051851.html
>> 
>> There are also several questions on SO:
>> http://stackoverflow.com/questions/20601872/numpy-or-scipy-to-calculate-weighted-median
>> http://stackoverflow.com/questions/13546146/percentile-calculation-with-weighted-data
>> http://stackoverflow.com/questions/26102867/python-weighted-median-algorithm-with-pandas
>> 
>> The only working solution with numpy:
>> http://stackoverflow.com/questions/21844024/weighted-percentile-using-numpy
>> uses sorting.
>> 
>> Are there better options at the moment (numpy/scipy/pandas)?
>> 
>> Cheers,
>> Alex.
>> 
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fortran order in recarray.

2017-02-22 Thread Alex Rogozhnikov
Hi Nathaniel, 


> pandas

yup, the idea was to have minimal pandas.DataFrame-like storage (which I was 
using for a long time), 
but without irritating problems with its row indexing and some other problems 
like interaction with matplotlib.

> A dict of arrays?


that's what I've started from and implemented, but at some point I decided that 
I'm reinventing the wheel and numpy has something already. In principle, I can 
ignore this 'column-oriented' storage requirement, but potentially it may turn 
out to be quite slow-ish if dtype's size is large.

Suggestions are welcome.

Another strange question:
in general, it is considered that once numpy.array is created, it's shape not 
changed. 
But if i want to keep the same recarray and change it's dtype and/or shape, is 
there a way to do this?

Thanks, 
Alex.



> 22 февр. 2017 г., в 3:53, Nathaniel Smith <n...@pobox.com> написал(а):
> 
> On Feb 21, 2017 3:24 PM, "Alex Rogozhnikov" <alex.rogozhni...@yandex.ru 
> <mailto:alex.rogozhni...@yandex.ru>> wrote:
> Ah, got it. Thanks, Chris!
> I thought recarray can be only one-dimensional (like tables with named 
> columns).
> 
> Maybe it's better to ask directly what I was looking for: 
> something that works like a table with named columns (but no labelling for 
> rows), and keeps data (of different dtypes) in a column-by-column way (and 
> this is numpy, not pandas). 
> 
> Is there such a magic thing?
> 
> Well, that's what pandas is for...
> 
> A dict of arrays?
> 
> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fortran order in recarray.

2017-02-21 Thread Alex Rogozhnikov
Hi, 

a question about numpy.recarray:
There is a parameter order in constructor 
https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.recarray.html 
,
 but it seems to have no effect:

import numpy
x = numpy.recarray(dtype=[('a', int), ('b', float)], shape=[1000], order='C')
y = numpy.recarray(dtype=[('a', int), ('b', float)], shape=[1000], order='F')
print numpy.array(x.ctypes.get_strides()) # [16]
print numpy.array(y.ctypes.get_strides()) # [16]

is this an intended behavior or bug?

Thanks,
Alex.___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fortran order in recarray.

2017-02-22 Thread Alex Rogozhnikov
Hi Stephan, 
thanks for the note. The progress over last two years wasn't impressive IMO, 
but I hope you'll manage.

As you suggest, I'll have a look at xarray too, as I see xarray.Dataset. 
I was sure that it doesn't work with non-homogeneous data at all, clearly I 
need to refresh my opinion.



> 22 февр. 2017 г., в 20:55, Stephan Hoyer <sho...@gmail.com> написал(а):
> 
> On Wed, Feb 22, 2017 at 8:57 AM, Alex Rogozhnikov <alex.rogozhni...@yandex.ru 
> <mailto:alex.rogozhni...@yandex.ru>> wrote:
> Pandas may be nice, if you need a report, and you need get it done tomorrow. 
> Then you'll throw away the code. When we initially used pandas as main data 
> storage in yandex/rep, it looked like an good idea, but a year later it was 
> obvious this was a wrong decision. In case when you build data pipeline / 
> research that should be working several years later (using some other 
> installation by someone else), usage of pandas shall be minimal. 
> 
> The pandas development team (myself included) is well aware of these issues. 
> There are long term plans/hopes to fix this, but there's a lot of work to be 
> done and some hard choices to make:
> https://github.com/pandas-dev/pandas/issues/1 
> <https://github.com/pandas-dev/pandas/issues/1>
> https://github.com/pandas-dev/pandas/issues/13862 
> <https://github.com/pandas-dev/pandas/issues/13862> 
> 
>  That's why I am looking for a reliable pandas substitute, which should be: 
> - completely consistent with numpy and should fail when this wasn't 
> implemented / impossible
> - fewer new abstractions, nobody wants to learn 
> one-more-way-to-manipulate-the-data, specifically other researchers
> - it may be less convenient for interactive data mungling
>   - in particular, less methods is ok
> - written code should be interpretable, and hardly can be misinterpreted.
> - not super slow, 1-10 gigabytes datasets are a normal situation
> 
> This has some overlap with our motivations for writing Xarray 
> (http://xarray.pydata.org <http://xarray.pydata.org/>), so I encourage you to 
> take a look. It still might be more complex than you're looking for, but we 
> did try to clean up the really ambiguous APIs from pandas like indexing.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fortran order in recarray.

2017-02-22 Thread Alex Rogozhnikov

> 22 февр. 2017 г., в 20:39, josef.p...@gmail.com написал(а):
> 
> 
> 
> On Wed, Feb 22, 2017 at 11:57 AM, Alex Rogozhnikov 
> <alex.rogozhni...@yandex.ru <mailto:alex.rogozhni...@yandex.ru>> wrote:
> Hi Matthew, 
> maybe it is not the best place to discuss problems of pandas, but to show 
> that I am not missing something, let's consider a simple example.
> 
> # simplest DataFrame
> x = pandas.DataFrame(dict(a=numpy.arange(10), b=numpy.arange(10, 20)))
> 
> # simplest indexing. Can you predict results without looking at comments?
> x[:2] # returns two first rows, as expected
> x[[0, 1]]# returns copy of x, whole dataframe
> x[numpy.array(2)] # fails with IndexError: indices are out-of-bounds (can you 
> guess why?)
> x[[0, 1], :] # unhashable type: list
> 
> just in case - I know about .loc and .iloc, but when you write code with many 
> subroutines, you concentrate on numpy inputs, and at some point you simply 
> forget to convert some of the data you operated with to numpy and it 
> continues to work, but it yields wrong results (while you tested everything, 
> but you tested this for numpy). Checking all the inputs in each small 
> subroutine is strange.
> 
> Ok, a bit more:
> x[x['a'] > 5]# works as expected
> x[x['a'] > 5, :] # 'Series' objects are mutable, thus they cannot be 
> hashed
> lookup = numpy.arange(10)
> x[lookup[x['a']] > 5] # works as expected
> x[lookup[x['a']] > 5, :]  # TypeError: unhashable type: 'numpy.ndarray'
> 
> x[lookup]['a']   # indexError
> x['a'][lookup]   # works as expected
> 
> Now let's go a bit further: train/test splitted the data for machine learning 
> (again, the most frequent operation)
> 
> from sklearn.model_selection import train_test_split
> x1, x2 = train_test_split(x, random_state=42)
> 
> # compare next to operations with pandas.DataFrame
> col = x1['a']
> print col[:2]   # first two elements
> print col[[0, 1]]  # doesn't fail (while there in no row with index 0), fills 
> it with NaN
> print col[numpy.arange(2)] # same as previous
> 
> print col[col > 4] # as expected
> print col[col.values > 4] # as expected
> print col.values[col > 4] # converts boolean to int, uses int indexing, but 
> at least raises warning
> 
> Mistakes done by such silent misoperating are not easy to detect (when your 
> data pipeline consists of several steps), quite hard to locate the source of 
> problem and almost impossible to be sure that you indeed avoided all such 
> caveats. Code review turns into paranoidal process (if you care about the 
> result, of course).
> 
> Things are even worse, because I've demonstrated this for my installation, 
> and probably if you run this with some other pandas installation, you get 
> some other results (that were really basic operations). So things that worked 
> ok in one version, may work different way in the other, this becomes 
> completely intractable. 
> 
> Pandas may be nice, if you need a report, and you need get it done tomorrow. 
> Then you'll throw away the code. When we initially used pandas as main data 
> storage in yandex/rep, it looked like an good idea, but a year later it was 
> obvious this was a wrong decision. In case when you build data pipeline / 
> research that should be working several years later (using some other 
> installation by someone else), usage of pandas shall be minimal. 
> 
> That's why I am looking for a reliable pandas substitute, which should be: 
> - completely consistent with numpy and should fail when this wasn't 
> implemented / impossible
> - fewer new abstractions, nobody wants to learn 
> one-more-way-to-manipulate-the-data, specifically other researchers
> - it may be less convenient for interactive data mungling
>   - in particular, less methods is ok
> - written code should be interpretable, and hardly can be misinterpreted.
> - not super slow, 1-10 gigabytes datasets are a normal situation
> 
> Just to the pandas part
> 
> statsmodels supported pandas almost from the very beginning (or maybe after 
> 1.5 years) when the new pandas was still very young.
> 
> However, what I insisted on is that pandas is in the wrapper/interface code, 
> and internally only numpy arrays are used. Besides the confusing "magic" 
> indexing of early pandas, there were a lot of details that silently produced 
> different results, e.g. default iteration on axis=1, ddof in std and var =1 
> instead of numpy =0.
> 
> Essentially, every interface corresponds to np.asarry, but we store the 
> DataFrame information, mainly the index and column names, wo we can return 
> the appropriate pandas object if a pandas object was used for the input.

Yes,

Re: [Numpy-discussion] Fortran order in recarray.

2017-02-21 Thread Alex Rogozhnikov
Ah, got it. Thanks, Chris!
I thought recarray can be only one-dimensional (like tables with named columns).

Maybe it's better to ask directly what I was looking for: 
something that works like a table with named columns (but no labelling for 
rows), and keeps data (of different dtypes) in a column-by-column way (and this 
is numpy, not pandas). 

Is there such a magic thing?

Alex.


> 22 февр. 2017 г., в 2:10, Chris Barker <chris.bar...@noaa.gov> написал(а):
> 
> 
> 
> On Tue, Feb 21, 2017 at 3:05 PM, Alex Rogozhnikov <alex.rogozhni...@yandex.ru 
> <mailto:alex.rogozhni...@yandex.ru>> wrote:
> a question about numpy.recarray:
> There is a parameter order in constructor 
> https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.recarray.html
>  
> <https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.recarray.html>,
>  but it seems to have no effect:
> x = numpy.recarray(dtype=[('a', int), ('b', float)], shape=[1000], order='C')
> 
> you are creating a 1D array here -- there is no difference between Fortran 
> and C order for a 1D array. For 2D:
> 
> In [2]: x = numpy.recarray(dtype=[('a', int), ('b', float)], shape=[10,10], 
> order='C')
> 
> 
> In [3]: x.strides
> Out[3]: (160, 16)
> 
> 
> In [4]: y = numpy.recarray(dtype=[('a', int), ('b', float)], shape=[10,10], 
> order='F')
> 
> 
> In [5]: y.strides
> Out[5]: (16, 160)
> 
> note the easier way to get the strides, too :-)
> 
> -CHB
> 
> 
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
> 
> chris.bar...@noaa.gov 
> <mailto:chris.bar...@noaa.gov>___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fortran order in recarray.

2017-02-22 Thread Alex Rogozhnikov
Hi Matthew, 
maybe it is not the best place to discuss problems of pandas, but to show that 
I am not missing something, let's consider a simple example.

# simplest DataFrame
x = pandas.DataFrame(dict(a=numpy.arange(10), b=numpy.arange(10, 20)))

# simplest indexing. Can you predict results without looking at comments?
x[:2] # returns two first rows, as expected
x[[0, 1]]# returns copy of x, whole dataframe
x[numpy.array(2)] # fails with IndexError: indices are out-of-bounds (can you 
guess why?)
x[[0, 1], :] # unhashable type: list

just in case - I know about .loc and .iloc, but when you write code with many 
subroutines, you concentrate on numpy inputs, and at some point you simply 
forget to convert some of the data you operated with to numpy and it continues 
to work, but it yields wrong results (while you tested everything, but you 
tested this for numpy). Checking all the inputs in each small subroutine is 
strange.

Ok, a bit more:
x[x['a'] > 5]# works as expected
x[x['a'] > 5, :] # 'Series' objects are mutable, thus they cannot be hashed
lookup = numpy.arange(10)
x[lookup[x['a']] > 5] # works as expected
x[lookup[x['a']] > 5, :]  # TypeError: unhashable type: 'numpy.ndarray'

x[lookup]['a']   # indexError
x['a'][lookup]   # works as expected

Now let's go a bit further: train/test splitted the data for machine learning 
(again, the most frequent operation)

from sklearn.model_selection import train_test_split
x1, x2 = train_test_split(x, random_state=42)

# compare next to operations with pandas.DataFrame
col = x1['a']
print col[:2]   # first two elements
print col[[0, 1]]  # doesn't fail (while there in no row with index 0), fills 
it with NaN
print col[numpy.arange(2)] # same as previous

print col[col > 4] # as expected
print col[col.values > 4] # as expected
print col.values[col > 4] # converts boolean to int, uses int indexing, but at 
least raises warning

Mistakes done by such silent misoperating are not easy to detect (when your 
data pipeline consists of several steps), quite hard to locate the source of 
problem and almost impossible to be sure that you indeed avoided all such 
caveats. Code review turns into paranoidal process (if you care about the 
result, of course).

Things are even worse, because I've demonstrated this for my installation, and 
probably if you run this with some other pandas installation, you get some 
other results (that were really basic operations). So things that worked ok in 
one version, may work different way in the other, this becomes completely 
intractable. 

Pandas may be nice, if you need a report, and you need get it done tomorrow. 
Then you'll throw away the code. When we initially used pandas as main data 
storage in yandex/rep, it looked like an good idea, but a year later it was 
obvious this was a wrong decision. In case when you build data pipeline / 
research that should be working several years later (using some other 
installation by someone else), usage of pandas shall be minimal. 

That's why I am looking for a reliable pandas substitute, which should be: 
- completely consistent with numpy and should fail when this wasn't implemented 
/ impossible
- fewer new abstractions, nobody wants to learn 
one-more-way-to-manipulate-the-data, specifically other researchers
- it may be less convenient for interactive data mungling
  - in particular, less methods is ok
- written code should be interpretable, and hardly can be misinterpreted.
- not super slow, 1-10 gigabytes datasets are a normal situation

Well, that's it. 
Sorry for large letter.

Alex.



> 22 февр. 2017 г., в 18:38, Matthew Harrigan <harrigan.matt...@gmail.com> 
> написал(а):
> 
> Alex,
> 
> Can you please post some code showing exactly what you are trying to do and 
> any issues you are having, particularly the "irritating problems with its row 
> indexing and some other problems" you quote above?
> 
> On Wed, Feb 22, 2017 at 10:34 AM, Robert McLeod <robbmcl...@gmail.com 
> <mailto:robbmcl...@gmail.com>> wrote:
> Just as a note, Appveyor supports uploading modules to "public websites":
> 
> https://packaging.python.org/appveyor/ 
> <https://packaging.python.org/appveyor/>
> 
> The main issue I would see from this, is the PyPi has my password stored on 
> my machine in a plain text file.   I'm not sure whether there's a way to 
> provide Appveyor with a SSH key instead.
> 
> On Wed, Feb 22, 2017 at 4:23 PM, Alex Rogozhnikov <alex.rogozhni...@yandex.ru 
> <mailto:alex.rogozhni...@yandex.ru>> wrote:
> Hi Francesc, 
> thanks a lot for you reply and for your impressive job on bcolz! 
> 
> Bcolz seems to make stress on compression, which is not of much interest for 
> me, but the ctable, and chunked operations look very appropriate to me now. 
> (Of course, I'll need to test it much before

Re: [Numpy-discussion] From Python to Numpy

2016-12-30 Thread Alex Rogozhnikov
Hi Nicolas, 
that's a very nice work!

> Comments/questions/fixes/ideas are of course welcome.

Boids example brought my attention too, some comments on it:
- I find using complex numbers here very natural, this should speed up things 
and also shorten the code (rotating without einsum, etc.)
- you probably can speed up things with going to sparse arrays 
- and you can go to really large numbers of 'birds' if you combine it with 
preliminary splitting of space into squares, thus analyze only birds from close 
squares

Also I think worth adding some operations with HSV / HSL color spaces as those 
can be visualized easily e.g. on some photo.

Thanks,
Alex.



> 23 дек. 2016 г., в 12:14, Kiko  написал(а):
> 
> 
> 
> 2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier  >:
> 
> Dear all,
> 
> I've just put online a (kind of) book on Numpy and more specifically about 
> vectorization methods. It's not yet finished, has not been reviewed and it's 
> a bit rough around the edges. But I think there are some material that can be 
> interesting. I'm specifically happy with the boids example that show a nice 
> combination of numpy and matplotlib strengths.
> 
> Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ 
> 
> Sources are available at: https://github.com/rougier/from-python-to-numpy 
> 
> 
> 
> Comments/questions/fixes/ideas are of course welcome.
> 
> Wow!!! Beautiful.
> 
> Thanks for sharing.
>  
> 
> 
> Nicolas
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org 
> https://mail.scipy.org/mailman/listinfo/numpy-discussion 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] From Python to Numpy

2017-01-05 Thread Alex Rogozhnikov

> 31 дек. 2016 г., в 2:09, Nicolas P. Rougier <nicolas.roug...@inria.fr> 
> написал(а):
> 
>> 
>> On 30 Dec 2016, at 20:36, Alex Rogozhnikov <alex.rogozhni...@yandex.ru> 
>> wrote:
>> 
>> Hi Nicolas, 
>> that's a very nice work!
>> 
>>> Comments/questions/fixes/ideas are of course welcome.
>> 
>> Boids example brought my attention too, some comments on it:
>> - I find using complex numbers here very natural, this should speed up 
>> things and also shorten the code (rotating without einsum, etc.)
>> - you probably can speed up things with going to sparse arrays 
>> - and you can go to really large numbers of 'birds' if you combine it with 
>> preliminary splitting of space into squares, thus analyze only birds from 
>> close squares
>> 
>> Also I think worth adding some operations with HSV / HSL color spaces as 
>> those can be visualized easily e.g. on some photo.
>> 
>> Thanks,
>> Alex.
> 
> 
> Thanks.
> 
> I'm not sure to know how to use complex with this example. Could you 
> elaborate ?

Position and velocity are encoded by complex numbers.
Rotation is multiplication by exp(i \phi), translating is adding a complex 
number.
Distance = abs(x - y). 

I think, that's all operations you need, but maybe I miss something.

> 
> For the preliminary splitting, a quadtree (scipy KDTree) could also help a 
> lot but I wanted to stick to numpy only.
> A simpler square splitting as you suggest could make thing faster but require 
> some work. I'm not sure yet I see how to restrict analysis to close squares.
> 
> Nicolas
> 
> 
>> 
>> 
>> 
>>> 23 дек. 2016 г., в 12:14, Kiko <kikocorre...@gmail.com> написал(а):
>>> 
>>> 
>>> 
>>> 2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <nicolas.roug...@inria.fr>:
>>> 
>>> Dear all,
>>> 
>>> I've just put online a (kind of) book on Numpy and more specifically about 
>>> vectorization methods. It's not yet finished, has not been reviewed and 
>>> it's a bit rough around the edges. But I think there are some material that 
>>> can be interesting. I'm specifically happy with the boids example that show 
>>> a nice combination of numpy and matplotlib strengths.
>>> 
>>> Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/
>>> Sources are available at: https://github.com/rougier/from-python-to-numpy
>>> 
>>> 
>>> Comments/questions/fixes/ideas are of course welcome.
>>> 
>>> Wow!!! Beautiful.
>>> 
>>> Thanks for sharing.
>>> 
>>> 
>>> 
>>> Nicolas
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> 
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
> https://mail.scipy.org/mailman/listinfo/numpy-discussion 
> <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion