[Numpy-discussion] smoothing function

2014-05-15 Thread rodrigo koblitz
Buenos,
I'm reading Zuur book (ecology models with R) and try make it entire in
python.
Have this function in R:
M4 - gam(So ∼ s(De) + factor(ID), subset = I1)

the 's' term indicated with So is modelled as a smoothing function of De

I'm looking for something close to this in python.

Someone can help me?

abraços,
Koblitz
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fancy Indexing of Structured Arrays is Slow

2014-05-15 Thread Dave Hirschfeld
As can be seen from the code below (or in the notebook linked beneath) fancy 
indexing of a structured array is twice as slow as indexing both fields 
independently - making it 4x slower?

I found that fancy indexing was a bottleneck in my application so I was 
hoping to reduce the overhead by combining the arrays into a structured 
array and only doing one indexing operation. Unfortunately that doubled the 
time that it took!

Is there any reason for this? If not, I'm happy to open an enhancement issue 
on GitHub - just let me know.

Thanks,
Dave


In [32]: nrows, ncols = 365, 1

In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names=
['widgets','gadgets'])

In [34]: row_idx = randint(0, nrows, ncols)
...: col_idx = np.arange(ncols)

In [35]: %timeit filtered_items = items[row_idx, col_idx]
100 loops, best of 3: 3.45 ms per loop

In [36]: %%timeit 
...: widgets = items['widgets'][row_idx, col_idx]
...: gadgets = items['gadgets'][row_idx, col_idx]
...: 
1000 loops, best of 3: 1.57 ms per loop


http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9
970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu
red%20Array%20Indexing.ipynb

https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [JOB] Scientific software engineer at the Met Office

2014-05-15 Thread Phil Elson
I just wanted to let you know that there is currently a vacancy for a
full-time developer at the Met Office, the UK's National Weather Service,
within our Analysis, Visualisation and Data (AVD) team.

I'm posting on this list as the Met Office's AVD team are heavily involved
in the development of Python packages to support the work that our
scientists undertake on a daily basis. The vast majority of the AVD team's
time is spent working on our own open source Python packages Iris, cartopy
and biggus as well as working on packages such as numpy, scipy, matplotlib
and IPython; so we don't see this as just a great opportunity to work
within a world class scientific organisation, but a role which will also
deliver real benefits to the wider scientific Python community.

Please see http://goo.gl/3ScFaZ for full details and how to apply, or
contact hrenquir...@metoffice.gov.uk if you have any questions.

Many Thanks,

Phil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] smoothing function

2014-05-15 Thread josef . pktd
On Thu, May 15, 2014 at 8:04 AM, rodrigo koblitz
rodrigokobl...@gmail.comwrote:

 Buenos,
 I'm reading Zuur book (ecology models with R) and try make it entire in
 python.
 Have this function in R:
 M4 - gam(So ∼ s(De) + factor(ID), subset = I1)

 the 's' term indicated with So is modelled as a smoothing function of De

 I'm looking for something close to this in python.


These kind of general questions are better asked on the scipy-user mailing
list which covers more general topics than numpy-discussion.

As far as I know, GAMs are not available in python, at least I never came
across any.

statsmodels has an ancient GAM in the sandbox that has never been connected
to any smoother, since, lowess, spline and kernel regression support was
missing. Nobody is working on that right now.
If you have only a single nonparametric variable, then statsmodels also has
partial linear model based on kernel regression, that is not cleaned up or
verified, but Padarn is currently working on this.

I think in this case using a penalized linear model with spline basis
functions would be more efficient, but there is also nothing clean
available, AFAIK.

It's not too difficult to write the basic models, but it takes time to
figure out the last 10% and to verify the results and write unit tests.


If you make your code publicly available, then I would be very interested
in a link. I'm trying to collect examples from books that have a python
solution.

Josef



 Someone can help me?

 abraços,
 Koblitz

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] smoothing function

2014-05-15 Thread Nathaniel Smith
On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz
rodrigokobl...@gmail.com wrote:
 Buenos,
 I'm reading Zuur book (ecology models with R) and try make it entire in
 python.
 Have this function in R:
 M4 - gam(So ∼ s(De) + factor(ID), subset = I1)

 the 's' term indicated with So is modelled as a smoothing function of De

 I'm looking for something close to this in python.

The closest thing that doesn't require writing your own code is
probably to use patsy's [1] support for (simple unpenalized) spline
basis transformations [2]. I think using statsmodels this works like:

import statsmodels.formula.api as smf
# adjust '5' to taste -- bigger = wigglier, less bias, more overfitting
results = smf.ols(So ~ bs(De, 5) + C(ID), data=my_df).fit()
print results.summary()

To graph the resulting curve you'll want to use the results to somehow
do prediction -- I'm not sure what the API for that looks like in
statsmodels. If you need help figuring it out then the asking on the
statsmodels list or stackoverflow is probably the quickest way to get
help.

-n

[1] http://patsy.readthedocs.org/en/latest/
[2] 
http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] smoothing function

2014-05-15 Thread josef . pktd
On Thu, May 15, 2014 at 12:17 PM, Nathaniel Smith n...@pobox.com wrote:

 On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz
 rodrigokobl...@gmail.com wrote:
  Buenos,
  I'm reading Zuur book (ecology models with R) and try make it entire in
  python.
  Have this function in R:
  M4 - gam(So ∼ s(De) + factor(ID), subset = I1)
 
  the 's' term indicated with So is modelled as a smoothing function of De
 
  I'm looking for something close to this in python.

 The closest thing that doesn't require writing your own code is
 probably to use patsy's [1] support for (simple unpenalized) spline
 basis transformations [2]. I think using statsmodels this works like:

 import statsmodels.formula.api as smf
 # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting
 results = smf.ols(So ~ bs(De, 5) + C(ID), data=my_df).fit()
 print results.summary()


Nice



 To graph the resulting curve you'll want to use the results to somehow
 do prediction -- I'm not sure what the API for that looks like in
 statsmodels. If you need help figuring it out then the asking on the
 statsmodels list or stackoverflow is probably the quickest way to get
 help.


seems to work (in a very simple made up example)

results.predict({'De':np.arange(1,5), 'ID':['a']*4}, transform=True)
#array([ 0.75 , 1.0833, 0.75 , 0.4167])

Josef


 -n

 [1] http://patsy.readthedocs.org/en/latest/
 [2]
 http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 92, Issue 19

2014-05-15 Thread rodrigo koblitz
Dear Smith,
that's exactly what I want. Thank!
Dear Josef,
I'm not thinking in publishing nothing with code. If you have some
interesting I can show some codes. But it's probably very basic. Mainly I'm
constructing some basics functions for model selection. R it's very good
with this (bestglm, leaps...)  and I see few things in python.
Finaly,
Have scipy discussion list yet? I'm not received nothing to months.

abraços,
Koblitz


2014-05-15 14:00 GMT-03:00 numpy-discussion-requ...@scipy.org:

 Send NumPy-Discussion mailing list submissions to
 numpy-discussion@scipy.org

 To subscribe or unsubscribe via the World Wide Web, visit
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 or, via email, send a message with subject or body 'help' to
 numpy-discussion-requ...@scipy.org

 You can reach the person managing the list at
 numpy-discussion-ow...@scipy.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of NumPy-Discussion digest...


 Today's Topics:

1. smoothing function (rodrigo koblitz)
2. Fancy Indexing of Structured Arrays is Slow (Dave Hirschfeld)
3. [JOB] Scientific software engineer at the Met Office (Phil Elson)
4. Re: smoothing function (josef.p...@gmail.com)
5. Re: smoothing function (Nathaniel Smith)
6. Re: smoothing function (josef.p...@gmail.com)


 --

 Message: 1
 Date: Thu, 15 May 2014 09:04:03 -0300
 From: rodrigo koblitz rodrigokobl...@gmail.com
 Subject: [Numpy-discussion] smoothing function
 To: numpy-discussion@scipy.org
 Message-ID:
 
 caazkdu_5yw9qigwvofvrpzlptgs75q14y7vawogpqw_nqtr...@mail.gmail.com
 Content-Type: text/plain; charset=utf-8

 Buenos,
 I'm reading Zuur book (ecology models with R) and try make it entire in
 python.
 Have this function in R:
 M4 - gam(So ? s(De) + factor(ID), subset = I1)

 the 's' term indicated with So is modelled as a smoothing function of De

 I'm looking for something close to this in python.

 Someone can help me?

 abra?os,
 Koblitz
 -- next part --
 An HTML attachment was scrubbed...
 URL:
 http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140515/04d32736/attachment-0001.html

 --

 Message: 2
 Date: Thu, 15 May 2014 12:31:50 + (UTC)
 From: Dave Hirschfeld dave.hirschf...@gmail.com
 Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is
 Slow
 To: numpy-discussion@scipy.org
 Message-ID: loom.20140515t135603-...@post.gmane.org
 Content-Type: text/plain; charset=us-ascii

 As can be seen from the code below (or in the notebook linked beneath)
 fancy
 indexing of a structured array is twice as slow as indexing both fields
 independently - making it 4x slower?

 I found that fancy indexing was a bottleneck in my application so I was
 hoping to reduce the overhead by combining the arrays into a structured
 array and only doing one indexing operation. Unfortunately that doubled the
 time that it took!

 Is there any reason for this? If not, I'm happy to open an enhancement
 issue
 on GitHub - just let me know.

 Thanks,
 Dave


 In [32]: nrows, ncols = 365, 1

 In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names=
 ['widgets','gadgets'])

 In [34]: row_idx = randint(0, nrows, ncols)
 ...: col_idx = np.arange(ncols)

 In [35]: %timeit filtered_items = items[row_idx, col_idx]
 100 loops, best of 3: 3.45 ms per loop

 In [36]: %%timeit
 ...: widgets = items['widgets'][row_idx, col_idx]
 ...: gadgets = items['gadgets'][row_idx, col_idx]
 ...:
 1000 loops, best of 3: 1.57 ms per loop



 http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9

 970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu
 red%20Array%20Indexing.ipynb

 https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea





 --

 Message: 3
 Date: Thu, 15 May 2014 16:13:10 +0100
 From: Phil Elson pelson@gmail.com
 Subject: [Numpy-discussion] [JOB] Scientific software engineer at the
 Met Office
 To: Discussion of Numerical Python numpy-discussion@scipy.org,
 matplotlib development list 
 matplotlib-de...@lists.sourceforge.net
 Message-ID:
 
 ca+l60saj1zoedxaldhuhp6ato+kvcjxzvrjv7nq76xy_oir...@mail.gmail.com
 Content-Type: text/plain; charset=utf-8

 I just wanted to let you know that there is currently a vacancy for a
 full-time developer at the Met Office, the UK's National Weather Service,
 within our Analysis, Visualisation and Data (AVD) team.

 I'm posting on this list as the Met Office's AVD team are heavily involved
 in the development of Python packages to support the work that our
 scientists undertake on a daily basis. The vast majority of the AVD team's
 time is spent working on our own open source Python packages Iris, cartopy
 and biggus as well as working on packages