Re: [Numpy-discussion] Picking rows with the first (or last) occurrence of each key

2016-07-04 Thread Jeff Reback
This is trivial in pandas. a simple groupby. In [6]: data = [[ 'a', 27, 14.5 ],['b', 12, 99.0],['a', 17, 100.3], ['b', 12, -329.0]] In [7]: df = DataFrame(data, columns=list('ABC')) In [8]: df Out[8]: A B C 0 a 27 14.5 1 b 12 99.0 2 a 17 100.3 3 b 12 -329.0 In [9]:

[Numpy-discussion] ANN: v0.18.1 pandas Released

2016-05-04 Thread Jeff Reback
t on Numpy 1.10. Macosx wheels are courtesy of Matthew Brett. Installation via conda is: conda install pandas currently its available via the conda-forge channel: conda install pandas -c conda-forge It will be available on the main channel shortly. Please report any issues on our issue tracker <https://

[Numpy-discussion] ANN: pandas v0.18.0 Final released

2016-03-12 Thread Jeff Reback
Aycock - Christopher Scanlin - Cody - Da Wang - Daniel Grady - Dorozhko Anton - Dr-Irv - Erik M. Bray - Evan Wright - Francis T. O'Donovan - Frank Cleary - Gianluca Rossi - Graham Jeffries - Guillaume Horel - Henry Hammond - Isaac Schwabacher - Jean-

[Numpy-discussion] ANN: pandas v0.18.0rc2 - RELEASE CANDIDATE

2016-03-09 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the second release candidate of Pandas 0.18.0. Please try this RC and report any issues here: Pandas Issues . Compared to RC1, we have added updated read_sas and fixed float indexing. We will be releasing

Re: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE

2016-02-15 Thread Jeff Reback
https://github.com/pydata/pandas/releases/tag/v0.18.0rc1 On Mon, Feb 15, 2016 at 12:51 PM, Derek Homeier < de...@astro.physik.uni-goettingen.de> wrote: > On 14 Feb 2016, at 1:53 am, Jeff Reback <jeffreb...@gmail.com> wrote: > > > > I'm pleased to announce the availa

Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-15 Thread Jeff Reback
just an FYI. pandas implemented a RangeIndex in upcoming 0.18.0, mainly for memory savings, see here , similar to how python range/xrange work. though there are substantial perf benefits, mainly with set operations, see

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-13 Thread Jeff Reback
In [10]: pd.options.display.max_rows=10 In [13]: np.random.seed(1234) In [14]: c = np.random.randint(0,32,size=10) In [15]: v = np.arange(10) In [16]: df = DataFrame({'v' : v, 'c' : c}) In [17]: df Out[17]: c v 0 15 0 1 19 1 2 6 2 3 21

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-13 Thread Jeff Reback
] On Sat, Feb 13, 2016 at 1:39 PM, Jeff Reback <jeffreb...@gmail.com> wrote: > In [10]: pd.options.display.max_rows=10 > > In [13]: np.random.seed(1234) > > In [14]: c = np.random.randint(0,32,size=10) > > In [15]: v = np.arange(10) > > In [16]: df = DataFrame({'

[Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE

2016-02-13 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.18.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in 1-2 weeks or so. **RELEASE CANDIDATE 1** This is a major

Re: [Numpy-discussion] Numpy pull requests getting out of hand.

2016-01-31 Thread Jeff Reback
FYI also useful to simply close by time - say older than 6 months with a message for the writer to reopen if they want to work on it then u don't get too many stale ones my 2c > On Jan 31, 2016, at 2:10 PM, Charles R Harris > wrote: > > Hi All, > > There are now

Re: [Numpy-discussion] Numpy 1.11.0b2 released

2016-01-30 Thread Jeff Reback
just my 2c it's fairly straightforward to add a test to the Travis matrix to grab numpy wheels built numpy wheels (works for conda or pip installs). so in pandas we r testing 2.7/3.5 against numpy master continuously https://github.com/pydata/pandas/blob/master/ci/install-3.5_NUMPY_DEV.sh >

Re: [Numpy-discussion] When to stop supporting Python 2.6?

2015-12-03 Thread Jeff Reback
pandas is going to drop 2.6 and 3.3 next release at end of Jan (3.2 dropped in 0.17, in October) I can be reached on my cell 917-971-6387 > On Dec 3, 2015, at 6:00 PM, Bryan Van de Ven wrote: > > >> On Dec 3, 2015, at 4:59 PM, Eric Firing wrote: >>

[Numpy-discussion] ANN: pandas v0.17.1 Released

2015-11-21 Thread Jeff Reback
ian Perez - Cody Piersall - Data & Code Expert Experimenting with Code on Data - DrIrv - Evan Wright - Guillaume Gay - Hamed Saljooghinejad - Iblis Lin - Jake VanderPlas - Jan Schulz - Jean-Mathieu Deschenes - Jeff Reback - Jimmy Callin - Joris Van den Bossche - K.-Michael Aye - Ka Wo Chen - Loïc Ségui

Re: [Numpy-discussion] deprecate fromstring() for text reading?

2015-10-23 Thread Jeff Reback
> On Oct 23, 2015, at 6:13 PM, Charles R Harris > wrote: > > > >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal >> wrote: >> >>> I think it would be good to keep the usage to read binary data at least. >> >> Agreed --

Re: [Numpy-discussion] deprecate fromstring() for text reading?

2015-10-23 Thread Jeff Reback
> On Oct 23, 2015, at 6:49 PM, Nathaniel Smith <n...@pobox.com> wrote: > > On Oct 23, 2015 3:30 PM, "Jeff Reback" <jeffreb...@gmail.com> wrote: > > > > On Oct 23, 2015, at 6:13 PM, Charles R Harris <charlesr.har...@gmail.com> > > wrote: &g

Re: [Numpy-discussion] Make all comparisons with NaT false?

2015-10-13 Thread Jeff Reback
Here another oddity to add to the list In [28]: issubclass(np.datetime64,np.integer) Out[28]: False In [29]: issubclass(np.timedelta64,np.integer) Out[29]: True On Tue, Oct 13, 2015 at 5:44 PM, Chris Barker wrote: > On Sun, Oct 11, 2015 at 8:38 PM, Stephan Hoyer

[Numpy-discussion] ANN: pandas v0.17.0 released

2015-10-09 Thread Jeff Reback
- Frank Pinter - Gabriel Araujo - Garrett-R - Gianluca Rossi - Guillaume Gay - Guillaume Poulin - Harsh Nisar - Ian Henriksen - Ian Hoegen - Jaidev Deshpande - Jan Rudolph - Jan Schulz - Jason Swails - Jeff Reback - Jonas Buyl - Joris Van den Bossche

Re: [Numpy-discussion] [pydata] ANN: pandas v0.17.0rc2 - RELEASE CANDIDATE 2

2015-10-05 Thread Jeff Reback
to-date with pandas 0.17rc2 ? > >> On Sunday, October 4, 2015 at 7:36:26 AM UTC+2, Matthew Brett wrote: >> Hi, >> >> On Sat, Oct 3, 2015 at 2:33 PM, Jeff Reback <jeffr...@gmail.com> wrote: >> > Hi, >> > >> > I'm pleased to announce the

[Numpy-discussion] ANN: pandas v0.17.0rc2 - RELEASE CANDIDATE 2

2015-10-03 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the second release candidate of Pandas 0.17.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially on October 9. **RELEASE CANDIDATE 2** >From RC 1 we have:

[Numpy-discussion] ANN: pandas v0.17.0rc1 - RELEASE CANDIDATE

2015-09-11 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.17.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in 1-2 weeks or so. **RELEASE CANDIDATE 1** This is a major

Re: [Numpy-discussion] testing numpy with downstream testsuites (was: Re: Notes from the numpy dev meeting at scipy 2015)

2015-08-26 Thread Jeff Reback
Pandas has for quite a while has a travis build where we install numpy master and then run our test suite. e.g. here: https://travis-ci.org/pydata/pandas/jobs/77256007 Over the last year this has uncovered a couple of changes which affected pandas (mainly using something deprecated which was

Re: [Numpy-discussion] floats for indexing, reshape - too strict ?

2015-07-02 Thread Jeff Reback
FYI pandas followed the same pattern to deprecate float indexers (except for indexing in a Float64Index) about a year ago see here: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0140-deprecations On Jul 2, 2015, at 9:18 PM, josef.p...@gmail.com josef.p...@gmail.com

Re: [Numpy-discussion] Video meeting this week

2015-06-30 Thread Jeff Reback
you guys have an agenda? I can be reached on my cell 917-971-6387 On Jun 30, 2015, at 12:58 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 26, 2015 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote: Hi all, In a week and a half, this is happening:

[Numpy-discussion] ANN: pandas v0.16.2 released

2015-06-13 Thread Jeff Reback
van der Meeren - Christian Hudon - Constantine Glen Evans - Daniel Julius Lasiman - Evan Wright - Francesco Brundu - Gaëtan de Menten - Jake VanderPlas - James Hiebert - Jeff Reback - Joris Van den Bossche - Justin Lecher - Ka Wo Chen - Kevin Sheppard

[Numpy-discussion] ANN: pandas 0.16.1 released

2015-05-11 Thread Jeff Reback
- Chris Grinolds - Dan Birken - David BROCHART - David Hirschfeld - David Stephens - Dr. Leo - Evan Wright - Frans van Dunné - Hatem Nassrat - Henning Sperr - Hugo Herter - Jan Schulz - Jeff Blackburne - Jeff Reback - Jim Crist - Jonas Abernot - Joris Van

[Numpy-discussion] ANN: pandas 0.16.0 released

2015-03-23 Thread Jeff Reback
Hello, We are proud to announce v0.16.0 of pandas, a major release from 0.15.2. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 4 months of work by 60 authors encompassing 204

[Numpy-discussion] Pandas v0.16.0 release candidate 1

2015-03-13 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.16.0. Please try this RC and report any issues here: Pandas Issues https://github.com/pydata/pandas/issues We will be releasing officially in 1 week or so. This is a major release from 0.15.2 and includes a

[Numpy-discussion] ANN: pandas v0.15.2

2014-12-12 Thread Jeff Reback
- Charalampos Papaloizou - Chris Warth - David Stephens - Fabio Zanini - Francesc Via - Henry Kleynhans - Jake VanderPlas - Jan Schulz - Jeff Reback - Jeff Tratner - Joris Van den Bossche - Kevin Sheppard - Matt Suggit - Matthew Brett - Phillip Cloud

Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-26 Thread Jeff Reback
you should have a read here/ http://wesmckinney.com/blog/?p=543 going below the 2x memory usage on read in is non trivial and costly in terms of performance On Oct 26, 2014, at 4:46 AM, Saullo Castro saullogiov...@gmail.com wrote: I would like to start working on a memory efficient

Re: [Numpy-discussion] Memory efficient alternative for np.loadtxt and np.genfromtxt

2014-10-26 Thread Jeff Reback
usage (though ultimately still 2x); but combined with memory mapping can provide a fixed resource utilization On Oct 26, 2014, at 9:41 AM, Daπid davidmen...@gmail.com wrote: On 26 October 2014 12:54, Jeff Reback jeffreb...@gmail.com wrote: you should have a read here/ http://wesmckinney.com

[Numpy-discussion] ANN: Pandas 0.15.0 released

2014-10-19 Thread Jeff Reback
Hello, We are proud to announce v0.15.0 of pandas, a major release from 0.14.1. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 4 months of work with 420 commits by 79 authors

[Numpy-discussion] ANN: Pandas 0.15.0 Release Candiate 1

2014-10-07 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.15.0. Please try this RC and report any issues here: Pandas Issues https://github.com/pydata/pandas/issues We will be releasing officially in 1-2 weeks or so. This is a major release from 0.14.1 and includes

[Numpy-discussion] Dataframe memory info printing

2014-09-23 Thread Jeff Reback
For the 0.15.0 release of pandas (coming 2nd week of oct), we are going to include memory info printing: see here: https://github.com/pydata/pandas/pull/7619 This will be controllable by an option display.memory_usage. My question to the community should this be by default True, e.g. show the

Re: [Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-22 Thread Jeff Reback
Hopefully this is not TL;DR! Their are 3 'dtype' likes that exist in pandas that could in theory mostly be migrated back to numpy. These currently exist as the .values in-other-words the object to which pandas defers data storage and computation for some/most of operations. 1) SparseArray: This

Re: [Numpy-discussion] numpy.mean still broken for large float32 arrays

2014-07-24 Thread Jeff Reback
related recent issue: https://github.com/numpy/numpy/issues/4638 and pandas is now explicitly specifying the accumulator to avoid this problem: https://github.com/pydata/pandas/pull/6954/files pandas also implemented the Welfords method for rolling_var in 0.14.0, see here:

Re: [Numpy-discussion] String type again.

2014-07-16 Thread Jeff Reback
in 0.15.0 pandas will have full fledged support for categoricals which in effect allow u 2 map a smaller number of strings to integers this is now in pandas master http://pandas-docs.github.io/pandas-docs-travis/categorical.html feedback welcome! On Jul 14, 2014, at 1:00 PM, Olivier Grisel

[Numpy-discussion] ANN: pandas 0.14.1 released

2014-07-12 Thread Jeff Reback
Hello, We are proud to announce v0.14.1 of pandas, a minor release from 0.14.0. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 1.5 months of work with 244 commits by 45

Re: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1

2014-07-12 Thread Jeff Reback
Ray Matthew builds Mac osx wheels for scipy stack (those are windows binaries) thanks anyhow On Jul 11, 2014, at 12:10 PM, RayS r...@blue-cove.com wrote: At 04:56 AM 7/11/2014, you wrote: Matthew, we posted the release of 0.14.1 last night. Are these picked up and build here

Re: [Numpy-discussion] Questions about fixes for 1.9.0rc2

2014-07-04 Thread Jeff Reback
ok from pandas we test with numpy master on Travis (which does pick up things!) thanks On Jul 4, 2014, at 7:07 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jul 4, 2014 at 10:31 PM, Charles R

Re: [Numpy-discussion] Questions about fixes for 1.9.0rc2

2014-07-04 Thread Jeff Reback
pandas 0.14.1 scheduled for end of next week (was waiting to see schedule for numpy 1.9) but works either way On Jul 4, 2014, at 7:41 PM, Nathaniel Smith n...@pobox.com wrote: On 5 Jul 2014 00:07, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jul 4, 2014 at 3:33 PM,

Re: [Numpy-discussion] genfromtxt universal newline support

2014-06-30 Thread Jeff Reback
In pandas 0.14.0, generic whitespace IS parsed via the c-parser, e.g. specifying '\s+' as a separator. Not sure when you were playing last with pandas, but the c-parser has been in place since late 2012. (version 0.8.0)

Re: [Numpy-discussion] ANN: NumPy 1.9.0 beta release

2014-06-09 Thread Jeff Reback
The one pandas test failure that is valid: ERROR: test_interp_regression (pandas.tests.test_generic.TestSeries) has been fixed in pandas master / 0.14.1 (prob releasing in 1 month). (the other test failures are for clipboard / network issues) On Mon, Jun 9, 2014 at 7:21 PM, Christoph Gohlke

Re: [Numpy-discussion] ANN: Pandas 0.14.0 released

2014-05-31 Thread Jeff Reback
sure would take a pr for that anything 2 make setup easier! On May 31, 2014, at 1:50 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, May 31, 2014 at 12:30 AM, Jeff Reback jeffreb...@gmail.com wrote: the upgrade flag on pip is apparently recursive on all deps Indeed

[Numpy-discussion] ANN: Pandas 0.14.0 released

2014-05-30 Thread Jeff Reback
Waeber - David Jung - David Stephens - Douglas McNeil - DSM - Garrett Drapala - Gouthaman Balaraman - Guillaume Poulin - hshimizu77 - hugo - immerrr - ischwabacher - Jacob Howard - Jacob Schaer - jaimefrio - Jason Sexauer - Jeff Reback - Jeffrey

Re: [Numpy-discussion] ANN: Pandas 0.14.0 released

2014-05-30 Thread Jeff Reback
the upgrade flag on pip is apparently recursive on all deps On May 30, 2014, at 6:16 PM, Neal Becker ndbeck...@gmail.com wrote: pip install --user --up pandas Downloading/unpacking pandas from

[Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1

2014-05-17 Thread Jeff Reback
Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.14.0. Please try this RC and report any issues here: Pandas Issueshttps://github.com/pydata/pandas/issues We will be releasing officially in about 2 weeks or so. This is a major release from 0.13.1 and

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-28 Thread Jeff Reback
FYI Here are docs for panda of timezone handling wesm worked thru the various issues w.r.t. conversion, localization, and ambiguous zone crossing. http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-zone-handling implementation is largely in here: (underlying impl is a

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-19 Thread Jeff Reback
Dave, your example is not a problem with numpy per se, rather that the default generation is in local timezone (same as what python datetime does). If you localize to UTC you get the results that you expect. In [49]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1] In [50]:

Re: [Numpy-discussion] 1.8.1 release

2014-02-24 Thread Jeff Reback
I am pretty sure that you guys test pandas master but 1.8.1 looks good to me On Feb 24, 2014, at 4:42 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Feb 24, 2014 at 1:54 PM, RayS r...@blue-cove.com wrote: Has anyone alerted C Gohlke?

[Numpy-discussion] [pydata] ANN: pandas 0.13.1 released

2014-02-09 Thread Jeff Reback
Hello, This is a minor release from 0.13.0 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: - Added