Re: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow
Julian Taylor jtaylor.debian at googlemail.com writes: On 16.05.2014 10:59, Dave Hirschfeld wrote: Julian Taylor jtaylor.debian at googlemail.com writes: Yes, I'd heard about the improvements and am very excited to try them out since indexing is one of the bottlenecks in our algorithm. I made a PR with the simple change: https://github.com/numpy/numpy/pull/4721 improves it by the expected 50%, but its still 40% slower than the improved normal indexing. Having some problems building numpy to test this out, but assuming it does what it says on the tin I'd be very keen to get this in the impending 1.9 release if possible. Thanks, Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Easter Egg or what I am missing here?
Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == Or is this a recipe to blow up a power plant? Thanks, Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Easter Egg or what I am missing here?
On Wed, May 21, 2014 at 3:29 PM, Siegfried Gonzi siegfried.go...@ed.ac.uk wrote: Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == It seems like a misunderstanding of Python scoping, or just an oversight in your code, or I'm not understanding your question. Would you expect the following code to print the same value twice in each iteration? for x in (1, 2, 3): print x dummy = [x*x for x in (4, 5, 6)] print x print Or is this a recipe to blow up a power plant? Now we're on the lists... Cheers! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Easter Egg or what I am missing here?
On Wed, May 21, 2014 at 12:38 PM, alex argri...@ncsu.edu wrote: years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x did you mean that to be print 'year' xy I then get: year 2004 year [2004 2005 2006 2007] year 2005 year [2004 2005 2006 2007] year 2006 year [2004 2005 2006 2007] year 2007 year [2004 2005 2006 2007] or di you really want something like: In [35]: %paste years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x, d in zip(years, dates): print 'year ', x print 'date', d print int (d*1.0e-4) print 'just date:', d - x*1e4 ## -- End pasted text -- year 2004 date 20040501 2004 just date: 501.0 year 2005 date 20050601 2005 just date: 601.0 year 2006 date 20060801 2006 just date: 801.0 year 2007 date 20071001 2007 just date: 1001.0 but using floating point for this is risky anyway, why not: In [47]: d Out[47]: 20071001 In [48]: d // 1 Out[48]: 2007 i.e integer division. -Chris == It seems like a misunderstanding of Python scoping, or just an oversight in your code, or I'm not understanding your question. Would you expect the following code to print the same value twice in each iteration? for x in (1, 2, 3): print x dummy = [x*x for x in (4, 5, 6)] print x print Or is this a recipe to blow up a power plant? Now we're on the lists... Cheers! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Easter Egg or what I am missing here?
On 5/21/14, Siegfried Gonzi siegfried.go...@ed.ac.uk wrote: Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == Or is this a recipe to blow up a power plant? This is a wart of Python 2.x. The dummy variable used in a list comprehension remains defined with its final value in the enclosing scope. For example, this is Python 2.7: x = 100 w = [x*x for x in range(4)] x 3 This behavior has been changed in Python 3. Here's the same sequence in Python 3.4: x = 100 w = [x*x for x in range(4)] x 100 Guido van Rossum gives a summary of this issue near the end of this blog: http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html Warren Thanks, Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] NumPy C API question
Hello, I would like to expose an existing (C++) object as a NumPy array to Python. Right now I'm using PyArray_New, passing the pointer to my object's storage. It now happens that the storage point of my object may change over its lifetime, so I'd like to change the pointer that is used in the PyArrayObject. Is there any API to do this ? (I'd like to avoid allocating a new PyArrayObject, as that is presumably a costly operation.) If not, may I access (i.e., change) the data member of the array object, or would I risk corrupting the application state doing that ? Many thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C API question
Hi Stefan, Allocating a new PyArrayObject isn't terribly expensive (compared to all the other allocations that Python programs are constantly doing), but I'm afraid you have a more fundamental problem. The reason there is no supported API to change the storage pointer of a PyArrayObject is that the semantics of PyArrayObject are that the data must remain allocated, and in the same place, until the PyArrayObject is freed (and when this happens is in general is up to the garbage collector, not you). You could make a copy, but you can't free the original buffer until Python tells you you can. The problem is that many simple operations on arrays return views, which are implemented as independent PyArrayObjects whose data field points directly into your memory buffer; these views will hold a reference to your PyArrayObject, but there's no supported way to reverse this mapping to find all the views that might be pointing into your buffer. If you're very determined there are probably hacks you could use (be very careful never to allocate views, or maybe gc.getreferrers() will work to let you run around and fix up all the views), but at that point you're kind of on your own anyway, and violating PyArrayObject's encapsulation boundary is the least of your worries :-). Hope things are well with you, -n On Thu, May 22, 2014 at 12:03 AM, Stefan Seefeld ste...@seefeld.name wrote: Hello, I would like to expose an existing (C++) object as a NumPy array to Python. Right now I'm using PyArray_New, passing the pointer to my object's storage. It now happens that the storage point of my object may change over its lifetime, so I'd like to change the pointer that is used in the PyArrayObject. Is there any API to do this ? (I'd like to avoid allocating a new PyArrayObject, as that is presumably a costly operation.) If not, may I access (i.e., change) the data member of the array object, or would I risk corrupting the application state doing that ? Many thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C API question
Hi Nathaniel, thanks for the prompt and thorough answer. You are entirely right, I hadn't thought things through properly, so let me back up a bit. I want to provide Python bindings to a C++ library I'm writing, which is based on vector/matrix/tensor data types. In my naive view I would expose these data types as NumPy arrays, creating PyArrayObject instances as wrappers, i.e. who borrow raw pointers to the storage managed by the C++ objects. To make things slightly more interesting, those C++ objects have their own storage management mechanism, which allows data to migrate across different address spaces (such as from host to GPU-device memory), and thus whether the host storage is valid (i.e., contains up-to-date data) or not depends on where the last operation was performed (which is controlled by an operation dispatcher that is part of the library, too). It seems if I let Python control the data lifetime, and borrow the data temporarily from C++ I may be fine. However, I may want to expose pre-existing C++ objects into Python, though, and it sounds like that might be dangerous unless I am willing to clone the data so the Python runtime can hold on to that even after my C++ runtime has released theirs. But that changes the semantics, as the Python runtime no longer sees the same data as the C++ runtime, unless I keep the two in sync each time I cross the language boundary, which may be quite a costly operation... Does all that sound sensible ? It seems I have some more design to do. Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C API question
Hi Stefan, One possibility that comes to mind: you may want in any case some way to temporarily pin an object's memory in place (e.g., to prevent one thread trying to migrate it while some other thread is working on it). If so then the Python wrapper could acquire a pin when the ndarray is allocated, and release it when it is released. (The canonical way to do this is to create a little opaque Python class that knows how to do the acquire/release, and then assign it to the 'base' attribute of your array -- the semantics of 'base' are simply that ndarray.__del__ will decref whatever object is in 'base'.) -n On Thu, May 22, 2014 at 12:44 AM, Stefan Seefeld ste...@seefeld.name wrote: Hi Nathaniel, thanks for the prompt and thorough answer. You are entirely right, I hadn't thought things through properly, so let me back up a bit. I want to provide Python bindings to a C++ library I'm writing, which is based on vector/matrix/tensor data types. In my naive view I would expose these data types as NumPy arrays, creating PyArrayObject instances as wrappers, i.e. who borrow raw pointers to the storage managed by the C++ objects. To make things slightly more interesting, those C++ objects have their own storage management mechanism, which allows data to migrate across different address spaces (such as from host to GPU-device memory), and thus whether the host storage is valid (i.e., contains up-to-date data) or not depends on where the last operation was performed (which is controlled by an operation dispatcher that is part of the library, too). It seems if I let Python control the data lifetime, and borrow the data temporarily from C++ I may be fine. However, I may want to expose pre-existing C++ objects into Python, though, and it sounds like that might be dangerous unless I am willing to clone the data so the Python runtime can hold on to that even after my C++ runtime has released theirs. But that changes the semantics, as the Python runtime no longer sees the same data as the C++ runtime, unless I keep the two in sync each time I cross the language boundary, which may be quite a costly operation... Does all that sound sensible ? It seems I have some more design to do. Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day
I have hourly 2D temperature data in a monthly netcdf and I would like to find the daily maximum temperature. The shape of the netcdf is (744, 106, 193) I would like to use the year-month-day as a new list name (i.e. 2009-03-01, 2009-03-022009-03-31) and then add each of the hours worth of temperature data to each corresponding list. Therefore each new list should contain 24 hours worth of data and the shape should be (24,106,193) . This is the part I cannot seem to get to work. I am using datetime and then groupby to group by date but I am not sure how to use the output to make a new list name and then add the data for that day into that list. see below and attached for my latest attempt. Any feedback will be greatly appreciated. from netCDF4 import Dataset import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime as dt import os import gc from numpy import * import pytz from itertools import groupby MainFolder=r/DATA/2009/03 dailydate=[] alltime=[] lists={} ncvariablename='T_SFC' for (path, dirs, files) in os.walk(MainFolder): for ncfile in files: print ncfile fileext='.nc' if ncfile.endswith(ncvariablename+'.nc'): print dealing with ncfiles:, path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] TIME=ncfile.variables['time'][:] ncfile.close() for temp, time in zip((variable[:]),(TIME[:])): cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(time) timestr=str(ncfiletime) utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') au_tz = pytz.timezone('Australia/Sydney') local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) alltime.append(local_dt) for k, g in groupby(alltime, key=lambda d: d.date()): kstrp_local=k.strftime('%Y-%m-%d_%H') klocal_date=k.strftime('%Y-%m-%d') dailydate.append(klocal_date) for n in dailydate: lists[n]=[] lists[n].append(temp) big_array=np.ma.concatenate(lists[n]) DailyTemp=big_array.max(axis=0) from netCDF4 import Dataset import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime as dt import os import gc from numpy import * import pytz from itertools import groupby MainFolder=r/DATA/2009/03 dailydate=[] alltime=[] lists={} ncvariablename='T_SFC' for (path, dirs, files) in os.walk(MainFolder): for ncfile in files: print ncfile fileext='.nc' if ncfile.endswith(ncvariablename+'.nc'): print dealing with ncfiles:, path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] TIME=ncfile.variables['time'][:] ncfile.close() for temp, time in zip((variable[:]),(TIME[:])): cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(time) timestr=str(ncfiletime) utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') au_tz = pytz.timezone('Australia/Sydney') local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) alltime.append(local_dt) for k, g in groupby(alltime, key=lambda d: d.date()): kstrp_local=k.strftime('%Y-%m-%d_%H') klocal_date=k.strftime('%Y-%m-%d') dailydate.append(klocal_date) for n in dailydate: lists[n]=[] lists[n].append(temp) big_array=np.ma.concatenate(lists[n]) DailyTemp=big_array.max(axis=0) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day
Hello anonymous, I recently wrote a package xray (http://xray.readthedocs.org/) specifically to make it easier to work with high-dimensional labeled data, as often found in NetCDF files. Xray has a groupby method for grouping over subsets of your data, which would seem well suited to what you're trying to do. Something like the following might work: ds = xray.open_dataset(ncfile) tmax = ds['temperature'].groupby('time.hour').max() It also might be worth looking at other more data analysis packages, either more generic (e.g., pandas, http://pandas.pydata.org/) or weather/climate data specific (e.g., Iris, http://scitools.org.uk/iris/ and CDAT, http://www2-pcmdi.llnl.gov/cdat/manuals/cdutil/cdat_utilities.html). Cheers, Stephan On Wed, May 21, 2014 at 5:27 PM, questions anon questions.a...@gmail.comwrote: I have hourly 2D temperature data in a monthly netcdf and I would like to find the daily maximum temperature. The shape of the netcdf is (744, 106, 193) I would like to use the year-month-day as a new list name (i.e. 2009-03-01, 2009-03-022009-03-31) and then add each of the hours worth of temperature data to each corresponding list. Therefore each new list should contain 24 hours worth of data and the shape should be (24,106,193) . This is the part I cannot seem to get to work. I am using datetime and then groupby to group by date but I am not sure how to use the output to make a new list name and then add the data for that day into that list. see below and attached for my latest attempt. Any feedback will be greatly appreciated. from netCDF4 import Dataset import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime as dt import os import gc from numpy import * import pytz from itertools import groupby MainFolder=r/DATA/2009/03 dailydate=[] alltime=[] lists={} ncvariablename='T_SFC' for (path, dirs, files) in os.walk(MainFolder): for ncfile in files: print ncfile fileext='.nc' if ncfile.endswith(ncvariablename+'.nc'): print dealing with ncfiles:, path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] TIME=ncfile.variables['time'][:] ncfile.close() for temp, time in zip((variable[:]),(TIME[:])): cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(time) timestr=str(ncfiletime) utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') au_tz = pytz.timezone('Australia/Sydney') local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) alltime.append(local_dt) for k, g in groupby(alltime, key=lambda d: d.date()): kstrp_local=k.strftime('%Y-%m-%d_%H') klocal_date=k.strftime('%Y-%m-%d') dailydate.append(klocal_date) for n in dailydate: lists[n]=[] lists[n].append(temp) big_array=np.ma.concatenate(lists[n]) DailyTemp=big_array.max(axis=0) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day
Thanks Stephan, It doesn't look like CDAT has 'daily' option - it has yearly, seasonal and monthly! I would need to look into IRIS more as it is new to me and I can't quiet figure out all the steps required for xray, although it looks great. Another way around was after converting to localtime_day I could append the corresponding hourly arrays to a list, concatenate, calculate max and make the max equal to that localtime_day. Then I could delete everything in that list and repeat by looping though the hours of the next day and append to the empty list. Although I really don't know how to get this to work. On Thu, May 22, 2014 at 10:56 AM, Stephan Hoyer sho...@gmail.com wrote: Hello anonymous, I recently wrote a package xray (http://xray.readthedocs.org/) specifically to make it easier to work with high-dimensional labeled data, as often found in NetCDF files. Xray has a groupby method for grouping over subsets of your data, which would seem well suited to what you're trying to do. Something like the following might work: ds = xray.open_dataset(ncfile) tmax = ds['temperature'].groupby('time.hour').max() It also might be worth looking at other more data analysis packages, either more generic (e.g., pandas, http://pandas.pydata.org/) or weather/climate data specific (e.g., Iris, http://scitools.org.uk/iris/and CDAT, http://www2-pcmdi.llnl.gov/cdat/manuals/cdutil/cdat_utilities.html). Cheers, Stephan On Wed, May 21, 2014 at 5:27 PM, questions anon questions.a...@gmail.comwrote: I have hourly 2D temperature data in a monthly netcdf and I would like to find the daily maximum temperature. The shape of the netcdf is (744, 106, 193) I would like to use the year-month-day as a new list name (i.e. 2009-03-01, 2009-03-022009-03-31) and then add each of the hours worth of temperature data to each corresponding list. Therefore each new list should contain 24 hours worth of data and the shape should be (24,106,193) . This is the part I cannot seem to get to work. I am using datetime and then groupby to group by date but I am not sure how to use the output to make a new list name and then add the data for that day into that list. see below and attached for my latest attempt. Any feedback will be greatly appreciated. from netCDF4 import Dataset import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime as dt import os import gc from numpy import * import pytz from itertools import groupby MainFolder=r/DATA/2009/03 dailydate=[] alltime=[] lists={} ncvariablename='T_SFC' for (path, dirs, files) in os.walk(MainFolder): for ncfile in files: print ncfile fileext='.nc' if ncfile.endswith(ncvariablename+'.nc'): print dealing with ncfiles:, path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] TIME=ncfile.variables['time'][:] ncfile.close() for temp, time in zip((variable[:]),(TIME[:])): cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(time) timestr=str(ncfiletime) utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') au_tz = pytz.timezone('Australia/Sydney') local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) alltime.append(local_dt) for k, g in groupby(alltime, key=lambda d: d.date()): kstrp_local=k.strftime('%Y-%m-%d_%H') klocal_date=k.strftime('%Y-%m-%d') dailydate.append(klocal_date) for n in dailydate: lists[n]=[] lists[n].append(temp) big_array=np.ma.concatenate(lists[n]) DailyTemp=big_array.max(axis=0) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Easter Egg or what I am missing here?
On 22/05/2014 00:37, numpy-discussion-requ...@scipy.org wrote: Message: 4 Date: Wed, 21 May 2014 18:32:30 -0400 From: Warren Weckesser warren.weckes...@gmail.com Subject: Re: [Numpy-discussion] Easter Egg or what I am missing here? To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cagzf1udkdap+yd2sqy9cca6rm4zzfyjjzfv0tieesv_driv...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 On 5/21/14, Siegfried Gonzi siegfried.go...@ed.ac.uk wrote: Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == Or is this a recipe to blow up a power plant? This is a wart of Python 2.x. The dummy variable used in a list comprehension remains defined with its final value in the enclosing scope. For example, this is Python 2.7: x = 100 w = [x*x for x in range(4)] x 3 This behavior has been changed in Python 3. Here's the same sequence in Python 3.4: x = 100 w = [x*x for x in range(4)] x 100 Guido van Rossum gives a summary of this issue near the end of this blog:http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html Warren [I still do not know how to properly use the reply function here. I apologise.] Hi all and thanks to all the respondes. I think I would have expected my code to be behaving like you said version 3.4 will do. I would never have thought 'x' is being changed during execution. I took me nearly 2 hours in my code to figure out what was going on (it was a lenghty piece of code an not so easy to spot). Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Easter Egg or what I am missing here?
I agree; this 'wart' has also messed with my code a few times. I didn't find it to be the case two years ago, but perhaps I should reevaluate if the scientific python stack has sufficiently migrated to python 3. On Thu, May 22, 2014 at 7:35 AM, Siegfried Gonzi siegfried.go...@ed.ac.ukwrote: On 22/05/2014 00:37, numpy-discussion-requ...@scipy.org wrote: Message: 4 Date: Wed, 21 May 2014 18:32:30 -0400 From: Warren Weckesser warren.weckes...@gmail.com Subject: Re: [Numpy-discussion] Easter Egg or what I am missing here? To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cagzf1udkdap+yd2sqy9cca6rm4zzfyjjzfv0tieesv_driv...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 On 5/21/14, Siegfried Gonzi siegfried.go...@ed.ac.uk wrote: Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == Or is this a recipe to blow up a power plant? This is a wart of Python 2.x. The dummy variable used in a list comprehension remains defined with its final value in the enclosing scope. For example, this is Python 2.7: x = 100 w = [x*x for x in range(4)] x 3 This behavior has been changed in Python 3. Here's the same sequence in Python 3.4: x = 100 w = [x*x for x in range(4)] x 100 Guido van Rossum gives a summary of this issue near the end of this blog: http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html Warren [I still do not know how to properly use the reply function here. I apologise.] Hi all and thanks to all the respondes. I think I would have expected my code to be behaving like you said version 3.4 will do. I would never have thought 'x' is being changed during execution. I took me nearly 2 hours in my code to figure out what was going on (it was a lenghty piece of code an not so easy to spot). Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion