On Mon, Jun 18, 2012 at 3:46 PM, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > On 06/18/2012 12:14 PM, Thouis (Ray) Jones wrote: >> Based on some previous discussion on the numpy list [1] and in >> now-cancelled PRs [2,3], I'd like to solicit opinions on adding an >> interface for numpy memory allocation event tracking, as implemented >> in this PR: >> >> https://github.com/numpy/numpy/pull/309 >> >> A brief summary of the changes: >> >> - PyDataMem_NEW/FREE/RENEW become functions in the numpy API. >> (they used to be macros for malloc/free/realloc) >> These are the functions used to manage allocations for array's >> internal data. Most other numpy data is allocated through Python's >> allocator. >> >> - PyDataMem_NEW/RENEW return void* instead of char*. >> >> - Adds PyDataMem_SetEventHook() to the API, with this description: >> * Sets the allocation event hook for numpy array data. >> * Takes a PyDataMem_EventHookFunc *, which has the signature: >> * void hook(void *old, void *new, size_t size, void *user_data). >> * Also takes a void *user_data, and void **old_data. >> * >> * Returns a pointer to the previous hook or NULL. If old_data is >> * non-NULL, the previous user_data pointer will be copied to it. >> * >> * If not NULL, hook will be called at the end of each >> PyDataMem_NEW/FREE/RENEW: >> * result = PyDataMem_NEW(size) -> (*hook)(NULL, result, >> size, user_data) >> * PyDataMem_FREE(ptr) -> (*hook)(ptr, NULL, 0, >> user_data) >> * result = PyDataMem_RENEW(ptr, size) -> (*hook)(ptr, result, size, >> user_data) >> * >> * When the hook is called, the GIL will be held by the calling >> * thread. The hook should be written to be reentrant, if it performs >> * operations that might cause new allocation events (such as the >> * creation/descruction numpy objects, or creating/destroying Python >> * objects which might cause a gc) >> >> >> The PR also includes an example using the hook functions to track >> allocation via Python callback funcions (in >> tools/allocation_tracking). >> >> Why I think this is worth adding to numpy, even though other tools may >> be able to provide similar functionality: >> >> - numpy arrays use orders of magnitude more memory than most python >> objects, and this is often a limiting factor in algorithms. >> >> - numpy can behave in complicated ways with regards to memory >> management, e.g., views, OWNDATA, temporaries, etc., making it >> sometimes difficult to know where memory usage problems are >> happening and why. >> >> - numpy attracts a large number of programmers with limited low-level >> programming expertise, and who don't have the skills to use external >> tools (or time/motivation to acquire those skills), but still need >> to be able to diagnose these sorts of problems. >> >> - Other tools are not well integrated with Python, and vary a great >> deal between OS and compiler setup. >> >> I appreciate any feedback. > > Are the hooks able to change how allocation happens/override allocation? > If one goes to this much pain already, I think one might as well go the > extra step and allow hooks to override memory allocation. > > At least something to think about -- of course the above (as I > understand it) would be a good start on a pluggable allocator even if it > isn't done right away. > > Examples: > > - Allocate NumPy arrays in process-shared memory using shmem/mmap > - Allocate NumPy arrays on some boundary (16-byte, 4096-byte..) using > memalign
That's not present in the current change, but the choice to use "EventHook" rather than the more generic "Hook" was to avoid colliding with a change like that in the future. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion