Re: [Numpy-discussion] fast numpy i/o

2011-06-21 Thread Simon Lyngby Kokkendorff
Hi,

  I have been using h5py a lot (both on windows and Mac OSX) and can only
recommend it- haven't tried the other options though

 Cheers,
 Simon


On Tue, Jun 21, 2011 at 8:24 PM, Derek Homeier 
de...@astro.physik.uni-goettingen.de wrote:

 On 21.06.2011, at 7:58PM, Neal Becker wrote:

  I think, in addition, that hdf5 is the only one that easily interoperates
 with
  matlab?
 
  speaking of hdf5, I see:
 
  pyhdf5io  0.7 - Python module containing high-level hdf5 load and save
  functions.
  h5py  2.0.0 - Read and write HDF5 files from Python
 
  Any thoughts on the relative merits of these?

 In my experience, HDF5 access usually approaches disk access speed, and
 random access to sub-datasets should be significantly faster than reading in
 the entire file, though I have not been able to test this.
 I have not heard about pyhdf5io (how does it work together with numpy?) -
 as alternative to h5py I'd rather recommend pytables, though I prefer the
 former for its cleaner/simpler interface (but that probably depends on your
 programming habits).

 HTH,
 Derek

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What Requires C and what is just python

2011-03-21 Thread Simon Lyngby Kokkendorff
Hi Ben,

  It's very easy to package numpy (and most other modules) with py2exe,
which like Dan mentioned above, will include all necessary (also non-python)
libraries into a dist-folder. The folder to distribute can of course get
quite large if you include a lot of libraries - but I think that only
standard libraries and numpy will be below 5 mb.

 Cheers,
 Simon


On Mon, Mar 21, 2011 at 9:30 AM, Paul Anton Letnes 
paul.anton.let...@gmail.com wrote:


 On 20. mars 2011, at 16.08, Ben Smith wrote:

 
  So, in addition to my computer science work, I'm a PhD student in econ.
 Right now, the class is using GAUSS for almost everything. This sort of
 pisses me off because it means people are building libraries of code that
 become valueless when they graduate (because right now we get GAUSS licenses
 for free, but it is absurdly expensive later) -- particularly when this is
 the only language they know.
 
  So, I had this idea of building some command line tools to do the same
 things using the most basic pieces of NumPy (arrays, dot products, transpose
 and inverse -- that's it). And it is going great. My problem however is that
 I'd like to be able to share these tools but I know I'm opening up a big can
 of worms where I have to go around building numpy on 75 peoples computers.
 What I'd like to do is limit myself to just the functions that are
 implemented in python, package it with py2exe and hand that to anyone that
 needs it. So, my question, if anyone knows, what's implemented in python and
 what depends on the c libraries? Is this even possible?

 I can testify that on most windows computers python(x,y) will give you
 everything you need - numpy, scipy, matplotlib, pyqt for GUI design, and
 much more.

 The only problem I ever saw was that some people had problems with $PATH
 not being set properly on windows. But this was on machines that seemed to
 be full of other problems.

 Oh, and in my experience, it is easier to run python scripts from the
 generic windows command line than in the ipython shell.

 Good luck,
 Paul
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to limit the numpy.memmap's RAM usage?

2010-10-25 Thread Simon Lyngby Kokkendorff
Hi List,

  I had similar problems on windows. I tried to use memmaps to buffer a
large amount of data and process it in chunks. But I found that whenever I
tried to do this, I always ended filling up RAM completely which led to
crashes of my python script with a MemoryError. This led me to consider,
actually from an advice via this list, the module h5py, which has a nice
numpy interface to the hdf5 file format. It seemed more clear to me with the
h5py-module, what was being buffered on disk and what was stored in RAM.

  Cheers,
  Simon


On Sun, Oct 24, 2010 at 2:15 AM, David Cournapeau courn...@gmail.comwrote:

 On Sun, Oct 24, 2010 at 12:44 AM, braingateway braingate...@gmail.com
 wrote:

 
  I agree with you about the point of using memmap.

  That is why the behavior
  is so strange to me.

 I think it is expected. What kind of behavior were you expecting ? To
 be clear, if I have a lot of available ram, I expect memmap arrays to
 take almost all of it (virtual memroy ~ resident memory). Now, if at
 the same time, another process starts taking a lot of memory, I expect
 the OS to automatically lower resident memory for the process using
 memmap.

 I did a small experiment on mac os x, creating a giant mmap'd array in
 numpy, and at the same time running a small C program using mlock (to
 lock pages into physical memory). As soon as I lock a big area (where
 big means most of my physical ram), the python process dealing with
 the mmap area sees its resident memory decrease. As soon as I kill the
 C program locking the memory, the resident memory starts increasing
 again.

 cheers,

 David
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Accessing data in a large file

2010-06-17 Thread Simon Lyngby Kokkendorff
Hi list,

   I am new to this list, so forgive me if this is a trivial problem,
however i would appreciate any help.

  I am using numpy to work with large amounts of data - sometimes too much
to fit into memory. Therefore I want to be able to store data in binary
files and use numpy to read chunks of the file into memory. I've tried to
use numpy.memmap and numpy.load and numpy.save with mmap_mode=r. However
when I try to perform any nontrivial operation on a slice of the memmap I
always end up reading the entire file into memory - which then leads to
memory errors. Is there a way to get numpy to do what I want, using an
internal platform independent numpy-format like .npy, or do I have to wrap a
custom file reader with something like ctypes?
Of course numpy.fromfile is a possibility, but it seems to be a quite
inflexible alternative as it doesn't really support slices and might have a
problem with platform dependency (byte order).

Hope that someone can help, cheers,
 Simon
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Accessing data in a large file

2010-06-17 Thread Simon Lyngby Kokkendorff
Thanks for the references to these libraries - they seem to fix my problem!

Cheers,
Simon


On Thu, Jun 17, 2010 at 2:58 PM, davide lasagnadav...@gmail.com wrote:

 You may have a look to the nice python-h5py module, which gives an OO
 interface to the underlying hdf5 file format. I'm using it for storing
 large amounts (~10Gb) of experimental data. Very fast, very convenient.

 Ciao

 Davide

 On Thu, 2010-06-17 at 08:33 -0400, greg whittier wrote:
  On Thu, Jun 17, 2010 at 4:21 AM, Simon Lyngby Kokkendorff
  sil...@gmail.com wrote:
   memory errors. Is there a way to get numpy to do what I want, using an
   internal platform independent numpy-format like .npy, or do I have to
 wrap a
   custom file reader with something like ctypes?
 
  You might give http://www.pytables.org/ a try.
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion