subject:"\[Numpy\-discussion\] seeking advice on a fast string\-array conversion"

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-20 Thread william ratcliff

Actually,
I do use spec when I have synchotron experiments.  But why are your files so
large?
On Nov 16, 2010 9:20 AM, Darren Dale dsdal...@gmail.com wrote:
 I am wrapping up a small package to parse a particular ascii-encoded
 file format generated by a program we use heavily here at the lab. (In
 the unlikely event that you work at a synchrotron, and use Certified
 Scientific's spec program, and are actually interested, the code is
 currently available at
 https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
 .)

 I have been benchmarking the project against another python package
 developed by a colleague, which is an extension module written in pure
 C. My python/cython project takes about twice as long to parse and
 index a file (~0.8 seconds for 100MB), which is acceptable. However,
 actually converting ascii strings to numpy arrays, which is done using
 numpy.fromstring, takes a factor of 10 longer than the extension
 module. So I am wondering about the performance of np.fromstring:

 import time
 import numpy as np
 s = b'1 ' * 2048 *1200
 d = time.time()
 x = np.fromstring(s)
 print time.time() - d
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Darren Dale

Sorry, I accidentally hit send long before I was finished writing. But
to answer your question, they contain many 2048-element multi-channel
analyzer spectra.

Darren

On Tue, Nov 16, 2010 at 9:26 AM, william ratcliff
william.ratcl...@gmail.com wrote:
 Actually,
 I do use spec when I have synchotron experiments.  But why are your files so
 large?

 On Nov 16, 2010 9:20 AM, Darren Dale dsdal...@gmail.com wrote:
 I am wrapping up a small package to parse a particular ascii-encoded
 file format generated by a program we use heavily here at the lab. (In
 the unlikely event that you work at a synchrotron, and use Certified
 Scientific's spec program, and are actually interested, the code is
 currently available at
 https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
 .)

 I have been benchmarking the project against another python package
 developed by a colleague, which is an extension module written in pure
 C. My python/cython project takes about twice as long to parse and
 index a file (~0.8 seconds for 100MB), which is acceptable. However,
 actually converting ascii strings to numpy arrays, which is done using
 numpy.fromstring, takes a factor of 10 longer than the extension
 module. So I am wondering about the performance of np.fromstring:

 import time
 import numpy as np
 s = b'1 ' * 2048 *1200
 d = time.time()
 x = np.fromstring(s)
 print time.time() - d
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Darren Dale

On Tue, Nov 16, 2010 at 10:31 AM, Darren Dale dsdal...@gmail.com wrote:
 On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen p...@iki.fi wrote:
 Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
 [clip]
 That loop takes 0.33 seconds to execute, which is a good start. I need
 some help converting this example to return an actual numpy array. Could
 anyone please offer a suggestion?

 Easiest way is probably to use ndarray buffers and resize them when
 needed. For example:

 https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980

 Thank you Pauli. That makes it *incredibly* simple:

 import time
 cimport numpy as np
 import numpy as np

 cdef extern from 'stdlib.h':
    double atof(char*)


 def test():
    py_string = '100'
    cdef char* c_string = py_string
    cdef int i, j
    cdef double val
    i = 0
    j = 2048*1200
    cdef np.ndarray[np.float64_t, ndim=1] ret

    ret_arr = np.empty((2048*1200,), dtype=np.float64)
    ret = ret_arr

    d = time.time()
    while ij:
        c_string = py_string
        ret[i] = atof(c_string)
        i += 1
    ret_arr.shape = (1200, 2048)
    print ret_arr, ret_arr.shape, time.time()-d

 The loop now takes only 0.11 seconds to execute. Thanks again.


One follow-up issue: I can't cythonize this code for python-3. I've
installed numpy with the most recent changes to the 1.5.x maintenance
branch, then re-installed cython-0.13, but when I run python3
setup.py build_ext --inplace with this setup script:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

import numpy

setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [
Extension(
test_open, [test_open.pyx], include_dirs=[numpy.get_include()]
)
]
)


I get the following error. Any suggestions what I need to fix, or
should I report it to the cython list?

$ python3 setup.py build_ext --inplace
running build_ext
cythoning test_open.pyx to test_open.c

Error converting Pyrex file to C:

...
# For use in situations where ndarray can't replace PyArrayObject*,
# like PyArrayObject**.
pass

ctypedef class numpy.ndarray [object PyArrayObject]:
cdef __cythonbufferdefaults__ = {mode: strided}
^


/Users/darren/.local/lib/python3.1/site-packages/Cython/Includes/numpy.pxd:173:49:
mode is not a buffer option

Error converting Pyrex file to C:

...
   cdef char* c_string = py_string
   cdef int i, j
   cdef double val
   i = 0
   j = 2048*1200
   cdef np.ndarray[np.float64_t, ndim=1] ret
   ^


/Users/darren/temp/test/test_open.pyx:16:8: 'ndarray' is not a type identifier
building 'test_open' extension
/usr/bin/gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g
-fwrapv -O3 -Wall -Wstrict-prototypes
-I/Users/darren/.local/lib/python3.1/site-packages/numpy/core/include
-I/opt/local/Library/Frameworks/Python.framework/Versions/3.1/include/python3.1
-c test_open.c -o build/temp.macosx-10.6-x86_64-3.1/test_open.o
test_open.c:1:2: error: #error Do not use this file, it is the result
of a failed Cython compilation.
error: command '/usr/bin/gcc-4.2' failed with exit status 1
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Pauli Virtanen

Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
[clip]
 That loop takes 0.33 seconds to execute, which is a good start. I need
 some help converting this example to return an actual numpy array. Could
 anyone please offer a suggestion?

Easiest way is probably to use ndarray buffers and resize them when 
needed. For example:

https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Pauli Virtanen

Tue, 16 Nov 2010 09:20:29 -0500, Darren Dale wrote:
[clip]
 module. So I am wondering about the performance of np.fromstring:

Fromstring is slow, probably because it must work around locale-
dependence of the underlying C parsing functions. Moreover, the Numpy 
parsing mechanism generates many indirect calls.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Darren Dale

Apologies, I accidentally hit send...

On Tue, Nov 16, 2010 at 9:20 AM, Darren Dale dsdal...@gmail.com wrote:
 I am wrapping up a small package to parse a particular ascii-encoded
 file format generated by a program we use heavily here at the lab. (In
 the unlikely event that you work at a synchrotron, and use Certified
 Scientific's spec program, and are actually interested, the code is
 currently available at
 https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
 .)

 I have been benchmarking the project against another python package
 developed by a colleague, which is an extension module written in pure
 C. My python/cython project takes about twice as long to parse and
 index a file (~0.8 seconds for 100MB), which is acceptable. However,
 actually converting ascii strings to numpy arrays, which is done using
 numpy.fromstring,  takes a factor of 10 longer than the extension
 module. So I am wondering about the performance of np.fromstring:

import time
import numpy as np
s = b'1 ' * 2048 *1200
d = time.time()
x = np.fromstring(s, dtype='d', sep=b' ')
print time.time() - d

That takes about 1.3 seconds on my machine. A similar metric for the
extension module is to load 1200 of these 2048-element arrays from the
file:

d=time.time()
x=[s.mca(i+1) for i in xrange(1200)]
print time.time()-d

That takes about 0.127 seconds on my machine. This discrepancy is
unacceptable for my usecase, so I need to develop an alternative to
fromstring. Here is bit of testing with cython:

import time

cdef extern from 'stdlib.h':
double atof(char*)

py_string = '100'
cdef char* c_string = py_string
cdef int i, j
j=2048*1200

d = time.time()
while ij:
c_string = py_string
val = atof(c_string)
i += 1
print val, time.time()-d


That loop takes 0.33 seconds to execute, which is a good start. I need
some help converting this example to return an actual numpy array.
Could anyone please offer a suggestion?

Thanks,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-19 Thread Darren Dale

I am wrapping up a small package to parse a particular ascii-encoded
file format generated by a program we use heavily here at the lab. (In
the unlikely event that you work at a synchrotron, and use Certified
Scientific's spec program, and are actually interested, the code is
currently available at
https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
.)

I have been benchmarking the project against another python package
developed by a colleague, which is an extension module written in pure
C. My python/cython project takes about twice as long to parse and
index a file (~0.8 seconds for 100MB), which is acceptable. However,
actually converting ascii strings to numpy arrays, which is done using
numpy.fromstring,  takes a factor of 10 longer than the extension
module. So I am wondering about the performance of np.fromstring:

import time
import numpy as np
s = b'1 ' * 2048 *1200
d = time.time()
x = np.fromstring(s)
print time.time() - d
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-16 Thread Darren Dale

On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen p...@iki.fi wrote:
 Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
 [clip]
 That loop takes 0.33 seconds to execute, which is a good start. I need
 some help converting this example to return an actual numpy array. Could
 anyone please offer a suggestion?

 Easiest way is probably to use ndarray buffers and resize them when
 needed. For example:

 https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980

Thank you Pauli. That makes it *incredibly* simple:

import time
cimport numpy as np
import numpy as np

cdef extern from 'stdlib.h':
double atof(char*)


def test():
py_string = '100'
cdef char* c_string = py_string
cdef int i, j
cdef double val
i = 0
j = 2048*1200
cdef np.ndarray[np.float64_t, ndim=1] ret

ret_arr = np.empty((2048*1200,), dtype=np.float64)
ret = ret_arr

d = time.time()
while ij:
c_string = py_string
ret[i] = atof(c_string)
i += 1
ret_arr.shape = (1200, 2048)
print ret_arr, ret_arr.shape, time.time()-d

The loop now takes only 0.11 seconds to execute. Thanks again.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-16 Thread Christopher Barker

On 11/16/10 7:31 AM, Darren Dale wrote:
 On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanenp...@iki.fi  wrote:
 Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
 [clip]
 That loop takes 0.33 seconds to execute, which is a good start. I need
 some help converting this example to return an actual numpy array. Could
 anyone please offer a suggestion?

Darren,

It's interesting that you found fromstring() so slow -- I've put some 
time into trying to get fromfile() and fromstring() to be a bit more 
robust and featurefull, but found it to be some really painful code to 
work on -- but it didn't dawn on my that it would be slow too! I saw all 
the layers of function calls, but I still thought that would be minimal 
compared to the actual string parsing. I guess not. Shows that you never 
know where your bottlenecks are without profiling.

Slow is relative, of course, but since the whole point of 
fromfile/string is performance (otherwise, we'd just parse with python), 
it would be nice to get them as fast as possible.

I had been thinking that the way to make a good fromfile was Cython, so 
you've inspired me to think about it some more. Would you be interested 
in extending what you're doing to a more general purpose tool?

Anyway,  a comment or two:
 cdef extern from 'stdlib.h':
  double atof(char*)

One thing I found with the current numpy code is that the use of the 
ato* functions is a source of a lot of bugs (all of them?) the core 
problem is error handling -- you have to do a lot of pointer checking to 
see if a call was successful, and with the fromfile code, that error 
handling is not done in all the layers of calls.

Anyone know what the advantage of ato* is over scanf()/fscanf()?

Also, why are you doing string parsing rather than parsing the files 
directly, wouldn't that be a bit faster?

I've got some C extension code for simple parsing of text files into 
arrays of floats or doubles (using fscanf). I'd be curious how the 
performance compares to what you've got. Let me know if you're interested.

-Chris


 def test():
  py_string = '100'
  cdef char* c_string = py_string
  cdef int i, j
  cdef double val
  i = 0
  j = 2048*1200
  cdef np.ndarray[np.float64_t, ndim=1] ret

  ret_arr = np.empty((2048*1200,), dtype=np.float64)
  ret = ret_arr

  d = time.time()
  while ij:
  c_string = py_string
  ret[i] = atof(c_string)
  i += 1
  ret_arr.shape = (1200, 2048)
  print ret_arr, ret_arr.shape, time.time()-d

 The loop now takes only 0.11 seconds to execute. Thanks again.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-16 Thread Darren Dale

On Tue, Nov 16, 2010 at 11:46 AM, Christopher Barker
chris.bar...@noaa.gov wrote:
 On 11/16/10 7:31 AM, Darren Dale wrote:
 On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanenp...@iki.fi  wrote:
 Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
 [clip]
 That loop takes 0.33 seconds to execute, which is a good start. I need
 some help converting this example to return an actual numpy array. Could
 anyone please offer a suggestion?

 Darren,

 It's interesting that you found fromstring() so slow -- I've put some
 time into trying to get fromfile() and fromstring() to be a bit more
 robust and featurefull, but found it to be some really painful code to
 work on -- but it didn't dawn on my that it would be slow too! I saw all
 the layers of function calls, but I still thought that would be minimal
 compared to the actual string parsing. I guess not. Shows that you never
 know where your bottlenecks are without profiling.

 Slow is relative, of course, but since the whole point of
 fromfile/string is performance (otherwise, we'd just parse with python),
 it would be nice to get them as fast as possible.

 I had been thinking that the way to make a good fromfile was Cython, so
 you've inspired me to think about it some more. Would you be interested
 in extending what you're doing to a more general purpose tool?

 Anyway,  a comment or two:
 cdef extern from 'stdlib.h':
      double atof(char*)

 One thing I found with the current numpy code is that the use of the
 ato* functions is a source of a lot of bugs (all of them?) the core
 problem is error handling -- you have to do a lot of pointer checking to
 see if a call was successful, and with the fromfile code, that error
 handling is not done in all the layers of calls.

In my case, I am making an assumption about the integrity of the file.

 Anyone know what the advantage of ato* is over scanf()/fscanf()?

 Also, why are you doing string parsing rather than parsing the files
 directly, wouldn't that be a bit faster?

Rank inexperience, I guess. I don't understand what you have in mind.
scanf/fscanf don't actually convert strings to numbers, do they?

 I've got some C extension code for simple parsing of text files into
 arrays of floats or doubles (using fscanf). I'd be curious how the
 performance compares to what you've got. Let me know if you're interested.

I'm curious, yes.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-16 Thread Christopher Barker

On 11/16/10 8:57 AM, Darren Dale wrote:
 In my case, I am making an assumption about the integrity of the file.

That does make things easier, but less universal. I guess this is the 
whole trade-off about reusable code. It sure it a lot easier to write 
code that does the one thing you need than something general purpose.

 Anyone know what the advantage of ato* is over scanf()/fscanf()?

 Also, why are you doing string parsing rather than parsing the files
 directly, wouldn't that be a bit faster?

 Rank inexperience, I guess. I don't understand what you have in mind.

if your goal is to read numbers from an ascii file, you can use 
fromfile() directly, rather than reading the file (or some of it) into a 
string, and then using fromstring(). Also, in C, you can use fscanf to 
read the file directly (of course, under the hood, it's putting stuff in 
stings somewhere along the line, but presumably in an optimized way.

 scanf/fscanf don't actually convert strings to numbers, do they?

yes, that's exactly what they do.

http://en.wikipedia.org/wiki/Scanf

The C lib may very well use ato* under the hood.

My idea at this point is to write a function in Cython to takes a file 
and a numpy dtype, converts the dtype to a scanf format string, then 
calls fscanf (or scanf) to parse out the file. My existing scanner code 
more or less does that, but the format string is hard-code to be either 
for floats or doubles.

 I've got some C extension code for simple parsing of text files into
 arrays of floats or doubles (using fscanf). I'd be curious how the
 performance compares to what you've got. Let me know if you're interested.

 I'm curious, yes.

OK -- I'll whip up a test similar to yours -- stay tuned!

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

2010-11-16 Thread Christopher Barker


On 11/16/10 10:01 AM, Christopher Barker wrote:


OK -- I'll whip up a test similar to yours -- stay tuned!


Here's what I've done:

import numpy as np
from maproomlib.utility import file_scanner

def gen_file():
f = file('test.dat', 'w')
for i in range(1200):
f.write('1 ' * 2048)
f.write('\n')
f.close()

def read_file1():
 read unknown length: doubles
f = file('test.dat')
arr = file_scanner.FileScan(f)
f.close()
return arr

def read_file2():
 read known length: doubles
f = file('test.dat')
arr = file_scanner.FileScanN(f, 1200*2048)
f.close()
return arr

def read_file3():
 read known length: singles
f = file('test.dat')
arr = file_scanner.FileScanN_single(f, 1200*2048)
f.close()
return arr

def read_fromfile1():
 read unknown length with fromfile(): singles
f = file('test.dat')
arr = np.fromfile(f, dtype=np.float32, sep=' ')
f.close()
return arr

def read_fromfile2():
 read unknown length with fromfile(): doubles
f = file('test.dat')
arr = np.fromfile(f, dtype=np.float64, sep=' ')
f.close()
return arr

def read_fromstring1():
 read unknown length with fromstring(): singles
f = file('test.dat')
str = f.read()
arr = np.fromstring(str, dtype=np.float32, sep=' ')
f.close()
return arr

And the results (ipython's timeit):

In [40]: timeit test.read_fromfile1()
1 loops, best of 3: 561 ms per loop

In [41]: timeit test.read_fromfile2()
1 loops, best of 3: 570 ms per loop

In [42]: timeit test.read_file1()
1 loops, best of 3: 336 ms per loop

In [43]: timeit test.read_file2()
1 loops, best of 3: 341 ms per loop

In [44]: timeit test.read_file3()
1 loops, best of 3: 515 ms per loop

In [46]: timeit test.read_fromstring1()
1 loops, best of 3: 301 ms per loop

So my filescanner is faster, but not radically so, than fromfile(). 
However, reading the whole file into a string, then using fromstring() 
is, in fact, tne fastest method -- interesting -- shows you why you need 
to profile!


Also, with my code, reading singles is slower than doubles -- odd. 
Perhaps the C lib fscanf read doubles anyway, then converts to singles?


Anyway, for my needs, my file_scanner and fromfile() are fast enough, 
and much faster than parsing the files with Python. My issue with 
fromfile is flexibility and robustness -- it's buggy in the face of 
ill-formed files. See the list archives and the bug reports for more detail.


Still, it seems your very basic method is indeed a faster way to go.

I've enclosed the files. It's currently built as part of a larger lib, 
so no setup.py -- though it could be written easily enough.


-Chris



--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
#include Python.h


#include numpy/arrayobject.h

// NOTE: these buffer sizes were picked very arbitrarily, and have
// remarkably little impact on performance on my system.
#define BUFFERSIZE1 1024
#define BUFFERSIZE2 64


int filescan_double(FILE *infile, int NNums, double *array){

double N;
int i, j;
int c;

for (i=0; iNNums; i++){
while ( (j = fscanf(infile, %lg, N)) == 0 ){
c = fgetc(infile);
}
if (j == EOF) {
return(i);
}
array[i] = N;
}
// Go to the end of any whitespace:
while ( isspace(c = fgetc(infile)) ){
//printf(skipping a whitespace character: %i\n, c);
//printf(I'm at position %i in the file\n,ftell(infile));
}
 if (c  -1){
 // not EOF, rewind the file one byte.
 fseek(infile, -1, SEEK_CUR);
 }
return(i);
}

int filescan_single(FILE *infile, int NNums, float *array){

float N;
int i, j;
int c;

for (i=0; iNNums; i++){
/*  while ( (j = fscanf(infile, %lg, N)) == 0 ){*/
while ( (j = fscanf(infile, %g, N)) == 0 ){
c = fgetc(infile);
}
if (j == EOF) {
return(i);
}
array[i] = N;
}
// Go to the end of any whitespace:
while ( isspace(c = fgetc(infile)) ){
//printf(skipping a whitespace character: %i\n, c);
//printf(I'm at position %i in the file\n,ftell(infile));
}
 if (c  -1){
 // not EOF, rewind the file one byte.
 fseek(infile, -1, SEEK_CUR);
 }
return(i);
}

static char doc_FileScanN[] =
FileScanN(file, N)\n\n
Reads N values in the ascii file, and produces a Numeric vector of\n
length N full of Floats (C doubles).\n\n
Raises an exception if there are fewer than N  numbers in the file.\n\n
All text in the file that is not part of a floating point number is\n
skipped over.\n\n
After reading N numbers, the file is left before the next non-whitespace\n
character in the file. This will often leave the file at the start of\n
the

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

[Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

Re: [Numpy-discussion] seeking advice on a fast string-array conversion

12 matches

Site Navigation

Mail list logo

Footer information