On Tue, Oct 6, 2009 at 12:31 PM, <josef.p...@gmail.com> wrote: > On Mon, Oct 5, 2009 at 5:22 PM, Elaine Angelino > <elaine.angel...@gmail.com> wrote: >> Hi there, >> >> We are writing to announce the release of "Tabular", a package of Python >> modules for working with tabular data. >> >> Tabular is a package of Python modules for working with tabular data. Its >> main object is the tabarray class, a data structure for holding and >> manipulating tabular data. By putting data into a tabarray object, you’ll >> get a representation of the data that is more flexible and powerful than a >> native Python representation. More specifically, tabarray provides: >> >> -- ultra-fast filtering, selection, and numerical analysis methods, using >> convenient Matlab-style matrix operation syntax >> -- spreadsheet-style operations, including row & column operations, 'sort', >> 'replace', 'aggregate', 'pivot', and 'join' >> -- flexible load and save methods for a variety of file formats, including >> delimited text (CSV), binary, and HTML >> -- helpful inference algorithms for determining formatting parameters and >> data types of input files >> -- support for hierarchical groupings of columns, both as data structures >> and file formats >> >> You can download Tabular from PyPI (http://pypi.python.org/pypi/tabular/) or >> alternatively clone our hg repository from bitbucket >> (http://bitbucket.org/elaine/tabular/). We also have posted tutorial-style >> Sphinx documentation (http://www.parsemydata.com/tabular/). >> >> The tabarray object is based on the record array object from the Numerical >> Python package (NumPy), and Tabular is built to interface well with NumPy in >> general. Our intended audience is two-fold: (1) Python users who, though >> they may not be familiar with NumPy, are in need of a way to work with >> tabular data, and (2) NumPy users who would like to do spreadsheet-style >> operations on top of their more "numerical" work. >> >> We hope that some of you find Tabular useful! >> >> Best, >> >> Elaine and Dan > > I briefly looked at the sphinx docs and the code. Tabular looks pretty > useful and > the code can be partially read as recipes for working with recarrays > or structured > arrays. Thanks for the choice of license (it makes looking at the code > "legal"). > > I didn't see any explicit nan handling. Are missing values allowed > e.g. in the constructor? > > I looked a bit closer at function like tabular.fast.recarrayisin since > I always have problems > with these row operations. > Are these function supposed to work with arbitrary structured arrays? > The tests are only > for a 1d integer arrays. > With floats the default string representation doesn't sort correctly. > Or am I misreading the function? > >>>> arr = np.array([6,1,2,1e-13,0.5*1e-14,1,2e25,3,0,7]).view([('',float)]*2) >>>> arr > array([(6.0, 1.0), (2.0, 1e-013), (5e-015, 1.0), > (2.0000000000000002e+025, 3.0), (0.0, 7.0)], > dtype=[('f0', '<f8'), ('f1', '<f8')]) >>>> np.sort([str(l) for l in arr]) > array(['(0.0, 7.0)', '(2.0, 1e-013)', '(2.0000000000000002e+025, 3.0)', > '(5e-015, 1.0)', '(6.0, 1.0)'], > dtype='|S30')
Maybe this doesn't matter for the purpose of this function. I will download and try the code before I make any more irrelevant comments. Josef > > Being able to do a searchsorted on rows of an array would be a useful feature > in numpy. Is there a sortable 1d representation of the rows of a 2d float or > mixed type array? > > Thanks, > > Josef > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion