[Numpy-discussion] genfromtxt view with object dtype
hi, i am using genfromtxt, with a dtype like this: [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'), ('phase', 'i4'), ('attrs', '|O4')] where i'm having problems with the attrs column which i'd like to be a dict. i can specify a convertor to parse a string into a dict, and it is correctly converted to a dict, but then in io.py it tries to take a view() of that dtype and it gives the error: A = np.genfromtxt(fname, **kwargs) File /usr/lib/python2.5/site-packages/numpy/lib/io.py, line 922, in genfromtxt output = rows.view(dtype) TypeError: Cannot change data-type for object array. is there anyway around this or must that col be kept as a string? it seems like genfromtxt expects you to specify either a dtype _or_ a convertor, not both. thanks, -brent ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt view with object dtype
OK, Brent, try r6341. I fixed genfromtxt for cases like yours (explicit dtype involving a np.object). Note that the fix won't work if the dtype is nested and involves np.objects (as we would hit the pb of renaming fields we observed...). Let me know how it goes. P. On Feb 4, 2009, at 4:03 PM, Brent Pedersen wrote: On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM pgmdevl...@gmail.com wrote: On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote: hi, i am using genfromtxt, with a dtype like this: [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'), ('phase', 'i4'), ('attrs', '|O4')] Brent, Please post a simple, self-contained example with a few lines of the file you want to load. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion hi pierre, here is an example. thanks, -brent ## import numpy as np from cStringIO import StringIO gffstr = \ ##gff-version 3 1\tucb\tgene\t2234602\t2234702\t.\t-\t. \tID = grape_1_2234602_2234702 ;match = EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA 1\tucb\tgene\t2300292\t2302123\t.\t+\t. \tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8 1\tucb\tgene\t2303615\t2303967\t.\t+\t. \tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8 1\tucb\tgene\t2303616\t2303966\t.\t+\t. \tParent=grape_1_2303615_2303967 1\tucb\tgene\t3596400\t3596503\t.\t-\t. \tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27 1\tucb\tgene\t3600651\t3600977\t.\t-\t. \tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA dtype = {'names' : ('seqid', 'source', 'type', 'start', 'end', 'score', 'strand', 'phase', 'attrs') , 'formats': ['S24', 'S16', 'S16', 'i4', 'i4', 'f8', 'S1', 'i4', 'S128']} #OK with S128 for attrs print np.genfromtxt(StringIO(gffstr), dtype = dtype) def _attr(kvstr): pairs = [kv.split(=) for kv in kvstr.split(;)] return dict(pairs) # change S128 to object to have col attrs as dictionary dtype['formats'][-1] = 'O' converters = {8: _attr } #NOT OK print np.genfromtxt(StringIO(gffstr), dtype = dtype, converters=converters) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtxt view with object dtype
On Wed, Feb 4, 2009 at 8:51 PM, Pierre GM pgmdevl...@gmail.com wrote: OK, Brent, try r6341. I fixed genfromtxt for cases like yours (explicit dtype involving a np.object). Note that the fix won't work if the dtype is nested and involves np.objects (as we would hit the pb of renaming fields we observed...). Let me know how it goes. P. that fixes it. thanks again pierre! -b On Feb 4, 2009, at 4:03 PM, Brent Pedersen wrote: On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM pgmdevl...@gmail.com wrote: On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote: hi, i am using genfromtxt, with a dtype like this: [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start', 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'), ('phase', 'i4'), ('attrs', '|O4')] Brent, Please post a simple, self-contained example with a few lines of the file you want to load. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion hi pierre, here is an example. thanks, -brent ## import numpy as np from cStringIO import StringIO gffstr = \ ##gff-version 3 1\tucb\tgene\t2234602\t2234702\t.\t-\t. \tID = grape_1_2234602_2234702 ;match = EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA 1\tucb\tgene\t2300292\t2302123\t.\t+\t. \tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8 1\tucb\tgene\t2303615\t2303967\t.\t+\t. \tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8 1\tucb\tgene\t2303616\t2303966\t.\t+\t. \tParent=grape_1_2303615_2303967 1\tucb\tgene\t3596400\t3596503\t.\t-\t. \tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27 1\tucb\tgene\t3600651\t3600977\t.\t-\t. \tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA dtype = {'names' : ('seqid', 'source', 'type', 'start', 'end', 'score', 'strand', 'phase', 'attrs') , 'formats': ['S24', 'S16', 'S16', 'i4', 'i4', 'f8', 'S1', 'i4', 'S128']} #OK with S128 for attrs print np.genfromtxt(StringIO(gffstr), dtype = dtype) def _attr(kvstr): pairs = [kv.split(=) for kv in kvstr.split(;)] return dict(pairs) # change S128 to object to have col attrs as dictionary dtype['formats'][-1] = 'O' converters = {8: _attr } #NOT OK print np.genfromtxt(StringIO(gffstr), dtype = dtype, converters=converters) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion