[Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
hi, i am using genfromtxt, with a dtype like this:
[('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start',
'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'), ('phase',
'i4'), ('attrs', '|O4')]

where i'm having problems with the attrs column which i'd like to be a
dict. i can specify a convertor to parse a string into a dict, and it
is correctly converted to a dict,
but then in io.py it tries to take a view() of that dtype and it gives
the error:

A = np.genfromtxt(fname, **kwargs)
  File /usr/lib/python2.5/site-packages/numpy/lib/io.py, line 922,
in genfromtxt
output = rows.view(dtype)
 TypeError: Cannot change data-type for object array.


is there anyway around this or must that col be kept as a string?
it seems like genfromtxt expects you to specify either a dtype _or_ a
convertor, not both.

thanks,
-brent
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Pierre GM
OK, Brent, try r6341.
I fixed genfromtxt for cases like yours (explicit dtype involving a  
np.object).
Note that the fix won't work if the dtype is nested and involves  
np.objects (as we would hit the pb of renaming fields we observed...).
Let me know how it goes.
P.

On Feb 4, 2009, at 4:03 PM, Brent Pedersen wrote:

 On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM pgmdevl...@gmail.com  
 wrote:

 On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote:

 hi, i am using genfromtxt, with a dtype like this:
 [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start',
 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'),  
 ('phase',
 'i4'), ('attrs', '|O4')]

 Brent,
 Please post a simple, self-contained example with a few lines of the
 file you want to load.

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion


 hi pierre, here is an example.
 thanks,
 -brent

 ##

 import numpy as np
 from cStringIO import StringIO

 gffstr = \
 ##gff-version 3
 1\tucb\tgene\t2234602\t2234702\t.\t-\t. 
 \tID 
 = 
 grape_1_2234602_2234702 
 ;match 
 = 
 EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA
 1\tucb\tgene\t2300292\t2302123\t.\t+\t. 
 \tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8
 1\tucb\tgene\t2303615\t2303967\t.\t+\t. 
 \tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8
 1\tucb\tgene\t2303616\t2303966\t.\t+\t. 
 \tParent=grape_1_2303615_2303967
 1\tucb\tgene\t3596400\t3596503\t.\t-\t. 
 \tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27
 1\tucb\tgene\t3600651\t3600977\t.\t-\t. 
 \tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA
 

 dtype = {'names' :
  ('seqid', 'source', 'type', 'start', 'end',
'score', 'strand', 'phase', 'attrs') ,
'formats':
  ['S24', 'S16', 'S16', 'i4', 'i4', 'f8',
  'S1', 'i4', 'S128']}

 #OK with S128 for attrs
 print np.genfromtxt(StringIO(gffstr), dtype = dtype)



 def _attr(kvstr):
pairs = [kv.split(=) for kv in kvstr.split(;)]
return dict(pairs)

 # change S128 to object to have col attrs as dictionary
 dtype['formats'][-1] = 'O'
 converters = {8: _attr }
 #NOT OK
 print np.genfromtxt(StringIO(gffstr), dtype = dtype,  
 converters=converters)
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] genfromtxt view with object dtype

2009-02-04 Thread Brent Pedersen
On Wed, Feb 4, 2009 at 8:51 PM, Pierre GM pgmdevl...@gmail.com wrote:
 OK, Brent, try r6341.
 I fixed genfromtxt for cases like yours (explicit dtype involving a
 np.object).
 Note that the fix won't work if the dtype is nested and involves
 np.objects (as we would hit the pb of renaming fields we observed...).
 Let me know how it goes.
 P.


that fixes it. thanks again pierre!
-b




 On Feb 4, 2009, at 4:03 PM, Brent Pedersen wrote:

 On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM pgmdevl...@gmail.com
 wrote:

 On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote:

 hi, i am using genfromtxt, with a dtype like this:
 [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start',
 'i4'), ('end', 'i4'), ('score', 'f8'), ('strand', '|S1'),
 ('phase',
 'i4'), ('attrs', '|O4')]

 Brent,
 Please post a simple, self-contained example with a few lines of the
 file you want to load.

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion


 hi pierre, here is an example.
 thanks,
 -brent

 ##

 import numpy as np
 from cStringIO import StringIO

 gffstr = \
 ##gff-version 3
 1\tucb\tgene\t2234602\t2234702\t.\t-\t.
 \tID
 =
 grape_1_2234602_2234702
 ;match
 =
 EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA
 1\tucb\tgene\t2300292\t2302123\t.\t+\t.
 \tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8
 1\tucb\tgene\t2303615\t2303967\t.\t+\t.
 \tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8
 1\tucb\tgene\t2303616\t2303966\t.\t+\t.
 \tParent=grape_1_2303615_2303967
 1\tucb\tgene\t3596400\t3596503\t.\t-\t.
 \tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27
 1\tucb\tgene\t3600651\t3600977\t.\t-\t.
 \tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA
 

 dtype = {'names' :
  ('seqid', 'source', 'type', 'start', 'end',
'score', 'strand', 'phase', 'attrs') ,
'formats':
  ['S24', 'S16', 'S16', 'i4', 'i4', 'f8',
  'S1', 'i4', 'S128']}

 #OK with S128 for attrs
 print np.genfromtxt(StringIO(gffstr), dtype = dtype)



 def _attr(kvstr):
pairs = [kv.split(=) for kv in kvstr.split(;)]
return dict(pairs)

 # change S128 to object to have col attrs as dictionary
 dtype['formats'][-1] = 'O'
 converters = {8: _attr }
 #NOT OK
 print np.genfromtxt(StringIO(gffstr), dtype = dtype,
 converters=converters)
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion