jarod_v6--- via Tutor wrote: > Dear All! > I have this elements > > In [445]: pt = line.split("\t")[9] > > In [446]: pt > Out[446]: 'gene_id "ENSG00000223972"; gene_version "5"; transcript_id > "ENST00000456328"; transcript_version "2"; exon_number "1"; gene_name > "DDX11L1"; gene_source "havana"; gene_biotype > "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-002"; > transcript_source "havana"; transcript_biotype "processed_transcript"; > exon_id "ENSE00002234944"; exon_version "1"; tag "basic"; > transcript_support_level "1";\n' > > > and I want to create a dictionary like this > > gene_id = "ENSG00000223972"; ... > > > I found on stack over flow this way to create a dictionary of dictionary > (http://stackoverflow.com/questions/8550912/python-dictionary-of-dictionaries) > # This is our sample data > data = [("Milter", "Miller", 4), ("Milter", "Miler", 4), ("Milter", > "Malter", 2)] > > # dictionary we want for the result > dictionary = {} > > # loop that makes it work > for realName, falseName, position in data: > dictionary.setdefault(realName, {})[falseName] = position > > I want to create a dictionary using setdefault but I have difficult to > trasform pt as list of tuple. > > data = pt.split(";") > <ipython-input-456-300c276109c6> in <module>() > 1 for i in data: > 2 l = i.split() > ----> 3 print l[0] > 4 > > IndexError: list index out of range > > In [457]: for i in data: > l = i.split() > print l > .....: > ['gene_id', '"ENSG00000223972"'] > ['gene_version', '"5"'] > ['transcript_id', '"ENST00000456328"'] > ['transcript_version', '"2"'] > ['exon_number', '"1"'] > ['gene_name', '"DDX11L1"'] > ['gene_source', '"havana"'] > ['gene_biotype', '"transcribed_unprocessed_pseudogene"'] > ['transcript_name', '"DDX11L1-002"'] > ['transcript_source', '"havana"'] > ['transcript_biotype', '"processed_transcript"'] > ['exon_id', '"ENSE00002234944"'] > ['exon_version', '"1"'] > ['tag', '"basic"'] > ['transcript_support_level', '"1"'] > [] > > > So how can do that more elegant way? > thanks so much!!
I don't see why you would need dict.setdefault(), you have the necessary pieces together: data = pt.split(";") pairs = (item.split() for item in data) mydict = {item[0]: item[1].strip('"') for item in pairs if len(item) == 2} You can protect against whitespace in the quoted strings with item.split(None, 1) instead of item.split(). If ";" is allowed in the quoted strings you have to work a little harder. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor