Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 1:04 PM, Dhananjay Nene dhananjay.n...@gmail.comwrote: This seems to be an output of print_r of PHP. If you have a flexibility, try to have the PHP code output the data into a language neutral format (eg json, yaml, xml etc.) and then parse it in python using the appropriate parser. If not you may have to write a custom parser. I did google to find if one existed, but couldn't easily locate one. There is http://www.php.net/manual/en/book.json.php for PHP and Python2.6 onwards has json part of the stdlib. If you don't have access to the webserver, you might be able to use the php interpreter on your own machine to parse this into something more language neutral -- ~noufal http://nibrahim.net.in ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Friday 15 Jan 2010 12:01:56 pm Eknath Venkataramani wrote: and I need to extract confident , ashahvasahta from the first record, consumers, upabhaokahtaa from the second record... i.e. word in english and the first word in the probable-translations #!/usr/bin/python words = [{'english':confident, 'count' : 4, 'trans' : [ (ashahvasahta , 0.74918568), (atahmavaishahvaasa , 0.09095465), (pahraaram\.nbha , 0.06990729), (mailatae , 0.02856427), (utanai , 0.01929341), (anaa , 0.01578552), (uthaanae , 0.01403157), (jaitanae , 0.01227762), ], }, {'english':consumers, 'count' : 4, 'trans' : [ (upabhaokahtaa , 0.75144362), (upabhaokahtaaom\.n , 0.12980166), ] }, { 'english':a, 'count' : 1164, 'trans' : [ (eka , 0.14900491), (kaisai , 0.08834675), (haai , 0.06774697), (kaoi , 0.05394308), (kai , 0.04981982), (\(none\) , 0.04400085), (kaa , 0.03726579), (kae , 0.03446450), ], } ] for word in words: print word['english'],word['trans'][0][0] -- regards kg http://lawgon.livejournal.com ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 2:40 PM, Anand Balachandran Pillai abpil...@gmail.com wrote: # Now, count and trans are not strings in # data, so Python will complain, hence we # define these as strings with same name! count, trans = 'count','trans' Clever, that. I got to there, threw up my hands and went downstairs to eat lunch. -- rm ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
It is a clever hack, taking advantage of the nature of the data. But it is far more faster than the other approaches posted here. I thought eval was evil :) Regards, BG -- Baishampayan Ghose b.ghose at gmail.com ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 4:00 PM, Baishampayan Ghose b.gh...@gmail.com wrote: It is a clever hack, taking advantage of the nature of the data. But it is far more faster than the other approaches posted here. I thought eval was evil :) The date looks like valid json. You can use simplejson.loads instead of eval. Anand ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 4:13 PM, Anand Chitipothu anandol...@gmail.comwrote: On Fri, Jan 15, 2010 at 4:00 PM, Baishampayan Ghose b.gh...@gmail.com wrote: It is a clever hack, taking advantage of the nature of the data. But it is far more faster than the other approaches posted here. I thought eval was evil :) The date looks like valid json. You can use simplejson.loads instead of eval. Python 2.6.2 (r262:71600, Aug 21 2009, 12:23:57) [GCC 4.4.1 20090818 (Red Hat 4.4.1-6)] on linux2 Type help, copyright, credits or license for more information. import simplejson data=open('data.txt').read().replace('[code]','').replace('[/code]','') data '\nconfident = {\n count = 4,\n trans = {\nashahvasahta = 0.74918568,\n atahmavaishahvaasa = 0.09095465,\n pahraaram\\.nbha = 0.06990729,\nmailatae = 0.02856427,\n utanai = 0.01929341,\nanaa = 0.01578552,\nuthaanae = 0.01403157,\njaitanae = 0.01227762,\n },\n},\nconsumers = {\n count = 4,\n trans = {\n upabhaokahtaa = 0.75144362,\n upabhaokahtaaom\\.n = 0.12980166,\n sauda\\\xef\xbf\xbd\\\xef\xbf\xbd\\\xef\xbf\xbddha = 0.11875471,\n },\n},\na = {\n count = 1164,\n trans = {\n eka = 0.14900491,\n kaisai = 0.08834675,\nhaai = 0.06774697,\nkaoi = 0.05394308,\n kai = 0.04981982,\n\\(none\\) = 0.04400085,\n kaa = 0.03726579,\n kae = 0.03446450,\n },\n},\n\n' simplejson.loads(data) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib64/python2.6/site-packages/simplejson/__init__.py, line 307, in loads return _default_decoder.decode(s) File /usr/lib64/python2.6/site-packages/simplejson/decoder.py, line 338, in decode raise ValueError(errmsg(Extra data, s, end, len(s))) ValueError: Extra data: line 2 column 13 - line 37 column 1 (char 13 - 815) Anand ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers -- --Anand ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 4:13 PM, Anand Chitipothu anandol...@gmail.comwrote: On Fri, Jan 15, 2010 at 4:00 PM, Baishampayan Ghose b.gh...@gmail.com wrote: It is a clever hack, taking advantage of the nature of the data. But it is far more faster than the other approaches posted here. I thought eval was evil :) The date looks like valid json. You can use simplejson.loads instead of eval. Don't the '=' characters mess things up? One of the nice things about the repr of Python objects is that they're almost valid JSON. The same can't be said for PHP though. -- ~noufal http://nibrahim.net.in ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
[BangPypers] psyco V2
Hi, Came across this post on codespeak. Christian Tismer of Stackless fame as taken up pysco and created a V2 of it...and seems the effort continues... http://codespeak.net/pipermail/pypy-dev/2009q3/005288.html Interesting stuff... Best regards, Vishal Sapre ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] psyco V2
On Fri, Jan 15, 2010 at 5:27 PM, Vishal vsapr...@gmail.com wrote: Hi, Came across this post on codespeak. Christian Tismer of Stackless fame as taken up pysco and created a V2 of it...and seems the effort continues... http://codespeak.net/pipermail/pypy-dev/2009q3/005288.html Nice.. Thanks for pointing this out. -- ~noufal http://nibrahim.net.in ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers
Re: [BangPypers] How should I do it?
On Fri, Jan 15, 2010 at 12:01 PM, Eknath Venkataramani eknath.i...@gmail.com wrote: I have a txt file in the following format: [code] confident = { count = 4, trans = { ashahvasahta = 0.74918568, atahmavaishahvaasa = 0.09095465, pahraaram\.nbha = 0.06990729, mailatae = 0.02856427, utanai = 0.01929341, anaa = 0.01578552, uthaanae = 0.01403157, jaitanae = 0.01227762, }, }, consumers = { count = 4, trans = { upabhaokahtaa = 0.75144362, upabhaokahtaaom\.n = 0.12980166, sauda\�\�\�dha = 0.11875471, }, }, a = { count = 1164, trans = { eka = 0.14900491, kaisai = 0.08834675, haai = 0.06774697, kaoi = 0.05394308, kai = 0.04981982, \(none\) = 0.04400085, kaa = 0.03726579, kae = 0.03446450, }, }, [/code] and I need to extract confident , ashahvasahta from the first record, consumers, upabhaokahtaa from the second record... i.e. word in english and the first word in the probable-translations Thanks is advance Eknath ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers Since I hadn't had a chance to write a recursive descent parser, took this opportunity to do a bit of an exercise. I have used a parser called pyparsing. -- Begin Code -- # coding=utf-8 from pyparsing import * import pprint import sys data = ''' confident = { count = 4, trans = { ashahvasahta = 0.74918568, atahmavaishahvaasa = 0.09095465, pahraaram\.nbha = 0.06990729, mailatae = 0.02856427, utanai = 0.01929341, anaa = 0.01578552, uthaanae = 0.01403157, jaitanae = 0.01227762, }, }, consumers = { count = 4, trans = { upabhaokahtaa = 0.75144362, upabhaokahtaaom\.n = 0.12980166, sauda\�\�\�dha = 0.11875471, }, }, a = { count = 1164, trans = { eka = 0.14900491, kaisai = 0.08834675, haai = 0.06774697, kaoi = 0.05394308, kai = 0.04981982, \(none\) = 0.04400085, kaa = 0.03726579, kae = 0.03446450, }, } ''' # Setup pyparsing tokens dct = Forward() pair_op = Literal(=) comma = Literal(,).suppress() beg_brace = Literal({).suppress() end_brace = Literal(}).suppress() num = Word(0123456789.) key = (Word(alphas + nums) ^ quotedString).setResultsName(key) val = (num ^ dct).setResultsName(value) key_value_pair = Group(key + pair_op.suppress() + val) key_value_pair_list = delimitedList(key_value_pair) dct Group(beg_brace + key_value_pair_list + Optional(comma) + end_brace) # parse data parsed = key_value_pair_list.parseString(data) # function to extract ie. form a python datastructure def extract(result): if 'key' in result.keys() : if isinstance(result.value,ParseResults) : return ( result.key, extract(result.value) ) else : return ( result.key, result.value ) else : return(dict(extract(elem) for elem in result)) # extract extracted = extract(parsed) # print extracted data pprint.pprint(extracted, sys.stdout) # print the english word and first translated word print \n\n\nTranslations\n\n print dict( (english, reduce(lambda x,y : (y[0],float(y[1])) if float(y[1]) x[1] else x , translations['trans'].items(), ('',0.0))[0] ) for english,translations in extracted.items() ) -- End Code -- Dhananjay -- blog: http://blog.dhananjaynene.com twitter: http://twitter.com/dnene http://twitter.com/_pythonic ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers