[Tutor] really basic py/regex
Hi. Trying to quickly get the re.match() to extract the groups from the string. x="MATH 59900/40 [47490] - THE " The regex has to return MATH, 59900, 40,, and 47490 d=re.match(r'(\D+)...) gets the MATH... But I can't see (yet) how to get the rest of what I need... Pointers would be useful. Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] prime factorisation
I think You would do much better if You wrote pseudo code first, i.e. write each step out in words, code is much easier to write following pseudo code Are You trying to factor Prime Numbers? Prime Number factored (Prime Number and 1) https://en.wikipedia.org/wiki/Table_of_prime_factors#1_to_100 https://www.mathsisfun.com/prime-factorization.html ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] vol 166, issue 20, 1. installing python and numpy on the Mac (OSX) (Peter Hodges)
sudo -H python3.6 -m pip install numpy ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] really basic question..
Lord... redid a search just now. found a bunch of sites that said it's doable.. embarrased Not sure what I was looking for earlier.. need r u m! On Sat, Aug 5, 2017 at 11:44 AM, bruce <badoug...@gmail.com> wrote: > Hey guys. > > A really basic question. I have the following: > try: > element = WebDriverWait(driver, > 100).until(EC.presence_of_element_located((By.ID, > "remarketingStoreId"))) > except TimeoutException: > driver.close() > > > I was wondering can I do something like the following to handle > "multiple" exceptions? Ie, have an "except" block that catches all > issues other than the specific TimeoutException. > > try: > element = WebDriverWait(driver, > 100).until(EC.presence_of_element_located((By.ID, > "remarketingStoreId"))) > except TimeoutException: > driver.close() > except : > driver.close() > > > I've looked all over SO, as well as the net in general. I might have > ust missed what I was looking for though. > > Comments?? Thanks much. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] really basic question..
Hey guys. A really basic question. I have the following: try: element = WebDriverWait(driver, 100).until(EC.presence_of_element_located((By.ID, "remarketingStoreId"))) except TimeoutException: driver.close() I was wondering can I do something like the following to handle "multiple" exceptions? Ie, have an "except" block that catches all issues other than the specific TimeoutException. try: element = WebDriverWait(driver, 100).until(EC.presence_of_element_located((By.ID, "remarketingStoreId"))) except TimeoutException: driver.close() except : driver.close() I've looked all over SO, as well as the net in general. I might have ust missed what I was looking for though. Comments?? Thanks much. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] pythonic ascii decoding!
Hi guys. Testing getting data from a number of different US based/targeted websites. So the input data source for the most part, will be "ascii". I'm getting a few "weird" chars every now and then asn as fas as I can tell, they should be utf-8. However, the following hasn;t always worked: s=str(s).decode('utf-8').strip() So, is there a quick/dirty approach I can use to simply strip out the "non-ascii" chars. I know, this might not be the "best/pythonic" way, and that it might result in loss of some data/chars, but I can live with it for now. thoughts/comments ?? thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] basic decorator question
Hi. I've seen sites discuss decorators, as functions that "wrap" and return functions. But, I'm sooo confuzed! My real question though, can a decorator have multiple internal functions? All the examples I've seen so far have a single internal function. And, if a decorator can have multiple internal functions, how would the calling sequence work? But as a start, if you have pointers to any really "basic" step by step sites/examples I can look at, I'd appreciate it. I suspect I'm getting flumoxed by something simple. thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] centos 7 - new setup.. weird python!
Hi. Testing setting up a new Cnntos7 instance. I ran python -v from the cmdline... and instantly got a bunch of the following! Pretty sure this isn't correct. Anyone able to give pointers as to what I've missed. thanks python -v # installing zipimport hook import zipimport # builtin # installed zipimport hook # /usr/lib64/python2.7/site.pyc matches /usr/lib64/python2.7/site.py import site # precompiled from /usr/lib64/python2.7/site.pyc # /usr/lib64/python2.7/os.pyc matches /usr/lib64/python2.7/os.py import os # precompiled from /usr/lib64/python2.7/os.pyc . . . ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] using sudo pip install
Hey guys.. Wanted to get thoughts? On an IRC chat.. someone stated emphatically... Never do a "sudo pip install --upgrade..." The claim was that it could cause issues, enought to seriously (possibly) damage the OS.. So, is this true?? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] subprocess.Popen / proc.communicate issue
Cameron!!! You are 'da man!! Read your exaplanation.. good stuff to recheck/test and investigate over time In the short term, I'll implement some tests!! thanks! On Thu, Mar 30, 2017 at 6:51 PM, Cameron Simpson <c...@zip.com.au> wrote: > I wrote a long description of how .communicate can deadlock. > > Then I read the doco more carefully and saw this: > > Warning: Use communicate() rather than .stdin.write, .stdout.read > or .stderr.read to avoid deadlocks due to any of the other OS > pipe buffers filling up and blocking the child process. > > This suggests that .communicate uses Threads to send and to gather data > independently, and that therefore the deadlock situation may not arise. > > See what lsof and strace tell you; all my other advice stands regardless, > and > the deadlock description may or may not be relevant. Still worth reading and > understanding it when looking at this kind of problem. > > Cheers, > Cameron Simpson <c...@zip.com.au> > > > On 31Mar2017 09:43, Cameron Simpson <c...@zip.com.au> wrote: >> >> On 30Mar2017 13:51, bruce <badoug...@gmail.com> wrote: >>> >>> Trying to understand the "correct" way to run a sys command ("curl") >>> and to get the potential stderr. Checking Stackoverflow (SO), implies >>> that I should be able to use a raw/text cmd, with "shell=true". >> >> >> I strongly recommend avoiding shell=True if you can. It has many problems. >> All stackoverflow advice needs to be considered with caution. However, that >> is not the source of your deadlock. >> >>> If I leave the stderr out, and just use >>>s=proc.communicate() >>> the test works... >>> >>> Any pointers on what I might inspect to figure out why this hangs on >>> the proc.communicate process/line?? >> >> >> When it is hung, run "lsof" on the processes from another terminal i.e. >> lsof the python process and also lsof the curl process. That will make clear >> the connections between them, particularly which file descriptors ("fd"s) >> are associated with what. >> >> The run "strace" on the processes. That shoud show you what system calls >> are in progress in each process. >> >> My expectation is that you will see Python reading from one file >> descriptor and curl writing to a different one, and neither progressing. >> >> Personally I avoid .communicate and do more work myself, largerly to know >> precisely what is going on with my subprocesses. >> >> The difficulty with .communicate is that Python must read both stderr and >> stdout separately, but it will be doing that sequentially: read one, then >> read the other. That is just great if the command is "short" and writes a >> small enough amount of data to each. The command runs, writes, and exits. >> Python reads one and sees EOF after the data, because the command has >> exited. Then Python reads the other and collects the data and sees EOF >> because the command has exited. >> >> However, if the output of the command is large on whatever stream Python >> reads _second_, the command will stall writing to that stream. This is >> because Python is not reading the data, and therefore the buffers fill >> (stdio in curl plus the buffer in the pipe). So the command ("curl") stalls >> waiting for data to be consumed from the buffers. And because it has >> stalled, the command does not exit, and therefore Python does not see EOF on >> the _first_ stream. So it sits waiting for more data, never reading from the >> second stream. >> >> [...snip...] >>> >>> cmd='[r" curl -sS ' >>> #cmd=cmd+'-A "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) >>> Gecko/20100101 Firefox/38.0"' >>> cmd=cmd+"-A '"+user_agent+"'" >>> ##cmd=cmd+' --cookie-jar '+cname+' --cookie '+cname+'' >>> cmd=cmd+' --cookie-jar '+ff+' --cookie '+ff+'' >>> #cmd=cmd+'-e "'+referer+'" -d "'+tt+'" ' >>> #cmd=cmd+'-e "'+referer+'"' >>> cmd=cmd+"-L '"+url1+"'"+'"]' >>> #cmd=cmd+'-L "'+xx+'" ' >> >> >> Might I recommand something like this: >> >> cmd_args = [ 'curl', '-sS' ] >> cmd_args.extend( [ '-A', user_agent ] ) >> cmd_args.extend( [ '--cookie-jar', ff, '--cookie', ff ] ) >> cmd_args.extend( [ '-L', url ] >> >> and using shell=False. This totally avoids any need to "quote" strings in >> the command, becau
[Tutor] test
sent a question earlier.. and got a reply saying it was in the moderation process??? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] subprocess.Popen / proc.communicate issue
Trying to understand the "correct" way to run a sys command ("curl") and to get the potential stderr. Checking Stackoverflow (SO), implies that I should be able to use a raw/text cmd, with "shell=true". If I leave the stderr out, and just use s=proc.communicate() the test works... Any pointers on what I might inspect to figure out why this hangs on the proc.communicate process/line?? I'm showing a very small chunk of the test, but its the relevant piece. Thanks . . . cmd='[r" curl -sS ' #cmd=cmd+'-A "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0"' cmd=cmd+"-A '"+user_agent+"'" ##cmd=cmd+' --cookie-jar '+cname+' --cookie '+cname+'' cmd=cmd+' --cookie-jar '+ff+' --cookie '+ff+'' #cmd=cmd+'-e "'+referer+'" -d "'+tt+'" ' #cmd=cmd+'-e "'+referer+'"' cmd=cmd+"-L '"+url1+"'"+'"]' #cmd=cmd+'-L "'+xx+'" ' try_=1 while(try_): proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE) s,err=proc.communicate() s=s.strip() err=err.strip() if(err==0): try_='' . . . the cmd is generated to be: cmd=[r" curl -sS -A 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; yie8)' --cookie-jar /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp --cookie /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp -L 'http://www6.austincc.edu/schedule/index.php?op=browse=ViewSched=216F000=PCACC=2016=CC'"] test code hangs, ctrl-C generates the following: ^CTraceback (most recent call last): File "/crawl_tmp/austinccFetch_cloud_test.py", line 3363, in ret=fetchClassSectionFacultyPage(a) File "/crawl_tmp/austinccFetch_cloud_test.py", line 978, in fetchClassSectionFacultyPage (s,err)=proc.communicate() File "/usr/lib64/python2.6/subprocess.py", line 732, in communicate stdout, stderr = self._communicate(input, endtime) File "/usr/lib64/python2.6/subprocess.py", line 1328, in _communicate stdout, stderr = self._communicate_with_poll(input, endtime) File "/usr/lib64/python2.6/subprocess.py", line 1400, in _communicate_with_poll ready = poller.poll(self._remaining_time(endtime)) KeyboardInterrupt This works from the cmdline: curl -sS -A 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; yie8)' --cookie-jar /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp --cookie /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp -L 'http://www6.austincc.edu/schedule/index.php?op=browse=ViewSched=216F000=PCACC=2016=CC' ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] implementing sed - termination error
Hi Running a test on a linux box, with python. Trying to do a search/replace over a file, for a given string, and replacing the string with a chunk of text that has multiple lines. >From the cmdline, using sed, no prob. however, implementing sed, runs into issues, that result in a "termination error" The error gets thrown, due to the "\" of the newline. SO, and other sites have plenty to say about this, but haven't run across any soln. The test file contains 6K lines, but, the process requires doing lots of search/replace operations, so I'm interested in testing this method to see how "fast" the overall process is. The following psuedo code is what I've used to test. The key point being changing the "\n" portion to try to resolved the termination error. import subprocess ll_="ffdfdfdfg" ll2_="12112121212121212" hash="a" data_=ll_+"\n"+ll2_+"\n"+qq22_ print data_ cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname print cc proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE) res=proc.communicate()[0].strip() === error sed: -e expression #1, char 38: unterminated `s' command ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] a bit off topic.. - more of a high level arch question!
Hi. Thinking of a situation where I have two "processes" running. They each want to operate on a list of files in the dir on a first come first operate basis. Once a process finishes with the file, it deletes it. Only one process operates on a file. I'm curious for ideas/thoughts. As far as I can tell, using some sort of PID/Lock file is "the" way of handling this. ProcessA looks to see if the PIDFile is in use, If it is, I wait a "bit" if the PIDFile is "empty", I set it an proceed --when I finish my work, i reset the PIDFile As long as both/all processes follow this logic, things should work, unless you get a "race" condition on the PIDFile.. Any thoughts on how you might handle this kind of situation, short of having a master process, that forks/spawns of children, with the master iterating through the list of files.. Thanks.. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] selenium bindings...
Hi. This is prob way off topic. Looking at web examples from different sites for selenium/python bindings. Basically, trying to get an understanding of how to get the "page" content of a page, after an implicit/explicit wait. I can see how to get an element, but can't see any site that describes how to get the complete page... As an example of getting an element... from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.Firefox() driver.get("http://somedomain/url_that_delays_loading;) try: element = WebDriverWait(driver, 10).until( EC.presence_of_element_located((By.ID, "myDynamicElement")) ) finally: driver.quit() But, as to getting the complete page, in the "try".. no clue. Any thoughts/pointers?? thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] xpath - html entities issue --
Hi. Just realized I might have a prob with testing a crawl. I get a page of data via a basic curl. The returned data is html/charset-utf-8. I did a quick replace ('','&') and it replaced the '' as desired. So the content only had '&' in it.. I then did a parseString/xpath to extract what I wanted, and realized I have '' as representative of the '&' in the returned xpath content. My issue, is there a way/method/etc, to only return the actual char, not the html entiy () I can provide a more comprehensive chunk of code, but minimized the post to get to the heart of the issue. Also, I'd prefer not to use a sep parse lib. code chunk import libxml2dom q1=libxml2dom s2= q1.parseString(a.toString().strip(), html=1) tt=s2.xpath(tpath) tt=tt[0].toString().strip() print "tit "+tt - the content of a.toString() (shortened) . . . Organization Development & Change Edition: 10th . . . the xpath results are Organization Development Change Edition: 10th As you can see.. in the results of the xpath (toString()) the & --> I'm wondering if there's a process that can be used within the toString() or do you really have to wrap each xpath/toString with a unescape() kind of process to convert htmlentities to the requisite chars. Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unicode decode/encode issue
Hey folks. (peter!) Thanks for the reply. I wound up doing: #s=s.replace('\u2013', '-') #s=s.replace(u'\u2013', '-') #s=s.replace(u"\u2013", "-") #s=re.sub(u"\u2013", "-", s) s=s.encode("ascii", "ignore") s=s.replace(u"\u2013", "-") s=s.replace("", "-") ##<<< this was actually in the raw content apparently print repr(s) The test no longer has the unicode 'dash' I'll revisit and simplify later. One or two of the above ines should be able to be removed, and still have the unicode issue resolved. Thanks On Mon, Sep 26, 2016 at 1:54 PM, Peter Otten <__pete...@web.de> wrote: > bruce wrote: > > > Hi. > > > > Ive got a "basic" situation that should be simpl. So it must be a user > > (me) issue! > > > > > > I've got a page from a web fetch. I'm simply trying to go from utf-8 to > > ascii. I'm not worried about any cruft that might get stripped out as the > > data is generated from a us site. (It's a college/class dataset). > > > > I know this is a unicode issue. I know I need to have a much more > > robust/ythnic/correct approach. I will later, but for now, just want to > > resolve this issue, and get it off my plate so to speak. > > > > I've looked at stackoverflow, as well as numerous other sites, so I turn > > to the group for a pointer or two... > > > > The unicode that I'm dealing with is 'u\2013' > > > > The basic things I've done up to now are: > > > > s=content > > s=ascii_strip(s) > > s=s.replace('\u2013', '-') > > s=s.replace(u'\u2013', '-') > > s=s.replace(u"\u2013", "-") > > s=re.sub(u"\u2013", "-", s) > > print repr(s) > > > > When I look at the input content, I have : > > > > u'English 120 Course Syllabus \u2013 Fall \u2013 2006' > > > > So, any pointers on replacing the \u2013 with a simple '-' (dash) (or I > > could even handle just a ' ' (space) > > I suppose you want to replace the DASH with HYPHEN-MINUS. For that both > > > s=s.replace(u'\u2013', '-') > > s=s.replace(u"\u2013", "-") > > should work (the Python interpreter sees no difference between the two). > Let's try: > > >>> s = u'English 120 Course Syllabus \u2013 Fall \u2013 2006' > >>> t = s.replace(u"\u2013", "-") > >>> s == t > False > >>> s > u'English 120 Course Syllabus \u2013 Fall \u2013 2006' > >>> t > u'English 120 Course Syllabus - Fall - 2006' > > So it look like you did not actually try the code you posted. > > To remove all non-ascii codepoints you can use encode(): > > >>> s.encode("ascii", "ignore") > 'English 120 Course Syllabus Fall 2006' > > (Note that the result is a byte string) > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] unicode decode/encode issue
Hi. Ive got a "basic" situation that should be simpl. So it must be a user (me) issue! I've got a page from a web fetch. I'm simply trying to go from utf-8 to ascii. I'm not worried about any cruft that might get stripped out as the data is generated from a us site. (It's a college/class dataset). I know this is a unicode issue. I know I need to have a much more robust/ythnic/correct approach. I will later, but for now, just want to resolve this issue, and get it off my plate so to speak. I've looked at stackoverflow, as well as numerous other sites, so I turn to the group for a pointer or two... The unicode that I'm dealing with is 'u\2013' The basic things I've done up to now are: s=content s=ascii_strip(s) s=s.replace('\u2013', '-') s=s.replace(u'\u2013', '-') s=s.replace(u"\u2013", "-") s=re.sub(u"\u2013", "-", s) print repr(s) When I look at the input content, I have : u'English 120 Course Syllabus \u2013 Fall \u2013 2006' So, any pointers on replacing the \u2013 with a simple '-' (dash) (or I could even handle just a ' ' (space) thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Unable to download , using Beautifulsoup
Hey Alan... Wow APIs.. yeah.. would be cool!!! I've worked on scraping data from lots of public sites, that have no issue (as long as you're kind) that have no clue/resource regarding offerning APIs. However, yeah, if you''re looking to "rip" off a site that has adverts, prob not a cool thing to do, no matter what tools are used. On Fri, Jul 29, 2016 at 6:59 PM, Alan Gauld via Tutor <tutor@python.org> wrote: > On 29/07/16 23:10, bruce wrote: > > > The most "complete" is the use of a headless browser. However, the > > use/implementation of a headless browser has its' own share of issues. > > Speed, complexity, etc... > > Walter and Bruce have jumped ahead a few steps from where I was > heading but basically it's an increasingly common scenario where > web pages are no longer primarily html but rather are > Javascript programs that fetch data dynamically. > > A headless browser is the brute force way to deal with such issues > but a better (purer?) way is to access the same API that the browser > is using. Many web sites now publish RESTful APIs with web > services that you can call directly. It is worth investigating > whether your target has this. If so that will generally provide > a much nicer solution than trying to drive a headless browser. > > Finally you need to consider whether you have the right to the > data without running a browser? Many sites provide information > for free but get paid by adverts. If you bypass the web screen > (adverts) you bypass their revenue and they do not allow that. > So you need to be sure that you are legally entitled to scrape > data from the site or use an API. > > Otherwise you may be on the wrong end of a law suite, or at > best be contributing to the demise of the very site you are > trying to use. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Unable to download , using Beautifulsoup
In following up/on what Walter said. If the browser without cookies/javascript enabled doesn't generate the content, you need to have a different approach. The most "complete" is the use of a headless browser. However, the use/implementation of a headless browser has its' own share of issues. Speed, complexity, etc... A potentially better/useful method is to view/look at the traffic (livehttpheaders for Firefox) to get a feel for exactly what the browser requires. At the same time, view the subordinate jscript functions. I've found it's often enough to craft the requisite cookies/curl functions in order to simulate the browser data. In a few cases though, I've run across situations where a headless browser is the only real soln. On Fri, Jul 29, 2016 at 3:28 AM, Crusierwrote: > I am using Python 3 on Windows 7. > > However, I am unable to download some of the data listed in the web > site as follows: > > http://data.tsci.com.cn/stock/00939/STK_Broker.htm > > 453.IMC 98.28M 18.44M 4.32 5.33 1499.Optiver 70.91M 13.29M 3.12 5.34 > 7387.花旗环球 52.72M 9.84M 2.32 5.36 > > When I use Google Chrome and use 'View Page Source', the data does not > show up at all. However, when I use 'Inspect', I can able to read the > data. > > '1453.IMC' > '98.28M' > '18.44M' > '4.32' > '5.33' > > '1499.Optiver ' > ' 70.91M' > '13.29M ' > '3.12' > '5.34' > > Please kindly explain to me if the data is hide in CSS Style sheet or > is there any way to retrieve the data listed. > > Thank you > > Regards, Crusier > > from bs4 import BeautifulSoup > import urllib > import requests > > > > > stock_code = ('00939', '0001') > > def web_scraper(stock_code): > > broker_url = 'http://data.tsci.com.cn/stock/' > end_url = '/STK_Broker.htm' > > for code in stock_code: > > new_url = broker_url + code + end_url > response = requests.get(new_url) > html = response.content > soup = BeautifulSoup(html, "html.parser") > Buylist = soup.find_all('div', id ="BuyingSeats") > Selllist = soup.find_all('div', id ="SellSeats") > > > print(Buylist) > print(Selllist) > > > > web_scraper(stock_code) > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Counting and grouping dictionary values in Python 2.7
On Fri, Jul 8, 2016 at 1:33 PM, Alan Gauld via Tutor <tutor@python.org> wrote: > On 08/07/16 14:22, Bruce Dykes wrote: > > > with it is writing the list of dictionaries to a .csv file, and to date, > > we've been able to get by doing some basic analysis by simply using grep > > and wc, but I need to do more with it now. > > I'm a big fan of using the right tool for the job. > If you got your data in CSV have you considered using a > spreadsheet to read the data and analyse it? They have lots > of formulae and stats functions built in and can do really > cool graphs etc and can read csv files natively. > > Python might be a better tool if you want regular identical reports, say > on a daily basis, but for ad-hoc analysis, or at least till you know > exactly what you need, Excel or Calc are possibly better tools. > > > We can and have used spreadsheets for small ad-hoc things, but no, we need two things, first, as noted, a daily report with various basic analyses, mainly totals, and percentages, and second, possibly, some near-current alarm checks, depending. That's less important, actually, but it might be a nice convenience. In the first instance, we want the reports to be accessed and displayed as web pages. Now, likewise, I'm sure there's a CMS that might make semi-quick work of this as well, but really, all I need to do is to display some web pages and run some cgi scripts. bkd ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Counting and grouping dictionary values in Python 2.7
I'm compiling application logs from a bunch of servers, reading the log entries, parsing each log entry into a dictionary, and compiling all the log entries into a single list of dictionaries. At present, all I'm doing with it is writing the list of dictionaries to a .csv file, and to date, we've been able to get by doing some basic analysis by simply using grep and wc, but I need to do more with it now. Here's what the data structures look like: NY = ['BX01','BX02','BK01','MN01','SI01'] NJ = ['NW01','PT01','PT02'] CT = ['ST01','BP01','NH01'] sales = [ {'store':'store','date':'date','time':'time','state':'state',transid':'transid','product':'product','price':'price'}, {'store':'BX01','date':'8','time':'08:55','state':'NY',transid':'387','product':'soup','price':'2.59'}, {'store':'NW01','date':'8','time':'08:57','state':'NJ',transid':'24','product':'apples','price':'1.87'}, {'store':'BX01','date':'8','time':'08:56','state':'NY',transid':'387','product':'crackers','price':'3.44'}] The first group of list with the state abbreviations is there to add the state information to the compiled log, as it's not included in the application log. The first dictionary in the list, with the duplicated key names in the value field is there to provide a header line as the first line in the compiled .csv file. Now, what I need to do with this arbitrarily count and total the values in the dictionaries, ie the total amount and number of items for transaction id 387, or the total number of crackers sold in NJ stores. I think the collections library has the functions I need, but I haven't been able to grok the examples uses I've seen online. Likewise, I know I could build a lot of what I need using regex and lists, etc, but if Python 2.7 already has the blocks there to be used, well let's use the blocks then. Also, is there any particular advantage to pickling the list and having two files, one, the pickled file to be read as a data source, and the .csv file for portability/readability, as opposed to just a single .csv file that gets reparsed by the reporting script? Thanks in advance bkd ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] decorators -- treat me like i'm 6.. what are they.. why are they?
Hi. Saw the decorator thread earlier.. didn't want to pollute it. I know, I could google! But, what are decorators, why are decorators? who decided you needed them! Thanks! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] simple regex question
Hi. I have a chunk of text code, which has multiple lines. I'd like to do a regex, find a pattern, and in the line that matches the pattern, mod the line. Sounds simple. I've created a test regex. However, after spending time/google.. can't quite figure out how to then get the "complete" line containing the returned regex/pattern. Pretty sure this is simple, and i'm just missing something. my test "text" and regex are: s=''' ACCT2081''' pattern = re.compile(r'Course\S+|\S+\|') aa= pattern.search(s).group() print "sss" print aa so, once I get the group, I'd like to use the returned match to then get the complete line.. pointers/thoughts!! (no laughing!!) thanks guys.. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Py/selenium bindings
Hi. This might be a bit beyond the group, but I figured no harm/no foul. I'm looking to access a site that's generated via javascript. The jscript of the site is invoked via the browser, generating the displayed content. I'm testing using py/selenium bindings as docs indicate that the py/selenium/PhantomJS browser (headless) combination should invoke the jscript, and result in the required content. However, I can't seem to generte anything other than the initial encrypted page/content. Any thoughts/comments would be useful. The test script is: #!/usr/bin/python #- # #FileName: #udel_sel.py # # #- #test python script import subprocess import re import libxml2dom import urllib import urllib2 import sys, string import time import os import os.path from hashlib import sha1 from libxml2dom import Node from libxml2dom import NodeList import hashlib import pycurl import StringIO import uuid import simplejson import copy from selenium import webdriver # if __name__ == "__main__": # main app url=" http://udel.bncollege.com/webapp/wcs/stores/servlet/TBListView?storeId=37554=Y=10001=-1=%3C%3Fxml+version%3D%221.0%22%3F%3E%3Ctextbookorder%3E%3Cschool+id%3D%22289%22+%2F%3E%3Ccourses%3E%3Ccourse+num%3D%22200%22+dept%3D%22ACCT%22+sect%3D%22010%22+term%3D%222163%22%2F%3E%3C%2Fcourses%3E%3C%2Ftextbookorder%3E " driver = webdriver.PhantomJS() driver.get(url) xx=driver.page_source print xx sys.exit() --- ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] really basic - finding multiline chunk within larger chunk
hmm... Ok. For some reason, it appears to be a whitespace issue, which is what I thought. The basic process that was used to get the subchunk to test for, was to actually do a copy/cut/paste of the subtext from the master text, and then to write the code to test. Yeah, testing for "text" with whitespaces/multiline can be fragile. And yeah, the text might have been from the 90s but that's irrelevant! Thanks for confirming what I thought. Thanks also for the sample code as well. I might just wind up stripping tabs/spaces and joining on space to pre massage the content prior to handling it.. 'ppreciate it guys/gals! On Wed, Feb 17, 2016 at 12:02 AM, Danny Yoowrote: > (Ah, I see that Joel Goldstick also was able to do the search > successfully; sorry about missing your message Joel!) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] really basic - finding multiline chunk within larger chunk
Hi. I've got a test, where I have a chunk of text "a" and a subset of text "s2a". the subset is multiline. For some reason, can't seem to return true on the find. I've pasted in http://fpaste.org/323521/, but the following is an example as well. (not sure if the psuedo code listed actually shows the chunks of text with the whitespace correctly. Obviously, this is an "example" not meant to be real code!! Thoughts on what I've screwed up? Thanks aa=''' Retail Price Less than $10 Required Yes Used During Full Term Copies on Reserve in Libraries No ''' --- as a test: s2a=''' Required Yes ''' if (aa.find(s2a)>0): print "here ppp \n" else: print "err \n" sys.exit() ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Creating a webcrawler
Hi Isac. I'm not going to get into the pythonic stuff.. People on the list are way better than I. I've been doing a chunk of crawling, it's not too bad, depending on what you're trying to accomplish and the site you're targeting. So, no offense, but I'm going to treat you like a 6 year old (google it - from a movie!) You need to back up, and analyze the site/pages/structure you're going after. Use the tools - firefox - livehttpheaders/nettraffic/etc.. -you want to be able to see what the exchange is between the client/browser, as well as the server.. -often, this gives you the clues/insite to crafting the request from your client back to the server for the item/data you're going for... Once you've gotten that together, setup the basic process with wget/curl etc to get a feel for any weird issues - cert issues? -security issues - are cookies required - etc.. A good deal of this stuff can be resolved/checked out at this level, without jumping into coding.. Once you're comfortable at this point, you can crank out some simple code to go after the site you're targeting. In the event you really have a javascript/dynamic site that you can't handle in any other manner, you're going to need to go use a 'headless browser' process. There are a number of headless browser projects - I think most run on the webit codebase (don't quote me). Casper/phantomjs, there are also pythonic implementations as well... So, there you go, should/hopefully this will get you on your way! On Fri, Jan 8, 2016 at 9:01 PM, Whom Isacwrote: > Hi I want to create a web-crawler but dont have any lead to choose any > module. I have came across the Jsoup but I am not familiar with how to use > it in 3.5 as I tried looking at a similar web crawler codes from 3.4 dev > version. > I just want to build that crawler to crawl through a javascript enable site > and automatically detect a download link (for video file) > . > And should I be using pickles to write the data in the text file/ save file. > Thanks > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] idle??
Hey guys/gals - list readers Recently came across someone here mentioning IDLE!! -- not knowing this. I hit google for a look. Is IDLE essentially an ide for doing py dev? I see there's a windows/linux (rpms) for it. I'm running py.. I normally do $$python to pop up the py env for quick tests.. and of course run my test scripts/apps from the cmdline via ./foo.py... So, where does IDLE fit into this Thanks (and yeah, I know I could continue to look at google, and even install the rpms to really check it out!!) tia!! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] idle??
Thanks Alan... So, as an IDE/shell.. I assume it's not quite Eclipse, butallows you to do reasonable editing/snyax tracking/etc.. as well as run apps within the window/shell.. I assume breakpoints as well, and a good chunk of the rest of the usual IDE functions... What about function completion? Where I type a function.. and it displays a "list" of potential function/defs ? Does it provide "function" or item hoovering. where cursor can be placed of a function/item and information about the func, or item (type/struct/etc..) is displayed? Thanks again' much appreciated!! On Fri, Jan 8, 2016 at 6:42 PM, Alan Gauld <alan.ga...@btinternet.com> wrote: > On 08/01/16 19:07, bruce wrote: > >> Is IDLE essentially an ide for doing py dev? I see there's a >> windows/linux (rpms) for it. > > Yes, its the official IDE for Python. > > There is an "unofficial" version called xidle which tends > to get a lot of the new stuff before it makes it into the > official release. For a long time not much happened with > IDLE but recently there has been a bunch of activity so > I'm hopeful we may soon see some new features appearing. > >> So, where does IDLE fit into this > > It incorporates a shell window where you can type commands > and you can create blank editor windows(with syntax > highlighting etc etc) from which you can save files, > run them, debug them etc. > > There are some YouTube and ShowMeDo videos around and > Danny Yoo has a short tutorial that is quite old but > still pretty much applicable. > > There is official documentation on the python.org > website too. > > Finally, it's not universally loved and definitely has > some quirks but it's adequate for getting started, > definitely better than notepad, say, on Windows. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parser recommendations (was Re: Tutor Digest, Vol 142, Issue 11)
beautifulsoup, selenium + PhantomJS, and dryscrape no knowledge of dryscape, never used it. The other tools/apps are used to handle/parse html/websites. Ssoup can handle xml/html as well as other input structs. Good for being able to parse the resulting struct/dom to extract data, or to change/modify the struct itself. Selenium is a framework, acting as a browser env, allowing you to 'test' the site/html. It's good for certain uses regarding testing. Phantomjs/casperjs are exxentially headless broswers, allow you to also run/parse websites. While Soup is more for static, Phantom because it's an actual headless browser, allows you to deal with dynamic sites as well as static. On Mon, Dec 14, 2015 at 2:56 PM, Alan Gauldwrote: > On 14/12/15 16:16, Crusier wrote: > > Please always supply a useful subject line when replying to the digest > and also delete all irrelevant text. Some people pay by the byte and we > have all received these messages already. > >> Thank you very much for answering the question. If you don't mind, >> please kindly let me know which library I should focus on among >> beautifulsoup, selenium + PhantomJS, and dryscrape. > > I don't know anything about the others but Beautiful soup > is good for html, especially badly written/generated html. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Beautiful Soup
Hey Crusier/ (And Others...) For your site... As Alan mentioned, its a mix of html/jscript/etc.. So, you're going (or perhaps should) need to extract just the json/struct that you need, and then go from there. I speak of experience, as I've had to hande a number of sites that are essentially just what you have. Here's a basic guide to start: --I use libxml, simplejson fetch the page in the page, do a split, to get the exact json (string) that you want. -you'll do to splits, 1st gets rid of extra pre json stuff 2nd gets rid of extra post json stuf that you don't need --at this point, you should have the json string you need, or you should be pretty close.. -now, you might need to "pretty" up what you have as py/json only accepts key/value in certain format single/double quotes, etc.. once you've gotten this far, you might actually have the json string, in which case, you can load it directly into the json, and proceed as you wish. you might also find that what you have, is really a py dictionary, and you can handle that as well! Have fun, let us know if you have issues... On Sun, Dec 13, 2015 at 2:44 AM, Crusierwrote: > Dear All, > > I am trying to scrap the following website, however, I have > encountered some problems. As you can see, I am not really familiar > with regex and I hope you can give me some pointers to how to solve > this problem. > > I hope I can download all the transaction data into the database. > However, I need to retrieve it first. The data which I hope to > retrieve it is as follows: > > " > 15:59:59 A 500 6.790 3,395 > 15:59:53 B 500 6.780 3,390 > > Thank you > > Below is my quote: > > from bs4 import BeautifulSoup > import requests > import re > > url = > 'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881=F=09=16=S=44c99b61679e019666f0570db51ad932=0=0' > > def turnover_detail(url): > response = requests.get(url) > html = response.content > soup = BeautifulSoup(html,"html.parser") > data = soup.find_all("script") > for json in data: > print(json) > > turnover_detail(url) > > Best Regards, > Henry > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] ascii to/from AL32UTF8 conversion
Hi. Doing a 'simple' test with linux command line curl, as well as pycurl to fetch a page from a server. The page has a charset of >>AL32UTF8. Anyway to conert this to straight ascii. Python is throwing a notice/error on the charset in another part of the test.. The target site is US based, so there's no weird chars in it.. I suspect that the page/system is based on legacy oracle The metadata of the page is I tried the usual foo = foo.decode('utf-8') foo = foo.decode('ansii') etc.. but no luck. Thanks for any pointers/help ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] ncurses question
Hi. Looking over various sites on ncurses. Curious. I see various chunks of code for creating multiple windows.. But I haven't seen any kind of example that shows how to 'move' or switch between multiple windows. Anyone have any code sample, or any tutorial/site that you could point me to! I'm thinking of putting together a simple test to be able to select between a couple of test windows, select the given field in the window, and then generate the results in a lower window based on what's selected.. Just curious. Any pointers, greatly appreciated. Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Scraping Wikipedia Table (The retruned file is empty)
my $0.02 for what it might be worth.. You have some people on the list who are straight out begineers who might be doing cut/copy/paste from 'bad code'. You have people coming from other languages.. and then you have some who are trying to 'get through' something, who aren't trying to be the dev!! And yeah, more time, could easily (in most cases) provide an answer, but sometimes, you just want to get a soln, and move on to the other 99 probs (no offense jay z!!) you guys have been a godsend at times! thanks - keep up the good fight/work. On Sun, Oct 25, 2015 at 9:56 PM, Alan Gauldwrote: > On 24/10/15 00:15, Mark Lawrence wrote: > >>> Looking more at the code... >>> >>> > for x in range(len(drama_actor)): >>> >>> This looks unusual... >> >> >> A better question IMHO is "where did you learn to write code like that >> in the first place", as I've seen so many examples of this that I cannot >> understand why people bother writing Python tutorials, as they clearly >> don't get read? >> > > I think its just a case of bad habits from other languages being > hard to shake off. If your language doesn't have a for-each operator then > its hard to wrap your brain around any other kind of for loop > than one based on indexes. > > It's a bit like dictionaries. They are super powerful but beginners coming > from other languages nearly always start out using > arrays(ie lists) and trying to "index" them by searching which > is hugely more complex, but it's what they are used too. > > JavaScript programmers tend to think the same about Python > programmers who insist on writing separate functions for > call backs rather than just embedding an anonymous function. > But Python programmers are used to brain dead lambdas with > a single expression so they don't tend to think about > embedding a full function. Familiarity with an idiom makes > it easier to stick with what you know than to try something new. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] generate a list/dict with a dynamic name..
Hi. I can do a basic a=[] to generate a simple list.. i can do a a="aa"+bb" how can i do a a=[] where a would have the value of "aabb" in other words, generate a list/dict with a dynamically generated name IRC replies have been "don't do it".. or it's bad.. but no one has said you can do it this way.. just curious.. thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] aws/cloud questions..
Evening group! Hope wee'all doing well, having fun. yada yada..!! I'm considering taking a dive into the cloud with an app that would be comprised of distributed machines, running py apps, talking to db on different server(s), etc.. So, I was wondering if anyone has good docs/tutorials/walk through(s) that you can provide, or even if someone is willing to play the role of online mentor/tutor!! Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Parsing/Crawling test College Class Site.
Hi. I'm creating a test py app to do a quick crawl of a couple of pages of a psoft class schedule site. Before I start asking questions/pasting/posting code... I wanted to know if this is the kind of thing that can/should be here.. The real issues I'm facing aren't so much pythonic as much as probably dealing with getting the cookies/post attributes correct. There's ongoing jscript on the site, but I'm hopeful/confident :) that if the cookies/post is correct, then the target page can be fetched.. If this isn't the right list, let me know! And if it is, I'll start posting.. Thanks -bd ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] trying to convert pycurl/html to ascii
Hi. Doing a quick/basic pycurl test on a site and trying to convert the returned page to pure ascii. The page has the encoding line meta http-equiv=Content-Type content=text/html;charset=ISO-8859-1 The test uses pycurl, and the StringIO to fetch the page into a str. pycurl stuff . . . foo=gg.getBuffer() -at this point, foo has the page in a str buffer. What's happening, is that the test is getting the following kind of error/ UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 20: invalid start byte The test is using python 2.6 on redhat. I've tried different decode functions based on different sites/articles/stackoverflow but can't quite seem to resolve the issue. Any thoughts/pointers would be useful! Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Installing twisted
Hey... When you get this resolved.. if you don't mind.. post the soln back here!! thanks ps. I know, not strictly a py language issue.. but might really help someone struggling to solve the same issue! On Wed, Nov 26, 2014 at 7:45 PM, Gary gwengst...@yahoo.com.dmarc.invalid wrote: Hi all, I have been trying to install the zope interface as part of the twisted installation with no luck. Any suggestions ? Sent from my iPad ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] try/exception - error block
Hi. I have a long running process, it generates calls to a separate py app. The py app appears to generate errors, as indicated in the /var/log/messages file for the abrtd daemon.. The errors are intermittent. So, to quickly capture all possible exceptions/errors, I decided to wrap the entire main block of the test py func in a try/exception block. This didn't work, as I'm not getting any output in the err file generated in the exception block. I'm posting the test code I'm using. Pointers/comments would be helpful/useful. the if that gets run is the fac1 logic which operates on the input packet/data.. elif (level=='collegeFaculty1'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList1(url,content) Thanks. if __name__ == __main__: # main app try: #college=asu #url=https://webapp4.asu.edu/catalog; #termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext; #termVal=2141 # # get the input struct, parse it, determine the level # #cmd='cat /apps/parseapp2/asuclass1.dat' #print cmd= +cmd #proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) #content=proc.communicate()[0].strip() #print content #sys.exit() #s=getClasses(content) #print arg1 =,sys.argv[0] if(len(sys.argv)2): print error\n sys.exit() a=sys.argv[1] aaa=a # # data is coming from the parentApp.php #data has been rawurlencode(json_encode(t)) #-reverse/split the data.. #-do the fetch, #-save the fetched page/content if any #-create the returned struct #-echo/print/return the struct to the # calling parent/call # ##print urllib.unquote_plus(a).decode('utf8') #print \n #print simplejson.loads(urllib.unquote_plus(a)) z=simplejson.loads(urllib.unquote_plus(a)) ##z=simplejson.loads(urllib.unquote(a).decode('utf8')) #z=simplejson.loads(urllib2.unquote(a).decode('utf8')) #print aa \n print z #print \n bb \n # #-passed in # url=str(z['currentURL']) level=str(z['level']) cname=str(z['parseContentFileName']) # # need to check the contentFname # -should have been checked in the parentApp # -check it anyway, return err if required # -if valid, get/import the content into # the content var for the function/parsing # ##cmd='echo ${yolo_clientFetchOutputDir}/' cmd='echo ${yolo_clientParseInputDir}/' #print cmd= +cmd proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) cpath=proc.communicate()[0].strip() cname=cpath+cname #print cn = +cname+\n #sys.exit() cmd='test -e '+cname+' echo 1' #print cmd= +cmd proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) c1=proc.communicate()[0].strip() if(not c1): #got an error - process it, return print error in parse # # we're here, no err.. got content # #fff= sdsu2.dat with open(cname,r) as myfile: content=myfile.read() myfile.close() #-passed in #college=louisville #url=http://htmlaccess.louisville.edu/classSchedule/; #termVal=4138 #print term = +str(termVal)+\n #print url = +url+\n #jtest() #sys.exit() #getTerm(url,college,termVal) ret={} # null it out to start if (level=='rState'): #ret=getTerm(content,termVal) ret=getParseStates(content) elif (level=='stateCollegeList'): #getDepts(url,college, termValue,termName) ret=getParseStateCollegeList(url,content) elif (level=='collegeFaculty1'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList1(url,content) elif (level=='collegeFaculty2'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList2(content) # # the idea of this section.. we have the resulting # fetched content/page... # a={} status=False if(ret['status']==True): s=ascii_strip(ret['data']) if(((s.find(/html)-1) or (s.find(/HTML)-1)) and ((s.find(html)-1) or (s.find(HTML)-1)) and level=='classSectionDay'): status=True #print herh #sys.exit() # # build the returned struct # # a['Status']=True a['recCount']=ret['count'] a['data']=ret['data'] a['nextLevel']='' a['timestamp']='' a['macAddress']='' elif(ret['status']==False): a['Status']=False a['recCount']=0 a['data']='' a['nextLevel']='' a['timestamp']='' a['macAddress']='' res=urllib.quote(simplejson.dumps(a)) ##print res name=subprocess.Popen('uuidgen -t', shell=True,stdout=subprocess.PIPE) name=name.communicate()[0].strip()
Re: [Tutor] try/exception - error block
chris.. my bad.. I wasnt intending to mail you personally. Or I wouldn't have inserted the thanks guys! thanks guys... but in all that.. no one could tell me .. why i'm not getting any errs/exceptions in the err file which gets created on the exception!!! but thanks for the information on posting test code! Don't email me privately - respond to the list :) Also, please don't top-post. ChrisA On Sun, Aug 3, 2014 at 10:29 AM, bruce badoug...@gmail.com wrote: Hi. I have a long running process, it generates calls to a separate py app. The py app appears to generate errors, as indicated in the /var/log/messages file for the abrtd daemon.. The errors are intermittent. So, to quickly capture all possible exceptions/errors, I decided to wrap the entire main block of the test py func in a try/exception block. This didn't work, as I'm not getting any output in the err file generated in the exception block. I'm posting the test code I'm using. Pointers/comments would be helpful/useful. the if that gets run is the fac1 logic which operates on the input packet/data.. elif (level=='collegeFaculty1'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList1(url,content) Thanks. if __name__ == __main__: # main app try: #college=asu #url=https://webapp4.asu.edu/catalog; #termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext; #termVal=2141 # # get the input struct, parse it, determine the level # #cmd='cat /apps/parseapp2/asuclass1.dat' #print cmd= +cmd #proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) #content=proc.communicate()[0].strip() #print content #sys.exit() #s=getClasses(content) #print arg1 =,sys.argv[0] if(len(sys.argv)2): print error\n sys.exit() a=sys.argv[1] aaa=a # # data is coming from the parentApp.php #data has been rawurlencode(json_encode(t)) #-reverse/split the data.. #-do the fetch, #-save the fetched page/content if any #-create the returned struct #-echo/print/return the struct to the # calling parent/call # ##print urllib.unquote_plus(a).decode('utf8') #print \n #print simplejson.loads(urllib.unquote_plus(a)) z=simplejson.loads(urllib.unquote_plus(a)) ##z=simplejson.loads(urllib.unquote(a).decode('utf8')) #z=simplejson.loads(urllib2.unquote(a).decode('utf8')) #print aa \n print z #print \n bb \n # #-passed in # url=str(z['currentURL']) level=str(z['level']) cname=str(z['parseContentFileName']) # # need to check the contentFname # -should have been checked in the parentApp # -check it anyway, return err if required # -if valid, get/import the content into # the content var for the function/parsing # ##cmd='echo ${yolo_clientFetchOutputDir}/' cmd='echo ${yolo_clientParseInputDir}/' #print cmd= +cmd proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) cpath=proc.communicate()[0].strip() cname=cpath+cname #print cn = +cname+\n #sys.exit() cmd='test -e '+cname+' echo 1' #print cmd= +cmd proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE) c1=proc.communicate()[0].strip() if(not c1): #got an error - process it, return print error in parse # # we're here, no err.. got content # #fff= sdsu2.dat with open(cname,r) as myfile: content=myfile.read() myfile.close() #-passed in #college=louisville #url=http://htmlaccess.louisville.edu/classSchedule/; #termVal=4138 #print term = +str(termVal)+\n #print url = +url+\n #jtest() #sys.exit() #getTerm(url,college,termVal) ret={} # null it out to start if (level=='rState'): #ret=getTerm(content,termVal) ret=getParseStates(content) elif (level=='stateCollegeList'): #getDepts(url,college, termValue,termName) ret=getParseStateCollegeList(url,content) elif (level=='collegeFaculty1'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList1(url,content) elif (level=='collegeFaculty2'): #getClasses(url, college, termVal,termName,deptName,deptAbbrv) ret=getParseCollegeFacultyList2(content) # # the idea of this section.. we have the resulting # fetched content/page... # a={} status=False if(ret['status']==True): s=ascii_strip(ret['data']) if(((s.find(/html)-1) or (s.find(/HTML)-1)) and ((s.find(html)-1) or (s.find(HTML)-1)) and level=='classSectionDay'): status=True #print
Re: [Tutor] try/exception - error block
Hi Alan. Yep, the err file in the exception block gets created. and the weird thing is it matches the time of the abrtd information in the /var/log/messages log.. Just nothing in the file! On Sun, Aug 3, 2014 at 4:01 PM, Alan Gauld alan.ga...@btinternet.com wrote: On 03/08/14 18:52, bruce wrote: but in all that.. no one could tell me .. why i'm not getting any errs/exceptions in the err file which gets created on the exception!!! Does the file actually get created? Do you see the print statement output - are they what you expect? Did you try the things Steven suggested. except Exception, e: print e print pycolFac1 - error!! \n; name=subprocess.Popen('uuidgen -t', shell=True,stdout=subprocess.PIPE) name=name.communicate()[0].strip() name=name.replace(-,_) This is usually a bad idea. You are using name for the process and its output. Use more names... What about: uuid=subprocess.Popen('uuidgen -t',shell=True,stdout=subprocess.PIPE) output=uuid.communicate()[0].strip() name=output.replace(-,_) name2=/home/ihubuser/parseErrTest/pp_+name+.dat This would be a good place to insert a print print name2 ofile1=open(name2,w+) Why are you using w+ mode? You are only writing. Keep life as simple as possible. ofile1.write(e) e is quite likely to be empty ofile1.write(aaa) Are you sure aaa exists at this point? Remember you are catching all errors so if an error happens prior to aaa being created this will fail. ofile1.close() You used the with form earlier, why not here too. It's considered better style... Some final comments. 1) You call sys.exit() several times inside the try block. sys.exit will not be caught by your except block, is that what you expect?. 2) The combination of confusing naming of variables, reuse of names and poor code layout and excessive commented code makes it very difficult to read your code. That makes it hard to figure out what might be going on. - Use sensible variable names not a,aaa,z, etc - use 3 or 4 level indentation not 2 - use a version control system (RCS,CVS, SVN,...) instead of commenting out big blocks - use consistent code style eg with f as ... or open(f)/close(f) but not both - use the os module (and friends) instead of subprocess if possible 3) Have you tried deleting all the files in the /home/ihubuser/parseErrTest/ folder and starting again, just to be sure that your current code is actually producing the empty files? 4) You use tmpParseDir in a couple of places but I don't see it being set anywhere? That's about the best I can offer based on the information available. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.flickr.com/photos/alangauldphotos -- https://mail.python.org/mailman/listinfo/python-list ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] capturing errors/exceptions..
Hi. Really basic question!! Got a chunk of some test python, trying to figure out a quick/easy way to capture all/any errors/exceptions that get thrown.. For the test process, I need to ensure that I capture any/all potential errors.. -Could/Should I wrap the entire func in a try/catch when I call the function from the parent process? -Should I have separate try/catch blocks within the function? -The test py app is being run from the CLI, is there a py command line attribute that auto captures all errors? Any thoughts.. Thanks! A sample of the test code is: def getParseCollegeFacultyList1(url, content): s=content s=s.replace(nbsp;,) if(debug==1): print s=+s url=url.strip(/) #got the page/data... parse it and get the schools.. #use the dept list as the school # s contains HTML not XML text d = libxml2dom.parseString(s, html=1) ### #-- #--create the output data file for the registrar/start data #-- #-- ### #term_in=201336sel_subj=ACCT if(debug==1): print inside parse state/college function \n #---Form #fetch the option val/text for the depts which are used #as the dept abbrv/name on the master side #-- the school matches the dept... #-- this results in separate packets for each dept p=//a[contains(@href,'SelectTeacher') and @id='last']//attribute::href ap=//a[contains(@href,'campusRatings.jsp')]//attribute::href hpath=//div[@id='profInfo']/ul/li[1]//a/attribute::href# -get the college website cpath=//div[@id='profInfo']/ul/li[2]/text() #-get the city,state colpath=//h2/text()#-college name xpath=//a[contains(@title,'school id:')]/attribute::href hh_ = d.xpath(hpath) cc_ = d.xpath(cpath) col_ = d.xpath(colpath) ap_ = d.xpath(ap) if(debug==1): print hhl +str(len(hh_)) print ccl +str(len(cc_)) web= if (len(hh_)0): web=hh_[0].textContent city= if (len(cc_)0): city=cc_[0].textContent colname= if (len(col_)0): colname=col_[0].textContent colname=colname.encode('ascii', 'ignore').strip() # # set up out array # ret={} out={} row={} jrow= ndx=0 pcount_ = d.xpath(p) if(len(pcount_)==0): #at least one success/entry.. but apparently only a single page.. status=True #count=pcount_[0].textContent.strip() #countp=count.split('pageNo=') #count=countp[1] #rr=countp[0] if(len(ap_)==1): idd=ap_[0].textContent.strip() idd=idd.split(?sid=) idd=idd[1].split() idd=idd[0].strip() nurl=url+/SelectTeacher.jsp?sid=+idd+pageNo=1 #nurl=url+pageNo=1 row={} row['WriteData']=True row['tmp5']=web row['tmp6']=city row['tmp7']=colname row['tmp8']=nurl #don't json for now #jrow=simplejson.dumps(row) jrow=row out[ndx]=jrow ndx = ndx+1 else: #at least one success/entry.. set the status status=True count=pcount_[0].textContent.strip() countp=count.split('pageNo=') count=countp[1] rr=countp[0] if(debug==1): print c =+str(count)+\n for t in range(1,int(count)+1): nurl=url+rr+pageNo=+str(t) if(debug==1): print nurl = +nurl+\n row={} row['WriteData']=True row['tmp5']=web row['tmp6']=city row['tmp7']=colname row['tmp8']=nurl #don't json for now #jrow=simplejson.dumps(row) jrow=row out[ndx]=jrow ndx = ndx+1 ret['data']=simplejson.dumps(out) ret['count']=ndx ret['status']=status return(ret) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] capturing errors/exceptions..
Clarification. The test py app is being invoked via a system function from a separate app, and not stacktrace gets created. All I have is in the /var/log/messages, an indication that the pyTest app generated an error.. This is noted by the abrtd process, but I have no other data to go on.. Which is why I'm interested in implementing some basic capture/display all/any error approach to get a feel for what's happening.. On Fri, Aug 1, 2014 at 10:54 AM, Steven D'Aprano st...@pearwood.info wrote: On Fri, Aug 01, 2014 at 10:14:38AM -0400, bruce wrote: Hi. Really basic question!! Got a chunk of some test python, trying to figure out a quick/easy way to capture all/any errors/exceptions that get thrown.. Why do you want to do that? The answer to your question will depend on what you expect to do with the exception once you've caught it, and the answer might very well be don't do that. For the test process, I need to ensure that I capture any/all potential errors.. Hmmm. I don't quite see the reason for this. If you're running by hand, manually, surely you want to see the exceptions so that you can fix them? If there's an exception, what do you expect to do next? If you're using the unittest module, it already captures the exceptions for you, no need to re-invent the wheel. -Could/Should I wrap the entire func in a try/catch when I call the function from the parent process? You mean something like this? try: mymodule.function_being_test(x, y, z) except Exception: do_something_with_exception() Sure. Sounds reasonable, if you have something reasonable to do once you've captured the exception. -Should I have separate try/catch blocks within the function? No. That means that the function is constrained by the testing regime. -The test py app is being run from the CLI, is there a py command line attribute that auto captures all errors? No. How would such a thing work? In general, once an exception occurs, you get a cascade of irrelevant errors: n = lne(some_data) # Oops I meant len m = 2*n + 1 # oops, this fails because n doesn't exist value = something[m] # now this fails because m doesn't exist ... Automatically recovering from an exception and continuing is not practical, hence Python halts after an exception unless you take steps to handle it yourself. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] cdata/aml question..
Hi. The following text contains sample data. I'm simply trying to parse it using libxml2dom as the lib to extract data. As an example, to get the name/desc test data class_meta_datadepartmentsdepartmentname![CDATA[A HTG]]/namedesc![CDATA[American Heritage]]/desc/departmentdepartmentname![CDATA[ACC]]/namedesc![CDATA[Accounting]]/desc/department d = libxml2dom.parseString(s, html=1) p1=//department/name p2=//department/desc pcount_ = d.xpath(p1) p2_ = d.xpath(p2) print str(len(pcount_)) nba=0 for a in pcount_: abbrv=a.nodeValue print abbrv abbrv=a.toString() print abbrv abbrv=a.textContent print abbrv neither of the above generates any of the CML name/desc data.. any pointers on what I'm missing??? I can/have created a quick parse/split process to get the data, but I thought there'd be a straight forward process to extract the data using one of the py/libs.. thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python Question
hey amy.. ok.. before we jump to coding (and forgive me if what I'm about to type is really basic!) let's play a bit with what's called psuedo-code. psuedo-code is a technique to kind of put your thoughts about a problem/approach in a hash of code/english.. kind of lets you lay out what you're trying to solve/program. so for you issue: you need to think about what you're trying to do. you want to give the user back something, based on you doing something to the thing the user gives you. so this means, you need some way of getting user input you want to do something to the input, so you need some way of capturing the input to perform the something (better known as an operation) on the user's input.. then you want to redisplay stuff back to the user, so you're going to need a way of displaying back to the user the data/output.. create the psuedo-code, post it, and we'll get this in no time! On Sat, Jan 11, 2014 at 12:23 PM, Amy Davidson amydavid...@sympatico.ca wrote: Hey! So luckily with the texts that were sent to me, I was able to figure out the answer(yay)! Unfortunately I am now stuck on a different question. Write a function called highlight() that prompts the user for a string. Your code should ensure that the string is all lower case. Next, prompt the user for a smaller 'substring' of one or more characters. Then replace every occurrence of the substring in the first string with an upper case. Finally, report to the user how many changes were made (i.e., how many occurrences of the substring there were).” On Jan 11, 2014, at 1:04 AM, Alex Kleider aklei...@sonic.net wrote: On 2014-01-10 17:57, Amy Davidson wrote: Hey Danny, I just started taking the course (introduction to Computer Science) on last Tuesday, so I am not to familiar. I have been doing my best to understand the material by reading the text book, Learn Python the hard way. A lot of people seem to think the Hard Way is the way to go. I disagree. I found that Allen Downey's book is excellent and free (although the book is also available in 'real' print which works better for me.) http://www.greenteapress.com/thinkpython/ My copy covers Python 2.7, you use Python 3 I believe, but I doubt that that will be too much of a problem. At the intro level the differences are few. ak In my quest to answer the question given to me, I have searched the internet high and low of other functions thus, I am familiar with the basic knowledge of them (i.e. starting with def) as well as examples. We can attempt the approach to the method that you prefer. Thans for helping me, by the way. On Jan 10, 2014, at 5:25 PM, Danny Yoo d...@hashcollision.org wrote: On Fri, Jan 10, 2014 at 2:00 PM, Keith Winston keithw...@gmail.com wrote: Amy, judging from Danny's replies, you may be emailing him and not the list. If you want others to help, or to report on your progress, you'll need to make sure the tutor email is in your reply to: Hi Amy, Very much so. Please try to use Reply to All if you can. If you're wondering why I'm asking for you to try to recall any other example function definitions, I'm doing so specifically because it is a general problem-solving technique. Try to see if the problem that's stumping you is similar to things you've seen before. Several of the heuristics from Polya's How to Solve It refer to this: http://en.wikipedia.org/wiki/How_to_Solve_It If you haven't ever seen any function definition ever before, then we do have to start from square one. But this would be a very strange scenario, to be asked to write a function definition without having seen any previous definitions before. If you have seen a function before, then one approach we might take is try to make analogies to those previous examples. That's an approach I'd prefer. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] trying to parse an xml file
Hi. Looking at a file -- http://www.marquette.edu/mucentral/registrar/snapshot/fall13/xml/BIOL_bysubject.xml The file is generated via online/web url, and appears to be XML. However, when I use elementtree: document = ElementTree.parse( '/apps/parseapp2/testxml.xml' ) I get an invalid error : not well-formed (invalid token): I started to go through the file, to remove offending chars, but decided there has to be a better approach. I also looked at the underlying url/page to see what it's doing with the javascript to parse the XML. Anyone have any python suggestions as to how to proceed to parse out the data! thanks the javascript chunk :: var dsSnapshot = new Spry.Data.XMLDataSet(xml/BIOL_bysubject.xml, RECORDS/RECORD); dsSnapshot.setColumnType(nt, html); dsSnapshot.setColumnType(ti, html); dsSnapshot.setColumnType(new, html); dsSnapshot.setColumnType(se, html); dsSnapshot.setColumnType(mt, html); dsSnapshot.setColumnType(ex, html); dsSnapshot.setColumnType(in, html); ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] libtiff--can't find library
Presumably this is a newbie question; apologies in advance, but I have spent hours trying to RTFM, to no avail. Can anyone help? I've installed pylibtiff-0.1-svn.win32.exe since I want to be able to read a TIFF file. But when I type (in IDLE) I get from libtiffimport TIFFfile, TIFFimage Traceback (most recent call last): File pyshell#1, line 1, inmodule from libtiff import TIFFfile, TIFFimage File E:\Python27\lib\site-packages\libtiff\__init__.py, line 4, inmodule from .libtiff import libtiff, TIFF File E:\Python27\lib\site-packages\libtiff\libtiff.py, line 35, inmodule raise ImportError('Failed to find TIFF library. Make sure that libtiff is installed and its location is listed in PATH|LD_LIBRARY_PATH|..') ImportError: Failed to find TIFF library. Make sure that libtiff is installed and its location is listed in PATH|LD_LIBRARY_PATH|.. import sys print sys.path ['E:\\Python27\\Lib\\idlelib', 'E:\\Windows\\system32\\python27.zip', 'E:\\Python27\\DLLs', 'E:\\Python27\\lib', 'E:\\Python27\\lib\\plat-win', 'E:\\Python27\\lib\\lib-tk', 'E:\\Python27', 'E:\\Python27\\lib\\site-packages'] Libtiff is in the 'E:\\Python27\\lib\\site-packages' directory as it's supposed to. So is, e.g., Numpy, which imports just fine. What am I doing wrong? FWIW, I tried the PIL package, and had the same problem (module not found). Why do these modules not import when Numpy, matplotlib, scipy, etc. import as expected? Running Win7, 32bit, Python 2.7.1. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] generating unique set of dicts from a list of dicts
trying to figure out how to generate a unique set of dicts from a json/list of dicts. initial list ::: [{pStart1a: {termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI, instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH, pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH}, pSearch1a: {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}}, {pStart1:}, {pStart1a:{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI, instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH, pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH}, pSearch1a: {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}}, {pStart1:}] As an exmple, the following is the test list: [{pStart1a: {termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI, instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH, pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH}, pSearch1a: {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}}, {pStart1:}, {pStart1a:{termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI, instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH, pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH}, pSearch1a: {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}}, {pStart1:}] Trying to get the following, list of unique dicts, so there aren't duplicate dicts. Searched various sites/SO.. and still have a mental block. [ {pStart1a: {termVal:1122,termMenu:CLASS_SRCH_WRK2_STRM,instVal:OSUSI, instMenu:CLASS_SRCH_WRK2_INSTITUTION,goBtn:CLASS_SRCH_WRK2_SSR_PB_SRCH, pagechk:CLASS_SRCH_WRK2_SSR_PB_SRCH,nPage:CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH}, pSearch1a: {chk:CLASS_SRCH_WRK2_MON,srchbtn:DERIVED_CLSRCH_SSR_EXPAND_COLLAPS}}, {pStart1:}] I was considering iterating through the initial list, copying each dict into a new list, and doing a basic comparison, adding the next dict if it's not in the new list.. is there another/better way? posted this to StackOverflow as well. http://stackoverflow.com/questions/8808286/simplifying-a-json-list-to-the-unique-dict-items There was a potential soln that I couldn't understand. - The simplest approach -- using list(set(your_list_of_dicts)) won't work because Python dictionaries are mutable and not hashable (that is, they don't implement __hash__). This is because Python can't guarantee that the hash of a dictionary won't change after you insert it into a set or dict. However, in your case, since you (don't seem to be) modifying the data at all, you can compute your own hash, and use this along with a dictionary to relatively easily find the unique JSON objects without having to do a full recursive comparison of each dictionary to the others. First, we need a function to compute a hash of the dictionary. Rather than trying to build our own hash function, let's use one of the built-in ones from hashlib: def dict_hash(d): out = hashlib.md5() for key, value in d.iteritems(): out.update(unicode(key)) out.update(unicode(value)) return out.hexdigest() (Note that this relies on unicode(...) for each of your values returning something unique -- if you have custom classes in the dictionaries whose __unicode__ returns something like MyClass instance, this will fail or will require modification. Also, in your example, your dictionaries are flat, but I'll leave it as an exercise to the reader how to expand this solution to work with dictionaries that contain other dicts or lists.) Since dict_hash returns a string, which is immutable, you can now use a dictionary to find the unique elements: uniques_map = {} for d in list_of_dicts: uniques[dict_hash(d)] = d unique_dicts = uniques_map.values() *** not sure what the uniqes is, or what/how it should be defined thoughts/comments are welcome thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] list issue.. i think
hi. got a test where i have multiple lists with key/values. trying to figure out how to do a join/multiply, or whatever python calls it, where i have a series of resulting lists/dicts that look like the following.. the number of lists/rows is dynamic.. the size of the list/rows will also be dynamic as well. i've looked over the py docs, as well as different potential solns.. psuedo code, or pointers would be helpful. thanks... test data a['a1']=['a1','a2','a3'] a['a2']=['b1','b2','b3'] a['a3']=['c1','c2','c3'] end test result:: a1:a1,a2:b1,a3:c1 a1:a2,a2:b1,a3:c1 a1:a3,a2:b1,a3:c1 a1:a1,a2:b2,a3:c1 a1:a2,a2:b2,a3:c1 a1:a3,a2:b2,a3:c1 a1:a1,a2:b3,a3:c1 a1:a2,a2:b3,a3:c1 a1:a3,a2:b3,a3:c1 a1:a1,a2:b1,a3:c2 a1:a2,a2:b1,a3:c2 a1:a3,a2:b1,a3:c2 a1:a1,a2:b2,a3:c2 a1:a2,a2:b2,a3:c2 a1:a3,a2:b2,a3:c2 a1:a1,a2:b3,a3:c2 a1:a2,a2:b3,a3:c2 a1:a3,a2:b3,a3:c2 a1:a1,a2:b1,a3:c3 a1:a2,a2:b1,a3:c3 a1:a3,a2:b1,a3:c3 a1:a1,a2:b2,a3:c3 a1:a2,a2:b2,a3:c3 a1:a3,a2:b2,a3:c3 a1:a1,a2:b3,a3:c3 a1:a2,a2:b3,a3:c3 a1:a3,a2:b3,a3:c3 ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] multi-threaded/parallel processing - local tutor
not looking for docs.. already have code. looking to actually talk to someone in the san fran/bay area for an in person talk/tutor session. thanks 2011/1/22 शंतनू shanta...@gmail.com: You may find following useful. 2.6+ --- http://docs.python.org/library/multiprocessing.html 3.x --- http://docs.python.org/dev/library/multiprocessing.html On 23-Jan-2011, at 11:46 AM, bruce wrote: Hi. I'm working on a project that uses python to spawn/create multiple threads, to run parallel processes to fetch data from websites. I'm looking to (if possible) go over this in person with someone in the San Fran area. Lunch/beer/oj can be on me!! It's a little too complex to try to describe here, and pasting the code/apps wouldn't do any good without an associated conversation. So, if you're in the Bay area, and you're up to some in person tutoring, let me know. Thanks guys for this list!! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] multi-threaded/parallel processing - local tutor
Hi. I'm working on a project that uses python to spawn/create multiple threads, to run parallel processes to fetch data from websites. I'm looking to (if possible) go over this in person with someone in the San Fran area. Lunch/beer/oj can be on me!! It's a little too complex to try to describe here, and pasting the code/apps wouldn't do any good without an associated conversation. So, if you're in the Bay area, and you're up to some in person tutoring, let me know. Thanks guys for this list!! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] list of tutors for python
Hi guys. Please don't slam me!! I'm working on a project, looking for a pretty good number of pythonistas. Trying to find resources that I should look to to find them, and thought I would try here for suggestions. Any comments would be appreciated. Thanks ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] [Visualpython-users] tkinter and visual with objects
The following works to produce a window with nothing displayed in it: ball = sphere() ball.visible = 0 Another scheme would be this: scene.range = 1 ball = sphere(radius=1e-6) The point is that Visual doesn't create a window unless there is something to display. Bruce Sherwood Mr Gerard Kelly wrote: I'm trying to make this very simple program, where the idea is that you click a tkinter button named Ball and then a ball will appear in the visual window. Problem is that the window itself doesn't pop up until the button is pressed and the ball is created. I would like it to start out blank, and then have the ball appear in it when the button is pressed. I thought that having self.display=display() in the __init__ of the Application would do this, but it doesn't seem to. What do I need to add to this code to make it start out with a blank window? from visual import * from Tkinter import * import sys class Ball: def __init__(self): sphere(pos=(0,0,0)) class Application: def __init__(self, root): self.frame = Frame(root) self.frame.pack() self.display=display() self.button=Button(self.frame, text=Ball, command=self.ball) self.button.pack() def ball(self): self.ball=Ball() root=Tk() app=Application(root) root.mainloop() -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Visualpython-users mailing list visualpython-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/visualpython-users ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] [Visualpython-users] VPython and Tkinter
In Visual 3, there is an example program (Tk-visual.py, if I remember correctly) which shows a Tk window controlling actions in a separate Visual window. In Visual 5, I believe that this program would still work on Windows and Linux, but because there seems to be no way to make this work in the Carbon-based Mac version, the application was removed from the set of examples, which are platform-independent. Bruce Sherwood Mr Gerard Kelly wrote: Is there a way to make separate VPython and Tkinter windows run simultaneously from the one program? Or to have the VPython window run inside a Tkinter toplevel? If I have a program that uses both, it seems that one window has to close before the other will start running. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor