[Tutor] beautifulsoup
I am trying to scrap from the (span class= 'Number'). The code looks like this on the pages I am scrapping: 9910.00(-0.1%) Menu Max Quantity 100.000 Average Quantity 822 Previous Order 96 Max Price 104 Number of Trades 383 Min Price 59 Total Amount 800 Start 10 Low 98 I have tried to use Beautifulsoup to scrape the data. However, it returns Nothing on the screen from bs4 import BeautifulSoup html = response.content soup = BeautifulSoup(html,"html.parser") title = soup.select('td.styleB')[0].next_sibling title1 = soup.find_all('span', attrs={'class': 'Number'}).next_sibling print(title1) I am hoping that I could retrieve the number as follows: Max Quantity: 100 Average Quantity: 822 Previous Order: 96 Max Price: 104 Number of Trades:383 Min Price: 59 Total Amount:800 Start:10 Low: 98 Please advise what is the problem with my code from handling the query. Thank you ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Unable to download , using Beautifulsoup
I am using Python 3 on Windows 7. However, I am unable to download some of the data listed in the web site as follows: http://data.tsci.com.cn/stock/00939/STK_Broker.htm 453.IMC 98.28M 18.44M 4.32 5.33 1499.Optiver 70.91M 13.29M 3.12 5.34 7387.花旗环球 52.72M 9.84M 2.32 5.36 When I use Google Chrome and use 'View Page Source', the data does not show up at all. However, when I use 'Inspect', I can able to read the data. '1453.IMC' '98.28M' '18.44M' '4.32' '5.33' '1499.Optiver ' ' 70.91M' '13.29M ' '3.12' '5.34' Please kindly explain to me if the data is hide in CSS Style sheet or is there any way to retrieve the data listed. Thank you Regards, Crusier from bs4 import BeautifulSoup import urllib import requests stock_code = ('00939', '0001') def web_scraper(stock_code): broker_url = 'http://data.tsci.com.cn/stock/' end_url = '/STK_Broker.htm' for code in stock_code: new_url = broker_url + code + end_url response = requests.get(new_url) html = response.content soup = BeautifulSoup(html, "html.parser") Buylist = soup.find_all('div', id ="BuyingSeats") Selllist = soup.find_all('div', id ="SellSeats") print(Buylist) print(Selllist) web_scraper(stock_code) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Fwd: 10+ Different Python Script?
Dear All, I am currently using: Python 3.5 Window 7 I have a python script which is used for downloading Real Time Stocks. Currently, there is over a 1000 stocks in the Portfolio. If I download the market info during market hours, it would take over 40 minutes to finish downloading all the market info. Since I am trading stock real time, I am thinking of breaking it down into 100 stocks per group, so I could make the download time shorter. For the market info, I will put into database and for data analysis. My Question is as follows: Please advise if I should put the portfolio and split it into 10 different scripts, so each scripts consists a list of 100+ number of stocks. Or is there any Pythonic way to approach this problem?? I am currently using Multi Thread but there is an Operational Error I am having a hard time to approach OOP. I just can't grasp the understanding of it as it seems so abstract. Please advise. Thank you Regards, Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Web Page Scraping
Hi Walter, Thank you for taking your time to do all the explanation. Have a great day. Cheers, Hank On Tue, May 24, 2016 at 10:45 PM, Walter Prinswrote: > On 24 May 2016 at 15:37, Walter Prins wrote: >> print(name1.encode(sys.stdout.encoding, "backslashreplace")) >> # > > I forgot to mention, you might want to read the following documentation page: > > https://docs.python.org/3/howto/unicode.html > > (good luck.) > > W ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Web Page Scraping
Dear All, I am trying to scrape a web site using Beautiful Soup. However, BS doesn't show any of the data. I am just wondering if it is Javascript or some other feature which hides all the data. I have the following questions: 1) Please advise how to scrape the following data from the website: 'http://www.dbpower.com.hk/en/quote/quote-warrant/code/10348' Type, Listing Date (Y-M-D), Call / Put, Last Trading Day (Y-M-D), Strike Price, Maturity Date (Y-M-D), Effective Gearing (X),Time to Maturity (D), Delta (%), Daily Theta (%), Board Lot... 2) I am able to scrape most of the data from the same site 'http://www.dbpower.com.hk/en/quote/quote-cbbc/code/63852' Please advise what is the difference between these two sites. Attached is my code Thank you Regards, Hank from bs4 import BeautifulSoup import requests import json import re warrants = ['10348'] def web_scraper(warrants): url = "http://www.dbpower.com.hk/en/quote/quote-warrant/code/; # Scrape from the Web for code in warrants: new_url = url + code response = requests.get(new_url) html = response.content soup = BeautifulSoup(html,"html.parser") print(soup) name = soup.findAll('div', attrs={'class': 'article_content'}) #print(name) for n in name: name1 = str(n.text) s_code = name1[:4] print(name1) web_scraper(warrants) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Tutor Digest, Vol 147, Issue 30
Dear Peter & Alan, Thanks alot. Have a great day Cheers, Hank On Fri, May 20, 2016 at 12:00 AM, <tutor-requ...@python.org> wrote: > Send Tutor mailing list submissions to > tutor@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor > or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > > You can reach the person managing the list at > tutor-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Tutor digest..." > > > Today's Topics: > >1. Re: SQLite (Peter Otten) > > > -- > > Message: 1 > Date: Thu, 19 May 2016 11:20:32 +0200 > From: Peter Otten <__pete...@web.de> > To: tutor@python.org > Subject: Re: [Tutor] SQLite > Message-ID: <nhk0h1$9et$1...@ger.gmane.org> > Content-Type: text/plain; charset="ISO-8859-1" > > Crusier wrote: > >> Dear Alan, >> >> I have read your web page and try to test thing out on SQLite. >> >> Attached is my code: >> >> import sqlite3 >> conn = sqlite3.connect('example1.db') >> c = conn.cursor() >> c.execute('drop table if exists stocks') >> c.execute('''CREATE TABLE stocks >> (code text)''') >> >> # Insert a row of data >> List = ['1', '2', '3', '4', '5', '6', '7', >> '8', '9', '00010', '00011', '00012'] > > List is a bad name; use something related to the problem domain, e. g > stocks. > >> >> c.executemany('INSERT INTO stocks VALUES (?)', List) >> >> # Save (commit) the changes >> conn.commit() >> >> # We can also close the connection if we are done with it. >> # Just be sure any changes have been committed or they will be lost. >> conn.close() >> >> The following error has came out >> sqlite3.ProgrammingError: Incorrect number of bindings supplied. The >> current statement uses 1, and there are 5 supplied. >> >> Please advise. > > The List argument is interpreted as a sequence of records and thus what you > meant as a single value, e. g. "1" as a sequence of fields, i. e. every > character counts as a separate value. > > To fix the problem you can either change the list to a list of tuples or > lists > > List = [['1'], ['2'], ['3'], ...] > > or add a zip() call in the line > > c.executemany('INSERT INTO stocks VALUES (?)', zip(List)) > > which has the same effect: > >>>> list(zip(["foo", "bar", "baz"])) > [('foo',), ('bar',), ('baz',)] > > > > > > -- > > Subject: Digest Footer > > ___ > Tutor maillist - Tutor@python.org > https://mail.python.org/mailman/listinfo/tutor > > > -- > > End of Tutor Digest, Vol 147, Issue 30 > ** ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] SQLite
Dear Alan, I have read your web page and try to test thing out on SQLite. Attached is my code: import sqlite3 conn = sqlite3.connect('example1.db') c = conn.cursor() c.execute('drop table if exists stocks') c.execute('''CREATE TABLE stocks (code text)''') # Insert a row of data List = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '00010', '00011', '00012'] c.executemany('INSERT INTO stocks VALUES (?)', List) # Save (commit) the changes conn.commit() # We can also close the connection if we are done with it. # Just be sure any changes have been committed or they will be lost. conn.close() The following error has came out sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 5 supplied. Please advise. Thank you very much. Regards, Crusier ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Tutor Digest, Vol 147, Issue 10
Thanks, Alan. Have a great day. Henry On Wed, May 4, 2016 at 12:00 AM, <tutor-requ...@python.org> wrote: > Send Tutor mailing list submissions to > tutor@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor > or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > > You can reach the person managing the list at > tutor-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Tutor digest..." > > > Today's Topics: > >1. Re: sqlite (Alan Gauld) > > > -- > > Message: 1 > Date: Tue, 3 May 2016 16:40:17 +0100 > From: Alan Gauld <alan.ga...@yahoo.co.uk> > To: tutor@python.org > Subject: Re: [Tutor] sqlite > Message-ID: <ngagp1$8a3$1...@ger.gmane.org> > Content-Type: text/plain; charset=utf-8 > > On 03/05/16 10:09, Crusier wrote: > > > I am just wondering if there is any good reference which I can learn how > to > > program SQLITE using Python > > > > I can not find any book is correlated to Sqlite using Python. > > You can try my tutorial below. > > http://www.alan-g.me.uk/tutor/tutdbms.htm > > If you want very similar information in book form then > our book 'Python Projects' contains a chapter on databases, > half of which is SQLite based. > > If you want a good book on SQLite itself I can recommend: > > Using SQLIte by Kreibich. > > hth > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > > > -- > > Subject: Digest Footer > > ___ > Tutor maillist - Tutor@python.org > https://mail.python.org/mailman/listinfo/tutor > > > -- > > End of Tutor Digest, Vol 147, Issue 10 > ** > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] sqlite
Dear All, I am just wondering if there is any good reference which I can learn how to program SQLITE using Python I can not find any book is correlated to Sqlite using Python. Thank you Regards, Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Beautiful Soup
Hi Python Tutors, I am currently able to strip down to the string I want. However, I have problems with the JSON script and I am not sure how to slice it into a dictionary. import urllib import json import requests from bs4 import BeautifulSoup url = 'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881\ =F=09=16=S=44c99b61679e019666f0570db51ad932=0=0' def web_scraper(url): response = requests.get(url) html = response.content soup = BeautifulSoup(html, 'lxml') stock1 = soup.findAll('script')[4].string stock2 = stock1.split() stock3 = stock2[3] # is stock3 sufficient to process as JSON or need further cleaning?? text = json.dumps(stock3) print(text) web_scraper(url) If it is possible, please give me some pointers. Thank you Regards, Henry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Tutor Digest, Vol 142, Issue 11
Dear Alan, Thank you very much for answering the question. If you don't mind, please kindly let me know which library I should focus on among beautifulsoup, selenium + PhantomJS, and dryscrape. Have a good day regards, Hank On Sun, Dec 13, 2015 at 5:11 PM, <tutor-requ...@python.org> wrote: > Send Tutor mailing list submissions to > tutor@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor > or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > > You can reach the person managing the list at > tutor-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Tutor digest..." > > > Today's Topics: > >1. Re: Calculation error with a simple program (Steven D'Aprano) >2. Re: Calculation error with a simple program (Jim Gallaher) >3. Re: Calculation error with a simple program (Todd Purple) > 4. Re: Tutor Digest, Vol 142, Issue 10 (Ken Hammer) >5. Beautiful Soup (Crusier) >6. Re: Beautiful Soup (Alan Gauld) > > > -- > > Message: 1 > Date: Sun, 13 Dec 2015 04:00:25 +1100 > From: Steven D'Aprano <st...@pearwood.info> > To: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <20151212170025.gk3...@ando.pearwood.info> > Content-Type: text/plain; charset=us-ascii > > On Sat, Dec 12, 2015 at 01:03:05AM -0600, Jim Gallaher wrote: >> Hi everyone. I'm reading through a beginners Python book and came up >> with a super simple program. I'm not getting any errors and everything >> runs through, but there's a logical calculation error. What the program >> does is take an amount and calculate a couple percentages and add a >> couple fees. >> >> For example, if I put in a value of 1, it will output 752.12 as the sub >> total and 753.12 as the grand total. It's off by 1 on sub total and 2 on >> grand total. > > Check your arithmetic -- with a base price of $1, the sub total is > $752.12 and the grand total is $753.12, exactly as Python calculates. > > > But I wonder whether you have made a logic mistake in your code. You > say: > > dealerPrep = basePrice + 500 > destinationCharge = basePrice + 250 > > This means that the car will cost more than TRIPLE the base price. > Instead of $1, enter a base price of $40,000 and you will see what I > mean: now the dealer prep is $40500 and the destination charge is > $40250, which added to the base price gives $120750 (plus tax and > licence). Surely that's not right, the deal charges more than the cost > of the car for preparation? I think what you want is: > > dealerPrep = 500 > destinationCharge = 250 > > > > -- > Steve > > > -- > > Message: 2 > Date: Sat, 12 Dec 2015 11:34:27 -0600 > From: Jim Gallaher <jcgallahe...@gmail.com> > To: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <8774a398-4a0f-41df-bbd2-941b49a3c...@gmail.com> > Content-Type: text/plain; charset=us-ascii > > Hi Alan, > > I'm 100 percent sure I'm wrong. :-) I verified it when I fixed the mistake. > The problem was it was adding in the basePrice and the fixed > rates/percentages each time. So I figured it out when Ian said something > about that. > > Thanks for everyone's help! :-) > > -- > > Message: 3 > Date: Sat, 12 Dec 2015 08:13:23 -0500 > From: Todd Purple <todd_pur...@yahoo.com> > To: Jim Gallaher <jcgallahe...@gmail.com> > Cc: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <6ec6d759-3c19-4a6d-8dd7-6efae1958...@yahoo.com> > Content-Type: text/plain; charset=us-ascii > > > >> On Dec 12, 2015, at 2:03 AM, Jim Gallaher <jcgallahe...@gmail.com> wrote: >> >> Hi everyone. I'm reading through a beginners Python book and came up with a >> super simple program. I'm not getting any errors and everything runs >> through, but there's a logical calculation error. What the program does is >> take an amount and calculate a couple percentages and add a couple fees. >> >> For example, if I put in a value of 1, it will output 752.12 as the sub >> total and 753.12 as the grand total. It's off by 1 on sub total and 2 on >> grand total. >> >> Thanks in advance! Jim Gallaher >> >> # Car Salesman Calculator >> >> # User enters the base price of the ca
[Tutor] Beautiful Soup
Dear All, I am trying to scrap the following website, however, I have encountered some problems. As you can see, I am not really familiar with regex and I hope you can give me some pointers to how to solve this problem. I hope I can download all the transaction data into the database. However, I need to retrieve it first. The data which I hope to retrieve it is as follows: " 15:59:59 A 500 6.790 3,395 15:59:53 B 500 6.780 3,390 Thank you Below is my quote: from bs4 import BeautifulSoup import requests import re url = 'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881=F=09=16=S=44c99b61679e019666f0570db51ad932=0=0' def turnover_detail(url): response = requests.get(url) html = response.content soup = BeautifulSoup(html,"html.parser") data = soup.find_all("script") for json in data: print(json) turnover_detail(url) Best Regards, Henry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Beautifulsoup Queries
Dear All, I am using Python 3.4, I tried to scrap the web and eventually put those data into a database for analysis. While I am using Beautifulsoup to scrap the web, I encountered 2 problems: 1. Using Beautiful Soup, the webmaster on the other end is using the same class, so I got a whole list of information (High, Low Previous Close, Shares Traded, Turnover.etc) and I really want to divide that up into separate categories. 2. If I input " print(data.string) " for line 27, I will get an error ('ResultSet' object has no attribute 'string'). However, if I input " print(data) ", they print out in span class. etc... from bs4 import BeautifulSoup import requests import os tryout = ['0001', '0002', '0003', '0004', '0005', '0006', '0007', '0008', '0009', '0010', '0011', '0012', '0014', '0015', '0016', '0017', '0018', '0019', '0020'] url = "https://www.etnet.com.hk/www/eng/stocks/realtime/quote.php?code=; def web_scraper(url): for n in tryout: URL = url + n response = requests.get(URL) html = response.content soup = BeautifulSoup(html,"html.parser") # print (soup.prettify()) Title = soup.find("div", attrs = {"id": "StkQuoteHeader"}) RT_down = soup.find("span", attrs = {"class": "Price down2"}) RT_up = soup.find("span", attrs = {"class": "Price up2"}) RT_unchange = soup.find("span",attrs = {"class" :"Price unchange2"}) change_percent = soup.find("span", attrs = {"class" :"Change"}) Day_High = soup.findAll("span", attrs = {"class" :"Number"}) for data in [Title, RT_down, RT_up, RT_unchange, change_percent, Day_High]: if data: print(data) web_scraper(url) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Problem on handling if statement
Dear All, I am trying to do some web scraping. Attached below is my code: from bs4 import BeautifulSoup import requests #string = str(0175, 0005, 1177) url = "https://www.etnet.com.hk/www/eng/stocks/realtime/quote.php?code=0175; def web_scraper(url): response = requests.get(url) html = response.content soup = BeautifulSoup(html,"html.parser") real_time_down = soup.find("span", attrs = {"class": "Price down2"}) real_time_up = soup.find("span", attrs = {"class": "Price up2"}) real_time_unchange = soup.find("span",attrs = {"class" :"Price unchange2"}) change_percent = soup.find("span", attrs = {"class" :"Change"}) if real_time_down == soup.find("span", attrs = {"class" : "Price down2"}) or real_time_up == soup.find("span", attrs \ = {"class": "Price up2"}) or real_time_unchange == soup.find("span",{"class" : "Price unchange2"}) : print(real_time_down) print(real_time_up) print(real_time_unchange) print(change_percent.string) else: return None web_scraper(url) I have problem trying to get rid of None object. For example, if I put in 1177 to the url, the real_price_down and the real_time_unchange will become a None Object. I hope that the program will be able to sort it out automatically and able to print out the string. Please help. Thank you very much Regards, Henry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Unable to retreive the stock code
Dear All, I am currently trying to download the stock code. I am using Python 3.4 and the code is as follows: from bs4 import BeautifulSoup import requests import re url = 'https://www.hkex.com.hk/eng/market/sec_tradinfo/stockcode/eisdeqty.htm' def web_scraper(url): response = requests.get(url) html = response.content soup = BeautifulSoup(html,"html.parser") for link in soup.find_all("a"): stock_code = re.search('/d/d/d/d/d', "1" ) print(stock_code, '', link.text) print(link.text) web_scraper(url) I am trying to retrieve the stock code from here: 1 or from a href. Please kindly inform which library I should use. Thanks Henry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Beautiful Soup
Hi I am using Python 3.4. I am trying to do some web scraping at this moment. I got stuck because there is an IndexError: list index out of range if I put stock_code = (18). My desire output is that the script is able to detect print out the recent price whether it is up, down or unchanged. Attached is the code: import requests from bs4 import BeautifulSoup stock_code = (939) url = ("http://www.etnet.com.hk/www/eng/stocks/realtime/quote.php?code=; + str(stock_code) ) res = requests.get(url).text soup = BeautifulSoup(res, "html.parser") for item in soup.select('#StkDetailMainBox'): if item.select('.up2') == item.select('.up2'): print('Now is trading at UP', item.select('.up2')[0].text) elif item.select('.down2') == item.select('.down2'): print('Now is trading at DOWN', item.select('.down2')[0].text) elif item.select('.unchange2') == item.select('.unchange2'): print('Now is trading at UNCHANGE', item.select('.unchange2')[0].text) print('Change is ', item.select('.Change')[0].text) #for item in soup.select('.styleB'): #print(item.select('.Number')[0].text) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Beautiful Soup
Hi I have recently finished reading "Starting out with Python" and I really want to do some web scraping. Please kindly advise where I can get more information about BeautifulSoup. It seems that Documentation is too hard for me. Furthermore, I have tried to scrap this site but it seems that there is an error (). Please advise what I should do in order to overcome this. from bs4 import BeautifulSoup import urllib.request HKFile = urllib.request.urlopen("https://bochk.etnet.com.hk/content/bochkweb/tc/quote_transaction_daily_history.php?code=2388;) HKHtml = HKFile.read() HKFile.close() print(HKFile) Thank you Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] String Problem
Dear All, I have used the urllib.request and download some of the information from a site. I am currently using Python 3.4. My program is as follows: import urllib.request response = urllib.request.urlopen(' http://www.hkex.com.hk/eng/ddp/Contract_Details.asp?PId=175') saveFile = open('HKEX.txt','w') saveFile.write(str(response.read())) saveFile.close() And the result is as follows: d align=right - /tdtd align=right0/tdtd align=right8.56/tdtd align=rightN/A/tdtd align=right1/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr56 class=tableHdrB1 align=centertd align=centreC Jul-15 - 23.00/tdtd align=right - /tdtd align=right - /tdtd align=right0.01/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right0.01/tdtd align=rightN/A/tdtd align=right467/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr57 class=tableHdrB2 align=centertd align=centreP Jul-15 - 23.00/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right9.56/tdtd align=rightN/A/tdtd align=right0/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr58 class=tableHdrB1 align=centertd align=centreC Jul-15 - 24.00/tdtd align=right - /tdtd align=right - /tdtd align=right0.01/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right0.01/tdtd align=rightN/A/tdtd align=right156/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr59 class=tableHdrB2 align=centertd align=centreP Jul-15 - 24.00/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right10.56/tdtd align=rightN/A/tdtd align=right0/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr60 class=tableHdrB1 align=centertd align=centreC Jul-15 - 25.00/tdtd align=right - /tdtd align=right - /tdtd align=right0.01/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right0.01/tdtd align=rightN/A/tdtd align=right6/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr61 class=tableHdrB2 align=centertd align=centreP Jul-15 - 25.00/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right11.56/tdtd align=rightN/A/tdtd align=right0/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr62 class=tableHdrB1 align=centertd align=centreC Aug-15 - 8.75/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right4.71/tdtd align=rightN/A/tdtd align=right0/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr63 class=tableHdrB2 align=centertd align=centreP Aug-15 - 8.75/tdtd align=right - /tdtd align=right0.03/tdtd align=right0.05/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right0/tdtd align=right0.01/tdtd align=rightN/A/tdtd align=right35/td/tr\r\n\t\t\t\t\t\t\t\ttr id=tr64 class=tableHdrB1 align=centertd align=centreC Aug-15 - 9.00/tdtd align=right - /tdtd align=right - /tdtd align=right - /tdtd align=right - /tdt Please let me know how to deal with this string. I hope I could put onto a table first. Eventually, I am hoping that I can able to put all this database. I need some guidance of which area of coding I should look into. Thank you Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] File Compare
Hi, I am comparing two files, and the program has problem comparing two files. I am currently using Python 3.3 and Window 7. Attached is my code: import difflib old_file = open('0105-up.txt','r') file_contentsA = old_file.read() A = file_contentsA.split(,) print(A) print() new_file = open('0106-up.txt','r') file_contentsB = new_file.read() B = file_contentsB.split(,) print(B) print() print('\n'.join(difflib.unified_diff(A, B))) old_file.close() new_file.close() When the result comes out, I have noticed that some of the numbers are in both files and the program fails to the difference. Please help. Thank you Henry 0002.HK 0004.HK 0006.HK 0008.HK 0011.HK 0012.HK 0016.HK 0038.HK 0054.HK 0066.HK 0069.HK 0107.HK 0151.HK 0177.HK 0218.HK 0220.HK 0232.HK 0291.HK 0295.HK 0303.HK 0315.HK 0316.HK 0323.HK 0327.HK 0342.HK 0347.HK 0358.HK 0371.HK 0390.HK 0402.HK 0425.HK 0439.HK 0511.HK 0525.HK 0530.HK 0535.HK 0548.HK 0566.HK 0588.HK 0590.HK 0607.HK 0631.HK 0656.HK 0669.HK 0683.HK 0686.HK 0691.HK 0697.HK 0708.HK 0709.HK 0721.HK 0753.HK 0754.HK 0777.HK 0806.HK 0817.HK 0829.HK 0845.HK 0857.HK 0861.HK 0868.HK 0884.HK 0902.HK 0911.HK 0914.HK 0915.HK 0934.HK 0939.HK 0960.HK 0966.HK 0967.HK 0991.HK 0998.HK 1041.HK 1055.HK 1072.HK 1088.HK 1108.HK 1109.HK 1133.HK 1138.HK 1157.HK 1171.HK 1186.HK 1190.HK 1199.HK 1212.HK 1238.HK 1253.HK 1288.HK 1313.HK 1323.HK 1336.HK 1339.HK 1363.HK 1375.HK 1398.HK 1421.HK 1613.HK 1618.HK 1628.HK 1699.HK 1766.HK 1778.HK 1800.HK 1819.HK 1829.HK 1880.HK 1893.HK 1919.HK 1929.HK 1963.HK 1988.HK 2005.HK 2007.HK 2009.HK 2202.HK 2208.HK 2233.HK 2318.HK 2319.HK 2328.HK 2333.HK 2338.HK 2379.HK 2380.HK 2600.HK 2601.HK 2626.HK 2628.HK 2638.HK 2689.HK 2800.HK 2811.HK 2822.HK 2823.HK 2827.HK 2828.HK 2866.HK 2868.HK 2883.HK 2888.HK 2899.HK 3049.HK 3188.HK 3323.HK 3328.HK .HK 3339.HK 3360.HK 3366.HK 3377.HK 3383.HK 3618.HK 3698.HK 3777.HK 3808.HK 3823.HK 3836.HK 3898.HK 3968.HK 3983.HK 3988.HK 6030.HK 6199.HK 6818.HK 6823.HK 6881.HK 6883.HK 8007.HK 8039.HK 8088.HK 8089.HK 8201.HK 8250.HK 82822.HK 8292.HK 83188.HK 8321.HK1288.HK 3383.HK 0753.HK 0347.HK 0914.HK 0995.HK 0232.HK 3988.HK 3328.HK 2009.HK 1963.HK 0588.HK 1880.HK 2868.HK 0371.HK 1190.HK 1253.HK 8321.HK 3777.HK 1848.HK 3188.HK 83188.HK 1375.HK 0939.HK 6818.HK 6881.HK 1117.HK 1041.HK 2600.HK 0606.HK 8255.HK 3983.HK 6199.HK 1800.HK 1919.HK 0257.HK 2628.HK 2883.HK 2380.HK 1186.HK 0390.HK 1109.HK 0291.HK 1088.HK 1138.HK 1055.HK 0966.HK 0728.HK 0308.HK 2202.HK 1333.HK 1313.HK 8089.HK 1929.HK 0998.HK 6030.HK 0439.HK 0002.HK 3968.HK 1829.HK 3323.HK 1778.HK 1199.HK 2007.HK 2601.HK 3618.HK 8088.HK 2866.HK 2822.HK 82822.HK 1766.HK 3898.HK 1363.HK 0991.HK 0861.HK 8007.HK 1072.HK .HK 3360.HK 0038.HK 2662.HK 0778.HK 0656.HK 0817.HK 1819.HK 0468.HK 0535.HK 0530.HK 2208.HK 2333.HK 0525.HK 0566.HK 0011.HK 0911.HK 1133.HK 3836.HK 8292.HK 0012.HK 3118.HK 2638.HK 0388.HK 2626.HK 0054.HK 0754.HK 2828.HK 2811.HK 0902.HK 3698.HK 1188.HK 1398.HK 1366.HK 0177.HK 0358.HK 0683.HK 1421.HK 0582.HK 2314.HK 2331.HK 2005.HK 0915.HK 0960.HK 3339.HK 0577.HK 0590.HK 1108.HK 0323.HK 0231.HK 1618.HK 2319.HK 1988.HK 0066.HK 3918.HK 1336.HK 0777.HK 0708.HK 0342.HK 1323.HK 2689.HK 1011.HK 0316.HK 6880.HK 0327.HK 0008.HK 0402.HK 0857.HK 1339.HK 2318.HK 0006.HK 8201.HK 1699.HK 0631.HK 0363.HK 0691.HK 0218.HK 0548.HK 0016.HK 0107.HK 3377.HK 1893.HK 0386.HK 3808.HK 0315.HK 0967.HK 2840.HK 2888.HK 1613.HK 1070.HK 3823.HK 0669.HK 0511.HK 0700.HK 1065.HK 3886.HK 2800.HK 0220.HK 0686.HK 0806.HK 0303.HK 0151.HK 0607.HK 2338.HK 2233.HK 0004.HK 2236.HK 3049.HK 2823.HK 2827.HK 0868.HK 1171.HK 1057.HK 2379.HK 2899.HK 1157.HK___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] File Compare
Hi Danny, Thanks for your suggestion. The ideal of output of this program is to show if there is any new number added to the new file. In other words, the file content of file1 [0001.hk, 0002.hk, 0003.hk, 0004.hk] is comparing with the file content of file2 [0001.hk, 0002.hk, 0003.hk, 0005.hk]. The result should be +0005.hk, -0004.hk Ah. One other thing. Can you explain what you're intending to do with this statement? A = file_contentsA.split(',') My thinking is I want to make both files as a list, so I can compare the two files. However, as you can see, it is only my wishful thinking. Best Henry On Fri, Jan 9, 2015 at 1:54 PM, Danny Yoo d...@hashcollision.org wrote: old_file = open('0105-up.txt','r') file_contentsA = old_file.read() A = file_contentsA.split(,) print(A) print() Ah. One other thing. Can you explain what you're intending to do with this statement? A = file_contentsA.split(',') The reason I ask is because neither of your input files that you've shown us has a single comma in it, so I do not understand what the intent is. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Compare two text files
Hi Alan, Attached are the two text files (stocklist.txt stocklist1.txt) which I want to do a comparison with the content of the file, Basically, I want to see if there are any new numbers added to the file. Please comment on the sequence of the file: 1. First, Open up the File No. 1 and put the string into a list. 2. Second, Open the File No. 2 and put the string into a list. 3. Use difflib to compare This is some of the code I had written. #Remove .HK from the stock list def remove_HK(): f = open('stock_list.txt', r) output = open('stock_lista.txt', w) output.write(f.read().replace(.hk,).replace(.HK,)) f.close() output.close() remove_HK() My thinking is first remove .HK so I could compare in as int to another text file. Thank you Henry 1728.HK 1033.HK 2393.HK 0968.HK 3378.HK 3049.HK 1661.HK 8269.HK 3393.HK 0151.HK 0303.HK 0345.HK 0913.HK 0220.HK 0696.HK 0570.HK 3886.HK 2283.HK 3382.HK 0882.HK 1065.HK 0826.HK 3823.HK 1613.HK 1228.HK 2382.HK 1089.HK 0981.HK 0598.HK 1099.HK 0361.HK 1177.HK 0750.HK 0444.HK 0460.HK 2877.HK 2313.HK 0152.HK 0747.HK 2607.HK 0563.HK 2727.HK 0205.HK 8047.HK 1004.HK 2010.HK 8201.HK 1345.HK 2328.HK 1515.HK 8311.HK 0402.HK 1323.HK 8180.HK 0553.HK 1618.HK 0231.HK 2186.HK 1108.HK 8058.HK 8237.HK 1212.HK 0381.HK 6136.HK 1638.HK 3336.HK 0419.HK 2211.HK 0923.HK 0438.HK 0091.HK 0167.HK 1886.HK 1071.HK 0336.HK 2811.HK 6823.HK 8292.HK 0911.HK 0566.HK 1367.HK 2208.HK 0283.HK 0530.HK 0175.HK 3800.HK 0451.HK 0500.HK 0038.HK 8123.HK 8018.HK 3360.HK 0729.HK 1856.HK 1808.HK 1330.HK 0895.HK 1072.HK 2880.HK 3898.HK 0080.HK 0867.HK 0471.HK 2722.HK 1060.HK 1313.HK 1333.HK 0728.HK 2198.HK 2380.HK 0572.HK 1185.HK 0085.HK 0217.HK 0370.HK 0031.HK 1196.HK 2623.HK 0476.HK 1375.HK 0996.HK 2324.HK 3188.HK 1848.HK 6828.HK 8321.HK 0285.HK 0154.HK 2357.HK 0232.HK 0161.HK 1803.HK 0899.HK 2020.HK 1131.HK0471.HK 3800.HK 0728.HK 1033.HK 1099.HK 2357.HK 0566.HK 2328.HK 0232.HK 0729.HK 2208.HK 0598.HK 2186.HK 0231.HK 0175.HK 0981.HK 0285.HK 0460.HK 0553.HK 2382.HK 0031.HK 0747.HK 3188.HK 1071.HK 3382.HK 3823.HK 3898.HK 0451.HK 2727.HK 0968.HK 0750.HK 1680.HK 6136.HK 1072.HK 6823.HK 1177.HK 2020.HK 0419.HK 6828.HK 1060.HK 8047.HK 0867.HK 0336.HK 1848.HK 1856.HK 1313.HK 2607.HK 3886.HK 8292.HK 1618.HK 0572.HK 2211.HK 3336.HK 2313.HK 0220.HK 1323.HK 1638.HK 1185.HK 1004.HK 1808.HK 8321.HK 0205.HK 2623.HK 2393.HK 0161.HK 1613.HK 0855.HK 8201.HK 0882.HK 1212.HK 0696.HK 1375.HK 0091.HK 0038.HK 0911.HK 3360.HK 0085.HK 1333.HK 0152.HK 1522.HK 0570.HK 0938.HK 1330.HK 2880.HK 3049.HK 0546.HK 2198.HK 1108.HK 8237.HK 2380.HK 0996.HK 0402.HK 0036.HK 0732.HK 0444.HK 0895.HK 3393.HK 1345.HK 0476.HK 1369.HK 1131.HK 1228.HK 0154.HK 0548.HK 8123.HK 0899.HK 0718.HK 2322.HK 0926.HK 1661.HK 1089.HK 0811.HK 0433.HK 83188.HK 0303.HK 1728.HK 0260.HK 0107.HK 2348.HK 1599.HK 1065.HK 8311.HK 8018.HK 0530.HK 8207.HK 0440.HK 1308.HK 0564.HK 0568.HK___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Web Praser
Hi, I am new to programming. I want to know what I should look at if I want to learn more about Web Praser. I know there is something called Beautiful Soup but I think it is kind of difficult for me at this stage. Thank you Regards, Crusier ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Scrapy vs. Beautiful Soup
Hi Emile, I have also found that there is something called Scrapy. Please kindly comment on it. Which one is easier to use compared with Beautiful Soup? Thanks in advance. Cheers, Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Downloading data from web
I am trying to extract web data and put it into a database for analysis.I am just wondering what is the best way to do it and I will try to dig up information from there. Please help me how can I kick start this project. Cheers, Hank ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor