Re: Fastest way to retrieve and write html contents to file

2016-05-04 Thread DFS
On 5/3/2016 2:41 PM, Tim Chase wrote: On 2016-05-03 13:00, DFS wrote: On 5/3/2016 11:28 AM, Tim Chase wrote: On 2016-05-03 00:24, DFS wrote: One small comparison I was able to make was VBA vs python/pyodbc to summarize an Access database. Not quite a fair test, but interesting nonetheless.

Re: Fastest way to retrieve and write html contents to file

2016-05-03 Thread Tim Chase
On 2016-05-03 13:00, DFS wrote: > On 5/3/2016 11:28 AM, Tim Chase wrote: > > On 2016-05-03 00:24, DFS wrote: > >> One small comparison I was able to make was VBA vs python/pyodbc > >> to summarize an Access database. Not quite a fair test, but > >> interesting nonetheless. > >> > >> Access 2003

Re: Fastest way to retrieve and write html contents to file

2016-05-03 Thread DFS
On 5/3/2016 11:28 AM, Tim Chase wrote: On 2016-05-03 00:24, DFS wrote: One small comparison I was able to make was VBA vs python/pyodbc to summarize an Access database. Not quite a fair test, but interesting nonetheless. Access 2003 file Access 2003 VBA code Time: 0.18 seconds same Access

Re: Fastest way to retrieve and write html contents to file

2016-05-03 Thread Tim Chase
On 2016-05-03 00:24, DFS wrote: > One small comparison I was able to make was VBA vs python/pyodbc to > summarize an Access database. Not quite a fair test, but > interesting nonetheless. > > Access 2003 file > Access 2003 VBA code > Time: 0.18 seconds > > same Access 2003 file > 32-bit python

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/3/2016 12:06 AM, Michael Torrie wrote: Now if you want to talk about processing the data once you have it, there we can talk about speeds and optimization. Be glad to. Helps me learn python, so bring whatever challenge you want and I'll try to keep up. One small comparison I was able

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Michael Torrie
On 05/02/2016 01:37 AM, DFS wrote: > So python matches or beats VBScript at this much larger file. Kewl. If you download something large enough to be meaningful, you'll find the runtime speeds should all converge to something showing your internet connection speed. Try downloading a 4 GB file,

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 10:00 PM, Chris Angelico wrote: On Tue, May 3, 2016 at 11:51 AM, DFS wrote: On 5/2/2016 3:19 AM, Chris Angelico wrote: There's an easier way to test if there's caching happening. Just crank the iterations up from 10 to 100 and see what happens to the times. If

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Chris Angelico
On Tue, May 3, 2016 at 11:51 AM, DFS wrote: > On 5/2/2016 3:19 AM, Chris Angelico wrote: > >> There's an easier way to test if there's caching happening. Just crank >> the iterations up from 10 to 100 and see what happens to the times. If >> your numbers are perfectly fair, they

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 3:19 AM, Chris Angelico wrote: There's an easier way to test if there's caching happening. Just crank the iterations up from 10 to 100 and see what happens to the times. If your numbers are perfectly fair, they should be perfectly linear in the iteration count; eg a 1.8 second

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 4:42 AM, Peter Otten wrote: DFS wrote: Is VB using a local web cache, and Python not? I'm not specifying a local web cache with either (wouldn't know how or where to look). If you have Windows, you can try it. I don't have Windows, but if I'm to believe

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Tim Chase
On 2016-05-02 00:06, DFS wrote: > Then I tested them in loops - the VBScript is MUCH faster: 0.44 for > 10 iterations, vs 0.88 for python. In addition to the other debugging recommendations in sibling threads, a couple other things to try: 1) use a local debugging proxy so that you can compare

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Peter Otten
DFS wrote: >> Is VB using a local web cache, and Python not? > > I'm not specifying a local web cache with either (wouldn't know how or > where to look). If you have Windows, you can try it. I don't have Windows, but if I'm to believe

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Stephen Hansen
On Mon, May 2, 2016, at 12:37 AM, DFS wrote: > On 5/2/2016 2:27 AM, Stephen Hansen wrote: > > I'm again going back to the point of: its fast enough. When comparing > > two small numbers, "twice as slow" is meaningless. > > Speed is always meaningful. > > I know python is relatively slow, but

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 2:27 AM, Stephen Hansen wrote: On Sun, May 1, 2016, at 10:59 PM, DFS wrote: startTime = time.clock() for i in range(loops): r = urllib2.urlopen(webpage) f = open(webfile,"w") f.write(r.read()) f.close endTime = time.clock() print "Finished urllib2 in

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Chris Angelico
On Mon, May 2, 2016 at 4:47 PM, DFS wrote: > I'm not specifying a local web cache with either (wouldn't know how or where > to look). If you have Windows, you can try it. > --- > Option Explicit > Dim xmlHTTP, fso,

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 2:05 AM, Steven D'Aprano wrote: On Monday 02 May 2016 15:00, DFS wrote: I tried the 10-loop test several times with all versions. The results were 100% consistent: VBSCript xmlHTTP was always 2x faster than any python method. Are you absolutely sure you're comparing the same

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Stephen Hansen
On Sun, May 1, 2016, at 10:59 PM, DFS wrote: > startTime = time.clock() > for i in range(loops): > r = urllib2.urlopen(webpage) > f = open(webfile,"w") > f.write(r.read()) > f.close > endTime = time.clock() > print "Finished urllib2 in %.2g seconds"

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Steven D'Aprano
On Monday 02 May 2016 15:00, DFS wrote: > I tried the 10-loop test several times with all versions. > > The results were 100% consistent: VBSCript xmlHTTP was always 2x faster > than any python method. Are you absolutely sure you're comparing the same job in two languages? Is VB using a local

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread Steven D'Aprano
On Monday 02 May 2016 15:04, DFS wrote: > 0.2 is half as fast as 0.1, here. > > And two small numbers turn into bigger numbers when the webpage is big, > and soon the download time differences are measured in minutes, not half > a second. It takes twice as long to screw a screw into timber than

Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS
On 5/2/2016 1:15 AM, Stephen Hansen wrote: On Sun, May 1, 2016, at 10:00 PM, DFS wrote: I tried the 10-loop test several times with all versions. Also how, _exactly_, are you testing this? C:\Python27>python -m timeit "filename='C:\\test.txt';

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen
On Sun, May 1, 2016, at 10:04 PM, DFS wrote: > And two small numbers turn into bigger numbers when the webpage is big, > and soon the download time differences are measured in minutes, not half > a second. Are you sure of that? Have you determined that the time is not a constant overhead verses

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen
On Sun, May 1, 2016, at 10:00 PM, DFS wrote: > I tried the 10-loop test several times with all versions. Also how, _exactly_, are you testing this? C:\Python27>python -m timeit "filename='C:\\test.txt'; webpage='http://econpy.pythonanywhere.com/ex/001.html'; import urllib2; r =

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico
On Mon, May 2, 2016 at 3:04 PM, DFS wrote: > And two small numbers turn into bigger numbers when the webpage is big, and > soon the download time differences are measured in minutes, not half a > second. > > So, any ideas? So, measure with bigger web pages, and find out whether

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS
On 5/2/2016 1:00 AM, Stephen Hansen wrote: On Sun, May 1, 2016, at 09:50 PM, DFS wrote: On 5/2/2016 12:40 AM, Chris Angelico wrote: On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen wrote: On Sun, May 1, 2016, at 09:06 PM, DFS wrote: Then I tested them in loops - the

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS
On 5/2/2016 12:49 AM, Ben Finney wrote: DFS writes: Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 iterations, vs 0.88 for python. […] urllib2 and requests were about the same speed as urllib.urlretrieve, while pycurl was significantly slower (1.2

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen
On Sun, May 1, 2016, at 09:50 PM, DFS wrote: > On 5/2/2016 12:40 AM, Chris Angelico wrote: > > On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen wrote: > >> On Sun, May 1, 2016, at 09:06 PM, DFS wrote: > >>> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS
On 5/2/2016 12:40 AM, Chris Angelico wrote: On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen wrote: On Sun, May 1, 2016, at 09:06 PM, DFS wrote: Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 iterations, vs 0.88 for python. ... I know it's asking a

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico
On Mon, May 2, 2016 at 2:49 PM, Ben Finney wrote: > One simple way to do that: Run the exact same test many times (say, > 10 000 or so) on the same machine, and then compute the average of all > the durations. > > Do the same for each different program, and then you

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Ben Finney
DFS writes: > Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 > iterations, vs 0.88 for python. > > […] > > urllib2 and requests were about the same speed as urllib.urlretrieve, > while pycurl was significantly slower (1.2 seconds). Network access is

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico
On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen wrote: > On Sun, May 1, 2016, at 09:06 PM, DFS wrote: >> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 >> iterations, vs 0.88 for python. > ... >> I know it's asking a lot, but is there a really fast AND

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen
On Sun, May 1, 2016, at 09:06 PM, DFS wrote: > Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 > iterations, vs 0.88 for python. ... > I know it's asking a lot, but is there a really fast AND really short > python solution for this simple thing? 0.88 is not fast enough

Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS
I posted a little while ago about how short the python code was: - 1. import urllib 2. urllib.urlretrieve(webpage, filename) - Which is very sweet compared to the VBScript version: