Re: Enormous Input and Output Test
John Yeung wrote: P.S. I hope people realize that the concise, intuitive, readable answers we all tried in our first couple of (failed) attempts are much more Pythonic than the beasts that were created just for SPOJ. Well, it is not often that we need to micro optimize stuff (or how would you call it) in Python, but this was a fun exercise to do that. You need to think about the small tricks that you can use to speed up your code. And then you come up with things like: - avoid method/function calls - use table lookups instead of computation - use iterators - use tools like psyco to help you out in the odd case where the above doesn't cut it These things are useful to keep in mind also in general I think! -irmen PS the actual loop of my solution isn't that much bigger than it was in my first, most readable, attempt. But I think that has everything to do with the complexity of the task: it is very simple. Ofcourse I cannot speak about the implementation of the solutions that are still faster by a factor of 2... -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
numerix's solution was excelled by Steve C's one (8.78s): http://www.spoj.pl/ranks/INOUTEST/lang=PYTH I don't understand nothing. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
Congrats, Irmen. PS so I think 7.5 seconds for the fastest ... It's becoming crazy :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
N00m The Instigator... hahaha :-) I always wish I was a producer, an entertainer, an impressario, or even a souteneur (kidding). Life is complex: it has both real and imaginary parts. Some producer (Mr. Gomelsky) nicknamed Eric Clapton as Slow Hand, many years ago. Gomel is my native town and apparently russian name Gomelsky means smb/smth belonging to Gomel. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
n00m wrote: numerix's solution was excelled by Steve C's one (8.78s): http://www.spoj.pl/ranks/INOUTEST/lang=PYTH I don't understand nothing. I just got my solution accepted, it ran in 14 seconds though. I have no idea how to shave more seconds off, so I think 7.5 seconds for the fastest solution is really mindboggling. Things that eventually made my solution run within the time limit: - I didn't use int() - I used Psyco Those two resulted in the biggest speed increase. Tweaking with buffered/unbuffered IO was insignificant. Cheers Irmen -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 7, 4:35 pm, Irmen de Jong irmen.nos...@xs4all.nl wrote: I just got my solution accepted, it ran in 14 seconds though. Hey, that's pretty good. Until n00m instigated the most recent INOUTEST craze, the only accepted answer besides numerix's was one that barely squeaked in at 19.81s, and that result was achieved with the older and apparently much faster (at least on the SPOJ machine) Python 2.5. (I see a lot of people trying to pass INOUTEST with Python these days, many or most of them because they have read this thread.) Things that eventually made my solution run within the time limit: - I didn't use int() - I used Psyco Those two resulted in the biggest speed increase. Tweaking with buffered/unbuffered IO was insignificant. Thank you so much for these tips. I quickly threw something together and now have my very own accepted INOUTEST answer! It's clearly not optimized, and I think a lot of people will soon have solutions much faster than mine and closer to yours, now that they are not wasting their efforts chasing dead ends. John P.S. I hope people realize that the concise, intuitive, readable answers we all tried in our first couple of (failed) attempts are much more Pythonic than the beasts that were created just for SPOJ. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
50-80%% of users from the 1st page in ranklist are super-extra-brilliant #5 there: http://www.spoj.pl/users/tourist/ This 14 y.o. schoolboy won IOI 2009, in this August, and he's about to get into Guiness' book as the youngest winner for all the history of international olympiads on informatics. He lives at 20 minutes of walking from my house, -- it's South East Belarus, near to Chernobyl (mutants?). Hardly I'm able to do even 10% of what he can do, though I'm older by 3 times than him. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
This takes 5 second on my machine using a file with 1,000,000 random... Surely it will fail to pass time limit too -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
What happened to performance of ver.2.6.2 (vs ver.2.5.x)? https://www.spoj.pl/forum/viewtopic.php?f=20t=5949 -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
but unlike us, he's routinely under 11s. Crazy. No wonder! 50-80%% of users from the 1st page in ranklist are super-extra-brilliant (young students) programmers. They are winners of numerous competitions, national and international olympiads on informatics, etc. Some of them are/were even true wunderkinders. E.g. Tomek Czajka from Poland (now he lives and works, in some university, in America) -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
Duncan Booth, alas... still TLE: 2800839 2009-10-04 13:03:59 Q Enormous Input and Output Test time limit exceeded - 88M PYTH -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
And this code time limits (no matter with or without Psyco): import psyco psyco.full() import sys def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') sys.stdin.readline() while 1: try: x, y = sys.stdin.readline().split() sys.stdout.write(str(int(x) * int(y)) + '\n') except: break foo() -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
I've given up :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 4, 1:50 am, n00m n...@narod.ru wrote: It can be not so simple. There can be multiple input files, with *total* size ~30-50-80 MB. According to one of the global moderators, the 20s time limit is for each input file: https://www.spoj.pl/forum/viewtopic.php?f=6t=4667 John -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
PS Yes, they support psyco since long time ago (otherwise I'd get Compilitation Error verdict). I used Psyco there many many times. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
This time limits too: = import psyco psyco.full() import sys def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') a = sys.stdin.readlines() a = a[1:int(a[0]) + 1] for ai in a: x, y = ai.split() sys.stdout.write(str(int(x) * int(y)) + '\n') foo() = -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
It can be not so simple. There can be multiple input files, with *total* size ~30-50-80 MB. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 4, 4:20 am, n00m n...@narod.ru wrote: I've given up :-) Well, that numerix user (who already had the top Python solution) just submitted a ton of new ones to that problem, apparently trying to get a faster time. I don't think he can squeeze much more out of that stone, but unlike us, he's routinely under 11s. Crazy. I wish they had an option to let you keep running your program past the limit (at least two or three times the limit) to give you more feedback, even if they still consider your solution unacceptable. Especially in the tutorial section, which doesn't seem to contribute to your score anyway. John -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
Terry Reedy: Don't waste your time with problem sites that judge raw-clock time over (and before) accuracy, thereby greatly favoring low-level languages and hack tricks over clear high-level code. I usually don't like to solve the kind of problems shown by those sites because those problems are too much artificial (and often too much difficult). But sometimes I have written some solutions. But those sites never judge raw running time over accuracy: in most or all such sites the programs are tested with tons of possible inputs, and if even one output is a bit wrong, the program is totally refused. This is a hard rule that encourages programmers to take a very good care of program correctness. Some sites add a little more noise in the inputs, simulating a bit more real-world inputs, while most of those online contests give clean inputs (the input bounds are well specified in the problem statement). From what I've seen from some of the best solutions submitted to those sites (some sites allow people to see the source of the contest entries), the programs usually don't (need to) use hack tricks as you say (even if probably some program uses them). Using hacks is often unsafe so people usually prefer safer ways to code, because just a little bug may fully compromise the score of the program. I agree that the timing scores in such sites often encourage low level languages, like C (and sometimes C++, that's a multilevel language), but on the other hand such languages exist, C is used in many real- world places, so designing sites where people compete with such languages is legit. C allows people to understand better what's going on inside the computer, this is valuable and positive. Bashing low- level languages is silly. Even CPython is written in C. A good programmer has to know both higher and lower level languages. And in real-life sometimes you need performance. This thread shows that a normal Python program is not up to those timings for the enormous input problem (even if there are ways to write a Python program to solve this problem). People at Google are trying to create a 5 times faster Python (Unladen Swallow project) because they use lot of real-world Python code and they think Python is slow. I've found plenty of situations where CPython code is not fast enough for my purposes. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
n00m n...@narod.ru wrote: I've given up :-) Here's my attempt, which is about 30% faster than your original but I've no idea if it would be fast enough for you. import sys, time, os, itertools import gc gc.set_threshold() D = [] def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') count = int(sys.stdin.readline()) data = sys.stdin.read().split() D.append(data) data = map(int, data) D.append(data) nextval = iter(data).next data = map(str, (nextval()*nextval() for a in xrange(count))) D.append(data) sys.stdout.write('\n'.join(data)) sys.stdout.write('\n') start = time.time() foo() print sys.stderr, time.time() - start os._exit(0) Storing the large lists in a global prevent them deallocating when the function returns, so speeds up the time as I recorded it. If they are timing the whole process then I think calling _exit() should avoid the final cleanup time. Playing with the garbage collector makes it ever so slightly faster although disabling it entirely makes it much slower. I feel there ought to be a faster alternative to xrange but if so I couldn't find it. As Terry said, it's all hack tricks. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 4, 12:08 pm, n00m n...@narod.ru wrote: Duncan Booth, alas... still TLE: 2800839 2009-10-04 13:03:59 Q Enormous Input and Output Test time limit exceeded - 88M PYTH Just to throw into the mix... What about buffering? Does anyone know what the effective stdin buffer is for Python? I mean, it really can't be the multiplying that's a bottleneck. Not sure if it's possible, but can a new stdin be created (possibly using os.fdopen) with a hefty buffer size? I'm probably way off, but something to share. Cheers, Jon. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
Jon Clements jon...@googlemail.com wrote: On Oct 4, 12:08 pm, n00m n...@narod.ru wrote: Duncan Booth, alas... still TLE: 2800839 2009-10-04 13:03:59 Q Enormous Input and Output Test time limit exceeded - 88M PYTH Just to throw into the mix... What about buffering? Does anyone know what the effective stdin buffer is for Python? I mean, it really can't be the multiplying that's a bottleneck. Not sure if it's possible, but can a new stdin be created (possibly using os.fdopen) with a hefty buffer size? I'm probably way off, but something to share. I did try a version where I just read the data in, split it up, and then wrote it out again. On my test file that took about 2 seconds compared with the 8 seconds it took the full code I posted, so while there may be scope for faster I/O (e.g. using mmap), any real speedup would have to be in the convert to int, multiply, convert back to str pipeline. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Sat, Oct 3, 2009 at 6:54 AM, n00m n...@narod.ru wrote: Hi, py.folk! I need your help to understand how http://www.spoj.pl/problems/INOUTEST/ can be passed in Python. snip def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') a = sys.stdin.readlines() That line is probably a Very Bad Idea (TM) as it reads the *entire* enormous file into memory *at once*. It would probably be much better to iterate over the file, thus only reading one individual line at a time. I'm betting the massive malloc()ing involved with .readlines() is a large part of the slowness. snip But it gets Time Limit Exceeded verdict. Any ideas? Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
n00m wrote: Hi, py.folk! I need your help to understand how http://www.spoj.pl/problems/INOUTEST/ can be passed in Python. I see two guys who managed to get accepted: http://www.spoj.pl/ranks/INOUTEST/lang=PYTH My code for this is: === import psyco psyco.full() import sys def noo(b): b = b.split() return str(int(b[0]) * int(b[1])) + '\n' def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') a = sys.stdin.readlines() a = a[1:int(a[0]) + 1] a = map(noo, a) sys.stdout.writelines(a) foo() === But it gets Time Limit Exceeded verdict. Any ideas? Don't waste your time with problem sites that judge raw-clock time over (and before) accuracy, thereby greatly favoring low-level languages and hack tricks over clear high-level code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 4, 2:29 am, Chris Rebert c...@rebertia.com wrote: That line is probably a Very Bad Idea (TM) as it reads the *entire* enormous file into memory *at once*. It would probably be much better to iterate over the file, thus only reading one individual line at a time. I'm betting the massive malloc()ing involved with .readlines() is a large part of the slowness. Certainly not. The culprit is line a = map(noo, a). Without it execution time = 2.59s (I've just checked it). -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
PS Time Limit for this problem = 20s -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 3, 11:54 pm, n00m n...@narod.ru wrote: I need your help to understand howhttp://www.spoj.pl/problems/INOUTEST/ can be passed in Python. My code for this is: === import psyco psyco.full() import sys def noo(b): b = b.split() return str(int(b[0]) * int(b[1])) + '\n' def foo(): ##sys.stdin = open('D:/1583.txt', 'rt') a = sys.stdin.readlines() a = a[1:int(a[0]) + 1] a = map(noo, a) sys.stdout.writelines(a) foo() === But it gets Time Limit Exceeded verdict. Any ideas? map() is really at its most efficient when used with built-ins, not user defined functions. In those cases, I believe you're better off using a for-loop or list comprehension. Also: are you sure they support psyco? It's not part of stdlib... I thought something simpler might work: import sys sys.stdin.readline() for line in sys.stdin.readlines(): nums = map(int, line.split()) print nums[0] * nums[1] But although this processes 5.5MB 2 secs on my PC (which meets the 2.5MB/sec criteria outlined in INTEST), I'm getting the same result that you are. Do you know how big the input data set actually is? -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
Do you know how big the input data set actually is? Of course, I don't know exact size of input. It's several MBs, I guess. And mind the fact: their testing machines are PIII (750MHz). -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 4, 1:58 pm, n00m n...@narod.ru wrote: Do you know how big the input data set actually is? Of course, I don't know exact size of input. It's several MBs, I guess. And mind the fact: their testing machines are PIII (750MHz). Well, then, that's moved the problem from challenging to ludicrous. Being asked to conform to one constraint with no knowledge of the others seems kind of ridiculous. On my machine, the above code handles ~50MB in ~10sec. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On my machine, the above code handles ~50MB in ~10sec. Means their input 40-50MB 2. I just see: two guys did it in Python and I feel myself curious how on earth?. -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
And *without* Psyco the above code gets TLE verdict... A kind of mystery :( -- http://mail.python.org/mailman/listinfo/python-list
Re: Enormous Input and Output Test
On Oct 3, 11:58 pm, n00m n...@narod.ru wrote: Do you know how big the input data set actually is? Of course, I don't know exact size of input. It's several MBs, I guess. And mind the fact: their testing machines are PIII (750MHz). You know the maximum size of the input, if you can trust the problem definition. The maximum number of lines in the input is 10**6 + 1. The first line is at most 7 characters, plus EOL. The subsequent lines are at most 13 characters each, plus EOL. John -- http://mail.python.org/mailman/listinfo/python-list