Re: Enormous Input and Output Test

2009-10-10 Thread Irmen de Jong

John Yeung wrote:

P.S.  I hope people realize that the concise, intuitive, readable
answers we all tried in our first couple of (failed) attempts are much
more Pythonic than the beasts that were created just for SPOJ.


Well, it is not often that we need to micro optimize stuff (or how would you call it) in 
Python, but this was a fun exercise to do that. You need to think about the small tricks 
that you can use to speed up your code. And then you come up with things like:


- avoid method/function calls
- use table lookups instead of computation
- use iterators
- use tools like psyco to help you out in the odd case where the above doesn't 
cut it

These things are useful to keep in mind also in general I think!

-irmen


PS the actual loop of my solution isn't that much bigger than it was in my first, most 
readable, attempt. But I think that has everything to do with the complexity of the 
task: it is very simple. Ofcourse I cannot speak about the implementation of the 
solutions that are still faster by a factor of 2...

--
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-08 Thread n00m
numerix's solution was excelled by Steve C's one (8.78s):
http://www.spoj.pl/ranks/INOUTEST/lang=PYTH
I don't understand nothing.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-08 Thread n00m
Congrats, Irmen.

PS
 so I think 7.5 seconds for the fastest ...

It's becoming crazy :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-08 Thread n00m
N00m The Instigator... hahaha :-)
I always wish I was a producer, an entertainer,
an impressario, or even a souteneur (kidding).

Life is complex: it has both real and imaginary parts.
Some producer (Mr. Gomelsky) nicknamed Eric Clapton as
Slow Hand, many years ago.
Gomel is my native town and apparently russian name
Gomelsky means smb/smth belonging to Gomel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-07 Thread Irmen de Jong

n00m wrote:

numerix's solution was excelled by Steve C's one (8.78s):
http://www.spoj.pl/ranks/INOUTEST/lang=PYTH
I don't understand nothing.



I just got my solution accepted, it ran in 14 seconds though.
I have no idea how to shave more seconds off, so I think 7.5 seconds for the fastest 
solution is really mindboggling.


Things that eventually made my solution run within the time limit:
- I didn't use int()
- I used Psyco

Those two resulted in the biggest speed increase.
Tweaking with buffered/unbuffered IO was insignificant.

Cheers
Irmen
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-07 Thread John Yeung
On Oct 7, 4:35 pm, Irmen de Jong irmen.nos...@xs4all.nl wrote:
 I just got my solution accepted, it ran in 14 seconds though.

Hey, that's pretty good.  Until n00m instigated the most recent
INOUTEST craze, the only accepted answer besides numerix's was one
that barely squeaked in at 19.81s, and that result was achieved with
the older and apparently much faster (at least on the SPOJ machine)
Python 2.5.

(I see a lot of people trying to pass INOUTEST with Python these days,
many or most of them because they have read this thread.)

 Things that eventually made my solution run within the time limit:
 - I didn't use int()
 - I used Psyco

 Those two resulted in the biggest speed increase.
 Tweaking with buffered/unbuffered IO was insignificant.

Thank you so much for these tips.  I quickly threw something together
and now have my very own accepted INOUTEST answer!  It's clearly not
optimized, and I think a lot of people will soon have solutions much
faster than mine and closer to yours, now that they are not wasting
their efforts chasing dead ends.

John


P.S.  I hope people realize that the concise, intuitive, readable
answers we all tried in our first couple of (failed) attempts are much
more Pythonic than the beasts that were created just for SPOJ.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-06 Thread n00m
 50-80%% of users from the 1st page in ranklist are
 super-extra-brilliant

#5 there: http://www.spoj.pl/users/tourist/
This 14 y.o. schoolboy won IOI 2009, in this August,
and he's about to get into Guiness' book as the youngest
winner for all the history of international olympiads on
informatics.
He lives at 20 minutes of walking from my house, -- it's
South East Belarus, near to Chernobyl (mutants?).
Hardly I'm able to do even 10% of what he can do, though
I'm older by 3 times than him.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-06 Thread n00m
 This takes 5 second on my machine using a file with 1,000,000 random...

Surely it will fail to pass time limit too
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-06 Thread n00m
What happened to performance of ver.2.6.2 (vs ver.2.5.x)?

https://www.spoj.pl/forum/viewtopic.php?f=20t=5949
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-05 Thread n00m
 but unlike us, he's routinely under 11s.  Crazy.

No wonder!
50-80%% of users from the 1st page in ranklist are
super-extra-brilliant (young students) programmers.
They are winners of numerous competitions, national
and international olympiads on informatics, etc.
Some of them are/were even true wunderkinders.
E.g. Tomek Czajka from Poland (now he lives and works,
in some university, in America)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-05 Thread n00m
Duncan Booth,

alas... still TLE:

2800839
2009-10-04 13:03:59
Q
Enormous Input and Output Test
time limit exceeded
-
88M
PYTH
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread n00m
And this code time limits (no matter with or without Psyco):

import psyco
psyco.full()

import sys

def foo():
##sys.stdin = open('D:/1583.txt', 'rt')
sys.stdin.readline()
while 1:
try:
x, y = sys.stdin.readline().split()
sys.stdout.write(str(int(x) * int(y)) + '\n')
except:
break

foo()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread n00m

I've given up :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread John Yeung
On Oct 4, 1:50 am, n00m n...@narod.ru wrote:
 It can be not so simple.
 There can be multiple input files,
 with *total* size ~30-50-80 MB.

According to one of the global moderators, the 20s time limit is for
each input file:

  https://www.spoj.pl/forum/viewtopic.php?f=6t=4667

John
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread n00m
PS
Yes, they support psyco since long time ago
(otherwise I'd get Compilitation Error verdict).
I used Psyco there many many times.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread n00m
This time limits too:

=
import psyco
psyco.full()

import sys

def foo():
##sys.stdin = open('D:/1583.txt', 'rt')
a = sys.stdin.readlines()
a = a[1:int(a[0]) + 1]
for ai in a:
x, y = ai.split()
sys.stdout.write(str(int(x) * int(y)) + '\n')

foo()
=
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread n00m
It can be not so simple.
There can be multiple input files,
with *total* size ~30-50-80 MB.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread John Yeung
On Oct 4, 4:20 am, n00m n...@narod.ru wrote:
 I've given up :-)

Well, that numerix user (who already had the top Python solution) just
submitted a ton of new ones to that problem, apparently trying to get
a faster time.  I don't think he can squeeze much more out of that
stone, but unlike us, he's routinely under 11s.  Crazy.

I wish they had an option to let you keep running your program past
the limit (at least two or three times the limit) to give you more
feedback, even if they still consider your solution unacceptable.
Especially in the tutorial section, which doesn't seem to contribute
to your score anyway.

John
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread Bearophile
Terry Reedy:

 Don't waste your time with problem sites that judge raw-clock time over
 (and before) accuracy, thereby greatly favoring low-level languages and
 hack tricks over clear high-level code.

I usually don't like to solve the kind of problems shown by those
sites because those problems are too much artificial (and often too
much difficult). But sometimes I have written some solutions.

But those sites never judge raw running time over accuracy: in most
or all such sites the programs are tested with tons of possible
inputs, and if even one output is a bit wrong, the program is totally
refused. This is a hard rule that encourages programmers to take a
very good care of program correctness.

Some sites add a little more noise in the inputs, simulating a bit
more real-world inputs, while most of those online contests give clean
inputs (the input bounds are well specified in the problem statement).

From what I've seen from some of the best solutions submitted to those
sites (some sites allow people to see the source of the contest
entries), the programs usually don't (need to) use hack tricks as
you say (even if probably some program uses them). Using hacks is
often unsafe so people usually prefer safer ways to code, because just
a little bug may fully compromise the score of the program.

I agree that the timing scores in such sites often encourage low level
languages, like C (and sometimes C++, that's a multilevel language),
but on the other hand such languages exist, C is used in many real-
world places, so designing sites where people compete with such
languages is legit. C allows people to understand better what's going
on inside the computer, this is valuable and positive. Bashing low-
level languages is silly. Even CPython is written in C. A good
programmer has to know both higher and lower level languages.

And in real-life sometimes you need performance. This thread shows
that a normal Python program is not up to those timings for the
enormous input problem (even if there are ways to write a Python
program to solve this problem). People at Google are trying to create
a 5 times faster Python (Unladen Swallow project) because they use lot
of real-world Python code and they think Python is slow. I've found
plenty of situations where CPython code is not fast enough for my
purposes.

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread Duncan Booth
n00m n...@narod.ru wrote:

 
 I've given up :-)

Here's my attempt, which is about 30% faster than your original but I've no 
idea if it would be fast enough for you.

import sys, time, os, itertools
import gc
gc.set_threshold()
D = []
def foo():
##sys.stdin = open('D:/1583.txt', 'rt')
count = int(sys.stdin.readline())
data = sys.stdin.read().split()
D.append(data)
data = map(int, data)
D.append(data)
nextval = iter(data).next
data = map(str, (nextval()*nextval() for a in xrange(count)))
D.append(data)
sys.stdout.write('\n'.join(data))
sys.stdout.write('\n')

start = time.time()
foo()
print sys.stderr, time.time() - start
os._exit(0)

Storing the large lists in a global prevent them deallocating when the 
function returns, so speeds up the time as I recorded it. If they are 
timing the whole process then I think calling _exit() should avoid the 
final cleanup time.

Playing with the garbage collector makes it ever so slightly faster 
although disabling it entirely makes it much slower.

I feel there ought to be a faster alternative to xrange but if so I 
couldn't find it.

As Terry said, it's all hack tricks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread Jon Clements
On Oct 4, 12:08 pm, n00m n...@narod.ru wrote:
 Duncan Booth,

 alas... still TLE:

 2800839
 2009-10-04 13:03:59
 Q
 Enormous Input and Output Test
 time limit exceeded
 -
 88M
 PYTH

Just to throw into the mix...

What about buffering? Does anyone know what the effective stdin buffer
is for Python? I mean, it really can't be the multiplying that's a
bottleneck. Not sure if it's possible, but can a new stdin be created
(possibly using os.fdopen) with a hefty buffer size?

I'm probably way off, but something to share.

Cheers,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-04 Thread Duncan Booth
Jon Clements jon...@googlemail.com wrote:

 On Oct 4, 12:08 pm, n00m n...@narod.ru wrote:
 Duncan Booth,

 alas... still TLE:

 2800839
 2009-10-04 13:03:59
 Q
 Enormous Input and Output Test
 time limit exceeded
 -
 88M
 PYTH
 
 Just to throw into the mix...
 
 What about buffering? Does anyone know what the effective stdin buffer
 is for Python? I mean, it really can't be the multiplying that's a
 bottleneck. Not sure if it's possible, but can a new stdin be created
 (possibly using os.fdopen) with a hefty buffer size?
 
 I'm probably way off, but something to share.
 
I did try a version where I just read the data in, split it up, and then 
wrote it out again. On my test file that took about 2 seconds compared with 
the 8 seconds it took the full code I posted, so while there may be scope 
for faster I/O (e.g. using mmap), any real speedup would have to be in the 
convert to int, multiply, convert back to str pipeline.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread Chris Rebert
On Sat, Oct 3, 2009 at 6:54 AM, n00m n...@narod.ru wrote:
 Hi, py.folk!

 I need your help to understand how
 http://www.spoj.pl/problems/INOUTEST/
 can be passed in Python.
snip
 def foo():
    ##sys.stdin = open('D:/1583.txt', 'rt')
    a = sys.stdin.readlines()

That line is probably a Very Bad Idea (TM) as it reads the *entire*
enormous file into memory *at once*. It would probably be much better
to iterate over the file, thus only reading one individual line at a
time. I'm betting the massive malloc()ing involved with .readlines()
is a large part of the slowness.

snip
 But it gets Time Limit Exceeded verdict.
 Any ideas?

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread Terry Reedy

n00m wrote:

Hi, py.folk!

I need your help to understand how
http://www.spoj.pl/problems/INOUTEST/
can be passed in Python.

I see two guys who managed to get accepted:
http://www.spoj.pl/ranks/INOUTEST/lang=PYTH

My code for this is:
===
import psyco
psyco.full()

import sys

def noo(b):
b = b.split()
return str(int(b[0]) * int(b[1])) + '\n'

def foo():
##sys.stdin = open('D:/1583.txt', 'rt')
a = sys.stdin.readlines()
a = a[1:int(a[0]) + 1]
a = map(noo, a)
sys.stdout.writelines(a)

foo()
===

But it gets Time Limit Exceeded verdict.
Any ideas?


Don't waste your time with problem sites that judge raw-clock time over 
(and before) accuracy, thereby greatly favoring low-level languages and 
hack tricks over clear high-level code.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread n00m
On Oct 4, 2:29 am, Chris Rebert c...@rebertia.com wrote:


 That line is probably a Very Bad Idea (TM) as it reads the *entire*
 enormous file into memory *at once*. It would probably be much better
 to iterate over the file, thus only reading one individual line at a
 time. I'm betting the massive malloc()ing involved with .readlines()
 is a large part of the slowness.

Certainly not.
The culprit is line a = map(noo, a).
Without it execution time = 2.59s (I've just checked it).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread n00m
PS
Time Limit for this problem = 20s
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread alex23
On Oct 3, 11:54 pm, n00m n...@narod.ru wrote:
 I need your help to understand howhttp://www.spoj.pl/problems/INOUTEST/
 can be passed in Python.

 My code for this is:
 ===
 import psyco
 psyco.full()

 import sys

 def noo(b):
     b = b.split()
     return str(int(b[0]) * int(b[1])) + '\n'

 def foo():
     ##sys.stdin = open('D:/1583.txt', 'rt')
     a = sys.stdin.readlines()
     a = a[1:int(a[0]) + 1]
     a = map(noo, a)
     sys.stdout.writelines(a)

 foo()
 ===

 But it gets Time Limit Exceeded verdict.
 Any ideas?

map() is really at its most efficient when used with built-ins, not
user defined functions. In those cases, I believe you're better off
using a for-loop or list comprehension. Also: are you sure they
support psyco? It's not part of stdlib...

I thought something simpler might work:

import sys

sys.stdin.readline()

for line in sys.stdin.readlines():
nums = map(int, line.split())
print nums[0] * nums[1]

But although this processes 5.5MB  2 secs on my PC (which meets the
2.5MB/sec criteria outlined in INTEST), I'm getting the same result
that you are.

Do you know how big the input data set actually is?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread n00m
 Do you know how big the input data set actually is?

Of course, I don't know exact size of input.
It's several MBs, I guess. And mind the fact:
their testing machines are PIII (750MHz).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread alex23
On Oct 4, 1:58 pm, n00m n...@narod.ru wrote:
  Do you know how big the input data set actually is?

 Of course, I don't know exact size of input.
 It's several MBs, I guess. And mind the fact:
 their testing machines are PIII (750MHz).

Well, then, that's moved the problem from challenging to
ludicrous. Being asked to conform to one constraint with no
knowledge of the others seems kind of ridiculous.

On my machine, the above code handles ~50MB in ~10sec.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread n00m
 On my machine, the above code handles ~50MB in ~10sec.

Means their input  40-50MB
2.
I just see: two guys did it in Python
and I feel myself curious how on earth?.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread n00m

And *without* Psyco the above code gets TLE verdict...

A kind of mystery :(
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enormous Input and Output Test

2009-10-03 Thread John Yeung
On Oct 3, 11:58 pm, n00m n...@narod.ru wrote:
  Do you know how big the input data set actually is?

 Of course, I don't know exact size of input.
 It's several MBs, I guess. And mind the fact:
 their testing machines are PIII (750MHz).

You know the maximum size of the input, if you can trust the problem
definition.  The maximum number of lines in the input is 10**6 + 1.
The first line is at most 7 characters, plus EOL.  The subsequent
lines are at most 13 characters each, plus EOL.

John
-- 
http://mail.python.org/mailman/listinfo/python-list