Re: middle line, redux

Ariel Scolnicov Sun, 02 Dec 2001 06:50:02 -0800

"Bernie Cosell" <[EMAIL PROTECTED]> writes:

> Would someone take a step back from the Perl intricacies and tell me if 
> they even have an *algorithm* [much less a Perl program] that meets the 
> problem's criteria.  I as got it:
>    1) you can only read the file once, from beginning to end [no 
> 'seek'ing around or other hackery -- assume the file is a TCP stream or a 
> pipe]
>    2) you can only use O(1) storage
>    3) you must print out the 'middle' line of the file.


No.  A program using a finite amount of memory and reading its input
exactly once, sequentially, is really just a finite automaton[1].  A
finite automaton can't solve the problem.

Of course, this requires stating what O(1) memory is, when lines can
be any length.  If your model allows an arbitrarily-long line to fit
into constant memory, then I can say "$/=undef" and slurp the entire
file into "constant" memory, so that doesn't work.

2 alternatives:  First, you can demand that lines have bounded size.
In which case the problem reduces to printing the middle letter of a
string, and a finite automaton can't do that.  Still not a very good
model, because a finite automaton can't do indices into that string,
but usually these are considered O(1) memory.

Second, allow lines of any length, but only as opaque data items.  A
program is allowed to read such a line, print it, and keep in memory
O(1) such lines.  But you aren't allowed to read or modify the line.
Since I can see no legitimate use of such activities in the problem,
it seems a reasonable restriction.  Also allow indices into the text
of unbounded size (an index is large enough to count the input lines
and you may perform any arithmetic on it).

This model seems the right one to use for our problem.  In this model
(one pass over STDIN, O(1) many indices and lines) write a program to
print (only) the middle line of its input.

Unfortunately the finite-automaton argument doesn't work for this.  A
proof that the problem is impossible goes instead along the following
lines: Suppose you have a program which does solve the problem.  Then
for some constant c (the amount of memory used), the middle line gets
printed at most c lines after it is read.  Let n > c, and pick a file 
with 2n+1 lines.  The program prints out the middle line of that file
by line n+1+c.  But then for any 2n+101-line file starting with these
same L lines, the program prints out line n+1, which is wrong.

[...]

> My intuition is that the problem is too constrained and that there 
> *is*no* algorithm that meets all the criteria, but I'd be delighted to be 
> proven wrong...

Of course, you may disagree with my model...


Footnotes: 
[1]  If your program uses k bits, it has ~2^k states (more than 2^k if
you read lines in more than one place).  Every time you read a line,
your program moves to some other state.

-- 
Ariel Scolnicov        |"GCAAGAATTGAACTGTAG"            | [EMAIL PROTECTED]
Compugen Ltd.          |   +++ THIS SPACE TO LET +++    \ We recycle all our Hz
72 Pinhas Rosen St.    |Tel: +972-3-7658117 (Main office)`---------------------
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555    http://3w.compugen.co.il/~ariels

Re: middle line, redux

Reply via email to