Re: WC:>: Linux and NT, both needed in IS

Rich Kulawiec Thu, 24 Sep 1998 10:53:49 -0400
On Wed, Sep 23, 1998 at 07:08:37PM -0400, Susan Duncan wrote:
> Rich Kulawiec wrote:
> 
> > A "few thousand items"?  You don't *need* a database.  Your data set is
> > far too small to justify one.  They're expensive, they eat resources,
> > they're tough to interface with (if you can), they (usually) take
> > training to use.  And commercial ones are black boxes: you can't
> > reach inside to fix it if it's broken, but instead have to wait
> > for support (a bad industry-wide joke) to do it for you.
> 
> I could go back to the dark ages and text search the whole thing

IMHO, there's nothing "dark ages" about this.  It solves the problem
in a cost-effective, scalable, robust manner without the unecessary
overhead, complexity, and cost of a proprietary application.

IMHO, "dark ages" is running a bag (Windows NT) built on the side of
a bag (Windows) built on top of a bag (DOS).    "dark ages" is running
an OS with an ineffective and leaky virtual memory subsystem.  "dark ages"
is running an OS that wasn't designed from the ground up with TCP/IP
networking.  "dark ages" is running an OS that isn't inherently
multi-tasking, multi-user, scalable, and portable.  "dark ages" is
running an OS where reliability and security are bad jokes.

>  but what happens when it grows?

"Growth" was not one of the design parameters you specified.  However,
this model scales quite easily -- for at least a couple orders
of magnitude.  See below [*].

> is it going to be fast enough even as it is?

Absolutely.  Roughly speaking, it's possible to search memory
about three orders of magnitude faster than disk.  It takes
a *lot* of database smarts to compensate for that advantage.


[*] I have in fact set up a database similar to this with about 60K records;
the entire thing took up 55 Mbytes of physical memory -- no big deal,
since it was running on a machine with 128 Mbytes of memory, which
left plenty of room for the OS (BSDI Unix) and the requisite applications.
Its performance/cost blew the doors off the old solution, which required:

        - A dedicated database machine
        - Proprietary database software
        - A trained database programmer/analyst
        - Maintenance/support fees for the database s/w

The database itself was stored on a disk as a single flat file
that was pulled into memory with a single read() call -- meaning
that it loaded, at boot, into memory as fast as the OS could
shuffle bits from disk to memory, with almost zero overhead.
Given that this machine ran BSDI, the only time it was booted
was when it needed be moved or have its OS upgraded.

Now, about a year later, the number of queries to this database rose
to a point where performance was starting to show the early signs
of sluggishness.  The answer?

I cloned the machine and caused the queries to round-robin between
the two.  Cost?  A jellybean Pentium box and about a half-day of
time to put it together, clone the original, and modify DNS to
round-robin the queries.

The next time i have to revisit this -- which will probably be
next year or so -- I'll probably replace both boxes (faster CPU/memory/disk,
100BaseT instead of 10BaseT network cards, newer version of BSDI Unix)
and, I would guess, increase performance by roughly an order of magnitude
or two.  That oughta hold 'em for another year or so.  I figure that'll
take a day of time, plus a couple $K of hardware.

Total cost over the life of the project, including the hardware,
software, and my time is still less than the original cost of the
database s/w alone.  Time from inception to deployment was
a couple of days -- the previous "database programmer analyst"
had been dorking around with it for a month before I came along.

> how long will it take to make a custom perl program compared to
> a cold fusion one? 

Apples and oranges.  What does cold fusion cost?  How long does it take
to learn?  What are its limitations?  Can you patch it INSTANTLY when
you discover a bug?  Is it portable across platforms?

Of course, this also depends on how good a programmer is doing
the job.  But given Perl's basic design features, which include
associative arrays and *very* fast regular expression handling,
it's quite easy to do things with Perl that are quite difficult
to do in other languages.  It's often referred to as the
"Swiss Army chainsaw" because it combines the best features of
C, shell, awk, sed and a few other tidbits.

Also, because Perl is freely reusable software with a healthy
user community, there are huge archives of programs out there
that are freely downloadable (e.g. CPAN).  I've often solved
problems by downloading something that does 95% of the job,
then adding a few lines of code.  Very fast, very efficient,
very cheap.

> Take a look at http://www.sportquest.com/ it's a database with about
> 14,000 records in the main table, a few hundred in various lookup files
> stored in an Access database.  It works nicely and I can upscale it if
> I ever need to.

As far as I can tell, it's a keyword-searchable list of web sites.
I can do that with Perl (or egrep, or awk, or...) just as easily.
(In fact, I've done it.)  And only 14K records?   No sweat.  Unless
there's something that's not obvious from the outsider's viewpoint,
this isn't even close to requiring a database.

> I can even compile the access
> database so that when they do download it to print labels or whatever,
> they don't > even need to buy access.  They can also email the whole
> bunch of them automatically if they want to.

These are nice capabilities, but things that I'd take for granted.

> I've got another one that I'm working on (still buggy), but you can
> look at the input form for it at http://www.ccisd.ca/english/form.cfm
> You're gonna tell me that 5 pages of linked information with
> multiple occurances should go in a flat file?

I have made no such statement: you are attempting to put words in my
mouth, and I very much resent that.  I don't have the requirements
specification for that particular site in front of me, so I can't
tell you what I think the appropriate solution is.  Maybe it *is*
a flat file; maybe it's not.  But 30 seconds of looking over a single
web page doesn't give me the requisite information to make that decision.

I have simply tried to get you (and everyone else) to open your minds
to the idea that modern hardware/OS/language designs allow you to solve
what *appear to be* very complex problems with very simple tools, often
creating what turn out to be superior solutions.  "What's right" varies
on a case-by-case basis, but the one thing that is a constant is that
you cannot achieve superior results with inferior tools.  Windows et.al.
is clearly a vastly inferior tool, and so by building on top of it,
you have already started in the hole.  I prefer to start ahead rather
than behind, and by choosing a better base to start with (e.g. Linux/Perl)
I often have such a head start on the solution that the problem
becomes trivial.  I like it that way.  So do my clients.

> I've been programming since '82 everything from cobol on mainframes on
> down..databases are more efficient and easier to maintain than flat files.

Utter nonsense.  I have several hundred tools here will slice-n-dice
flat files before your database s/w can even load and start up.

Databases are appropriate in their place, to be sure: but for small
amounts of data, they represent a waste of resources.  It's just
another case of possession of a hammer causing every problem to
look like a nail.

---Rsk
Rich Kulawiec
[EMAIL PROTECTED]
____________________________________________________________________
--------------------------------------------------------------------
 Join The Web Consultants Association :  Register on our web site Now
Web Consultants Web Site : http://just4u.com/webconsultants
If you lose the instructions All subscription/unsubscribing can be done
directly from our website for all our lists.
---------------------------------------------------------------------
Re: WC:>: Linux and NT, both needed in IS

Reply via email to