Just a few thoughts...

Ed McNierney wrote:
If all your data are coming from external SDE sources, outside of your
control, then don't worry about those.  Pay attention to the
configuration of your MapServer systems.

Also, you'll need to be very careful about your network setup: even simple things like DNS resolution can kill you in this type of system.
In your scenario, your MapServer systems will all be acting as OGC
clients and servers.  In those scenarios, those machines will be doing a
lot of disk I/O, writing data retrieved from the SDE boxes and then
reading that data to build and serve output data.  As a result, you will
have a lot of simultaneous reading and writing going on.  Each system
will have a very busy disk subsystem.

You will want a disk subsystem that has very low random seek times
(since you'll be jumping around among multiple requests all the time),
and can handle lots of simultaneous I/O requests.  Conversely, all your
MapServer data is transient data used only to serve the current request,
so you don't need to worry about redundancy or disk failure on your data
disks.

That sounds like a recipe for (a) a RAID 0 (striped) array of disks, (b)
SCSI drives, and (c) high-RPM disks.  The RAID 0 array will give you
excellent read and write performance.  It will give you no data
redundancy, but you don't care about that.  You should opt for a large
number of smaller disks rather than a small number of larger disks, as
this will improve your performance.  SCSI drives will handle multiple
I/O requests better than IDE drives.  And high-RPM drives (15K RPM) will
reduce your seek times (and are another reason to use SCSI, as SCSI
drives are generally available at higher RPMs).

I'd actually consider moving off of magnetic disk entirely for this application. Today, you can get common off the shelf server motherboards that will hold up to 64GB of RAM, and in the project size you're considering, you have even more options.

On a linux system, you can use ramfs very simply:

mount -t ramfs none /var/gis/tmp -o maxsize=2000000 (2000000 = ~2Gb in bytes, you might try additional options like noatime, noexec, nodev, nosuid)

And use that as your IMAGE_PATH parameter. (I believe all of the OGC client interfaces use the IMAGE_PATH space for temporary files, but you might want to check that)

If you have static data (such as base maps, or "common" shapefiles) that you use locally, you can always copy them (read-only) into a second ramdisk at bootup (with similar options).

With DDR2 4G memory sticks going for < $400, this may be a far less expensive option than the SCSI RAID path.

It's actually interesting to consider the fact that this could be a solid state device :-) That would _seriously_ increase your uptime and reduce your maintenance costs including backup, recovery, etc. Just DHCP boot it from an existing server and store syslog output on a remote host. However, that's going a bit afield, and may cause headaches in your IS department.

Just my $.02.

Bill

Reply via email to