I'm surprised that there weren't earlier responses that suggest getting off of a workstation platform. It's pretty surprising the performance difference with a "real server". Disk IO, CPU to Memory transfers, Network IO all generally go way up when you get a more significant hardware platform.

Of course there's a lot to it, as others said, there isn't enough info in the original post.

If the processing involves reading A.txt into RAM and doing stuff to it, then cramming it into a Database, you'd likely do well with an AMD solution, their CPU to RAM transfers are really fast. But disk performance isn't as important.

If you are going to be taking A.txt, grep it, spit out several small files, do stuff to them, then cram the results into a database, you'd get more bang by getting a really good RAID subsystem, and I don't mean software RAID. I know that the argument of Software vs. Hardware RAID is second only to VI vs EMACS, but if you want real speed my vote is for hardware. LOTS of Cache on the controller, disks that are engineered in conjunction with the controller, boy does it make a difference.



Are you,

Disk bound?
Processor Bound?
Memory Bound?
Network Bound?




All that said, you could still be better off to cluster some workstations. Then you have the cluster coding question, how to best use the clustered systems? Will you be creating a support headache that you will "own" for the rest of your days?



Lots to think about......



        Kevin


Mark Freeze wrote:
Someone please take my side and settle an argument for me.
 I have a friend who runs a business like mine and we have the same basic
setup. We normally receive files from customers that may be 50 to 100 MB. We
run programs on these files that parse text, create databases, purge
records, and so on. Normal database stuff. Converting and parsing records
with the software that I have written usually runs for about 1 hour on the
larger files and we may have 2 or 3 of these files each time a customer
trasmits data to us.
 My friend says that he is considering clustering Linux boxes together to
improve the speed of the processing and he figures that he can cut
processing time in half. Now I may be in for a public spanking, but I did
not think that clustering would have that much of an effect on this type of
operation. Also, he is not talking about clustering new, workhorse p4
machines... He is talking about clustering up about 4 or 5 p3 & p4 machines
that he has as spares. From the things that I have read (including the link
that someone posted the other day) I think that he has a misconception of
clustering.
 Am I way off base? Will clustering have this dramatic of an effect?
 Thanks,
Mark.
--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Reply via email to