--On 06/13/2005 09:56:30 AM -0700 Steve Lancaster wrote:
Is 10 year old temperature data really worth ANYTHING? (Unless you are the
weather forecasters.)?
While the "cost per megabyte" is low, that assumes that the machine you
use to store it... and the backup tapes and labor you use to back it up,
and the staff you take to support the processes are "free"..
Just because you can isn't a good reason to store piles of useless data...
Tell me.. do you remember where sensor 10.2C0442000800 was 10 years ago?
Or what the configuration of all the machines in all the racks was?
Do you even HAVE the same machines you had 10 years ago?
I thought not.
:-)
Steve
Steve,
Why the contentious tone?
I'd give this the famous "it depends." If there is some forethought as to
sensor location and collection, yes it can be. I would love to have the raw
data for any number of 25 year old research projects. They might have
flaws, but they generated seminal works and it would be very interesting to
see how hindsight views the analysis.
In the case of the person doing the rack sensor logging, I would think it
is very useful. With almost 10K nodes, you have a large population and
should be able to look for correlations between temperature and things like
system failures. And yes, believe it or not, there are systems where the
racks run untouched other than maintenance for many years. When you're
dealing with 10k nodes, you don't just up and decide to replace a few.
Now let's look at the cost and effort of storing the data. By far the
hardest/most expensive part is defining the format that the data will be
stored in, so let's just guess it will be XML with a bunch of tags. Now
let's assume a reading generates 256 bytes of XML to be stored. With 300
sensors generating samples every 30 seconds, I come up with 221 MB per day.
You get around 21 days on a DVD-R which costs about a buck and would take 5
minutes of someone's time to unmount and store the old one, then label and
mount the new one. We could make it part of a weekly operator action, and
the net cost is really quite small.
If I were doing the failure tracking of a cluster this size, I sure would
want to have this data to look for correlations. How many nodes fail each
day? Do top of rack units do worse? Do certain floor areas do worse? Do
temporary peaks cause accelerated failures, and if so what is the window
before the effects get lost in the noise.
Many moons ago I used to be a physicist. Every run of every experiment in a
synchotron is recorded forever. There are physicists who make a living on
reanalyzing old data. Not only do new results come up, but also the design
of experiments improves with this kind of review.
jerry
-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games. How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
_______________________________________________
Owfs-developers mailing list
Owfs-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/owfs-developers