On 21/10/10 22:53, Ian Holsman wrote:
yep.. I've heard it's a source of contention...

but I'd like to see how we can get it so the amount of patches that the
large companies apply on top of the current production apache release gets
minimized, and the large installations are all running nearly identical code
on their clusters, and that we wouldn't need to have a yahoo or cloudera
repo of their patch sets made available.

So Ideally I'd like to hear what kind of things apache needs to do help get
these kind of things less divergent.

In discussing it with people, I've heard that a major issue (not the only
one i'm sure) is lack of resources to actually test the apache releases on
large clusters, and that it is very hard getting this done in short cycles
(hence the large gap between 20.x and 21).

So I thought I would start the thread to see if we could at least identify
what the people think are the problems are.

A big issue is the $ value of the data in a production cluster, the size of the large clusters, and the fact that they are in use. The only way to test on a few hundred nodes -especially now that 12-24TB/node is possible, is when you are bringing up a cluster of this size, which is a pretty rare event. Lots of us have small real or virtual clusters, but they don't stress the NN or JT, don't raise emergent problems like the increasing cost of rebalancing 24TB nodes if they go down, etc.

Reply via email to