On 16/06/11 14:49, David Scott Williams wrote:
http://gigaom.com/cloud/lexisnexis-open-sources-its-hadoop-killer/


It's interesting that they decided they had to become open source to survive. It's the Linux effect: it's not that it's better than Solaris was, it just got the momentum up.

A strength of Hadoop is that it does have layers on it, and a lot of the interesting stuff is above the basic layer -Mahout, Pig, Hive, Hama, etc, and you can plug in things: filesystems, schedulers, HDFS placers. While we can debate what "compatible" means, by implementing the APIs that the higher layers use, MapR and hence EMC's products can run those higher layers. HPCC looks to be a completely new ecosystem.

Oh, and the license is AGPL, which complicates any external-facing web app way more than even GPL does. Good for business models (you can pay for the alternate license), but not ideal for takeup.

HPCC do a good comparision page here, seems quite unbiased

http://hpccsystems.com/why-HPCC/HPCC-vs-hadoop

Regarding performance, I haven't seen any new terasort numbers for a while. Whoever next brings up a 1000+ node cluster should publish them.

As HPCC say: "In practice, HPCC configurations require significantly fewer nodes to provide the same processing performance as a Hadoop cluster. Sizing of clusters may depend however on the overall storage requirements for the distributed file system."

That means if your cluster is driven by storage demands, that fixes node size more than CPU issues (though if you need less CPUs, that's capex and opex savings or the opportunity to do other things with CPU time)

Reply via email to