hi stevel:
thanks for your reply.i have not tried to debug infiniband,although i only know
it.
Now, My hadoop cluster is made with HDFS+MAPREDUCE, ,hive, derby server.i want
to put HBASE into cluster.how can i do it .can you help me .
thanks.pengbing chu
> Date: Mon, 6 Sep 2010 11:14:10 +0100
> From: ste...@apache.org
> To: common-user@hadoop.apache.org
> Subject: Re: the question of hadoop
>
> On 06/09/10 09:32, 褚 鵬兵 wrote:
> >
> > hi ,my hadoop friends:i have the 3 questions about hadoop.there are ....
> >
> > 1 the speed between the datanodes. Tera data in one datanodes , the
> > data transfers from one datanode to the another datanode. if the speed
> > is bad, Hadoop will be slow, i think. i heard the gNet architecture in
> > Greenplum , then hadoop ? SAS storage + G-Ethernet is best answer, isn't
> > it?
>
> if your code has locality gigabit ether is fine, saves the hassle of
> getting faster stuff to work. Have you ever tried to debug infiniband
> cluster problems?
>
> > 2 the GUI tool there is a hive web tool in hadoop. but it is not enough
> > to use it for our business work. it is too simple to use it.
> > if hadoop+hive is designed into DWH. then how to use it for users.
> > by CGI Tool(Command),? by New Developed webGUITOOL.?
>
> the community welcomes new contributions. I'd look at cascading,
> datameeer's stuff, and other things. Hive is designed for people who
> know SQL, like PHP developers.
>
> > 3 5 computers Hadoop cluster and 1 computer SQLSERVER2000 5 computers
> > Hadoop celeron 2.66G 1G memory Ethernet namenode +
> > secondarynamenode + 3 datanode 1 computer SQLSERVER2000 celeron
> > 2.66G 1G memory then i did select operation at the same data 100M .
> > 5 computers Hadoop is 2mins 30secs 1 computer SQLSERVER2000 is 2mins
> > 25secs
> > the result is that 5 computers Hadoop is not good .why .can anyone give me
> > some advises.
> > thanks in adverse.
>
> Indexes give RBMS speed, but limit their scale. If your dataset fits
> onto a single mssql or mysql and you can afford the index costs, stay
> with that data in a RAID array. Hadoop isn't trying to compete in that
> space -though things like CouchDB are trying to
>
> However, before you dismiss Hadoop, get in touch with your SQL server or
> oracle account team and say "we are planning on working with 15
> Petabytes of storage with data coming in at 1-2PB/month" and see what
> they say back and how big their quote is. The search terms "MapReduce a
> Major Step Backwards" shows some of the debate going on.
>