This sounds like an interesting exercise.   We should do same on this end
proving a release on a cluster just before we put it out.
Are the keys that TeraGen makes binary?  Maybe check its source?

If they are, they'll look odd in the UI and on shell; we don't support them
in UI and shell (yet) but hbase should operate fine with binary keys.  Is it
not working for you?

St.Ack


On Sat, Feb 28, 2009 at 2:56 AM, schubert zhang <[email protected]> wrote:

> I have being used HBase and Hadoop for 5 months.
>
> My testbed have 5node(1mastar and 4slaves)
> Hadoop-0.19.1
> HBase-0.19.0
>
> 1. I use the TeraGen mapreduce job of hadoop examples, to generate files
> with random key-value paires.
>    I just create a 1G data and  another 10G data for later test.
>
> 2. Then write a job to read these TeraGen files and insert each record's
> key-value to a HBase table.
>    (create 'sort1g', {NAME => 't', VERSIONS => 1}
>     (create 'sort10g', {NAME => 't', VERSIONS => 1}
>    I want use this insert jobs to simulate the TeraSort, since HBase
> automatically sort rows.
>
> 3. after finish the insert jobs. On the web interface of HBase, I found
> following strange thing:
>
> Name Region Server Encoded Name Start Key End Key
> ......
> sort10g,%ql`{^8Bcf,1235730412828   nd2-rack0-cloud:60020   155375382
>  %ql`{^8Bcf   &YK&Uop0a=
> sort10g,&YK&Uop0a=,1235730749832  nd1-rack0-cloud:60020  1574155935
>  &YK&Uop0a=  'B'Zp+!]Tb
> sort10g,'B'Zp+!]Tb,1235730749832  nd1-rack0-cloud:60020  395792177
>  'B'Zp+!]Tb  ()o:
> sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> sort10g,(qYp"7;j2$,1235730731006  nd1-rack0-cloud:60020  2143364419
>  (qYp"7;j2$  )Z/?>:ZM3Z
> sort10g,)Z/?>:ZM3Z,1235730853698  nd2-rack0-cloud:60020  440987412
>  )Z/?>:ZM3Z  *BuVHF#1ME
> .......
> sort10g,:Qt-(8;Y>i,1235730441379   nd1-rack0-cloud:60020   1461025497
>  :Qt-(8;Y>i   ;;Vg!IT[d"
> sort10g,;;Vg!IT[d",1235730461102  nd1-rack0-cloud:60020  36776992
>  ;;Vg!IT[d"  <$#
> sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> sort10g,=VyK?xTtI`,1235730334262  nd3-rack0-cloud:60020  1165072084
>  =VyK?xTtI`  >A274Dj=vU
>  .......
> sort10g,s#Y}pGP|{3,1235730476424   nd1-rack0-cloud:60020   1728348677
>  s#Y}pGP|{3   soWA+0=0Ao
> sort10g,soWA+0=0Ao,1235730487163  nd1-rack0-cloud:60020  1275380223
>  soWA+0=0Ao  t\<
> sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
> sort10g,uI-1OW2g=t,1235730515195  nd1-rack0-cloud:60020  232566103
>  uI-1OW2g=t  v6'-_5E]7'
>
>
> In above lines, some look not like normal:
> sort10g,()o:  nd1-rack0-cloud:60020  1176340729  ()o:  (qYp"7;j2$
> sort10g,<$#  nd1-rack0-cloud:60020  1430043392  <$#
> sort10g,  nd3-rack0-cloud:60020  1176532237   =VyK?xTtI`
> sort10g,t\<  nd1-rack0-cloud:60020  2080592534  t\<  uI-1OW2g=t
>
>
> Coud you please tell me it is right or not.
>

Reply via email to