Re: why print this error when using MultipleOutputFormat?

2009-02-24 Thread ma qiang
Thanks for your reply.
If I increase the number of computers, can we solve this problem of
running out of file descriptors?




On Wed, Feb 25, 2009 at 11:07 AM, jason hadoop  wrote:
> My 1st guess is that your application is running out of file
> descriptors,possibly because your MultipleOutputFormat  instance is opening
> more output files than you expect.
> Opening lots of files in HDFS is generally a quick route to bad job
> performance if not job failure.
>
> On Tue, Feb 24, 2009 at 6:58 PM, ma qiang  wrote:
>
>> Hi all,
>>   I have one class extends MultipleOutputFormat as below,
>>
>>      public class MyMultipleTextOutputFormat extends
>> MultipleOutputFormat {
>>        private TextOutputFormat theTextOutputFormat = null;
>>
>>       �...@override
>>        protected RecordWriter getBaseRecordWriter(FileSystem fs,
>>                        JobConf job, String name, Progressable arg3) throws
>> IOException {
>>                if (theTextOutputFormat == null) {
>>                        theTextOutputFormat = new TextOutputFormat();
>>                }
>>                return theTextOutputFormat.getRecordWriter(fs, job, name,
>> arg3);
>>        }
>>       �...@override
>>        protected String generateFileNameForKeyValue(K key, V value, String
>> name) {
>>                return name + "_" + key.toString();
>>        }
>> }
>>
>>
>> also conf.setOutputFormat(MultipleTextOutputFormat2.class) in my job
>> configuration. but when the program run, error print as follow:
>>
>> 09/02/25 10:22:32 INFO mapred.JobClient: Task Id :
>> attempt_200902250959_0002_r_01_0, Status : FAILED
>> java.io.IOException: Could not read from stream
>>        at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:119)
>>        at java.io.DataInputStream.readByte(DataInputStream.java:248)
>>        at
>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325)
>>        at
>> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346)
>>        at org.apache.hadoop.io.Text.readString(Text.java:400)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2779)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2704)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>
>> 09/02/25 10:22:42 INFO mapred.JobClient:  map 100% reduce 69%
>> 09/02/25 10:22:55 INFO mapred.JobClient:  map 100% reduce 0%
>> 09/02/25 10:22:55 INFO mapred.JobClient: Task Id :
>> attempt_200902250959_0002_r_00_1, Status : FAILED
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>>
>> /user/qiang/output/_temporary/_attempt_200902250959_0002_r_00_1/part-0_t0x5y3
>> could only be replicated to 0 nodes, instead of 1
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
>>        at org.apache.hadoop.ipc.Client.call(Client.java:696)
>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>        at $Proxy1.addBlock(Unknown Source)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>        at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>        at $Proxy1.addBlock(Unknown Source)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>>        at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>
>>
>> Of course the program run successfully without MyMultipleOutputFormat.
>> who can help me solve this problem?
>> Thanks.
>>
>> yours,    Qiang
>>
>


why print this error when using MultipleOutputFormat?

2009-02-24 Thread ma qiang
Hi all,
   I have one class extends MultipleOutputFormat as below,

  public class MyMultipleTextOutputFormat extends
MultipleOutputFormat {
private TextOutputFormat theTextOutputFormat = null;

@Override
protected RecordWriter getBaseRecordWriter(FileSystem fs,
JobConf job, String name, Progressable arg3) throws 
IOException {
if (theTextOutputFormat == null) {
theTextOutputFormat = new TextOutputFormat();
}
return theTextOutputFormat.getRecordWriter(fs, job, name, arg3);
}
@Override
protected String generateFileNameForKeyValue(K key, V value, String 
name) {
return name + "_" + key.toString();
}
}


also conf.setOutputFormat(MultipleTextOutputFormat2.class) in my job
configuration. but when the program run, error print as follow:

09/02/25 10:22:32 INFO mapred.JobClient: Task Id :
attempt_200902250959_0002_r_01_0, Status : FAILED
java.io.IOException: Could not read from stream
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:119)
at java.io.DataInputStream.readByte(DataInputStream.java:248)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2779)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2704)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

09/02/25 10:22:42 INFO mapred.JobClient:  map 100% reduce 69%
09/02/25 10:22:55 INFO mapred.JobClient:  map 100% reduce 0%
09/02/25 10:22:55 INFO mapred.JobClient: Task Id :
attempt_200902250959_0002_r_00_1, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/qiang/output/_temporary/_attempt_200902250959_0002_r_00_1/part-0_t0x5y3
could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)


Of course the program run successfully without MyMultipleOutputFormat.
who can help me solve this problem?
Thanks.

yours,Qiang


Re: how many maps in a map task?

2008-11-10 Thread ma qiang
yes,
It need further analyze in reducer.


On Tue, Nov 11, 2008 at 10:28 AM, Mice <[EMAIL PROTECTED]> wrote:
> it there a reducer in your program?  or you need to output the result
> in map-side?
>
> 2008/11/11 ma qiang <[EMAIL PROTECTED]>:
>> hi all,
>>I hava a data set stored in hbase, and I run a mapreduce program
>> to analyze. Now I want to know how many maps in a map task?
>>I want to use the number of the maps in my program. For
>> example.  There are 100 maps in a map task, and I want to collect all
>> the values, and analyze these values then output the last result in
>> the last map.
>>   Generally, there is no ouput in the first 99 maps, only output exists
>> in the last map. So I should know how many maps in a map task and use
>> it in my code.
>>
>> Thanks a lot.
>>
>> yours qiang.
>>
>
>
>
> --
> http://www.hadoopchina.com
>


how many maps in a map task?

2008-11-10 Thread ma qiang
hi all,
I hava a data set stored in hbase, and I run a mapreduce program
to analyze. Now I want to know how many maps in a map task?
I want to use the number of the maps in my program. For
example.  There are 100 maps in a map task, and I want to collect all
the values, and analyze these values then output the last result in
the last map.
   Generally, there is no ouput in the first 99 maps, only output exists
in the last map. So I should know how many maps in a map task and use
it in my code.

Thanks a lot.

yours qiang.


how to access data directly in webapplication?

2008-05-23 Thread ma qiang
Hi all,
I have developed a web application using Tomcat. In my app, a
client submit a request, and the server read data from HBase then
return these data as a response. But now I can only use shell and
eclipse plugin to invoke program on Hadoop. Who can tell me how to
access data in Hbase directly in java program such as sevlet when
tomcat process a request from client?
 Thank you very much!


why the value of attribute in map function will change ?

2008-03-15 Thread ma qiang
Hi all:
I have this map class as below;
public class TestMap extends MapReduceBase implements Mapper
{
  private static int value;

  public TestMap()
  {
   value=100;
  }

  public void map……

 
}

 and in my Driver Class as below:
   public class Driver {

public void main() throws Exception {
conf……… 

client.setConf(conf);
try {   
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}

System.out.println(TestMap.value); //here the
value printed is 0;
}

I don't know why the TestMap.value printed is not 100 but 0, who can
tell me why ?
Thanks very much!


if can not close the connection to HBase using HTable ...?

2008-03-13 Thread ma qiang
Hi all,
 If I can not close the connection to HBase using HTable, after
 the object was set as null . Whether the resource of this connection
 will be released ?

 The code as below;

 public class MyMap extends MapReduceBase implements Mapper {
private HTable connection ;

public MyMap(){
   connection=new HTable(new HBaseConfiguration(),
new Text("HBaseTest"));
}

public void map(...){
  ..
  connection=null;  // I couldn't use  connection.close;
}

 }


Re: why can not initialize object in Map Class's constructor?

2008-03-13 Thread ma qiang
I'm sorry for my mistake.   The object was initialized .






On Thu, Mar 13, 2008 at 3:13 PM, ma qiang <[EMAIL PROTECTED]> wrote:
> Hi all,
> My code as below,
>
>  public class MapTest extends MapReduceBase implements Mapper {
> private int[][] myTestArray;
> private int myTestInt;
>
> public MapTest()
> {
>System.out.println("The construtor run !");
>myTestArray=new int[10][10];
>myTestInt=100;
>  }
>  .
>
>   }
>
>   when I run this program,I find the constructor run and the parameter
>  myTestInt is 100,but myTestArray is null and can not initialize. I've
>  no idea why the object will not initialize in Map's constructor ?
>  Thank you for your reply !
>
>  Qiang Ma
>


why can not initialize object in Map Class's constructor?

2008-03-13 Thread ma qiang
Hi all,
My code as below,

 public class MapTest extends MapReduceBase implements Mapper {
private int[][] myTestArray;
private int myTestInt;

public MapTest()
{
   System.out.println("The construtor run !");
   myTestArray=new int[10][10];
   myTestInt=100;
 }
.

  }

 when I run this program,I find the constructor run and the parameter
myTestInt is 100,but myTestArray is null and can not initialize. I've
no idea why the object will not initialize in Map's constructor ?
Thank you for your reply !

Qiang Ma


connection to HBase using HTable

2008-03-06 Thread ma qiang
Hi all:
 In my map function, I need to connect to a table in HBase several
times ,and I use HTable class, the code as belows:

  HTable TestConnection = new HTable(new HBaseConfiguration(),
new Text("test"));

So every time the map function run , the code must run ,and this run
time is waste.


Because of too many times of connection,the most part of time of run
time is waste in  new HTable().So I want to run the code one time,
and all of task use the same connection. How can I do this?


about the length of Text

2008-02-24 Thread ma qiang
Hi all;
A error happen in my program and system print as below:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.write(Text.java:243)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:349)
at MyTest.map(UserMinHashMap.java:43)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1787)

and I think the error happen here: output.collect(new
Text(testKeyString),new Text(testValueString));
I'm not sure why? I guess the reason is the length of testValueString
is too large to construct Text or write into HD.   Who can tell why?

Thanks very much
Best Wishes!


Re: how to use two reduce fucntions?

2008-02-24 Thread ma qiang
Thanks for your reply. I meet this problem as below: I have a
application that need to use two reduce phase. In my first reduce
function, I divided all the data into several keys which will be use
in the second reduce function, in addition,in my second reduce
function, it will computer some values using data from the result of
the first reduce function . The result of the first reduce function is
the input data of the second reduce function.
Or I run two jobs, but in this case the map function of the second job
will do nothing except some IO .


On Sun, Feb 24, 2008 at 3:29 AM, Jason Venner <[EMAIL PROTECTED]> wrote:
> If you set up a partitioner class, you could pre partition the output of
>  the into the relevant segments.
>  Then your reducer would be responsible for determining which reduce
>  function to apply based on which segment the key is part of.
>
>
>
>
>  Amar Kamat wrote:
>  > Can you provide more details on what exactly what you wish to do? What
>  > is the nature of reducers? A simple answer would be with map(m) and
>  > reducers(r1,r2) you can run 2 jobs i.e job1(m,r1) and
>  > job2(IdentityMapper,r2). But it depends what exactly r1 and r2 do.
>  > Also combiners will play an important role. Also can one merge r1 and
>  > r2 to r and run a job(m,r)
>  > Amar
>  > On Sat, 23 Feb 2008, ma qiang wrote:
>  >
>  >> Hi all,
>  >>I have a program need to use two reduce fucntions, who can tell me
>  >> why?
>  >>Thank you!
>  >>
>  >> Qiang
>  >>
>
>  --
>  Jason Venner
>  Attributor - Publish with Confidence <http://www.attributor.com/>
>  Attributor is hiring Hadoop Wranglers, contact if interested
>


how to use two reduce fucntions?

2008-02-22 Thread ma qiang
Hi all,
I have a program need to use two reduce fucntions, who can tell me why?
Thank you!

Qiang


how to set the result of the first mapreduce program as the input of the second mapreduce program?

2008-02-20 Thread ma qiang
Hi all:
 Here I have two mapreduce program.I need to use the result of the
first mapreduce program to computer another values which generate in
the second mapreduce program and this intermediate result is not need
to save, so I want to run the second mapreduce program automatic using
output of the first mapreduce program as the input of the second
mapreduce program. Who can tell me how?
 Thanks!
 Best Wishes!

Qiang


Re: why read HBase in map function is so slow?

2008-02-01 Thread ma qiang
About more than twice. I guess the  time of connect to table is too large .



On Feb 1, 2008 8:15 PM, edward yoon <[EMAIL PROTECTED]> wrote:
> How long does it take to get the row data from table?
>
>
> On 2/1/08, ma qiang <[EMAIL PROTECTED]> wrote:
> > Hi all:
> > I have a mapreduce program now, and in my map function I need use
> > some parameters which read from another table in HBase using HTable
> > class , as a result I find the this program run so slow. Who can tell
> > me why and how to solve this problem?
> >Thank you very much!
> >Best Wishes!
> >
>
>
> --
> B. Regards,
> Edward yoon @ NHN, corp.
>


why read HBase in map function is so slow?

2008-02-01 Thread ma qiang
Hi all:
 I have a mapreduce program now, and in my map function I need use
some parameters which read from another table in HBase using HTable
class , as a result I find the this program run so slow. Who can tell
me why and how to solve this problem?
Thank you very much!
Best Wishes!


about the exception in mapreduce program?

2008-01-31 Thread ma qiang
Hi all:
  I meet this problem as below:
  My map function read from a table in HBase, then merge several
string and finally save these string into another table HBase. The
number of string and the length of the string  is large. After ten
minutes, the  hadoop print error "out of memory, java heap is not
enough" . And the program is tested using small string and there is no
error.  But when the number and length of string become large, the
error happened. I installed the hadoop in non-distributed mode, and
the size of my computer's memory is 2G, this size is enough fit for my
simple program in theory.
 Who can tell me why?
 Thank you very much!


Best Wishes!


Re: Hbase Timestamp Error

2008-01-31 Thread ma qiang
Additionally, I only find the 0.15.3 version in the releases.


On Feb 1, 2008 12:05 PM, ma qiang <[EMAIL PROTECTED]> wrote:
> Where can we get the higher version ? Which is the latest version?
> 0.16 or 0.15.2?
> Thank you !
>
>
>
>
> On Feb 1, 2008 6:58 AM, edward yoon <[EMAIL PROTECTED]> wrote:
> > It required a higher version version than hadoop 0.16
> >
> >
> > On 1/31/08, Peeyush Bishnoi <[EMAIL PROTECTED]> wrote:
> > > Hi ,
> > >
> > > While inserting the data into Hbase , timestamp is giving problem.
> > >
> > > Hbase> select * from test;
> > > +-+-+-+
> > > | Row | Column  | Cell
> > > |
> > > +-+-+-+
> > > | r1  | a:  | 10
> > > |
> > > +-+-+-+
> > > | r1  | b:  | 20
> > > |
> > > +-+-+-+
> > > | r1  | c:  | 30
> > > |
> > > +-+-+-+
> > >
> > > 3 row(s) in set (0.19 sec)
> > >
> > > Hbase> insert into test(a,b,c) values ('1','2','3') where row='r1'
> > > TIMESTAMP '200';
> > > Syntax error : Type 'help;' for usage.
> > > Message : Encountered "TIMESTAMP" at line 1, column 62.
> > > Hbase>
> > >
> > > But data gets inserted
> > >
> > > Hbase> insert into test(a,b,c) values ('1','2','3') where
> > > row='r1' ;
> > > 1 row inserted successfully. (0.01 sec)
> > > Hbase>
> > >
> > > Hbase> select * from
> > > test;
> > > +-+-+-+
> > > | Row | Column  | Cell
> > > |
> > > +-+-+-+
> > > | r1  | a:  | 1
> > > |
> > > +-+-+-+
> > > | r1  | b:  | 2
> > > |
> > > +-+-+-+
> > > | r1  | c:  | 3
> > > |
> > > +-+-+-+
> > >
> > > 3 row(s) in set (0.09 sec)
> > > Hbase>
> > >
> > > But my old data in column get overridden with previous one. Can any one
> > > look into this and give their valuable suggestion why it is so.
> > >
> > > Why timestamp is not showing atall  but it is claimed to be working in
> > > the following URL: http://wiki.apache.org/hadoop/Hbase/HbaseShell/
> > >
> > > So please help me out with valuable solutions and example.
> > >
> > > Thanks
> > >
> > > ---
> > > Peeyush
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > B. Regards,
> > Edward yoon @ NHN, corp.
> >
>


Re: Hbase Timestamp Error

2008-01-31 Thread ma qiang
Where can we get the higher version ? Which is the latest version?
0.16 or 0.15.2?
Thank you !



On Feb 1, 2008 6:58 AM, edward yoon <[EMAIL PROTECTED]> wrote:
> It required a higher version version than hadoop 0.16
>
>
> On 1/31/08, Peeyush Bishnoi <[EMAIL PROTECTED]> wrote:
> > Hi ,
> >
> > While inserting the data into Hbase , timestamp is giving problem.
> >
> > Hbase> select * from test;
> > +-+-+-+
> > | Row | Column  | Cell
> > |
> > +-+-+-+
> > | r1  | a:  | 10
> > |
> > +-+-+-+
> > | r1  | b:  | 20
> > |
> > +-+-+-+
> > | r1  | c:  | 30
> > |
> > +-+-+-+
> >
> > 3 row(s) in set (0.19 sec)
> >
> > Hbase> insert into test(a,b,c) values ('1','2','3') where row='r1'
> > TIMESTAMP '200';
> > Syntax error : Type 'help;' for usage.
> > Message : Encountered "TIMESTAMP" at line 1, column 62.
> > Hbase>
> >
> > But data gets inserted
> >
> > Hbase> insert into test(a,b,c) values ('1','2','3') where
> > row='r1' ;
> > 1 row inserted successfully. (0.01 sec)
> > Hbase>
> >
> > Hbase> select * from
> > test;
> > +-+-+-+
> > | Row | Column  | Cell
> > |
> > +-+-+-+
> > | r1  | a:  | 1
> > |
> > +-+-+-+
> > | r1  | b:  | 2
> > |
> > +-+-+-+
> > | r1  | c:  | 3
> > |
> > +-+-+-+
> >
> > 3 row(s) in set (0.09 sec)
> > Hbase>
> >
> > But my old data in column get overridden with previous one. Can any one
> > look into this and give their valuable suggestion why it is so.
> >
> > Why timestamp is not showing atall  but it is claimed to be working in
> > the following URL: http://wiki.apache.org/hadoop/Hbase/HbaseShell/
> >
> > So please help me out with valuable solutions and example.
> >
> > Thanks
> >
> > ---
> > Peeyush
> >
> >
> >
> >
> >
>
>
> --
> B. Regards,
> Edward yoon @ NHN, corp.
>


how to set the output in map function directly to HBase?

2008-01-31 Thread ma qiang
Hi all:
I want to set the output in map function directly to HBase when I
don't need reduce function?  I have used conf.setNumReduceTasks(0);
but the map function can't run any more.  Who can tell me how?
Thanks very much!
Best wishes!


how to see the size of a talle in HBase

2008-01-30 Thread ma qiang
HI all:
  I have a table in HBase.  Who can tell me how to see its' size? Thank you!

Best Wishes!


how to stop regionserver

2008-01-24 Thread ma qiang
Hi all;
 When I start my hbase,the error print as follows: localhost:
regionserver running as process 6893. Stop it first.

Can you tell me how to solve this problem ?Why after I stop my
hbase the regionserver still run?

Best Wishes


Re: can't connect to HBase any more

2008-01-23 Thread ma qiang
I use command in shell  lilke select * from tablename ; the exception
print as below :


org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException:
mq3,,-872142061201302060
at 
org.apache.hadoop.hbase.HRegionServer.getRegion(HRegionServer.java:1327)
at 
org.apache.hadoop.hbase.HRegionServer.getRegion(HRegionServer.java:1299)
at 
org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1163)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
at 
org.apache.hadoop.hbase.HTable$ClientScanner.nextScanner(HTable.java:796)
at org.apache.hadoop.hbase.HTable$ClientScanner.(HTable.java:754)
at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:451)
at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:388)
at 
org.apache.hadoop.hbase.shell.SelectCommand.scanPrint(SelectCommand.java:199)
at 
org.apache.hadoop.hbase.shell.SelectCommand.execute(SelectCommand.java:91)
at org.apache.hadoop.hbase.Shell.main(Shell.java:96)

On Jan 24, 2008 2:21 PM, ma qiang <[EMAIL PROTECTED]> wrote:
> I'm sure I run the hadoop and hbase using the command .
> I print the master is running using admin.isMasterRunning()==true;
> I have seen the hbase log, I  find the follow message in log as below;
>   Can you help me to solve this problem?  Thank you very much!
>
>
> 2008-01-23 21:34:29,679 INFO
> org.apache.hadoop.hbase.HMaster$RootScanner: HMaster.rootScanner
> exiting
> 2008-01-23 21:34:31,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
> on following regionserver(s) to go down (or region server lease
> expiration, whichever happens first): [address: 127.0.1.1:60020,
> startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
> 2008-01-23 21:34:35,743 INFO
> org.apache.hadoop.hbase.HMaster$MetaScanner: HMaster.metaScanner
> exiting
> 2008-01-23 21:34:41,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
> on following regionserver(s) to go down (or region server lease
> expiration, whichever happens first): [address: 127.0.1.1:60020,
> startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
> 2008-01-23 21:34:51,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
> on following regionserver(s) to go down (or region server lease
> expiration, whichever happens first): [address: 127.0.1.1:60020,
> startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
> 2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.Leases:
> HMaster.leaseChecker lease expired 1377822776/1377822776
> 2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.HMaster:
> 127.0.1.1:60020 lease expired
> 2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.HMaster: Stopping
> infoServer
> 2008-01-23 21:34:58,683 INFO org.mortbay.util.ThreadedServer: Stopping
> Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60010]
> 2008-01-23 21:34:58,750 INFO org.mortbay.http.SocketListener: Stopped
> SocketListener on 0.0.0.0:60010
> 2008-01-23 21:34:58,791 INFO org.mortbay.util.Container: Stopped
> HttpContext[/static,/static]
> 2008-01-23 21:34:58,845 INFO org.mortbay.util.Container: Stopped
> HttpContext[/logs,/logs]
> 2008-01-23 21:34:58,845 INFO org.mortbay.util.Container: Stopped
> [EMAIL PROTECTED]
> 2008-01-23 21:34:58,944 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/,/]
> 2008-01-23 21:34:58,944 INFO org.mortbay.util.Container: Stopped
> [EMAIL PROTECTED]
> 2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: Stopping
> server on 6
> 2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 6: exiting
> 2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 6: exiting
> 2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 6: exiting
> 2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 6: exiting
> 2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Serve

Re: can't connect to HBase any more

2008-01-23 Thread ma qiang
I'm sure I run the hadoop and hbase using the command .
I print the master is running using admin.isMasterRunning()==true;
I have seen the hbase log, I  find the follow message in log as below;
  Can you help me to solve this problem?  Thank you very much!


2008-01-23 21:34:29,679 INFO
org.apache.hadoop.hbase.HMaster$RootScanner: HMaster.rootScanner
exiting
2008-01-23 21:34:31,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
on following regionserver(s) to go down (or region server lease
expiration, whichever happens first): [address: 127.0.1.1:60020,
startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
2008-01-23 21:34:35,743 INFO
org.apache.hadoop.hbase.HMaster$MetaScanner: HMaster.metaScanner
exiting
2008-01-23 21:34:41,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
on following regionserver(s) to go down (or region server lease
expiration, whichever happens first): [address: 127.0.1.1:60020,
startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
2008-01-23 21:34:51,065 INFO org.apache.hadoop.hbase.HMaster: Waiting
on following regionserver(s) to go down (or region server lease
expiration, whichever happens first): [address: 127.0.1.1:60020,
startcode: -4600488168642437787, load: (requests: 0 regions: 7)]
2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.Leases:
HMaster.leaseChecker lease expired 1377822776/1377822776
2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.HMaster:
127.0.1.1:60020 lease expired
2008-01-23 21:34:58,682 INFO org.apache.hadoop.hbase.HMaster: Stopping
infoServer
2008-01-23 21:34:58,683 INFO org.mortbay.util.ThreadedServer: Stopping
Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60010]
2008-01-23 21:34:58,750 INFO org.mortbay.http.SocketListener: Stopped
SocketListener on 0.0.0.0:60010
2008-01-23 21:34:58,791 INFO org.mortbay.util.Container: Stopped
HttpContext[/static,/static]
2008-01-23 21:34:58,845 INFO org.mortbay.util.Container: Stopped
HttpContext[/logs,/logs]
2008-01-23 21:34:58,845 INFO org.mortbay.util.Container: Stopped
[EMAIL PROTECTED]
2008-01-23 21:34:58,944 INFO org.mortbay.util.Container: Stopped
WebApplicationContext[/,/]
2008-01-23 21:34:58,944 INFO org.mortbay.util.Container: Stopped
[EMAIL PROTECTED]
2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: Stopping
server on 6
2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 6: exiting
2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 6: exiting
2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 6: exiting
2008-01-23 21:34:58,944 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 8 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 7 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 5 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 4 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 6: exiting
2008-01-23 21:34:58,955 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 6: exiting
2008-01-23 21:34:58,956 INFO org.apache.hadoop.hbase.Leases: HMaster
closing leases
2008-01-23 21:34:58,956 INFO
org.apache.hadoop.hbase.Leases$LeaseMonitor: HMaster.leaseChecker
exiting
2008-01-23 21:34:58,956 INFO org.apache.hadoop.hbase.Leases: HMaster
closed leases
2008-01-23 21:34:58,956 INFO org.apache.hadoop.hbase.HMaster: HMaster
main thread exiting
2008-01-23 21:34:58,957 INFO org.apache.hadoop.ipc.Server: Stopping
IPC Server listener on 6






On Jan 24, 2008 1:25 PM, stack <[EMAIL PROTECTED]> wrote:
> Is hbase running?
>
> Check your logs to see if you can figure why client is unable to
> connect.  See $HADOOP_HOME/logs.  Look in the hbase* logs.
>
> St.Ack
>
>
>
> ma qiang wrote:
> > I connect to HBase using HBaseAdmin and HTable class succefully,
> > suddenly  can't manipulate HBase and the error print as belows:
> >
> > java.net.SocketTimeoutException: timed out waiting for rpc response
> >   at org.apache.hadoop.ipc.Client.call(Client.java:484)
> >   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >   at $Proxy0.deleteTable(Unknown Source)
> >   at org.apache.hadoop.hbase.HBaseAdmin.deleteTable(HBaseAdmin.java:169)
> >   at hbasetest.DeleteTable.main(DeleteTable.java:29)
> >   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >   at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >   at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >   at java.lang.reflect.Method.invoke(Met

can't connect to HBase any more

2008-01-23 Thread ma qiang
I connect to HBase using HBaseAdmin and HTable class succefully,
suddenly  can't manipulate HBase and the error print as belows:

java.net.SocketTimeoutException: timed out waiting for rpc response
at org.apache.hadoop.ipc.Client.call(Client.java:484)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at $Proxy0.deleteTable(Unknown Source)
at org.apache.hadoop.hbase.HBaseAdmin.deleteTable(HBaseAdmin.java:169)
at hbasetest.DeleteTable.main(DeleteTable.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

Can you tell me why and how to solve this problem. Thank you very much.

Best wishes;


Re: other methods can run examples in hadoop ?

2008-01-23 Thread ma qiang
I call org.apache.hadoop.examples.WordCount.main(new
String[]{"in-dir","out-dir"}) in my program, but I see the error as
below:
   Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/commons/logging/LogFactory
at org.apache.hadoop.mapred.JobClient.(JobClient.java:147)

I have developed a SWT project to manager all the examples in hadoop
instead of the command line .  If I choose a example such as Wordcount
,it will run.




On Jan 24, 2008 11:12 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
>
> On Jan 23, 2008, at 6:31 PM, ma qiang wrote:
>
> >  I am trying to invoke it using program, Can you tell me how to
> > run the examples using other methods?
>
> What are you trying to accomplish? Of course you could call:
>
>   org.apache.hadoop.examples.WordCount.main(new String[]{"in-
> dir","out-dir"});
>
> is that what you are asking about?
>
> -- Owen
>


other methods can run examples in hadoop ?

2008-01-23 Thread ma qiang
Hi all:
 As we know, we can run a the example in hadoop such as WorldCount
using a command like this:  bin/hadoop jar hadoop-*-examples.jar
wordcount [-m <#maps>] [-r <#reducers>]  .
 I am trying to invoke it using program, Can you tell me how to
run the examples using other methods?

Best Wishes!


Re: How to invoke program in hadoop

2008-01-22 Thread ma qiang
Sorry, I have this problem!
I have some mapreduce codes in hadoop.   Now I am writing some java
code  in my own app that need to use the results of mapreduce code's
computation. My app is not run in hadoop and it is the general java
such as SWT project.




On Jan 23, 2008 12:25 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>
> Adding -Dmapred.job.tracker=local to your command line will cause a
> map-reduce program to be executed entirely locally.  This may not be what
> you mean.
>
> You can run a map-reduce program on a machine outside of the hadoop cluster
> by just copying the configuration files to the machine where you want to run
> your program.  Your program will contact the namenode and job tracker to
> access or store data and to start tasks.  You can get away with very little
> of the normal distribution, but I find it easiest to copy the entire
> distribution to the machine that runs the program.
>
>
>
> On 1/22/08 8:01 PM, "ma qiang" <[EMAIL PROTECTED]> wrote:
>
> > Dear colleagues:
> > I have some mapreduce codes in hadoop,now I want to invoke
> > these codes in another place outside of hadoop server. Who can gell me
> > how?
> >Thanks !
>
>


How to invoke program in hadoop

2008-01-22 Thread ma qiang
Dear colleagues:
I have some mapreduce codes in hadoop,now I want to invoke
these codes in another place outside of hadoop server. Who can gell me
how?
   Thanks !


select hql in hbase

2008-01-22 Thread ma qiang
Hi all;
 I want to search a row in a table where a column's value  equals
to sth .  In SQL, there exists this sentence: select * from table_name
where column_value='sth' ,but in HQL I don't know how. Who can tell me
?
 Thank you very much!

Best Wishes!


Re: Reduce hangs

2008-01-21 Thread ma qiang
Do we need update our mailing list from hadoop-user to core-user ?

On Jan 22, 2008 2:56 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
>
> On Jan 21, 2008, at 10:08 PM, Joydeep Sen Sarma wrote:
>
> > hey list admins
> >
> > what's this list and how is it different from the other one?
> > (hadoop-user). i still see mails on the other one - so curious ..
>
> This is part of Hadoop's move to a top level project at Apache. The
> code previously know as Hadoop is now Hadoop core. Therefore, we have
> gone from:
>
> hadoop-{user,dev,[EMAIL PROTECTED]
>
> to:
>
>   core-{user,dev,[EMAIL PROTECTED]
>
> -- Owen
>