Hi, Esteban,

There is no region split in this cluster, since we put the region size
upper bound to be really high to prevent splitting.

I think it happens for all the regions of this table.

I repeatedly run "hdfs dfs -lsr /hbase/.hbase-snapshot/ss_rich_pin_data_v1"
 while taking snapshot, no region was able to write into this direction. I
also turn on DEBUG logging on RS, all RS  just report fail with Timeout,
with no specific reason.

Thanks
Tian-Ying

On Tue, May 19, 2015 at 11:06 AM, Esteban Gutierrez <[email protected]>
wrote:

> Hi Tianying,
>
> Is this happening consistently in this region or is it happening randomly
> across other regions too? One possibility is that there was a split going
> on at the time you started to take the snapshot and it failed. If you look
> into /hbase/rich_pin_data_v1 can you find a directory named
> dff681880bb2b23d0351d6656a1dbbb9 in there?
>
> cheers,
> esteban.
>
>
> --
> Cloudera, Inc.
>
>
> On Mon, May 18, 2015 at 11:12 PM, Tianying Chang <[email protected]>
> wrote:
>
> > Hi,
> >
> > We have a cluster that used to be able to take snapshot. But recently,
> one
> > table failed due to the error below. Other tables on the same clusters
> are
> > fine.
> >
> > Any idea what could go wrong? Is the table not healthy? But I run hbase
> > hbck, it reports cluster healthy.
> >
> > BTW, we are running 94.7, so we need to take snapshot of the data to
> export
> > to a new cluster of 94.26 as part of upgrade (and eventually upgrade to
> > 1.x)
> >
> > Thanks
> > Tian-Ying
> >
> >
> > 015-05-19 06:00:45,505 ERROR
> > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed
> taking
> > snapshot { ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH }
> > due to exception:No region directory found for region:{NAME =>
> > 'rich_pin_data_v1,,1389319134976.dff681880bb2b23d0351d6656a1dbbb9.',
> > STARTKEY => '', ENDKEY => '001ff3a165ff571471603035ca7b4be9', ENCODED =>
> > dff681880bb2b23d0351d6656a1dbbb9,}
> > org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: No region
> > directory found for region:{NAME =>
> > 'rich_pin_data_v1,,1389319134976.dff681880bb2b23d0351d6656a1dbbb9.',
> > STARTKEY => '', ENDKEY => '001ff3a165ff571471603035ca7b4be9', ENCODED =>
> > dff681880bb2b23d0351d6656a1dbbb9,}
> >         at
> >
> >
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegion(MasterSnapshotVerifier.java:167)
> >         at
> >
> >
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:152)
> >         at
> >
> >
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:115)
> >         at
> >
> >
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:156)
> >         at
> > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >         at java.lang.Thread.run(Thread.java:662)
> > 2015-05-19 06:00:45,505 INFO
> > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Stop taking
> > snapshot={ ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH }
> > because: Failed to take snapshot '{ ss=ss_rich_pin_data_v1
> > table=rich_pin_data_v1 type=SKIPFLUSH }' due to exception
> > 2015-05-19 06:00:49,745 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 50 on 60000 caught: java.lang.ArrayIndexOutOfBoundsException: 2
> >         at java.util.Arrays$ArrayList.get(Arrays.java:3381)
> >         at
> > java.util.Collections$UnmodifiableList.get(Collections.java:1152)
> >         at
> >
> >
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$SnapshotDescription$Type.getValueDescriptor(HBaseProtos.java:99)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> >
> >
> com.google.protobuf.GeneratedMessage.invokeOrDie(GeneratedMessage.java:1369)
> >         at
> >
> com.google.protobuf.GeneratedMessage.access$1400(GeneratedMessage.java:57)
> >         at
> >
> >
> com.google.protobuf.GeneratedMessage$FieldAccessorTable$SingularEnumFieldAccessor.get(GeneratedMessage.java:1670)
> >         at
> > com.google.protobuf.GeneratedMessage.getField(GeneratedMessage.java:162)
> >         at
> >
> >
> com.google.protobuf.GeneratedMessage.getAllFieldsMutable(GeneratedMessage.java:113)
> >         at
> >
> >
> com.google.protobuf.GeneratedMessage.getAllFields(GeneratedMessage.java:152)
> >         at
> > com.google.protobuf.TextFormat$Printer.print(TextFormat.java:228)
> >         at
> > com.google.protobuf.TextFormat$Printer.access$200(TextFormat.java:217)
> >         at com.google.protobuf.TextFormat.print(TextFormat.java:68)
> >         at
> > com.google.protobuf.TextFormat.printToString(TextFormat.java:115)
> >         at
> > com.google.protobuf.AbstractMessage.toString(AbstractMessage.java:86)
> >         at
> >
> >
> org.apache.hadoop.hbase.snapshot.HSnapshotDescription.toString(HSnapshotDescription.java:72)
> >         at java.lang.String.valueOf(String.java:2826)
> >         at java.lang.StringBuilder.append(StringBuilder.java:115)
> >         at
> > org.apache.hadoop.hbase.ipc.Invocation.toString(Invocation.java:152)
> >         at
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.toString(HBaseServer.java:304)
> >         at java.lang.String.valueOf(String.java:2826)
> >         at java.lang.StringBuilder.append(StringBuilder.java:115)
> >
>

Reply via email to