Re: Should Taking A Snapshot Work Even If Balancer Is Moving A Few Regions Around?

2018-03-21 Thread Ted Yu
Looking at
hbase-client/src/main/java/org/apache/hadoop/hbase/client/Admin.java in
branch-1.4 :

  boolean[] setSplitOrMergeEnabled(final boolean enabled, final boolean
synchronous,
   final MasterSwitchType... switchTypes)
throws IOException;

  boolean isSplitOrMergeEnabled(final MasterSwitchType switchType) throws
IOException;

Please also see the following script:

hbase-shell/src/main/ruby/shell/commands/splitormerge_switch.rb

FYI

On Wed, Mar 21, 2018 at 11:33 AM, Vladimir Rodionov 
wrote:

> >>So my question is whether taking a snapshot is supposed to work even with
> >>regions being moved around. In our case it is usually only a couple here
> >>and there.
>
> No, if region was moved, split or merged during snapshot operation -
> snapshot will fail.
> This is why taking snapshots on a large table is a 50/50 game.
>
> Disabling balancer,region merging and split before snapshot should help.
> This works in 2.0
>
> Not sure if merge/split switch is available in 1.4
>
> -Vlad
>
> On Tue, Mar 20, 2018 at 8:00 PM, Saad Mufti  wrote:
>
> > Hi,
> >
> > We are using HBase 1.4.0 on AWS EMR based Hbase. Since snapshots are in
> S3,
> > they take much longer than when using local disk. We have a cron script
> to
> > take regular snapshots as backup, and they fail quite often on our
> largest
> > table which takes close to an hour to complete the snapshot.
> >
> > The only thing I have noticed in the errors usually is a message about
> the
> > region moving or closing.
> >
> > So my question is whether taking a snapshot is supposed to work even with
> > regions being moved around. In our case it is usually only a couple here
> > and there.
> >
> > Thanks.
> >
> > 
> > Saad
> >
>


Re: Should Taking A Snapshot Work Even If Balancer Is Moving A Few Regions Around?

2018-03-21 Thread Vladimir Rodionov
>>So my question is whether taking a snapshot is supposed to work even with
>>regions being moved around. In our case it is usually only a couple here
>>and there.

No, if region was moved, split or merged during snapshot operation -
snapshot will fail.
This is why taking snapshots on a large table is a 50/50 game.

Disabling balancer,region merging and split before snapshot should help.
This works in 2.0

Not sure if merge/split switch is available in 1.4

-Vlad

On Tue, Mar 20, 2018 at 8:00 PM, Saad Mufti  wrote:

> Hi,
>
> We are using HBase 1.4.0 on AWS EMR based Hbase. Since snapshots are in S3,
> they take much longer than when using local disk. We have a cron script to
> take regular snapshots as backup, and they fail quite often on our largest
> table which takes close to an hour to complete the snapshot.
>
> The only thing I have noticed in the errors usually is a message about the
> region moving or closing.
>
> So my question is whether taking a snapshot is supposed to work even with
> regions being moved around. In our case it is usually only a couple here
> and there.
>
> Thanks.
>
> 
> Saad
>


Connection refused to NameNode Port (8020) When using HBaseTestingUtility

2018-03-21 Thread Nkechi Achara
Hi All,

I receive the following error when using HBaseTestingUtility and start up
the startMiniHBaseCluster. As you can see I am attempting to localise my
test HBase test framework, so I cannot understand what could be going wrong
in regards to networking.

An exception or error caused a run to abort: Call From
krakendev/127.0.0.1 to localhost.localdomain:8020 failed on connection
exception: java.net.ConnectException: Connection refused; For more
details see:  http://wiki.apache.org/hadoop/ConnectionRefused
  java.net.ConnectException: Call From krakendev/127.0.0.1 to
localhost.localdomain:8020 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1470)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2095)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1214)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1210)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1210)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:424)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1409)
at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:1364)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getNewDataTestDirOnTestFS(HBaseTestingUtility.java:479)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.setupDataTestDirOnTestFS(HBaseTestingUtility.java:458)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getDataTestDirOnTestFS(HBaseTestingUtility.java:431)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.getDefaultRootDirPath(HBaseTestingUtility.java:1105)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.createRootDir(HBaseTestingUtility.java:1136)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:973)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:951)
at 
com.thomsonreuters.kraken.medusa.service.MySharedHBaseCluster$class.beforeAll(MySharedHBaseCluster.scala:28)
at 
com.thomsonreuters.kraken.medusa.service.AlertStatusServiceSpec.beforeAll(AlertStatusServiceSpec.scala:12)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
at 
com.thomsonreuters.kraken.medusa.service.AlertStatusServiceSpec.org$scalatest$BeforeAndAfter$$super$run(AlertStatusServiceSpec.scala:7)
at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:258)
at 
com.thomsonreuters.kraken.medusa.service.AlertStatusServiceSpec.run(AlertStatusServiceSpec.scala:7)
at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
at 
org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
at 
org.scalatest.tools.Runner$$anonfun$do

Re: Scan problem

2018-03-21 Thread Yang Zhang
Thanks all of you,  and your answer help me a lot.

2018-03-19 22:31 GMT+08:00 Saad Mufti :

> Another option if you have enough disk space/off heap memory space is to
> enable bucket cache to cache even more of your data, and set the
> PREFETCH_ON_OPEN => true option on the column families you want always
> cache. That way HBase will prefetch your data into the bucket cache and
> your scan won't have that initial slowdown. Or if you want to do it
> globally for all column families, set the configuration flag
> "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you
> do this, you should either have enough bucket cache space for all your
> data, otherwise there will be a lot of useless eviction activity at HBase
> startup and even later.
>
> Also, where a region is located will also be heavily impacted by which
> region balancer you have chosen and how you have tuned it in terms of how
> often to run and other parameters. A split region will stay initially at
> least on the same region server but your balancer if and when run can move
> it (an indeed any region) elsewhere to satisfy its criteria.
>
> Cheers.
>
> 
> Saad
>
>
> On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan <
> ramkrishna.s.vasude...@gmail.com> wrote:
>
> > Hi
> >
> > First regarding the scans,
> >
> > Generally the data resides in the store files which is in HDFS. So
> probably
> > the first scan that you are doing is reading from HDFS which involves
> disk
> > reads. Once the blocks are read, they are cached in the Block cache of
> > HBase. So your further reads go through that and hence you see further
> > speed up in the scans.
> >
> > >> And another question about region split, I want to know which
> > RegionServer
> > will load the new region afther splited ,
> > Will they be the same One with the old region?
> > Yes . Generally same region server hosts it.
> >
> > In master the code is here,
> > https://github.com/apache/hbase/blob/master/hbase-
> > server/src/main/java/org/apache/hadoop/hbase/master/assignment/
> > SplitTableRegionProcedure.java
> >
> > You may need to understand the entire flow to know how the regions are
> > opened after a split.
> >
> > Regards
> > Ram
> >
> > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang 
> > wrote:
> >
> > > Hello everyone
> > >
> > > I try to do many Scan use RegionScanner in coprocessor, and
> > ervery
> > > time ,the first Scan cost  about 10 times than the other,
> > > I don't know why this will happen
> > >
> > > OneBucket Scan cost is : 8794 ms Num is : 710
> > > OneBucket Scan cost is : 91 ms Num is : 776
> > > OneBucket Scan cost is : 87 ms Num is : 808
> > > OneBucket Scan cost is : 105 ms Num is : 748
> > > OneBucket Scan cost is : 68 ms Num is : 200
> > >
> > >
> > > And another question about region split, I want to know which
> > RegionServer
> > > will load the new region afther splited ,
> > > Will they be the same One with the old region?  Anyone know where I can
> > > find the code to learn about that?
> > >
> > >
> > > Thanks for your help
> > >
> >
>