[jira] [Created] (HBASE-10389) Add namespace help info in table related shell commands

2014-01-20 Thread Jerry He (JIRA)
Jerry He created HBASE-10389:


 Summary: Add namespace help info in table related shell commands
 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.1, 0.96.0
Reporter: Jerry He
 Fix For: 0.98.0, 0.96.2


Currently in the help info of table related shell command, we don't mention or 
give namespace as part of the table name.  
For example, to create table:

{code}
hbase(main):001:0 help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:

  hbase create 't1', {NAME = 'f1', VERSIONS = 5}
  hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
  hbase # The above in shorthand would be the following:
  hbase create 't1', 'f1', 'f2', 'f3'
  hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE 
= true}
  hbase create 't1', {NAME = 'f1', CONFIGURATION = 
{'hbase.hstore.blockingStoreFiles' = '10'}}

Table configuration options can be put at the end.
Examples:

  hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
  hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
  hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
'myvalue' }
  hbase # Optionally pre-split the table into NUMREGIONS, using
  hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
  hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
  hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}

You can also keep around a reference to the created table:

  hbase t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then
call methods.
{code}

We should document the usage of namespace in these commands.
For example:

#namespace=foo and table qualifier=bar
create 'foo:bar', 'fam'

#namespace=default and table qualifier=bar
create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10389) Add namespace help info in table related shell commands

2014-01-21 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He reassigned HBASE-10389:


Assignee: Jerry He

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.98.0, 0.96.2


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands

2014-01-21 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878319#comment-13878319
 ] 

Jerry He commented on HBASE-10389:
--

I will do an initial survey of those commands, and get comments from there.

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.98.0, 0.96.2


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)
Jerry He created HBASE-10448:


 Summary: ZKUtil create and watch methods don't set watch in some 
cases
 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1.1, 0.96.0
Reporter: Jerry He
 Fix For: 0.98.1


While using the ZKUtil methods during testing, I found that watch was not set 
when it should be set based on the methods and method comments:
createNodeIfNotExistsAndWatch
createEphemeralNodeAndWatch

For example, in createNodeIfNotExistsAndWatch():

{code}
 public static boolean createNodeIfNotExistsAndWatch(
  ZooKeeperWatcher zkw, String znode, byte [] data)
  throws KeeperException {
try {
  zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
  CreateMode.PERSISTENT);
} catch (KeeperException.NodeExistsException nee) {
  try {
zkw.getRecoverableZooKeeper().exists(znode, zkw);
  } catch (InterruptedException e) {
zkw.interruptedException(e);
return false;
  }
  return false;
} catch (InterruptedException e) {
  zkw.interruptedException(e);
  return false;
}
return true;
  }
{code}

The watch is only set via exists() call when the node already exists.
Similarly in createEphemeralNodeAndWatch():
{code}
  public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
  String znode, byte [] data)
  throws KeeperException {
try {
  zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
  CreateMode.EPHEMERAL);
} catch (KeeperException.NodeExistsException nee) {
  if(!watchAndCheckExists(zkw, znode)) {
// It did exist but now it doesn't, try again
return createEphemeralNodeAndWatch(zkw, znode, data);
  }
  return false;
} catch (InterruptedException e) {
  LOG.info(Interrupted, e);
  Thread.currentThread().interrupt();
}
return true;
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888016#comment-13888016
 ] 

Jerry He commented on HBASE-10448:
--

I wonder how the callers/users of these 2 methods actually worked. 
A guess is that they don't really depend on the watches being set in the cases 
that the watches are not set.

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.0, 0.96.1.1
Reporter: Jerry He
 Fix For: 0.98.1


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888441#comment-13888441
 ] 

Jerry He commented on HBASE-10448:
--

Found this HBASE-8937 that reported similar problem.

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.0, 0.96.1.1
Reporter: Jerry He
 Fix For: 0.98.1


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10448:
-

Attachment: HBASE-10448-trunk.patch

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.0, 0.96.1.1
Reporter: Jerry He
 Fix For: 0.98.1

 Attachments: HBASE-10448-trunk.patch


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10448:
-

Status: Patch Available  (was: Open)

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1.1, 0.96.0
Reporter: Jerry He
 Fix For: 0.98.1

 Attachments: HBASE-10448-trunk.patch


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-01-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888455#comment-13888455
 ] 

Jerry He commented on HBASE-10448:
--

Attached a patch that set the watch whether or not we get NodeExistsException.
I tried to avoid any other change in behavior of these two methods.

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.0, 0.96.1.1
Reporter: Jerry He
 Fix For: 0.98.1

 Attachments: HBASE-10448-trunk.patch


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10448) ZKUtil create and watch methods don't set watch in some cases

2014-02-01 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888792#comment-13888792
 ] 

Jerry He commented on HBASE-10448:
--

Thanks, Ted, Andrew.
Could you also mark HBASE-8937 as resolved as dup so that we have closure on 
that one too?

 ZKUtil create and watch methods don't set watch in some cases
 -

 Key: HBASE-10448
 URL: https://issues.apache.org/jira/browse/HBASE-10448
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.0, 0.96.1.1
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10448-trunk.patch


 While using the ZKUtil methods during testing, I found that watch was not set 
 when it should be set based on the methods and method comments:
 createNodeIfNotExistsAndWatch
 createEphemeralNodeAndWatch
 For example, in createNodeIfNotExistsAndWatch():
 {code}
  public static boolean createNodeIfNotExistsAndWatch(
   ZooKeeperWatcher zkw, String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.PERSISTENT);
 } catch (KeeperException.NodeExistsException nee) {
   try {
 zkw.getRecoverableZooKeeper().exists(znode, zkw);
   } catch (InterruptedException e) {
 zkw.interruptedException(e);
 return false;
   }
   return false;
 } catch (InterruptedException e) {
   zkw.interruptedException(e);
   return false;
 }
 return true;
   }
 {code}
 The watch is only set via exists() call when the node already exists.
 Similarly in createEphemeralNodeAndWatch():
 {code}
   public static boolean createEphemeralNodeAndWatch(ZooKeeperWatcher zkw,
   String znode, byte [] data)
   throws KeeperException {
 try {
   zkw.getRecoverableZooKeeper().create(znode, data, createACL(zkw, znode),
   CreateMode.EPHEMERAL);
 } catch (KeeperException.NodeExistsException nee) {
   if(!watchAndCheckExists(zkw, znode)) {
 // It did exist but now it doesn't, try again
 return createEphemeralNodeAndWatch(zkw, znode, data);
   }
   return false;
 } catch (InterruptedException e) {
   LOG.info(Interrupted, e);
   Thread.currentThread().interrupt();
 }
 return true;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-03 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Attachment: HBASE-10389-trunk.patch

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-03 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Status: Patch Available  (was: Open)

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.1, 0.96.0
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-03 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890267#comment-13890267
 ] 

Jerry He commented on HBASE-10389:
--

I attached a patch which basically adds a table with namespace example in each 
of the relevant command without much explanation.
Comment please, to let me know if we need to do more, or less.

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Status: Open  (was: Patch Available)

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.1, 0.96.0
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Attachment: (was: HBASE-10389-trunk.patch)

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He

 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Attachment: HBASE-10389-trunk.patch

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10389:
-

Status: Patch Available  (was: Open)

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.1, 0.96.0
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891017#comment-13891017
 ] 

Jerry He commented on HBASE-10389:
--

Re-formatted the patch, and re-attached it.

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands

2014-02-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891488#comment-13891488
 ] 

Jerry He commented on HBASE-10389:
--

All the following work as desired:

hbase disable_all 't.*'
hbase disable_all 'ns:t.*'
hbase disable_all 'ns:.*'

and this one will include my_namespace1:table1, my_namespace2:table2, my_table
  hbase disable_all 'my.*'

 Add namespace help info in table related shell commands
 ---

 Key: HBASE-10389
 URL: https://issues.apache.org/jira/browse/HBASE-10389
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.96.0, 0.96.1
Reporter: Jerry He
Assignee: Jerry He
 Attachments: HBASE-10389-trunk.patch


 Currently in the help info of table related shell command, we don't mention 
 or give namespace as part of the table name.  
 For example, to create table:
 {code}
 hbase(main):001:0 help 'create'
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}
   hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'}
   hbase # The above in shorthand would be the following:
   hbase create 't1', 'f1', 'f2', 'f3'
   hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, 
 BLOCKCACHE = true}
   hbase create 't1', {NAME = 'f1', CONFIGURATION = 
 {'hbase.hstore.blockingStoreFiles' = '10'}}
 Table configuration options can be put at the end.
 Examples:
   hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40']
   hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe'
   hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 
 'myvalue' }
   hbase # Optionally pre-split the table into NUMREGIONS, using
   hbase # SPLITALGO (HexStringSplit, UniformSplit or classname)
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'}
   hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', 
 CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}}
 You can also keep around a reference to the created table:
   hbase t1 = create 't1', 'f1'
 Which gives you a reference to the table named 't1', on which you can then
 call methods.
 {code}
 We should document the usage of namespace in these commands.
 For example:
 #namespace=foo and table qualifier=bar
 create 'foo:bar', 'fam'
 #namespace=default and table qualifier=bar
 create 'bar', 'fam'



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10492) open daughter regions can unpredictably take long time

2014-02-10 Thread Jerry He (JIRA)
Jerry He created HBASE-10492:


 Summary: open daughter regions can unpredictably take long time
 Key: HBASE-10492
 URL: https://issues.apache.org/jira/browse/HBASE-10492
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Jerry He


I have seen during a stress testing client was getting 

RetriesExhaustedWithDetailsException: Failed 748 actions: 
NotServingRegionException

On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to 
SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN.

The corresponding time period on the region sever log is:

{code}
2014-02-08 20:44:12,662 WARN 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
found for region: 010c1981882d1a59201af5e2dc589d44
2014-02-08 20:44:12,666 WARN 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
found for region: c2eb9b7971ca7f3fed3da86df5b788e7
{code}
There were no INFO related to these two regions until: (at the end see this: 
Split took 57mins, 16sec)

{code}
2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355
2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354
2014-02-08 21:41:14,032 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for 
region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Updated row 
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 with server=hdtest208.svl.ibm.com,60020,1391887547473
2014-02-08 21:41:14,054 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy 
task for 
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
2014-02-08 21:41:14,054 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for 
region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Completed compaction of 10 file(s) in cf of 
tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.
 into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 
2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute.
2014-02-08 21:41:14,059 INFO 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: 
Request = 
regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.,
 storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, 
time=1391924373278861000; duration=1mins, 40sec
2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Starting compaction on cf in region 
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Starting compaction of 10 file(s) in cf of 
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 into 
tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp,
 totalSize=709.7 M
2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Updated row 
tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with 
server=hdtest208.svl.ibm.com,60020,1391887547473
2014-02-08 21:41:14,066 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy 
task for 
tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
2014-02-08 21:41:14,190 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: 
Region split, hbase:meta updated, and report to master. 
Parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d.,
 new regions: 
tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., 
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7..
 Split took 57mins, 16sec
{code}



--
This message 

[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time

2014-02-10 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896937#comment-13896937
 ] 

Jerry He commented on HBASE-10492:
--

The problem is probably caused by this part of the code in 
SplitTransaction.openDaughters()

{code}
  // Open daughters in parallel.
  DaughterOpener aOpener = new DaughterOpener(server, a);
  DaughterOpener bOpener = new DaughterOpener(server, b);
  aOpener.start();
  bOpener.start();
  try {
aOpener.join();
bOpener.join();
  } 
{code}

We are opening the daughter regions in separate new threads.  It is possible, 
although rare, due to issues like thread scheduling, the daughter regions may 
not be opened until after a long time.

 open daughter regions can unpredictably take long time
 --

 Key: HBASE-10492
 URL: https://issues.apache.org/jira/browse/HBASE-10492
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Jerry He

 I have seen during a stress testing client was getting 
 RetriesExhaustedWithDetailsException: Failed 748 actions: 
 NotServingRegionException
 On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to 
 SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN.
 The corresponding time period on the region sever log is:
 {code}
 2014-02-08 20:44:12,662 WARN 
 org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
 found for region: 010c1981882d1a59201af5e2dc589d44
 2014-02-08 20:44:12,666 WARN 
 org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
 found for region: c2eb9b7971ca7f3fed3da86df5b788e7
 {code}
 There were no INFO related to these two regions until: (at the end see this: 
 Split took 57mins, 16sec)
 {code}
 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355
 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354
 2014-02-08 21:41:14,032 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
 for 
 region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
 Updated row 
 tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
  with server=hdtest208.svl.ibm.com,60020,1391887547473
 2014-02-08 21:41:14,054 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy 
 task for 
 tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 2014-02-08 21:41:14,054 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
 for 
 region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
 Completed compaction of 10 file(s) in cf of 
 tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.
  into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 
 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute.
 2014-02-08 21:41:14,059 INFO 
 org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed 
 compaction: Request = 
 regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.,
  storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, 
 time=1391924373278861000; duration=1mins, 40sec
 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Starting compaction on cf in region 
 tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
 Starting compaction of 10 file(s) in cf of 
 tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
  into 
 tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp,
  totalSize=709.7 M
 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
 

[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time

2014-02-11 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13898262#comment-13898262
 ] 

Jerry He commented on HBASE-10492:
--

The machines are 24 CPU 48G memory with 
Red Hat Enterprise Linux Server release 6.4 (Santiago) 2.6.32-358.el6.x86_64
IBM JDK 6
5 region servers (each with datanode and task tracker).  The load MR job with 
loading of data.
I have been trying to reproduce the long delay in opening the daughter regions.
With 'org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 200' I have 
seen delays up to 6 mins.
See the log below (from 2014-02-11 02:35:52 to 2014-02-11 02:41:14 at the end)
{code}
2014-02-11 02:35:52,473 WARN 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
found for region: 10a421ac8075a42cbcb53bdc393c8e8c
2014-02-11 02:35:52,479 WARN 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
found for region: 5ff07e59d13c99ca14408807a6e61722
2014-02-11 02:35:52,589 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactionConfiguration: size 
[4194304, 9223372036854775807); files [3, 10); ratio 1.20; off-peak ratio 
5.00; throttle point 2684354560; delete expired; major period 0, major 
jitter 0.50
2014-02-11 02:35:52,596 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactionConfiguration: size 
[4194304, 9223372036854775807); files [3, 10); ratio 1.20; off-peak ratio 
5.00; throttle point 2684354560; delete expired; major period 0, major 
jitter 0.50
2014-02-11 02:35:55,458 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=4289924, memsize=256.6 M, hasBloomFilter=true, into tmp file 
gpfs:/hbase/data/default/TestTable/ed4d9fb392ae52c1a406a221defc6b00/.tmp/9e2cb318b0114248b9c62948cf47ac5b
2014-02-11 02:36:37,894 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=4289926, memsize=153.1 M, hasBloomFilter=true, into tmp file 
gpfs:/hbase/data/default/TestTable/110cc21c77569d595f7717b8c75fbf66/.tmp/4e55d6ba4b5644838163101f2ba20fdb
2014-02-11 02:36:53,067 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
Rolled WAL 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392114789609
 with entries=416, filesize=578.7 M; new WAL 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392114958416
2014-02-11 02:36:53,067 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112795409
 whose highest sequenceid is 4285071 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112795409
2014-02-11 02:36:53,162 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112818204
 whose highest sequenceid is 4285169 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112818204
2014-02-11 02:36:53,210 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112839023
 whose highest sequenceid is 4285266 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112839023
2014-02-11 02:37:13,297 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112862511
 whose highest sequenceid is 4285362 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112862511
2014-02-11 02:37:13,326 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112871587
 whose highest sequenceid is 4285453 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112871587
2014-02-11 02:37:13,383 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112877894
 whose highest sequenceid is 4285546 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112877894
2014-02-11 02:37:33,474 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
moving old hlog file 
/hbase/WALs/hdtest202.svl.ibm.com,60020,1392097223732/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112891408
 whose highest sequenceid is 4285641 to 
/hbase/oldWALs/hdtest202.svl.ibm.com%2C60020%2C1392097223732.1392112891408
2014-02-11 02:37:33,481 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 

[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time

2014-02-12 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13899501#comment-13899501
 ] 

Jerry He commented on HBASE-10492:
--

I am able to reproduce the problem and get more realtime metrics on the region 
server.  Now it does not seem to be a thread scheduling problem, or doesn't it?
We can see from the live metrics:   Initializing region  since 5mins, 18sec ago 
Instantiating store for column family since 5mins, 18sec ago

{code}
Wed Feb 12 00:00:59 PST 2014Initializing region 
tpch_hb_1000_2.lineitem,`\x01\x85\xF5\xEC\x8D\x01\x80\x00\x0B\x8E\x01\x80\x00\x00\x00\xB3\x9EN\xE3\x01\x80\x00\x00\x03,1392192032936.1d4381bb583f957a9996c1ef0fa3ce68.
  RUNNING (since 5mins, 18sec ago)Instantiating store for column 
family {NAME = 'cf', REPLICATION_SCOPE = '0', KEEP_DELETED_CELLS = 'false', 
COMPRESSION = 'GZ', ENCODE_ON_DISK = 'true', BLOCKCACHE = 'true', 
MIN_VERSIONS = '0', DATA_BLOCK_ENCODING = 'NONE', IN_MEMORY = 'false', 
BLOOMFILTER = 'ROW', TTL = '2147483647', VERSIONS = '2147483647', BLOCKSIZE 
= '65536'} (since 5mins, 18sec ago)
Wed Feb 12 00:00:59 PST 2014Initializing region 
tpch_hb_1000_2.lineitem,`\x01\x80\x01:\x94\x01\x80\x01:\x95\x01\x80\x00\x00\x00\xB5\xA8\x94\x04\x01\x80\x00\x00\x02,1392192032936.2980739184621d45397a972ea89c9411.
 RUNNING (since 5mins, 18sec ago)Instantiating store for column family 
{NAME = 'cf', REPLICATION_SCOPE = '0', KEEP_DELETED_CELLS = 'false', 
COMPRESSION = 'GZ', ENCODE_ON_DISK = 'true', BLOCKCACHE = 'true', 
MIN_VERSIONS = '0', DATA_BLOCK_ENCODING = 'NONE', IN_MEMORY = 'false', 
BLOOMFILTER = 'ROW', TTL = '2147483647', VERSIONS = '2147483647', BLOCKSIZE 
= '65536'} (since 5mins, 18sec ago)

Wed Feb 12 00:00:59 PST 2014Initializing region 
tpch_hb_1000_2.lineitem,`\x01\x85\xF5\xEC\x8D\x01\x80\x00\x0B\x8E\x01\x80\x00\x00\x00\xB3\x9EN\xE3\x01\x80\x00\x00\x03,1392192032936.1d4381bb583f957a9996c1ef0fa3ce68.
  RUNNING (since 8mins, 18sec ago) Instantiating store for column 
family {NAME = 'cf', REPLICATION_SCOPE = '0', KEEP_DELETED_CELLS = 'false', 
COMPRESSION = 'GZ', ENCODE_ON_DISK = 'true', BLOCKCACHE = 'true', 
MIN_VERSIONS = '0', DATA_BLOCK_ENCODING = 'NONE', IN_MEMORY = 'false', 
BLOOMFILTER = 'ROW', TTL = '2147483647', VERSIONS = '2147483647', BLOCKSIZE 
= '65536'} (since 8mins, 18sec ago)
Wed Feb 12 00:00:59 PST 2014Initializing region 
tpch_hb_1000_2.lineitem,`\x01\x80\x01:\x94\x01\x80\x01:\x95\x01\x80\x00\x00\x00\xB5\xA8\x94\x04\x01\x80\x00\x00\x02,1392192032936.2980739184621d45397a972ea89c9411.
 RUNNING (since 8mins, 18sec ago)Instantiating store for column family 
{NAME = 'cf', REPLICATION_SCOPE = '0', KEEP_DELETED_CELLS = 'false', 
COMPRESSION = 'GZ', ENCODE_ON_DISK = 'true', BLOCKCACHE = 'true', 
MIN_VERSIONS = '0', DATA_BLOCK_ENCODING = 'NONE', IN_MEMORY = 'false', 
BLOOMFILTER = 'ROW', TTL = '2147483647', VERSIONS = '2147483647', BLOCKSIZE 
= '65536'} (since 8mins, 18sec ago)

{code}

It is more like file system issue?

BWT. HBase's new metrics rocks!

 open daughter regions can unpredictably take long time
 --

 Key: HBASE-10492
 URL: https://issues.apache.org/jira/browse/HBASE-10492
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Jerry He

 I have seen during a stress testing client was getting 
 RetriesExhaustedWithDetailsException: Failed 748 actions: 
 NotServingRegionException
 On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to 
 SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN.
 The corresponding time period on the region sever log is:
 {code}
 2014-02-08 20:44:12,662 WARN 
 org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
 found for region: 010c1981882d1a59201af5e2dc589d44
 2014-02-08 20:44:12,666 WARN 
 org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
 found for region: c2eb9b7971ca7f3fed3da86df5b788e7
 {code}
 There were no INFO related to these two regions until: (at the end see this: 
 Split took 57mins, 16sec)
 {code}
 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355
 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354
 2014-02-08 21:41:14,032 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
 for 
 region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
 Updated row 
 

[jira] [Commented] (HBASE-10549) when there is a hole,LoadIncrementalHFiles will hung up in an infinite loop.

2014-02-15 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902517#comment-13902517
 ] 

Jerry He commented on HBASE-10549:
--

Can you give more details, e.g. what is a hole?

 when there is a hole,LoadIncrementalHFiles will hung up in an infinite loop.
 

 Key: HBASE-10549
 URL: https://issues.apache.org/jira/browse/HBASE-10549
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.11
Reporter: yuanxinen





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10533) commands.rb is giving wrong error messages on exceptions

2014-02-17 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903761#comment-13903761
 ] 

Jerry He commented on HBASE-10533:
--

HBASE-8798 should have fixed the first part?

 commands.rb is giving wrong error messages on exceptions
 

 Key: HBASE-10533
 URL: https://issues.apache.org/jira/browse/HBASE-10533
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: HBASE-10533_trunk.patch


 1) Clone into existing table name is printing snapshot name instead of table 
 name.
 {code}
 hbase(main):004:0 clone_snapshot 'myTableSnapshot-122112','table'
 ERROR: Table already exists: myTableSnapshot-122112!
 {code}
 The reason for this is we are printing first argument instead of exception 
 message.
 {code}
 if cause.kind_of?(org.apache.hadoop.hbase.TableExistsException) then
   raise Table already exists: #{args.first}!
 end
 {code}
 2) If we give wrong column family in put or delete. Expectation is to print 
 actual column families in the table but instead throwing the exception.
 {code}
 hbase(main):002:0 put 't1','r','unkwown_cf','value'
 2014-02-14 15:51:10,037 WARN  [main] util.NativeCodeLoader: Unable to load 
 native-hadoop library for your platform... using builtin-java classes where 
 applicable
 2014-02-14 15:51:10,640 INFO  [main] hdfs.PeerCache: SocketCache disabled.
 ERROR: Failed 1 action: 
 org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column 
 family unkwown_cf does not exist in region 
 t1,,1392118273512.c7230b923c58f1af406a6d84930e40c1. in table 't1', 
 {NAME = 'f1', DATA_BLOCK_ENCODING = 'NONE', BLOOMFILTER = 'ROW', 
 REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '6', TTL = 
 '2147483647', MIN_VERSIONS = '0', KEEP_DELETED_CELLS = 'false', BLOCKSIZE 
 = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4206)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3441)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3345)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28460)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
 at 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
 at 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
 at 
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
 at java.lang.Thread.run(Thread.java:662)
 : 1 time,
 {code}
 The reason for this is server will not throw NoSuchColumnFamilyException 
 directly, instead RetriesExhaustedWithDetailsException will be thrown.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-25 Thread Jerry He (JIRA)
Jerry He created HBASE-10615:


 Summary: Make LoadIncrementalHFiles skip reference files
 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor


There is use base that the source of hfiles for LoadIncrementalHFiles can be a 
FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or archive 
dir.
2. ExportSnapshot

It is possible that there are reference files in the family dir in these cases.
We have such use cases, where trying to load back into HBase, we'll get

{code}
Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
reading HFile Trailer from file 
hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
at 
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
at 
org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
at java.util.concurrent.FutureTask.run(FutureTask.java:149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
at java.lang.Thread.run(Thread.java:738)
Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 16715777 
(expected to be between 2 and 2)
at 
org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
at 
org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
at 
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
{code}

It is desirable and safe to skip these reference files since they don't contain 
any real data for bulk load purpose.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-25 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Attachment: HBASE-10615-trunk.patch

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-25 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912533#comment-13912533
 ] 

Jerry He commented on HBASE-10615:
--

Attached a patch to skip reference files in discoverLoadQueue().

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-25 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Status: Patch Available  (was: Open)

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8073) HFileOutputFormat support for offline operation

2014-02-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13913397#comment-13913397
 ] 

Jerry He commented on HBASE-8073:
-

A good feature. 
An example is the WALPlayer. When replaying WALs to HFiles, it requires a live 
table with the same table name in the WAL, which makes it not useful as a 
offline tool.

 HFileOutputFormat support for offline operation
 ---

 Key: HBASE-8073
 URL: https://issues.apache.org/jira/browse/HBASE-8073
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce
Reporter: Nick Dimiduk

 When using HFileOutputFormat to generate HFiles, it inspects the region 
 topology of the target table. The split points from that table are used to 
 guide the TotalOrderPartitioner. If the target table does not exist, it is 
 first created. This imposes an unnecessary dependence on an online HBase and 
 existing table.
 If the table exists, it can be used. However, the job can be smarter. For 
 example, if there's far more data going into the HFiles than the table 
 currently contains, the table regions aren't very useful for data split 
 points. Instead, the input data can be sampled to produce split points more 
 meaningful to the dataset. LoadIncrementalHFiles is already capable of 
 handling divergence between HFile boundaries and table regions, so this 
 should not pose any additional burdon at load time.
 The proper method of sampling the data likely requires a custom input format 
 and an additional map-reduce job perform the sampling. See a relevant 
 implementation: 
 https://github.com/alexholmes/hadoop-book/blob/master/src/main/java/com/manning/hip/ch4/sampler/ReservoirSamplerInputFormat.java



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13913904#comment-13913904
 ] 

Jerry He commented on HBASE-10615:
--

Hi, Matteo

Thanks for the comments!

There are two questions here.
1.  Should the bulk load throws an error or skip if it sees a reference file?  
My argument is that we should not throw an error.
 The existence of reference file is not an error condition.
2.  Is it safe to skip the reference file for the purpose of bulk loading from 
the user's perspective?  Matteo raised the issue of possible loss of data.
 My argument is that we are fine for these reasons:
1)  The purpose of LoadIncrementalHFiles is to load the data contained in the 
hfiles of a given region dir into HBase safely.  
   As long as this is satisfied, we are fine for the data for this scope 
2)  If we want to consider from a broader view, to confider the integrity of 
the entire table data.  
  The user of the bulk load tool controls the bulk loading. 
  For example, the user will not copy out the links in a cloned table from a 
snapshot and then expect to bulk load these links to have the data.
  In the reference example, the user will bulk load the parent region too. 

{quote} 
you upload the parent region data but not the daughter reference files
the CatalogJanitor kicks in and the parent is removed, since there are no 
references to the parent
and your data is lost...
{quote}
Why would the data is lost?  I thought the hfiles in the parent region would be 
added or sliced into an existing live region. The bulk load tool does not care 
if the input hfile region is a split parent or not, right?  Maybe I miss and 
misunderstand something? 

   

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10622) Improve log and Exceptions in Export Snapshot

2014-02-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13913963#comment-13913963
 ] 

Jerry He commented on HBASE-10622:
--

It will help to use job.getStatus().getFailureInfo() if the copy job failed?

Another area is to somehow intelligently to estimate the number of copy mappers 
needed based on the size and number of files?  Similar to DistCp?

 Improve log and Exceptions in Export Snapshot 
 --

 Key: HBASE-10622
 URL: https://issues.apache.org/jira/browse/HBASE-10622
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.99.0

 Attachments: HBASE-10622-v0.patch


 from the logs of export snapshot is not really clear what's going on,
 adding some extra information useful to debug, and in some places the real 
 exception can be thrown



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914928#comment-13914928
 ] 

Jerry He commented on HBASE-10615:
--

Let me give a practical use case, related to ExportSnapshot.  You can help to 
see if there is any loophole.

I take snapshots on cluster A, and export them to cluster B, which services as 
backup storage.
When I want to clone the table on cluster C, I can do the following on cluster 
C (alternative to restore/clone snapshot)

1. Construct the table based on the tableInfo and possibly pre-split based on 
the region info stored with the snapshot.
2. Have a program basically loops thru the archive regions to bulk load the 
region data.

The parent region is in the archive so are the daughters, if the snapshot 
happened to capture the moment. 
I remember you had a JIRA to include both parent and daughters in the snapshot.

I don't see any loss of data here.  I have been testing it for a while.  
I had to change LoadIncrementalHFiles to skip the reference files if they 
exists, to avoid the exception that posted in this JIRA.

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Attachment: HBASE-10615-trunk-v2.patch

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915064#comment-13915064
 ] 

Jerry He commented on HBASE-10615:
--

Attached v2 with added LOG warns.
There is another place where we walk thru the hfiles in createTable() when the 
table does not exist. We read the files twice in this case.
Only warn once.

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Attachment: HBASE-10615-trunk-v3.patch

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915320#comment-13915320
 ] 

Jerry He commented on HBASE-10615:
--

Rebased with latest from trunk and attached v3.

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915324#comment-13915324
 ] 

Jerry He commented on HBASE-10615:
--

A config parameter will probably be helpful if the bulk load tool itself can or 
wants to 'resolve' the reference/link.
But bulk load is operating on one region dir only.

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Status: Open  (was: Patch Available)

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-10615:
-

Status: Patch Available  (was: Open)

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-02-27 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915435#comment-13915435
 ] 

Jerry He commented on HBASE-10615:
--

bq. A config parameter will probably be helpful if the bulk load tool itself 
can or wants to 'resolve' the reference/link.
Then it will be good to give user an option to 'skip' or 'resolve'.
Now it is more like the only option is to 'skip'. 
Or 'error', which doesn't make much meaningful sense.

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10622) Improve log and Exceptions in Export Snapshot

2014-02-28 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916308#comment-13916308
 ] 

Jerry He commented on HBASE-10622:
--

Hi, Matteo

Are you reversing HBASE-9060 in the code by putting the path inthe 

Also, do you want to put a '%' charactor after:
(totalBytesWritten/(float)inputFileSize) * 100.0f

 Improve log and Exceptions in Export Snapshot 
 --

 Key: HBASE-10622
 URL: https://issues.apache.org/jira/browse/HBASE-10622
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.99.0

 Attachments: HBASE-10622-v0.patch, HBASE-10622-v1.patch, 
 HBASE-10622-v2.patch


 from the logs of export snapshot is not really clear what's going on,
 adding some extra information useful to debug, and in some places the real 
 exception can be thrown



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10622) Improve log and Exceptions in Export Snapshot

2014-02-28 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916318#comment-13916318
 ] 

Jerry He commented on HBASE-10622:
--

bq. Are you reversing HBASE-9060 in the code by putting the path inthe
Are you reversing HBASE-9060 in the code by putting the path in the format?

 Improve log and Exceptions in Export Snapshot 
 --

 Key: HBASE-10622
 URL: https://issues.apache.org/jira/browse/HBASE-10622
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.99.0

 Attachments: HBASE-10622-v0.patch, HBASE-10622-v1.patch, 
 HBASE-10622-v2.patch


 from the logs of export snapshot is not really clear what's going on,
 adding some extra information useful to debug, and in some places the real 
 exception can be thrown



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10622) Improve log and Exceptions in Export Snapshot

2014-03-02 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917556#comment-13917556
 ] 

Jerry He commented on HBASE-10622:
--

A few more comments (since you are doing the improvement ...)

{code}
-// Verify that the written size match
-if (totalBytesWritten != inputFileSize) {
-  String msg = number of bytes copied not matching copied= + 
totalBytesWritten +
-expected= + inputFileSize +  for file= + inputPath;
-  throw new IOException(msg);
{code}
You think this is unnecessary?

In the run(),  can we cleanup/delete snapshotTmpDir if Step 2 failed so that we 
don't ask the user to manually clean it since it comes from our Step 1 copy?

Can we add a job counter say 'COPIES_FILES' to be along side with 
'BYTES_COPIED'?

Another issue is probably more involved, and does not need to be covered in 
this JIRA. It is the overall progress reporting of the ExportSnapshot job.
For example,  
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snapshot1 
-copy-to /user/biadmin/mysnapshots -mappers 30

{code}
14/03/02 12:19:54 INFO mapred.JobClient:  map 0% reduce 0%
14/03/02 12:20:12 INFO mapred.JobClient:  map 6% reduce 0%
14/03/02 12:20:13 INFO mapred.JobClient:  map 44% reduce 0%
14/03/02 12:20:19 INFO mapred.JobClient:  map 83% reduce 0%
{code}
There is about 130G to export.  But it takes just a few secs to get to 83%, 
after the first around of mappers are launched, and will stay there for a long 
time.
Similarly at the end it will show 100% for a long time while there are mappers 
still running.
he map progress percentage is quite inaccurate with regard to the over 
progress. 

 Improve log and Exceptions in Export Snapshot 
 --

 Key: HBASE-10622
 URL: https://issues.apache.org/jira/browse/HBASE-10622
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.99.0

 Attachments: HBASE-10622-v0.patch, HBASE-10622-v1.patch, 
 HBASE-10622-v2.patch, HBASE-10622-v3.patch


 from the logs of export snapshot is not really clear what's going on,
 adding some extra information useful to debug, and in some places the real 
 exception can be thrown



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10622) Improve log and Exceptions in Export Snapshot

2014-03-03 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918364#comment-13918364
 ] 

Jerry He commented on HBASE-10622:
--

bq. other jira, it requires a new InputFormat/RecordReader with the progress 
based on the file size and not on the number of lines in the input file. The 
only progress that we track is the current file copy

Agree. Not an easy thing to do.  It seems that DistCp has the same issue.

Looks good.  Thanks.

 Improve log and Exceptions in Export Snapshot 
 --

 Key: HBASE-10622
 URL: https://issues.apache.org/jira/browse/HBASE-10622
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.99.0

 Attachments: HBASE-10622-v0.patch, HBASE-10622-v1.patch, 
 HBASE-10622-v2.patch, HBASE-10622-v3.patch, HBASE-10622-v4.patch


 from the logs of export snapshot is not really clear what's going on,
 adding some extra information useful to debug, and in some places the real 
 exception can be thrown



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10615) Make LoadIncrementalHFiles skip reference files

2014-03-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920074#comment-13920074
 ] 

Jerry He commented on HBASE-10615:
--

The findbugs and TestLogRolling does not seem to be caused by the patch.
Hi, [~stack], [~mbertozzi], [~yuzhih...@gmail.com]
Are you ok with the patch?

 Make LoadIncrementalHFiles skip reference files
 ---

 Key: HBASE-10615
 URL: https://issues.apache.org/jira/browse/HBASE-10615
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-10615-trunk-v2.patch, HBASE-10615-trunk-v3.patch, 
 HBASE-10615-trunk.patch


 There is use base that the source of hfiles for LoadIncrementalHFiles can be 
 a FileSystem copy-out/backup of HBase table or archive hfiles.  For example,
 1. Copy-out of hbase.rootdir, table dir, region dir (after disable) or 
 archive dir.
 2. ExportSnapshot
 It is possible that there are reference files in the family dir in these 
 cases.
 We have such use cases, where trying to load back into HBase, we'll get
 {code}
 Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
 reading HFile Trailer from file 
 hdfs://HDFS-AMR/tmp/restoreTemp/117182adfe861c5d2b607da91d60aa8a/info/aed3d01648384b31b29e5bad4cd80bec.d179ab341fc68e7612fcd74eaf7cafbd
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:570)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:594)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:636)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:472)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:393)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:391)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
 at java.util.concurrent.FutureTask.run(FutureTask.java:149)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 
 16715777 (expected to be between 2 and 2)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:927)
 at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:426)
 at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:568)
 {code}
 It is desirable and safe to skip these reference files since they don't 
 contain any real data for bulk load purpose.
   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-8304) Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured without default port.

2014-03-06 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923197#comment-13923197
 ] 

Jerry He commented on HBASE-8304:
-

Maybe I am asking for too much.
Does this JIRA/patch covers the following case:
source uri:  webhdfs://myhost1:14000/
target  uri:   hdfs://myhost1:9000/

We should not copy in this case since we are really on the same cluster as well.


 Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured 
 without default port.
 ---

 Key: HBASE-8304
 URL: https://issues.apache.org/jira/browse/HBASE-8304
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Affects Versions: 0.94.5
Reporter: Raymond Liu
Assignee: haosdent
  Labels: bulkloader
 Attachments: HBASE-8304-v2.patch, HBASE-8304-v3.patch, 
 HBASE-8304.patch


 When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as 
 hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir 
 where port is the hdfs namenode's default port. the bulkload operation will 
 not remove the file in bulk output dir. Store::bulkLoadHfile will think 
 hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy 
 approaching instead of rename.
 The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS 
 according to hbase.rootdir when regionserver started, thus, dest fs uri from 
 the hregion will not matching src fs uri passed from client.
 any suggestion what is the best approaching to fix this issue? 
 I kind of think that we could check for default port if src uri come without 
 port info.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-8304) Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured without default port.

2014-03-06 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923220#comment-13923220
 ] 

Jerry He commented on HBASE-8304:
-

{code}
+String srcServiceName = srcFs.getCanonicalServiceName();
+String desServiceName = desFs.getCanonicalServiceName();
...
+  //If one serviceName is a HA format while the other is a no-HA format,
+  //maybe they are refer to the same FileSystem.
+  //For example, srcFs is ha-hdfs://nameservices and desFs is 
hdfs://activeNamenode:port
+  SetInetSocketAddress srcAddrs = getNNAddresses((DistributedFileSystem) 
srcFs, conf);
+  SetInetSocketAddress desAddrs = getNNAddresses((DistributedFileSystem) 
desFs, conf);
+  if (Sets.intersection(srcAddrs, desAddrs).size()  0) {
+return true;
{code}
A little unclear about this.  Given your example:
 //For example, srcFs is ha-hdfs://nameservices and desFs is 
hdfs://activeNamenode:port
If the desFs is HA enabled, then you will get the ''ha-hdfs:// format, right? 
If it returns hdfs://, does it already tell you they are different FS?

It is good JIRA.
I didn't know fs.getCanonicalServiceName() will return ha-hdfs://nameservices 
in HA case.

 Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured 
 without default port.
 ---

 Key: HBASE-8304
 URL: https://issues.apache.org/jira/browse/HBASE-8304
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Affects Versions: 0.94.5
Reporter: Raymond Liu
Assignee: haosdent
  Labels: bulkloader
 Attachments: HBASE-8304-v2.patch, HBASE-8304-v3.patch, 
 HBASE-8304.patch


 When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as 
 hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir 
 where port is the hdfs namenode's default port. the bulkload operation will 
 not remove the file in bulk output dir. Store::bulkLoadHfile will think 
 hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy 
 approaching instead of rename.
 The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS 
 according to hbase.rootdir when regionserver started, thus, dest fs uri from 
 the hregion will not matching src fs uri passed from client.
 any suggestion what is the best approaching to fix this issue? 
 I kind of think that we could check for default port if src uri come without 
 port info.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-8798) Fix a minor bug in shell command with clone_snapshot table error

2013-07-01 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697010#comment-13697010
 ] 

Jerry He commented on HBASE-8798:
-

Here is the output after the fix.

hbase(main):001:0 list
TABLE
TestTable
1 row(s) in 1.1350 seconds

= [TestTable]
hbase(main):002:0 list_snapshots
SNAPSHOTTABLE + CREATION TIME
 mysnapshot1TestTable (Mon Jun 24 13:29:00 -0700 2013)
1 row(s) in 0.2040 seconds

= [mysnapshot1]
hbase(main):003:0 clone_snapshot 'mysnapshot1', 'TestTable'

ERROR: Table already exists: *TestTable!*

Here is some help for this command:
Create a new table by cloning the snapshot content.
There're no copies of data involved.
And writing on the newly created table will not influence the snapshot data.

Examples:
  hbase clone_snapshot 'snapshotName', 'tableName'

 Fix a minor bug in shell command with clone_snapshot table error
 

 Key: HBASE-8798
 URL: https://issues.apache.org/jira/browse/HBASE-8798
 Project: HBase
  Issue Type: Bug
  Components: shell, snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-8798-trunk.patch


 In HBase shell, the syntax for clone_snapshot is:
   hbase clone_snapshot 'snapshotName', 'tableName'
 If the target table already exists, we'll get an error.
 For example:
 --
 hbase(main):011:0 clone_snapshot 'mysnapshot1', 'TestTable'
 ERROR: Table already exists: mysnapshot1!
 Here is some help for this command:
 Create a new table by cloning the snapshot content.
 There're no copies of data involved.
 And writing on the newly created table will not influence the snapshot data.
 Examples:
   hbase clone_snapshot 'snapshotName', 'tableName'
 --
 The bug is in the ERROR message:
 *ERROR: Table already exists: mysnapshot1!*
 We should output the table name, not the snapshot name.
 Currently, in command.rb, we have the output fixed as args.first for 
 TableExistsException:
 {code}
   def translate_hbase_exceptions(*args)
 yield
   rescue org.apache.hadoop.hbase.exceptions.TableNotFoundException
 raise Unknown table #{args.first}!
   rescue org.apache.hadoop.hbase.exceptions.NoSuchColumnFamilyException
 valid_cols = table(args.first).get_all_columns.map { |c| c + '*' }
 raise Unknown column family! Valid column names: 
 #{valid_cols.join(, )}
   rescue org.apache.hadoop.hbase.exceptions.TableExistsException
 raise Table already exists: #{args.first}!
   end
 {code}
 This is fine with commands like 'create tableName ...' but not 
 'clone_snapshot snapshotName tableName'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8798) Fix a minor bug in shell command with clone_snapshot table error

2013-07-01 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697290#comment-13697290
 ] 

Jerry He commented on HBASE-8798:
-

Hi, Ted, Matteo

Thanks for the review. 
The commands.rb has some existing logic that tries to 'rescue' some exceptions. 
It will get more and more difficult to keep it generic artificially as we 
evolve and add more ...
The approach to expand (instead of shrink) this logic is not a bad or ugly one 
...

But your suggested solution is fine.

 

 Fix a minor bug in shell command with clone_snapshot table error
 

 Key: HBASE-8798
 URL: https://issues.apache.org/jira/browse/HBASE-8798
 Project: HBase
  Issue Type: Bug
  Components: shell, snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBASE-8798-trunk.patch


 In HBase shell, the syntax for clone_snapshot is:
   hbase clone_snapshot 'snapshotName', 'tableName'
 If the target table already exists, we'll get an error.
 For example:
 --
 hbase(main):011:0 clone_snapshot 'mysnapshot1', 'TestTable'
 ERROR: Table already exists: mysnapshot1!
 Here is some help for this command:
 Create a new table by cloning the snapshot content.
 There're no copies of data involved.
 And writing on the newly created table will not influence the snapshot data.
 Examples:
   hbase clone_snapshot 'snapshotName', 'tableName'
 --
 The bug is in the ERROR message:
 *ERROR: Table already exists: mysnapshot1!*
 We should output the table name, not the snapshot name.
 Currently, in command.rb, we have the output fixed as args.first for 
 TableExistsException:
 {code}
   def translate_hbase_exceptions(*args)
 yield
   rescue org.apache.hadoop.hbase.exceptions.TableNotFoundException
 raise Unknown table #{args.first}!
   rescue org.apache.hadoop.hbase.exceptions.NoSuchColumnFamilyException
 valid_cols = table(args.first).get_all_columns.map { |c| c + '*' }
 raise Unknown column family! Valid column names: 
 #{valid_cols.join(, )}
   rescue org.apache.hadoop.hbase.exceptions.TableExistsException
 raise Table already exists: #{args.first}!
   end
 {code}
 This is fine with commands like 'create tableName ...' but not 
 'clone_snapshot snapshotName tableName'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-07-12 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8760:


Assignee: (was: Jerry He)

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8
Reporter: Jerry He
 Fix For: 0.94.10

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-07-12 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8760:


Affects Version/s: 0.95.1

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.94.10

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-07-12 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8760:


Fix Version/s: 0.95.2

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.95.2, 0.94.10

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)
Jerry He created HBASE-8967:
---

 Summary: Duplicate call to snapshotManager.stop() in HRegionServer
 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.9, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2


snapshotManager.stop() is called twice in HRegionServer shutdown process

{code}
2013-07-12 12:06:56,909 INFO 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
Stopping RegionServerSnapshotManager gracefully.
2013-07-12 12:06:56,909 INFO 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
regionserver60020.cacheFlusher exiting
2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
LogRoller exiting.
2013-07-12 12:06:56,909 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
regionserver60020.compactionChecker exiting
2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
...
2013-07-12 12:06:56,911 INFO 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
Stopping RegionServerSnapshotManager gracefully.
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Attachment: HBASE-8967.patch.patch

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Attachment: (was: HBASE-8967.patch.patch)

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Status: Patch Available  (was: Open)

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.9, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Attachment: HBASE-8967.patch

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-16 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710603#comment-13710603
 ] 

Jerry He commented on HBASE-8967:
-

Also included in the patch is a minor cleanup related to HBASE-8783

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-17 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Attachment: HBASE-8967-v2.patch

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch, HBASE-8967-v2.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-17 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711480#comment-13711480
 ] 

Jerry He commented on HBASE-8967:
-

Thanks, Ted, Matteo
Attached v2 to add back the comment line.

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch, HBASE-8967-v2.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-17 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8967:


Attachment: HBASE-8967-v2-0.94.patch

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch, HBASE-8967-v2-0.94.patch, 
 HBASE-8967-v2.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-17 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711501#comment-13711501
 ] 

Jerry He commented on HBASE-8967:
-

Attached v2-0.94 for 0.94 consideration.

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBASE-8967.patch, HBASE-8967-v2-0.94.patch, 
 HBASE-8967-v2.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8967) Duplicate call to snapshotManager.stop() in HRegionServer

2013-07-17 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711894#comment-13711894
 ] 

Jerry He commented on HBASE-8967:
-

Thanks for the comment and good reminder, Stack, Ted.

 Duplicate call to snapshotManager.stop() in HRegionServer
 -

 Key: HBASE-8967
 URL: https://issues.apache.org/jira/browse/HBASE-8967
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-8967.patch, HBASE-8967-v2-0.94.patch, 
 HBASE-8967-v2.patch


 snapshotManager.stop() is called twice in HRegionServer shutdown process
 {code}
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: 
 regionserver60020.cacheFlusher exiting
 2013-07-12 12:06:56,909 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2013-07-12 12:06:56,909 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver60020.compactionChecker exiting
 2013-07-12 12:06:56,909 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: 
 Stopping catalog tracker 
 org.apache.hadoop.hbase.catalog.CatalogTracker@1bfd1bfd
 ...
 2013-07-12 12:06:56,911 INFO 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: 
 Stopping RegionServerSnapshotManager gracefully.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-9029) Backport HBASE-8706 Some improvement in snapshot to 0.94

2013-07-23 Thread Jerry He (JIRA)
Jerry He created HBASE-9029:
---

 Summary: Backport HBASE-8706 Some improvement in snapshot to 0.94
 Key: HBASE-9029
 URL: https://issues.apache.org/jira/browse/HBASE-9029
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.94.11


'HBASE-8706 Some improvement in snapshot' has some good parameter tuning and 
improvement for snapshot handling, making snapshot more robust.
It will nice to put it in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9029) Backport HBASE-8706 Some improvement in snapshot to 0.94

2013-07-23 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9029:


Attachment: HBase-9029-0.94.patch

 Backport HBASE-8706 Some improvement in snapshot to 0.94
 

 Key: HBASE-9029
 URL: https://issues.apache.org/jira/browse/HBASE-9029
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.94.11

 Attachments: HBase-9029-0.94.patch


 'HBASE-8706 Some improvement in snapshot' has some good parameter tuning and 
 improvement for snapshot handling, making snapshot more robust.
 It will nice to put it in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9029) Backport HBASE-8706 Some improvement in snapshot to 0.94

2013-07-23 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717551#comment-13717551
 ] 

Jerry He commented on HBASE-9029:
-

It was backported to our 0.94.9 branch short time ago.

Unit test clean on local run:

{code}
Running org.apache.hadoop.hbase.TestHServerInfo
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 215.893 sec

Results :

Tests run: 692, Failures: 0, Errors: 0, Skipped: 0

Running org.apache.hadoop.hbase.snapshot.TestSnapshotDescriptionUtils
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.532 sec
Running org.apache.hadoop.hbase.snapshot.TestExportSnapshot
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.308 sec

Results :

Tests run: 888, Failures: 0, Errors: 0, Skipped: 2

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 31:52.827s
[INFO] Finished at: Fri Jul 19 20:52:08 PDT 2013
[INFO] Final Memory: 26M/226M
[INFO] 
{code}

Also ran some ad hoc snapshot testing successfully.

 Backport HBASE-8706 Some improvement in snapshot to 0.94
 

 Key: HBASE-9029
 URL: https://issues.apache.org/jira/browse/HBASE-9029
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.94.11

 Attachments: HBase-9029-0.94.patch


 'HBASE-8706 Some improvement in snapshot' has some good parameter tuning and 
 improvement for snapshot handling, making snapshot more robust.
 It will nice to put it in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-9060) EXportSnapshot job fails if target path contains percentage character

2013-07-27 Thread Jerry He (JIRA)
Jerry He created HBASE-9060:
---

 Summary: EXportSnapshot job fails if target path contains 
percentage character
 Key: HBASE-9060
 URL: https://issues.apache.org/jira/browse/HBASE-9060
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.10, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor


Here is the stack trace:

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot table1_snapshot 
-copy-to hdfs:///myhbasebackup/table1_snapshot
 
{code}
13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
attempt_201307261804_0002_m_01_0, Status : FAILED
java.util.MissingFormatArgumentException: Format specifier ') from 
family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
 to hdfs:/myhbase%2C'
at java.util.Formatter.getArgument(Formatter.java:592)
at java.util.Formatter.format(Formatter.java:561)
at java.util.Formatter.format(Formatter.java:510)
at java.lang.String.format(String.java:1977)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
{code}

The problem is this code in copyData():
{code}
final String statusMessage = copied %s/ + 
StringUtils.humanReadableInt(inputFileSize) +
(%.3f%%) from  + inputPath +  to  + 
outputPath;
{code}

Since we don't know what the path may contain that may confuse the formatter, 
we need to pull that part out of the format string.

Also the percentage completion math seems to be wrong in the same code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9060) EXportSnapshot job fails if target path contains percentage character

2013-07-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9060:


Fix Version/s: 0.95.2
   Status: Patch Available  (was: Open)

 EXportSnapshot job fails if target path contains percentage character
 -

 Key: HBASE-9060
 URL: https://issues.apache.org/jira/browse/HBASE-9060
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.10, 0.95.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBase-9060.patch


 Here is the stack trace:
 hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
 table1_snapshot -copy-to hdfs:///myhbasebackup/table1_snapshot
  
 {code}
 13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
 attempt_201307261804_0002_m_01_0, Status : FAILED
 java.util.MissingFormatArgumentException: Format specifier ') from 
 family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
  to hdfs:/myhbase%2C'
 at java.util.Formatter.getArgument(Formatter.java:592)
 at java.util.Formatter.format(Formatter.java:561)
 at java.util.Formatter.format(Formatter.java:510)
 at java.lang.String.format(String.java:1977)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
 {code}
 The problem is this code in copyData():
 {code}
 final String statusMessage = copied %s/ + 
 StringUtils.humanReadableInt(inputFileSize) +
 (%.3f%%) from  + inputPath +  to  + 
 outputPath;
 {code}
 Since we don't know what the path may contain that may confuse the formatter, 
 we need to pull that part out of the format string.
 Also the percentage completion math seems to be wrong in the same code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9060) EXportSnapshot job fails if target path contains percentage character

2013-07-27 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9060:


Attachment: HBase-9060.patch

 EXportSnapshot job fails if target path contains percentage character
 -

 Key: HBASE-9060
 URL: https://issues.apache.org/jira/browse/HBASE-9060
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.10
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Attachments: HBase-9060.patch


 Here is the stack trace:
 hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
 table1_snapshot -copy-to hdfs:///myhbasebackup/table1_snapshot
  
 {code}
 13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
 attempt_201307261804_0002_m_01_0, Status : FAILED
 java.util.MissingFormatArgumentException: Format specifier ') from 
 family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
  to hdfs:/myhbase%2C'
 at java.util.Formatter.getArgument(Formatter.java:592)
 at java.util.Formatter.format(Formatter.java:561)
 at java.util.Formatter.format(Formatter.java:510)
 at java.lang.String.format(String.java:1977)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
 {code}
 The problem is this code in copyData():
 {code}
 final String statusMessage = copied %s/ + 
 StringUtils.humanReadableInt(inputFileSize) +
 (%.3f%%) from  + inputPath +  to  + 
 outputPath;
 {code}
 Since we don't know what the path may contain that may confuse the formatter, 
 we need to pull that part out of the format string.
 Also the percentage completion math seems to be wrong in the same code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9060) EXportSnapshot job fails if target path contains percentage character

2013-07-28 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9060:


Description: 
Here is the stack trace:

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot table1_snapshot 
-copy-to hdfs:///myhbase%2Cbackup/table1_snapshot
 
{code}
13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
attempt_201307261804_0002_m_01_0, Status : FAILED
java.util.MissingFormatArgumentException: Format specifier ') from 
family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
 to hdfs:/myhbase%2C'
at java.util.Formatter.getArgument(Formatter.java:592)
at java.util.Formatter.format(Formatter.java:561)
at java.util.Formatter.format(Formatter.java:510)
at java.lang.String.format(String.java:1977)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
{code}

The problem is this code in copyData():
{code}
final String statusMessage = copied %s/ + 
StringUtils.humanReadableInt(inputFileSize) +
(%.3f%%) from  + inputPath +  to  + 
outputPath;
{code}

Since we don't know what the path may contain that may confuse the formatter, 
we need to pull that part out of the format string.

Also the percentage completion math seems to be wrong in the same code.

  was:
Here is the stack trace:

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot table1_snapshot 
-copy-to hdfs:///myhbasebackup/table1_snapshot
 
{code}
13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
attempt_201307261804_0002_m_01_0, Status : FAILED
java.util.MissingFormatArgumentException: Format specifier ') from 
family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
 to hdfs:/myhbase%2C'
at java.util.Formatter.getArgument(Formatter.java:592)
at java.util.Formatter.format(Formatter.java:561)
at java.util.Formatter.format(Formatter.java:510)
at java.lang.String.format(String.java:1977)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
{code}

The problem is this code in copyData():
{code}
final String statusMessage = copied %s/ + 
StringUtils.humanReadableInt(inputFileSize) +
(%.3f%%) from  + inputPath +  to  + 
outputPath;
{code}

Since we don't know what the path may contain that may confuse the formatter, 
we need to pull that part out of the format string.

Also the percentage completion math seems to be wrong in the same code.


 EXportSnapshot job fails if target path contains percentage character
 -

 Key: HBASE-9060
 URL: https://issues.apache.org/jira/browse/HBASE-9060
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.10
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBase-9060.patch


 Here is the stack trace:
 hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
 table1_snapshot -copy-to hdfs:///myhbase%2Cbackup/table1_snapshot
  
 {code}
 13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
 attempt_201307261804_0002_m_01_0, Status : FAILED
 java.util.MissingFormatArgumentException: Format specifier ') from 
 family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
  to hdfs:/myhbase%2C'
 at java.util.Formatter.getArgument(Formatter.java:592)
 at java.util.Formatter.format(Formatter.java:561)
 at java.util.Formatter.format(Formatter.java:510)
 at java.lang.String.format(String.java:1977)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
 at 
 

[jira] [Commented] (HBASE-9060) EXportSnapshot job fails if target path contains percentage character

2013-07-28 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721892#comment-13721892
 ] 

Jerry He commented on HBASE-9060:
-

Edited the description to correct the path in the command (had given the path 
without the problem).

 EXportSnapshot job fails if target path contains percentage character
 -

 Key: HBASE-9060
 URL: https://issues.apache.org/jira/browse/HBASE-9060
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.1, 0.94.10
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.95.2

 Attachments: HBase-9060.patch


 Here is the stack trace:
 hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
 table1_snapshot -copy-to hdfs:///myhbase%2Cbackup/table1_snapshot
  
 {code}
 13/07/26 18:09:50 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/26 18:09:58 INFO mapred.JobClient: Task Id : 
 attempt_201307261804_0002_m_01_0, Status : FAILED
 java.util.MissingFormatArgumentException: Format specifier ') from 
 family1/table1=3567d8ac6cfee83dfe81c346f139fb9c-c5bc120475a54d188f30d4b621d505b1
  to hdfs:/myhbase%2C'
 at java.util.Formatter.getArgument(Formatter.java:592)
 at java.util.Formatter.format(Formatter.java:561)
 at java.util.Formatter.format(Formatter.java:510)
 at java.lang.String.format(String.java:1977)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyData(ExportSnapshot.java:274)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:204)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:149)
 at 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:98)
 {code}
 The problem is this code in copyData():
 {code}
 final String statusMessage = copied %s/ + 
 StringUtils.humanReadableInt(inputFileSize) +
 (%.3f%%) from  + inputPath +  to  + 
 outputPath;
 {code}
 Since we don't know what the path may contain that may confuse the formatter, 
 we need to pull that part out of the format string.
 Also the percentage completion math seems to be wrong in the same code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9029) Backport HBASE-8706 Some improvement in snapshot to 0.94

2013-07-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725469#comment-13725469
 ] 

Jerry He commented on HBASE-9029:
-

Hi, guys

Should we put this in 0.94?

 Backport HBASE-8706 Some improvement in snapshot to 0.94
 

 Key: HBASE-9029
 URL: https://issues.apache.org/jira/browse/HBASE-9029
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.94.11

 Attachments: HBase-9029-0.94.patch


 'HBASE-8706 Some improvement in snapshot' has some good parameter tuning and 
 improvement for snapshot handling, making snapshot more robust.
 It will nice to put it in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-02 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727919#comment-13727919
 ] 

Jerry He commented on HBASE-8760:
-

Hi, Matteo

I agree your patch is probably the best we can do for now.

We can probably do more in HBASE-7987 to have a new solution for this problem. 
For example, include the parent hfiles in the manifest file but add 
indicators/markers
to tell they are meant to be parent hfiles.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8565:


Description: 
In stop-hbase.sh:
{code}
  # TODO: store backup masters in ZooKeeper and have the primary send them a 
shutdown message
  # stop any backup masters
  $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
--hosts ${HBASE_BACKUP_MASTERS} stop master-backup
{code}

After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the backup 
master too via the cluster status znode.

We should not need the above code anymore.

Another issue happens when the current master died and the backup master became 
the active master.
{code}
nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
   --config ${HBASE_CONF_DIR} \
   master stop $@  $logout 21  /dev/null 

waitForProcessEnd `cat $pid` 'stop-master-command'
{code}
We can still issue 'hbase-stop.sh' from the old master.
stop-hbase.sh - hbase master stop - look for active master - request shutdown
This process still works.
But the waitForProcessEnd statement will not work since the local master pid is 
not relevant anymore.
What is the best way in the this case?


  was:
In stop-hbase.sh:
{code}
  # TODO: store backup masters in ZooKeeper and have the primary send them a 
shutdown message
  # stop any backup masters
  $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
--hosts ${HBASE_BACKUP_MASTERS} stop master-backup
{code}

After HBASE-5213, stop-hbase.sh - hbase master stop will bring up the backup 
master too via the cluster status znode.

We should not need the above code anymore.

Another issue happens when the current master died and the backup master became 
the active master.
{code}
nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
   --config ${HBASE_CONF_DIR} \
   master stop $@  $logout 21  /dev/null 

waitForProcessEnd `cat $pid` 'stop-master-command'
{code}
We can still issue 'hbase-stop.sh' from the old master.
stop-hbase.sh - hbase master stop - look for active master - request shutdown
This process still works.
But the waitForProcessEnd statement will not work since the local master pid is 
not relevant anymore.
What is the best way in the this case?



 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727970#comment-13727970
 ] 

Jerry He commented on HBASE-8565:
-

Correct a typo in the description:
Old: After HBASE-5213, stop-hbase.sh - hbase master stop will bring up the 
backup master too via the cluster status znode.
New: After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
backup master too via the cluster status znode.

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8565:


Status: Patch Available  (was: Open)

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.95.0, 0.94.7
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He reassigned HBASE-8565:
---

Assignee: Jerry He

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728365#comment-13728365
 ] 

Jerry He commented on HBASE-8565:
-

Attached an initial patch.

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-02 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8565:


Attachment: HBASE-8565-v1-0.94.patch
HBASE-8565-v1-trunk.patch

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728951#comment-13728951
 ] 

Jerry He commented on HBASE-8760:
-

This is a meaningful change!
Parent region and daughter regions are all included in the snapshot. After 
restore/clone, all will be included and brought online?

Code comments:
{code}
 snapshtoDisabledRegion(snapshotDir, regionInfo);
{code}
Typo in the method name.
{code}
public void verifyRegions(Path snapshotDir) throws IOException
{code}
== private void verifyRegions(final Path snapshotDir) throws IOException
{code}
  private void verifyRegion(final FileSystem fs, final Path snapshotDir, final 
HRegionInfo region)
  throws IOException {
 // make sure we have region in the snapshot
{code}
That comment line is not indeeded anymore.


 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728970#comment-13728970
 ] 

Jerry He commented on HBASE-8760:
-

And the population of the .META is based on the .regioninfo files which were 
carried over from the original table?
That makes sense.
Thanks.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-05 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730002#comment-13730002
 ] 

Jerry He commented on HBASE-8760:
-

I had seen the problem on a 0.94 live cluster. It happened when there was heavy 
write load on the cluster 
while the snapshot was taken.
Later to re-create the problem, I had to suspend the compaction thread manually 
so that right after region 
split the new regions would not be compacted right away.
I have not got a chance to do testing on this patch yet.


 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, HBASE-8760-thz-v2.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-11 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736454#comment-13736454
 ] 

Jerry He commented on HBASE-8760:
-

Hi, Matteo

I've just tested the v4 patch again 0.94 and 0.95.2. These are the basic steps:
1. Change the code to disable compaction (similar to what you mentioned).
2. start hbase
3. Create and populate a TestTable with 'hbase 
org.apache.hadoop.hbase.PerformanceEvaluation  randomWrite 5'
4. split TestTable
5. snapshot 'TestTable', 'my_snapshot1'  (This snapshot includes parent and 
daughter references.)
6. stop hbase
7. Change the code to enable normal compaction.
8. start hbase
9. Wait for normal compactions (and/or additional splits) to go thru their 
courses, and hfile cleaners to go thru their courses as well.
10. clone_snapshot 'my_snapshot1', 'TestTable_clone'
11. count the rows of TestTable_clone to verify the number is the same as 
TestTable.
12. Verify there are no exceptions in region server logs like 'can not open 
link' or 'can not open file'.

13. snapshot 'TestTable_clone', 'my_snapshot2'
14. clone 'my_snapshot2', 'TestTable_clone_clone'


 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, 
 HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-11 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736461#comment-13736461
 ] 

Jerry He commented on HBASE-8760:
-

The patch is working well up to step 12. I've not been able to re-create the 
problem.

But I have seen problems and exceptions in both 0.94 and 0.95.2 during step 13 
and 14 for a second level snapshot and clone.

For example, in 0.94:
{code}
hbase(main):005:0 snapshot 'TestTable_clone', 'my_snapshot2'

ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: 
org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
ss=my_snapshot2 table=TestTable_clone type=FLUSH } had an error.  my_snapshot2 
not found in proclist []
at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:359)
at 
org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:2185)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
Failed taking snapshot { ss=my_snapshot2 table=TestTable_clone type=FLUSH } due 
to exception:Missing parent hfile for: 
TestTable=83935cdbb327ac84f45a7248f4d58173-048d68de11a042e9aba294ab336ddbf3.630c188f55575e0cce497ba342b562bb:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
 Missing parent hfile for: 
TestTable=83935cdbb327ac84f45a7248f4d58173-048d68de11a042e9aba294ab336ddbf3.630c188f55575e0cce497ba342b562bb
at 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:85)
at 
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:282)
at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:349)
... 7 more
Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Missing 
parent hfile for: 
TestTable=83935cdbb327ac84f45a7248f4d58173-048d68de11a042e9aba294ab336ddbf3.630c188f55575e0cce497ba342b562bb
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyStoreFile(MasterSnapshotVerifier.java:223)
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.access$000(MasterSnapshotVerifier.java:85)
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier$1.storeFile(MasterSnapshotVerifier.java:209)
at 
org.apache.hadoop.hbase.util.FSVisitor.visitRegionStoreFiles(FSVisitor.java:115)
{code}
From the logs, in this failure, 630c188f55575e0cce497ba342b562bb is a region 
in TestTable_clone that went thru its own split. It was gone (not even in 
.archive) after its split.
But somehow there are remaining links/references to it in TestTable_clone.
TestTable_clone have 3m plus rows. It could go thru compactions and splits on 
its own.  That seems to have confuses snapshot operations.  
If you need to relevant master/region server logs, I can send to you or attach 
them here.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, 
 HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-11 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736468#comment-13736468
 ] 

Jerry He commented on HBASE-8760:
-

Some exceptions were seen against 0.95.2 during step 14. It is different from 
0.94. But it could just be due to random timing.
Step 13 was ok. 
Step 14 was ok as successful as well, but there were errors in the logs:
{code}
2013-08-10 23:00:16,463 ERROR [RS_OPEN_REGION-hdtest009:60021-1] 
handler.OpenRegionHandler: Failed open of 
region=TestTable_clone_clone,,1376197826879.c3ea5fba0fe4a49a9e93102d133b99fd., 
starting to roll back the global memstore size.
...
Caused by: java.io.IOException: java.io.FileNotFoundException: Unable to open 
link: org.apache.hadoop.hbase.io.HFileLink 
locations=[hdfs://hdtest009.svl.ibm.com:9000/hbase95/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690,
 
hdfs://hdtest009.svl.ibm.com:9000/hbase95/.tmp/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690,
 
hdfs://hdtest009.svl.ibm.com:9000/hbase95/.archive/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690]
at 
org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:448)
at org.apache.hadoop.hbase.regionserver.HStore.init(HStore.java:241)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3122)
{code}
Based on the logs, the failed region was a parent region. The daughters regions 
were ok. Therefore the end row count was good.
Again if you need relevant logs, I can send to you or attach here.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, 
 HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-11 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8760:


Attachment: v4-patch-testing-0.95.2.zip
v4-patch-testing-0.94.zip

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, 
 HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch, 
 v4-patch-testing-0.94.zip, v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-11 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736557#comment-13736557
 ] 

Jerry He commented on HBASE-8760:
-

Attached zip files from the testing.
Each contains:
1. master/region sever logs
2. file listing for the snapshots, tables during the testing. 

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, 
 HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch, 
 v4-patch-testing-0.94.zip, v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-14 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739972#comment-13739972
 ] 

Jerry He commented on HBASE-8760:
-

Hi, Matteo

From the master and region server logs, the offline regions from snapshot were 
clearly tried being brought online.
I didn't make the connection between the failures/exceptions and that.
I am very hopeful this latest patch will solve it. I will do some quick testing.
Thanks.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-0.94-v5.patch, HBASE-8760-0.94-v6.patch, 
 HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, HBASE-8760-thz-v2.patch, 
 HBASE-8760-thz-v3.patch, HBASE-8760-v4.patch, v4-patch-testing-0.94.zip, 
 v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-18 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-8760:


Attachment: HBASE-8760-0.94-v8-addendum.patch

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-0.94-v5.patch, HBASE-8760-0.94-v6.patch, 
 HBASE-8760-0.94-v7.patch, HBASE-8760-0.94-v8-addendum.patch, 
 HBASE-8760-0.94-v8.patch, HBASE-8760-thz-v0.patch, HBASE-8760-trunk-v8.patch, 
 HBASE-8760-v4.patch, v4-patch-testing-0.94.zip, v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13743522#comment-13743522
 ] 

Jerry He commented on HBASE-8760:
-

Hi, Matteo

Thank you for the time and effort you spent on this JIRA!  There had been more 
complexity and problems than anticipated.

I applied HBASE-9207, HBASE-9233, and then the HBASE-8760-0.94-v8.patch on my 
0.94 cluster.

I went through a few times the test steps outlined in my previous comment. 
Sometimes with minor changes in the steps.

There is one more issue. (Hopefully this is the last one!)
We should not include the offline regions' ServerName in the online snapshot 
procedure. Otherwise the snapshot procedure will timeout
while waiting for the obsolete ServerName if the ServerName has been changed, 
e.g. a re-start.

Attached a 0.94-v8-addendum. It is on top of HBASE-8760-0.94-v8.patch.

After this, I have not seen any failure or exceptions during the testing. 
The row counts always match. The logs are clean without errors or exceptions 
too.

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v4.patch, HBASE-8760-0.94-v5.patch, HBASE-8760-0.94-v6.patch, 
 HBASE-8760-0.94-v7.patch, HBASE-8760-0.94-v8-addendum.patch, 
 HBASE-8760-0.94-v8.patch, HBASE-8760-thz-v0.patch, HBASE-8760-trunk-v8.patch, 
 HBASE-8760-v4.patch, v4-patch-testing-0.94.zip, v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split

2013-08-20 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745565#comment-13745565
 ] 

Jerry He commented on HBASE-8760:
-

Thanks Matteo and everyone!

 possible loss of data in snapshot taken after region split
 --

 Key: HBASE-8760
 URL: https://issues.apache.org/jira/browse/HBASE-8760
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.1
Reporter: Jerry He
Assignee: Matteo Bertozzi
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, 
 HBASE-8760-0.94-v10.patch, HBASE-8760-0.94-v4.patch, 
 HBASE-8760-0.94-v5.patch, HBASE-8760-0.94-v6.patch, HBASE-8760-0.94-v7.patch, 
 HBASE-8760-0.94-v8-addendum.patch, HBASE-8760-0.94-v8.patch, 
 HBASE-8760-0.94-v9.patch, HBASE-8760-thz-v0.patch, 
 HBASE-8760-trunk-v10.patch, HBASE-8760-trunk-v8.patch, 
 HBASE-8760-trunk-v9.patch, HBASE-8760-v4.patch, v4-patch-testing-0.94.zip, 
 v4-patch-testing-0.95.2.zip


 Right after a region split but before the daughter regions are compacted, we 
 have two daughter regions containing Reference files to the parent hfiles.
 If we take snapshot right at the moment, the snapshot will succeed, but it 
 will only contain the daughter Reference files. Since there is no hold on the 
 parent hfiles, they will be deleted by the HFile Cleaner after they are no 
 longer needed by the daughter regions soon after.
 A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-22 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747703#comment-13747703
 ] 

Jerry He commented on HBASE-8565:
-

Hi, Stack
Thank you for the comment.

bq. On the second issue, test for presence of the process before waiting on it?

We could do a check on the local master pid. 
But to make it work even if the master is not local anymore, instead of waiting 
for local master pid,
can we borrow the idea of using master node on ZK to wait for:
{code}
zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool 
zookeeper.znode.master`
if [ $zmaster == null ]; then zmaster=master; fi
zmaster=$zparent/$zmaster
echo -n Waiting for Master ZNode ${zmaster} to expire
while ! $bin/hbase zkcli stat $zmaster 21 | grep Node does not exist; 
do
  echo -n .
  sleep 1
done
{code}

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8565) stop-hbase.sh clean up: backup master

2013-08-22 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748317#comment-13748317
 ] 

Jerry He commented on HBASE-8565:
-

Hi, Lars, Stack

The extra code that was removed doesn't break anything. It is just maybe 
redundant.
But keep the way it it currently for 0.94 to reduce the risk is prudent.

Feel free to close this one. As you suggested, any additional polishment of 
stop-hbase.sh will be addressed in another jira.

Thanks!

 stop-hbase.sh clean up: backup master
 -

 Key: HBASE-8565
 URL: https://issues.apache.org/jira/browse/HBASE-8565
 Project: HBase
  Issue Type: Bug
  Components: master, scripts
Affects Versions: 0.94.7, 0.95.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-8565-v1-0.94.patch, HBASE-8565-v1-trunk.patch


 In stop-hbase.sh:
 {code}
   # TODO: store backup masters in ZooKeeper and have the primary send them a 
 shutdown message
   # stop any backup masters
   $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \
 --hosts ${HBASE_BACKUP_MASTERS} stop master-backup
 {code}
 After HBASE-5213, stop-hbase.sh - hbase master stop will bring down the 
 backup master too via the cluster status znode.
 We should not need the above code anymore.
 Another issue happens when the current master died and the backup master 
 became the active master.
 {code}
 nohup nice -n ${HBASE_NICENESS:-0} $HBASE_HOME/bin/hbase \
--config ${HBASE_CONF_DIR} \
master stop $@  $logout 21  /dev/null 
 waitForProcessEnd `cat $pid` 'stop-master-command'
 {code}
 We can still issue 'hbase-stop.sh' from the old master.
 stop-hbase.sh - hbase master stop - look for active master - request 
 shutdown
 This process still works.
 But the waitForProcessEnd statement will not work since the local master pid 
 is not relevant anymore.
 What is the best way in the this case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-30 Thread Jerry He (JIRA)
Jerry He created HBASE-9397:
---

 Summary: Snapshots with the same name are allowed to proceed 
concurrently
 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.11, 0.95.2
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0


Snapshots with the same name (but on different tables) are allowed to proceed 
concurrently.
This seems to be loop hole created by allowing multiple snapshots (on different 
tables) to run concurrently.
There are two checks in SnapshotManager, but fail to catch this particular case.
In isSnapshotCompleted(), we only check the completed snapshot directory.
In isTakingSnapshot(), we only check for the same table name.

The end result is the concurrently running snapshots with the same name are 
overlapping and messing up each other. For example, cleaning up the other's 
snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.

{code}
2013-08-29 18:25:13,443 ERROR 
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to exception:Couldn't 
read snapshot info 
from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
at 
org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-30 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9397:


Status: Patch Available  (was: Open)

 Snapshots with the same name are allowed to proceed concurrently
 

 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.11, 0.95.2
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0

 Attachments: HBASE-9397-0.94.patch, HBASE-9397-trunk.patch


 Snapshots with the same name (but on different tables) are allowed to proceed 
 concurrently.
 This seems to be loop hole created by allowing multiple snapshots (on 
 different tables) to run concurrently.
 There are two checks in SnapshotManager, but fail to catch this particular 
 case.
 In isSnapshotCompleted(), we only check the completed snapshot directory.
 In isTakingSnapshot(), we only check for the same table name.
 The end result is the concurrently running snapshots with the same name are 
 overlapping and messing up each other. For example, cleaning up the other's 
 snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.
 {code}
 2013-08-29 18:25:13,443 ERROR 
 org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
 snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to 
 exception:Couldn't read snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
 snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 at 
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
 at 
 org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-30 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9397:


Attachment: HBASE-9397-trunk.patch
HBASE-9397-0.94.patch

 Snapshots with the same name are allowed to proceed concurrently
 

 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.2, 0.94.11
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0

 Attachments: HBASE-9397-0.94.patch, HBASE-9397-trunk.patch


 Snapshots with the same name (but on different tables) are allowed to proceed 
 concurrently.
 This seems to be loop hole created by allowing multiple snapshots (on 
 different tables) to run concurrently.
 There are two checks in SnapshotManager, but fail to catch this particular 
 case.
 In isSnapshotCompleted(), we only check the completed snapshot directory.
 In isTakingSnapshot(), we only check for the same table name.
 The end result is the concurrently running snapshots with the same name are 
 overlapping and messing up each other. For example, cleaning up the other's 
 snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.
 {code}
 2013-08-29 18:25:13,443 ERROR 
 org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
 snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to 
 exception:Couldn't read snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
 snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 at 
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
 at 
 org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-30 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755113#comment-13755113
 ] 

Jerry He commented on HBASE-9397:
-

Attached a straight fix. Any comment and other suggestion are welcome.

 Snapshots with the same name are allowed to proceed concurrently
 

 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.2, 0.94.11
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0

 Attachments: HBASE-9397-0.94.patch, HBASE-9397-trunk.patch


 Snapshots with the same name (but on different tables) are allowed to proceed 
 concurrently.
 This seems to be loop hole created by allowing multiple snapshots (on 
 different tables) to run concurrently.
 There are two checks in SnapshotManager, but fail to catch this particular 
 case.
 In isSnapshotCompleted(), we only check the completed snapshot directory.
 In isTakingSnapshot(), we only check for the same table name.
 The end result is the concurrently running snapshots with the same name are 
 overlapping and messing up each other. For example, cleaning up the other's 
 snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.
 {code}
 2013-08-29 18:25:13,443 ERROR 
 org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
 snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to 
 exception:Couldn't read snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
 snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 at 
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
 at 
 org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755615#comment-13755615
 ] 

Jerry He commented on HBASE-9397:
-

Matteo,

Thanks for the comment. Yes, the 'restoreHandlers' part was a last min 
copy-paste error.
Corrected it and followed your second suggestion too.

Easy test with one relatively big table (100G). The snapshot will take a few 
seconds for another snapshot with the same name (or table) to sneak in.

 Snapshots with the same name are allowed to proceed concurrently
 

 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.2, 0.94.11
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0

 Attachments: HBASE-9397-0.94.patch, HBASE-9397-0.94-v2.patch, 
 HBASE-9397-trunk.patch, HBASE-9397-trunk-v2.patch


 Snapshots with the same name (but on different tables) are allowed to proceed 
 concurrently.
 This seems to be loop hole created by allowing multiple snapshots (on 
 different tables) to run concurrently.
 There are two checks in SnapshotManager, but fail to catch this particular 
 case.
 In isSnapshotCompleted(), we only check the completed snapshot directory.
 In isTakingSnapshot(), we only check for the same table name.
 The end result is the concurrently running snapshots with the same name are 
 overlapping and messing up each other. For example, cleaning up the other's 
 snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.
 {code}
 2013-08-29 18:25:13,443 ERROR 
 org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
 snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to 
 exception:Couldn't read snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
 snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 at 
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
 at 
 org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-9397) Snapshots with the same name are allowed to proceed concurrently

2013-08-31 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-9397:


Attachment: HBASE-9397-trunk-v2.patch
HBASE-9397-0.94-v2.patch

 Snapshots with the same name are allowed to proceed concurrently
 

 Key: HBASE-9397
 URL: https://issues.apache.org/jira/browse/HBASE-9397
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.95.2, 0.94.11
Reporter: Jerry He
Assignee: Jerry He
 Fix For: 0.94.12, 0.96.0

 Attachments: HBASE-9397-0.94.patch, HBASE-9397-0.94-v2.patch, 
 HBASE-9397-trunk.patch, HBASE-9397-trunk-v2.patch


 Snapshots with the same name (but on different tables) are allowed to proceed 
 concurrently.
 This seems to be loop hole created by allowing multiple snapshots (on 
 different tables) to run concurrently.
 There are two checks in SnapshotManager, but fail to catch this particular 
 case.
 In isSnapshotCompleted(), we only check the completed snapshot directory.
 In isTakingSnapshot(), we only check for the same table name.
 The end result is the concurrently running snapshots with the same name are 
 overlapping and messing up each other. For example, cleaning up the other's 
 snapshot working directory in .hbase-snapshot/.tmp/snapshot-name.
 {code}
 2013-08-29 18:25:13,443 ERROR 
 org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking 
 snapshot { ss=mysnapshot table=TestTable type=FLUSH } due to 
 exception:Couldn't read snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
 snapshot info 
 from:hdfs://hdtest009:9000/hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
 at 
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:321)
 at 
 org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshotDescription(MasterSnapshotVerifier.java:123)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   6   7   8   9   10   >