[jira] [Created] (HBASE-7775) Regionservers continue to read/parse XML config files after startup.

2013-02-05 Thread Aravind Gottipati (JIRA)
Aravind Gottipati created HBASE-7775:


 Summary: Regionservers continue to read/parse XML config files 
after startup.
 Key: HBASE-7775
 URL: https://issues.apache.org/jira/browse/HBASE-7775
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
 Environment: linux x86_64
Reporter: Aravind Gottipati


It appears that the region servers continue to parse the xml config files as a 
part of their normal operation (and not just on startup).  I realize this might 
be coming from hadoop config parsing etc, but it is still a major problem 
should you happen to push out a bad xml config.

Here is the stack trace from the problem in our environment.

13/02/05 13:46:12 INFO regionserver.HRegion: Starting compaction on region 
tsdb,\x00\x00\x0FP\xFEU\x10\x00\x00\x01\x00 
\xEF\x00\x00\x0B\x00\x00\x0F\x00\x00\x0C\x00\x00\x19,1359827642230.aadcc5a9ef4d4f16fb8937c9e93763a1.
13/02/05 13:46:12 INFO regionserver.Store: Started compaction of 3 file(s) in 
cf=t  into 
hdfs://nn-blah:8020/hbase/tsdb/aadcc5a9ef4d4f16fb8937c9e93763a1/.tmp, 
seqid=1704265883, totalSize=141.6m
13/02/05 13:47:57 INFO regionserver.StoreFile: Bloom added to HFile 
(hdfs://nn-blah:8020/hbase/tsdb/aadcc5a9ef4d4f16fb8937c9e93763a1/.tmp/7838002095040213865):
 793.2k, 656192/677964 (97%)
13/02/05 13:47:57 INFO regionserver.StoreFile$Reader: Loaded row bloom filter 
metadata for 
hdfs://nn-blah:8020/hbase/tsdb/aadcc5a9ef4d4f16fb8937c9e93763a1/t/8826149174883990976
13/02/05 13:47:57 INFO regionserver.Store: Completed compaction of 3 file(s), 
new 
file=hdfs://nn-blah:8020/hbase/tsdb/aadcc5a9ef4d4f16fb8937c9e93763a1/t/8826149174883990976,
 size=141.5m; total size for store is 2.4g
13/02/05 13:47:57 INFO regionserver.HRegion: completed compaction on region 
tsdb,\x00\x00\x0FP\xFEU\x10\x00\x00\x01\x00 
\xEF\x00\x00\x0B\x00\x00\x0F\x00\x00\x0C\x00\x00\x19,1359827642230.aadcc5a9ef4d4f16fb8937c9e93763a1.
 after 1mins, 44sec
[Fatal Error] mapred-site.xml:173:13: The string -- is not permitted within 
comments.
13/02/05 13:55:33 FATAL conf.Configuration: error parsing conf file: 
org.xml.sax.SAXParseException: The string -- is not permitted within comments.
13/02/05 13:55:33 INFO regionserver.StoreFile: Bloom added to HFile 
(hdfs://nn-blah:8020/hbase/tsdb/74a4a785bc317da7282c331f577918a0/.tmp/4658027967280663602):
 5.3k, 1/4519 (0%)
[Fatal Error] mapred-site.xml:173:13: The string -- is not permitted within 
comments.
13/02/05 13:55:33 FATAL conf.Configuration: error parsing conf file: 
org.xml.sax.SAXParseException: The string -- is not permitted within comments.
13/02/05 13:55:33 FATAL regionserver.HRegionServer: ABORTING region server 
serverName=hbrs-blah,60020,1360021443595, load=(requests=5434, regions=73, 
usedHeap=4085, maxHeap=15979): Replay of HLog required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,\x00\x060P\xF4\x1Dp\x00\x00\x01\x00\x07\xDB\x00\x00\x02\x00\x02\x02\x00\x00\x87\x00\xB0Y,1359421493144.74a4a785bc317da7282c331f577918a0.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1054)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:954)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:902)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:394)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:368)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:242)
Caused by: java.lang.RuntimeException: org.xml.sax.SAXParseException: The 
string -- is not permitted within comments.
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1393)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1251)
at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1192)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:493)
at 
com.hadoop.compression.lzo.LzoCodec.getCompressionStrategy(LzoCodec.java:205)
at 
com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:204)
at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:105)
at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112)
at 
org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:236)
at 
org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile.java:397)
at org.apache.hadoop.hbase.io.hfile.HFile$Writer.close(HFile.java:621)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:877)
at 
org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:495)
at 

[jira] [Commented] (HBASE-3866) Script to add regions gradually to a new regionserver.

2012-09-05 Thread Aravind Gottipati (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448647#comment-13448647
 ] 

Aravind Gottipati commented on HBASE-3866:
--

I will defer to you folks regarding including this script with the 
distribution.  Stack's suggestion of closing the JIRA is a fine one, like he 
said - this would leave the script here for others to use.

I would however like to note a few things.

1. The script attached here is outdated.  A newer version of the script that 
worked with 0.92 is here 
(https://github.com/aravind/hbase-utils/blob/master/region_mover.rb).  I 
haven't been keeping up with the latest, so there is a very good chance, it 
might not work with versions after 0.92.

2. The script is pretty inefficient in how it moves and balances regions.  It 
maintains an internal hashmap (two of them even) of the servers - number of 
regions, to keep the region count balanced.

3. It is as portable as the original region mover script, since it re-uses most 
of the same mechanisms.


 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: 3866-max-regions-per-iteration.patch, slow_balancer.rb, 
 slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-5929) HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions.

2012-05-03 Thread Aravind Gottipati (JIRA)
Aravind Gottipati created HBASE-5929:


 Summary: HBaseAdmin.majorCompact and hbase shell randomly throw 
exceptions when asked to majorcompact regions.
 Key: HBASE-5929
 URL: https://issues.apache.org/jira/browse/HBASE-5929
 Project: HBase
  Issue Type: Bug
  Components: client, shell
Affects Versions: 0.92.1
 Environment: Linux Ubuntu Lucid 64bit
Reporter: Aravind Gottipati
Priority: Minor


I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions 
randomly for some regions.  I could not find a pattern to these exception.  The 
code I have simply does this 
admin.majorCompact(region.getRegionNameAsString()).  admin is an instance of 
HBaseAdmin and region is an instance of HRegionInfo.  The exception I get is 

org.apache.hadoop.hbase.TableNotFoundException: -ROOT-,,0
at 
org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
~[hbase-0.92.1.jar:0.92.1]
at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
Source) [hbase_compact.jar:na]


In this case it's the root region, but I get similar exceptions for other 
tables, like this.


2012-05-03 19:03:42,994 WARN  [main] HBaseCompact: Could not compact:
org.apache.hadoop.hbase.TableNotFoundException: 
ad_daily,49842:2009-07-10,1269763588508.1997607018
at 
org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) 
~[hbase-0.92.1.jar:0.92.1]
at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
Source) [hbase_compact.jar:na]
at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) 
[hbase_compact.jar:na]


I see this on hbase shell as well.  However, I don't see these exceptions if I 
use admin.majorCompact(region.getRegionName()), so it looks like something gets 
lost when I use getRegionNameAsString().

Let me know if I can provide more information.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5929) HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions.

2012-05-03 Thread Aravind Gottipati (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267792#comment-13267792
 ] 

Aravind Gottipati commented on HBASE-5929:
--

Here is the output from hbase shell for a similar table:

hbase(main):004:0 major_compact 
'ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321'

ERROR: Unknown table 
ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321!

Here is some help for this command:
Run major compaction on passed table or pass a region row
to major compact an individual region


hbase(main):005:0

I get these region names by querying the HRegionInterface of the server, and 
then proceed to compress them.  This is all on the dev cluster (if you want to 
replicate/test).


 HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked 
 to majorcompact regions.
 -

 Key: HBASE-5929
 URL: https://issues.apache.org/jira/browse/HBASE-5929
 Project: HBase
  Issue Type: Bug
  Components: client, shell
Affects Versions: 0.92.1
 Environment: Linux Ubuntu Lucid 64bit
Reporter: Aravind Gottipati
Priority: Minor

 I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions 
 randomly for some regions.  I could not find a pattern to these exception.  
 The code I have simply does this 
 admin.majorCompact(region.getRegionNameAsString()).  admin is an instance of 
 HBaseAdmin and region is an instance of HRegionInfo.  The exception I get is 
 org.apache.hadoop.hbase.TableNotFoundException: -ROOT-,,0
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
  ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
 ~[hbase-0.92.1.jar:0.92.1]
 at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
 Source) [hbase_compact.jar:na]
 In this case it's the root region, but I get similar exceptions for other 
 tables, like this.
 2012-05-03 19:03:42,994 WARN  [main] HBaseCompact: Could not compact:
 org.apache.hadoop.hbase.TableNotFoundException: 
 ad_daily,49842:2009-07-10,1269763588508.1997607018
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
  ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) 
 ~[hbase-0.92.1.jar:0.92.1]
 at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
 Source) [hbase_compact.jar:na]
 at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) 
 [hbase_compact.jar:na]
 I see this on hbase shell as well.  However, I don't see these exceptions if 
 I use admin.majorCompact(region.getRegionName()), so it looks like something 
 gets lost when I use getRegionNameAsString().
 Let me know if I can provide more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4298) Support to drain RS nodes through ZK

2011-08-30 Thread Aravind Gottipati (JIRA)
Support to drain RS nodes through ZK


 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Minor
 Fix For: 0.90.4


HDFS currently has a way to exclude certain datanodes and prevent them from 
getting new blocks.  HDFS goes one step further and even drains these nodes for 
you.  This enhancement is a step in that direction.

The idea is that we mark nodes in zookeeper as draining nodes.  This means that 
they don't get any more new regions.  These draining nodes look exactly the 
same as the corresponding nodes in /rs, except they live under /draining.

Eventually, support for draining them can be added.  I am submitting two 
patches for review - one for the 0.90 branch and one for trunk (in git).

Here are the two patches
0.90 - 
https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2

trunk - 
https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5

I have tested both these patches and they work as advertised.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-3866) Script to add regions gradually to a new regionserver.

2011-05-06 Thread Aravind Gottipati (JIRA)
Script to add regions gradually to a new regionserver.
--

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor


When a new region server is brought online, the current balancer kicks off a 
whole bunch of region moves and causes a lot of regions to be un-available 
right away.  A slower balancer that gradually balances the cluster is probably 
a good script to have.  I have an initial version that mooches off the 
region_mover script to do this.



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3866) Script to add regions gradually to a new regionserver.

2011-05-06 Thread Aravind Gottipati (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravind Gottipati updated HBASE-3866:
-

Attachment: slow_balancer.rb

This script uses a lot of the code from region_mover.rb.  The script should be 
invoked like this.

 HBASE_NOEXEC=true $HBASE_HOME/bin/hbase org.jruby.Main 
$HBASE_HOME/bin/slow_balancer.rb --debug -l 2

The -l option is the target difference between the server with the maximum 
regions and the server with the minimum regions.  Once the delta reaches this 
point, the script exits.  If -l is not passed, it defaults to the number of 
region servers in your environment.


 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3866) Script to add regions gradually to a new regionserver.

2011-05-06 Thread Aravind Gottipati (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravind Gottipati updated HBASE-3866:
-

Attachment: slow_balancer.rb

 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: slow_balancer.rb, slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira