[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616234#comment-14616234
 ] 

ramkrishna.s.vasudevan commented on HBASE-13890:


bq.Yes, of course, but I think this should be additional hint/attribute of an 
operation.
Having an additional hint or option should be the best option rather than 
another RPC call.
bq.Data can be partial
So the result is getting marked as partial? 

 Get/Scan from MemStore only (Client API)
 

 Key: HBASE-13890
 URL: https://issues.apache.org/jira/browse/HBASE-13890
 Project: HBase
  Issue Type: New Feature
  Components: API, Client, Scanners
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Attachments: HBASE-13890-v1.patch


 This is short-circuit read for get/scan when recent data (version) of a cell 
 can be found only in MemStore (with very high probability). 
 Good examples are: Atomic counters and appends. This feature will allow to 
 bypass completely store file scanners and improve performance and latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-13858) RS/MasterDumpServlet dumps threads before its “Stacks” header

2015-07-07 Thread sunhaitao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunhaitao reassigned HBASE-13858:
-

Assignee: sunhaitao

 RS/MasterDumpServlet dumps threads before its “Stacks” header
 -

 Key: HBASE-13858
 URL: https://issues.apache.org/jira/browse/HBASE-13858
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, UI
Affects Versions: 1.1.0
Reporter: Lars George
Assignee: sunhaitao
Priority: Trivial
  Labels: beginner
 Fix For: 2.0.0, 1.3.0


 The stacktraces are captured using a Hadoop helper method, then its output is 
 merged with the current. I presume there is a simple flush after outputing 
 the Stack header missing, before then the caught output is dumped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Gaurav Bhardwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616959#comment-14616959
 ] 

Gaurav Bhardwaj commented on HBASE-13867:
-

Patch corrected

 Add endpoint coprocessor guide to HBase book
 

 Key: HBASE-13867
 URL: https://issues.apache.org/jira/browse/HBASE-13867
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, documentation
Reporter: Vladimir Rodionov
Assignee: Gaurav Bhardwaj
 Attachments: HBASE-13867.1.patch


 Endpoint coprocessors are very poorly documented.
 Coprocessor section of HBase book must be updated either with its own 
 endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some 
 other guides. There is good description here:
 http://www.3pillarglobal.com/insights/hbase-coprocessors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616987#comment-14616987
 ] 

Josh Elser commented on HBASE-13561:


If you didn't get to it.. I deleted 
{{\x00\x02\x91\x1E\xA5U\x97\xC9x\xA0\xAE\xCD\xED*C\x92}}, then re-{{Verify}}ied.

{noformat}
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$Counts
REFERENCED=3998
UNDEFINED=1
UNREFERENCED=1
undef
\x00\x02\x91\x1E\xA5U\x97\xC9x\xA0\xAE\xCD\xED*C\x92=1
unref
\xD7\xFE{~0\x1C\x91#\xA5\xE1\x01T\xA7UwY=1
2015-07-07 12:52:12,362 ERROR [main] test.IntegrationTestBigLinkedList$Verify: 
Found an undefined node. Undefined count=1
% echo $?
1
{noformat}

I did notice that I could update the usage for ITBLL to be a bit more accurate 
after these changes. Will put up a v2 shortly.

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-v1.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-07-07 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616904#comment-14616904
 ] 

Matteo Bertozzi commented on HBASE-13415:
-

I think it just need another round with the fixes pointed out on reviewboard, 
nothing too big. if [~syuanjiang] does not have time I may be able to complete 
it. 

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Gaurav Bhardwaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Bhardwaj updated HBASE-13867:

Attachment: HBASE-13867.1.patch

Uploading correct patch.

 Add endpoint coprocessor guide to HBase book
 

 Key: HBASE-13867
 URL: https://issues.apache.org/jira/browse/HBASE-13867
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, documentation
Reporter: Vladimir Rodionov
Assignee: Gaurav Bhardwaj
 Attachments: HBASE-13867.1.patch


 Endpoint coprocessors are very poorly documented.
 Coprocessor section of HBase book must be updated either with its own 
 endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some 
 other guides. There is good description here:
 http://www.3pillarglobal.com/insights/hbase-coprocessors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table

2015-07-07 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617007#comment-14617007
 ] 

Ashish Singhi commented on HBASE-8642:
--

Ping for more reviews! 

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-0.98.patch, HBASE-8642-v1.patch, 
 HBASE-8642-v2.patch, HBASE-8642-v3.patch, HBASE-8642-v4.patch, 
 HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Gaurav Bhardwaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Bhardwaj updated HBASE-13867:

Status: Patch Available  (was: In Progress)

Please use patch 
[HBASE-13867.1.patch|https://issues.apache.org/jira/secure/attachment/12743985/HBASE-13867.1.patch]

 Add endpoint coprocessor guide to HBase book
 

 Key: HBASE-13867
 URL: https://issues.apache.org/jira/browse/HBASE-13867
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, documentation
Reporter: Vladimir Rodionov
Assignee: Gaurav Bhardwaj
 Attachments: HBASE-13867.1.patch


 Endpoint coprocessors are very poorly documented.
 Coprocessor section of HBase book must be updated either with its own 
 endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some 
 other guides. There is good description here:
 http://www.3pillarglobal.com/insights/hbase-coprocessors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-07-07 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616972#comment-14616972
 ] 

Stephen Yuan Jiang commented on HBASE-13415:


[~busbey]  I should be be able to complete soon.

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616958#comment-14616958
 ] 

Hadoop QA commented on HBASE-13387:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743962/HBASE-13387_v2.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743962

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 18 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:red}-1 javac{color}.  The applied patch generated 20 javac compiler 
warnings (more than the master's current 16 warnings).

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1899 checkstyle errors (more than the master's current 1898 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14691//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14691//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14691//artifact/patchprocess/checkstyle-aggregate.html

Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14691//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14691//console

This message is automatically generated.

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, 
 HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, 
 benchmark.zip


 This came in btw the discussion abt the parent Jira and recently Stack added 
 as a comment on the E2E patch on the parent Jira.
 The idea is to add a new Interface 'ByteBufferedCell'  in which we can add 
 new buffer based getter APIs and getters for position in components in BB.  
 We will keep this interface @InterfaceAudience.Private.   When the Cell is 
 backed by a DBB, we can create an Object implementing this new interface.
 The Comparators has to be aware abt this new Cell extension and has to use 
 the BB based APIs rather than getXXXArray().  Also give util APIs in CellUtil 
 to abstract the checks for new Cell type.  (Like matchingXXX APIs, 
 getValueAstype APIs etc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-7912:
-
Attachment: HBaseBackupRestore-Jira-7912-v6.pdf

Updated version of design document. Added section for KVs deduplication and 
some other stuff. 

 HBase Backup/Restore Based on HBase Snapshot
 

 Key: HBASE-7912
 URL: https://issues.apache.org/jira/browse/HBASE-7912
 Project: HBase
  Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Vladimir Rodionov
  Labels: backup
 Fix For: 2.0.0

 Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
 HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
 HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
 HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf


 Finally, we completed the implementation of our backup/restore solution, and 
 would like to share with community through this jira. 
 We are leveraging existing hbase snapshot feature, and provide a general 
 solution to common users. Our full backup is using snapshot to capture 
 metadata locally and using exportsnapshot to move data to another cluster; 
 the incremental backup is using offline-WALplayer to backup HLogs; we also 
 leverage global distribution rolllog and flush to improve performance; other 
 added-on values such as convert, merge, progress report, and CLI commands. So 
 that a common user can backup hbase data without in-depth knowledge of hbase. 
  Our solution also contains some usability features for enterprise users. 
 The detail design document and CLI command will be attached in this jira. We 
 plan to use 10~12 subtasks to share each of the following features, and 
 document the detail implement in the subtasks: 
 * *Full Backup* : provide local and remote back/restore for a list of tables
 * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
 backup)
 * *distributed* Logroll and distributed flush 
 * Backup *Manifest* and history
 * *Incremental* backup: to build on top of full backup as daily/weekly backup 
 * *Convert*  incremental backup WAL files into hfiles
 * *Merge* several backup images into one(like merge weekly into monthly)
 * *add and remove* table to and from Backup image
 * *Cancel* a backup process
 * backup progress *status*
 * full backup based on *existing snapshot*
 *-*
 *Below is the original description, to keep here as the history for the 
 design and discussion back in 2013*
 There have been attempts in the past to come up with a viable HBase 
 backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
 advancements and new features in HBase, for example, FileLink, Snapshot, and 
 Distributed Barrier Procedure. This is a proposal for a backup/restore 
 solution that utilizes these new features to achieve better performance and 
 consistency. 
  
 A common practice of backup and restore in database is to first take full 
 baseline backup, and then periodically take incremental backup that capture 
 the changes since the full baseline backup. HBase cluster can store massive 
 amount data.  Combination of full backups with incremental backups has 
 tremendous benefit for HBase as well.  The following is a typical scenario 
 for full and incremental backup.
 # The user takes a full backup of a table or a set of tables in HBase. 
 # The user schedules periodical incremental backups to capture the changes 
 from the full backup, or from last incremental backup.
 # The user needs to restore table data to a past point of time.
 # The full backup is restored to the table(s) or to different table name(s).  
 Then the incremental backups that are up to the desired point in time are 
 applied on top of the full backup. 
 We would support the following key features and capabilities.
 * Full backup uses HBase snapshot to capture HFiles.
 * Use HBase WALs to capture incremental changes, but we use bulk load of 
 HFiles for fast incremental restore.
 * Support single table or a set of tables, and column family level backup and 
 restore.
 * Restore to different table names.
 * Support adding additional tables or CF to backup set without interruption 
 of incremental backup schedule.
 * Support rollup/combining of incremental backups into longer period and 
 bigger incremental backups.
 * Unified command line interface for all the above.
 The solution will support HBase backup to FileSystem, either on the same 
 cluster or across clusters.  It has the flexibility to support backup to 
 other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.

2015-07-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617056#comment-14617056
 ] 

Lars Hofhansl commented on HBASE-13337:
---

v3 looks reasonable to me. At the very least it won't hurt :)

Nice find. +1

 Table regions are not assigning back, after restarting all regionservers at 
 once.
 -

 Key: HBASE-13337
 URL: https://issues.apache.org/jira/browse/HBASE-13337
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 2.0.0
Reporter: Y. SREENIVASULU REDDY
Assignee: Samir Ahmic
Priority: Blocker
 Fix For: 2.0.0

 Attachments: HBASE-13337-v2.patch, HBASE-13337-v3.patch, 
 HBASE-13337.patch


 Regions of the table are continouly in state=FAILED_CLOSE.
 {noformat}
 RegionState   
   
   RIT time (ms)
 8f62e819b356736053e06240f7f7c6fd  
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. 
 state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), 
 server=VM1,16040,1427362531818  113929
 caf59209ae65ea80fca6bdc6996a7d68  
 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. 
 state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), 
 server=VM2,16040,1427362533691  113929
 db52a74988f71e5cf257bbabf31f26f3  
 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. 
 state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), 
 server=VM3,16040,1427362533691  113920
 43f3a65b9f9ff283f598c5450feab1f8  
 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. 
 state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), 
 server=VM1,16040,1427362531818  113920
 {noformat}
 *Steps to reproduce:*
 1. Start HBase cluster with more than one regionserver.
 2. Create a table with precreated regions. (lets say 15 regions)
 3. Make sure the regions are well balanced.
 4. Restart all the Regionservers process at once across the cluster, except 
 HMaster process
 5. After restarting the Regionservers, successfully will connect to the 
 HMaster.
 *Bug:*
 But no regions are assigning back to the Regionservers.
 *Master log shows as follows:*
 {noformat}
 2015-03-26 15:05:36,201 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd 
 state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to 
 {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, 
 server=VM1,16040,1427362531818}
 2015-03-26 15:05:36,202 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.RegionStateStore: Updating row 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with 
 state=PENDING_OPENsn=VM1,16040,1427362531818
 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.AssignmentManager: Force region state offline 
 {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, 
 server=VM1,16040,1427362531818}
 2015-03-26 15:05:36,244 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd 
 state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to 
 {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, 
 server=VM1,16040,1427362531818}
 2015-03-26 15:05:36,244 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.RegionStateStore: Updating row 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with 
 state=PENDING_CLOSE
 2015-03-26 15:05:36,248 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.AssignmentManager: Server VM1,16040,1427362531818 returned 
 java.nio.channels.ClosedChannelException for 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10
 2015-03-26 15:05:36,248 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.AssignmentManager: Server VM1,16040,1427362531818 returned 
 java.nio.channels.ClosedChannelException for 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=2 of 10
 2015-03-26 15:05:36,249 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.AssignmentManager: Server VM1,16040,1427362531818 returned 
 java.nio.channels.ClosedChannelException for 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=3 of 10
 2015-03-26 15:05:36,249 INFO  [VM2,16020,1427362216887-GeneralBulkAssigner-0] 
 master.AssignmentManager: Server VM1,16040,1427362531818 returned 
 java.nio.channels.ClosedChannelException for 
 t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., 

[jira] [Updated] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13897:
---
 Hadoop Flags: Reviewed
Fix Version/s: 1.3.0
   2.0.0

 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14027) Clean up netty dependencies

2015-07-07 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-14027:

Attachment: HBASE-14027.3.patch

-03
* adds test dependency for hbase-it so that integration tests run via maven 
will work
* unifies the netty 3.x version used in tests between hbase-server and hbase-it

Given the above, need to test that keeping the jar out of the assembly doesn't 
prevent the ITs from running on a deployed cluster.


 Clean up netty dependencies
 ---

 Key: HBASE-14027
 URL: https://issues.apache.org/jira/browse/HBASE-14027
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 1.0.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14027.1.patch, HBASE-14027.2.patch, 
 HBASE-14027.3.patch


 We have multiple copies of Netty (3?) getting shipped around. clean some up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617050#comment-14617050
 ] 

stack commented on HBASE-13561:
---

I tried it... v1. Looks good.

{code}
...
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$Counts
REFERENCED=499395826
UNDEFINED=302087
UNREFERENCED=302087
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=20746296
undef
\x00\x00$\xA6\x96W\x9F\xC5\x81\x83r\x5Co[;3=1
\x00\x005\xBC\x06\x0D\xE4'\xDD\xA6l\xA0\xB1c6~=1
\x00\x007\xB3\x0D^\x11\xF2\xC4\xD74\xAA\xC5!\xA8o=1
\x00\x008 \xD3\xD6Z\xCD\xD0\xBC\x9C\xE7\x1F\xEE\x11.=1
\x00\x00=+\xD2\xB4\x91h\xCFJ8`\xF8\x82\xA5\xE7=1
\x00\x00S\xE1\xD5\xC5n\xB9Y\xA3\xB8\xB9`\xA1\xCF\xB9=1
\x00\x00\x0D\xC3)\xCB\x85@t\x0E\x8EZ\xBAy6\x8E=1
\x00\x00\x16\xFF\x8E\x94\x0F\xFC\x13\xC1m~\xB9\xA8!\x85=1
\x00\x00\x82@Z\xB1N\x1B\xA05\xFB\xBC\xDB\xD0\x0D\x04=1
\x00\x00\x96z\xB9\x18\xE5\x9B\xC7\x14\xB1\xA6\x0Bf\x1F\xE7=1
\x00\x00\x98*U(\x8Fqi\x04\xD8A\x13\x0E8j=1
\x00\x00\xA3\xD8\x0F\x02\x13\x06n5\xD45.Y\xB3\x81=1
\x00\x00\xC4ItJ\x0BX\x9F\x8A\x0D\xB5\xDDn\xAE=1
\x00\x00\xEB\xA7\x902X\xB0\xDD\xE1\x17\x83\xAD\x0C\xD0\x9F=1
\x00\x00a\xF2k6\xBC;\xDD)5\xB2\xAD\xA7\xBA(=1
\x00\x00g\xBB\xF5\xD2\xBE\x9Dm\xE1L\x8F\xB1\xAB/=1
\x00\x00{\x8E\x12\xE0\x1Des!\xF3I\xC7}Zn=1
\x00\x00}k\x89\xF8b\x970\xC0\x07Xu\xAF\xDA\xC5=1
\x00\x01+\xBE+\x10/\x87\xA4\xB5\xF8aEDdU=1
\x00\x01.8\x9FiBj\xD3\x8E6e\xCF\xEC\xF8\xC9=1
\x00\x010F\xCB\x0B \xA3\x07\x0B\x8D^X\xC7\x5C\x5C=1
\x00\x016\xB0\x17\xD9\xE2\xF6S\xE9v\xB4u\xDD\xBF}=1
\x00\x01?\xB3\xB8\x88\x1A\xF4\xA4\xAF\xFA!\xA8\xA1\x93\x8B=1
\x00\x01J{tXz\x92\xDAI1\x96\x98E\x0E\x97=1
\x00\x01X\xB1[C]0\xEAP\x90\xDF\xBE\xD87\xBD=1
\x00\x01\x07\x93\x88/c3h\xA2i\xBAs\xB8\xB9\x5C=1
\x00\x01\x8C\xAB\xED+\x95\xD4\x07\x178\xA4m2\xCE*=1
\x00\x01\xAFGX\x0C\xFBi\xEB\xA4\xCB\x0D\x9B\xA3=1
\x00\x01\xD5i\xF3\x95\x8Bn\xFEx{\xEC\x13\xFE\xE5\xBB=1
\x00\x01\xDC\xF1\xE3FXZ\xE9\x00\xB4i.\x01\xFD\xC1=1
\x00\x01\xE5?\xB8eB\xDEM\x01\x90\xF3\xC8\x04\xB0=1
\x00\x01\xF0\xD1\x14\x1DK\xAB\xE6\xADZ\xBC\xE5y\x12o=1
\x00\x01c\xD1`\x00\x871qK \xB0\x88z;\x86=1
\x00\x02 \x1C\x0E\xF3\xBD\xCDSb3\xEB\x8E\xA7\xFAs=1
\x00\x020\xFA\x1F4\xAD\xA2K\xF7\xC2\xF5\xD9=\x86\x84=1
\x00\x02\x09\x00xb\x06\x0B\xFB\x89\xC9\xDF\xEB9\xB7\xC7=1
\x00\x02\xFE\xC6o\x91z\x85\xA6\xC1\xA2\xFDH\x05EK=1
\x00\x02{\x1F\xD9{5\x06\x06H\xC5ql\xB0\x93\xF8=1
\x00\x03`\xC0\xD1\xA1)\x8B\x18\x99=|\xCAk\x88\x88=1
\x00\x03es\xA0\xC9h\xEEd\xCFL\xDFB\x9A\x92C=1
unref
\x00\x00\x87\xFE\xFA\xFF`\xD7\x8B\x0D#\xD9\xE2\xEFy\x89=1
\x00\x00\xC5_\x9F\xFC\xBB\x969\xBE%\x89\xAB\xC7\x94W=1
\x00\x00\xEF\xB0\xFC\xFD\x025$\xF9\x14\xC48\xA95\x8A=1
\x00\x00wb?\xA1=\xEA\xDC\x19\xBD\xD6\xEC\x09\xEE=1
\x00\x01RK\x86\x18|0\xB8\xE3\xA2C\xA1\x07\xA4\x0C=1
15/07/07 10:26:21 ERROR test.IntegrationTestBigLinkedList$Verify: Found an 
undefined node. Undefined count=302087
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$
[stack@c2020 ~]$ echo $?
1
{code}

Let me commit [~elserj]

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just 

[jira] [Commented] (HBASE-14024) ImportTsv is not loading hbase-default.xml

2015-07-07 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616994#comment-14616994
 ] 

Ashish Singhi commented on HBASE-14024:
---

Ping for reviews! 

 ImportTsv is not loading hbase-default.xml
 --

 Key: HBASE-14024
 URL: https://issues.apache.org/jira/browse/HBASE-14024
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 2.0.0
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Critical
 Fix For: 2.0.0

 Attachments: HBASE-14024.patch


 ImportTsv job is failing with below exception
 {noformat}
 Exception in thread main java.lang.IllegalArgumentException: Can not create 
 a Path from a null string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123)
   at org.apache.hadoop.fs.Path.init(Path.java:135)
   at org.apache.hadoop.fs.Path.init(Path.java:89)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406)
   at 
 org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555)
   at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772)
 {noformat}
 {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. 
 I found that hbase configuration resources from its xml are not loaded into 
 conf object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-13561:
---
Attachment: HBASE-13561-v2.patch

Verified failure to Verify is reported as expected. Updated usage on ITBLL to 
advise that return code is checked in addition to counters.

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12848) Utilize Flash storage for WAL

2015-07-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617045#comment-14617045
 ] 

Lars Hofhansl commented on HBASE-12848:
---

Hmm... Interesting. Only renames are atomic. In theory we can rename (move the 
inode) in the NameNode and move the blocks lazily at the DataNodes, but that'd 
need more HDFS (presumably).


 Utilize Flash storage for WAL
 -

 Key: HBASE-12848
 URL: https://issues.apache.org/jira/browse/HBASE-12848
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.1.0

 Attachments: 12848-v1.patch, 12848-v2.patch, 12848-v3.patch, 
 12848-v4.patch, 12848-v4.patch


 One way to improve data ingestion rate is to make use of Flash storage.
 HDFS is doing the heavy lifting - see HDFS-7228.
 We assume an environment where:
 1. Some servers have a mix of flash, e.g. 2 flash drives and 4 traditional 
 drives.
 2. Some servers have all traditional storage.
 3. RegionServers are deployed on both profiles within one HBase cluster.
 This JIRA allows WAL to be managed on flash in a mixed-profile environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)

2015-07-07 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616309#comment-14616309
 ] 

Anoop Sam John commented on HBASE-13890:


{code}
if(results.size() == 0  get.isMemstoreOnly()){
7071  // memory store mode
7072  // Nothing was found - return empty result or null
7073  return increment.isReturnResults() ? 
Result.create(results) : null;
7074}
{code}

I see..  Checking the patch now.  So this will fail it to client..  Can the get 
op be repeated (with out memstore only setting) at server side only?

 Get/Scan from MemStore only (Client API)
 

 Key: HBASE-13890
 URL: https://issues.apache.org/jira/browse/HBASE-13890
 Project: HBase
  Issue Type: New Feature
  Components: API, Client, Scanners
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Attachments: HBASE-13890-v1.patch


 This is short-circuit read for get/scan when recent data (version) of a cell 
 can be found only in MemStore (with very high probability). 
 Good examples are: Atomic counters and appends. This feature will allow to 
 bypass completely store file scanners and improve performance and latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14027) Clean up netty dependencies

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616319#comment-14616319
 ] 

Hadoop QA commented on HBASE-14027:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743885/HBASE-14027.2.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743885

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14688//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14688//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14688//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14688//console

This message is automatically generated.

 Clean up netty dependencies
 ---

 Key: HBASE-14027
 URL: https://issues.apache.org/jira/browse/HBASE-14027
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 1.0.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14027.1.patch, HBASE-14027.2.patch


 We have multiple copies of Netty (3?) getting shipped around. clean some up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13988) Add exception handler for lease thread

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616330#comment-14616330
 ] 

Hudson commented on HBASE-13988:


SUCCESS: Integrated in HBase-0.98 #1049 (See 
[https://builds.apache.org/job/HBase-0.98/1049/])
HBASE-13988 Add exception handler for lease thread (Liu Shaohui) (enis: rev 
9894497e7158905b3a8091e6ec8454e699be3e72)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Add exception handler for lease thread
 --

 Key: HBASE-13988
 URL: https://issues.apache.org/jira/browse/HBASE-13988
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2, 1.3.0, 0.98.15

 Attachments: HBASE-13988-v001.diff, HBASE-13988-v002.diff


 In a prod cluster, a region server exited for some important 
 threads were not alive. After excluding other threads from the log, we 
 doubted the lease thread was the root. 
 So we need to add an exception handler to the lease thread to debug why it 
 exited in future.
  
 {quote}
 2015-06-29,12:46:09,222 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: One or more 
 threads are no longer alive -- stop
 2015-06-29,12:46:09,223 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 21600
 ...
 2015-06-29,12:46:09,330 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Thread-37 exiting
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver21600.compactionChecker exiting
 2015-06-29,12:46:12,403 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher: 
 regionserver21600.periodicFlusher exiting
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13988) Add exception handler for lease thread

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616384#comment-14616384
 ] 

Hudson commented on HBASE-13988:


SUCCESS: Integrated in HBase-1.3-IT #25 (See 
[https://builds.apache.org/job/HBase-1.3-IT/25/])
HBASE-13988 Add exception handler for lease thread (Liu Shaohui) (enis: rev 
3da5058337579d72ef046166ac0c979dda5eb74b)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Add exception handler for lease thread
 --

 Key: HBASE-13988
 URL: https://issues.apache.org/jira/browse/HBASE-13988
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2, 1.3.0, 0.98.15

 Attachments: HBASE-13988-v001.diff, HBASE-13988-v002.diff


 In a prod cluster, a region server exited for some important 
 threads were not alive. After excluding other threads from the log, we 
 doubted the lease thread was the root. 
 So we need to add an exception handler to the lease thread to debug why it 
 exited in future.
  
 {quote}
 2015-06-29,12:46:09,222 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: One or more 
 threads are no longer alive -- stop
 2015-06-29,12:46:09,223 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 21600
 ...
 2015-06-29,12:46:09,330 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Thread-37 exiting
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver21600.compactionChecker exiting
 2015-06-29,12:46:12,403 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher: 
 regionserver21600.periodicFlusher exiting
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13858) RS/MasterDumpServlet dumps threads before its “Stacks” header

2015-07-07 Thread sunhaitao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunhaitao updated HBASE-13858:
--
Assignee: (was: sunhaitao)

 RS/MasterDumpServlet dumps threads before its “Stacks” header
 -

 Key: HBASE-13858
 URL: https://issues.apache.org/jira/browse/HBASE-13858
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, UI
Affects Versions: 1.1.0
Reporter: Lars George
Priority: Trivial
  Labels: beginner
 Fix For: 2.0.0, 1.3.0


 The stacktraces are captured using a Hadoop helper method, then its output is 
 merged with the current. I presume there is a simple flush after outputing 
 the Stack header missing, before then the caught output is dumped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11182) Store backup information in a manifest file using protobuff format

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-11182.
---
Resolution: Won't Fix

Closing this JIRA as not relevant one to a current Backup/Restore roadmap.

 Store backup information in a manifest file using protobuff format
 --

 Key: HBASE-11182
 URL: https://issues.apache.org/jira/browse/HBASE-11182
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.99.0
Reporter: Jerry He
Assignee: Enoch Hsu

 A manifest file is used to store information about a backup image such as:
 Table Name
 Type: Full or Incremental
 Size
 Timestamp Info
 State Info: Converted, Merged, Compacted, etc.
 Dependency Lineage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14030) Backup/Restore Phase 1

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14030:
-

 Summary: Backup/Restore Phase 1
 Key: HBASE-14030
 URL: https://issues.apache.org/jira/browse/HBASE-14030
 Project: HBase
  Issue Type: Umbrella
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


This is the umbrella ticket for Backup/Restore Phase 1. See HBase-7912 design 
doc for the phase description.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617131#comment-14617131
 ] 

Hadoop QA commented on HBASE-13897:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743973/13897-v2.txt
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743973

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14693//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14693//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14693//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14693//console

This message is automatically generated.

 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14034) HBase Backup/Restore Phase 1: Abstract Coordination manager (Zk) operations

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14034:
-

 Summary: HBase Backup/Restore Phase 1: Abstract Coordination 
manager (Zk) operations
 Key: HBASE-14034
 URL: https://issues.apache.org/jira/browse/HBASE-14034
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Abstract Coordination manager (Zk) operations. See 
org.apache.hadoop.hbase.coordination package for references. Provide Zookeeper 
implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14036) HBase Backup/Restore Phase 1: Custom WAL archive cleaner

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14036:
-

 Summary: HBase Backup/Restore Phase 1: Custom WAL archive cleaner
 Key: HBASE-14036
 URL: https://issues.apache.org/jira/browse/HBASE-14036
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Custom WAL archive cleaner (BackupLogCleaner).  We need to keep WAL files in 
archive until they either get copied over to backup destination during an 
incremental backup or full backup (for ALL tables) happens. This is tricky, but 
is doable. Backup-aware WAL archiver cleaner should consult hbase:backup to 
determine if WAL file is safe to purge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617158#comment-14617158
 ] 

ramkrishna.s.vasudevan commented on HBASE-12295:


Will correct the long lines in my next revision based on the comments from RB.

 Prevent block eviction under us if reads are in progress from the BBs
 -

 Key: HBASE-12295
 URL: https://issues.apache.org/jira/browse/HBASE-12295
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
 HBASE-12295_10.patch, HBASE-12295_2.patch, HBASE-12295_4.patch, 
 HBASE-12295_4.pdf, HBASE-12295_5.pdf, HBASE-12295_9.patch, 
 HBASE-12295_trunk.patch


 While we try to serve the reads from the BBs directly from the block cache, 
 we need to ensure that the blocks does not get evicted under us while 
 reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14024) ImportTsv is not loading hbase-default.xml

2015-07-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617287#comment-14617287
 ] 

Ted Yu commented on HBASE-14024:


lgtm

 ImportTsv is not loading hbase-default.xml
 --

 Key: HBASE-14024
 URL: https://issues.apache.org/jira/browse/HBASE-14024
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 2.0.0
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Critical
 Fix For: 2.0.0

 Attachments: HBASE-14024.patch


 ImportTsv job is failing with below exception
 {noformat}
 Exception in thread main java.lang.IllegalArgumentException: Can not create 
 a Path from a null string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123)
   at org.apache.hadoop.fs.Path.init(Path.java:135)
   at org.apache.hadoop.fs.Path.init(Path.java:89)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441)
   at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406)
   at 
 org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555)
   at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772)
 {noformat}
 {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. 
 I found that hbase configuration resources from its xml are not loaded into 
 conf object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14032) HBase Backup/Restore Phase 1: Abstract SnapshotCopy (full backup)

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14032:
--
Summary: HBase Backup/Restore Phase 1: Abstract SnapshotCopy (full backup)  
(was: Backup/Restore Phase 1: Abstract SnapshotCopy (full backup))

 HBase Backup/Restore Phase 1: Abstract SnapshotCopy (full backup)
 -

 Key: HBASE-14032
 URL: https://issues.apache.org/jira/browse/HBASE-14032
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov

 Abstract SnapshotCopy (full backup) to support non-M/R based implementations. 
 Provide M/R implementation. SnapshotCopy is used to copy snapshot’s data 
 during full backup operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14033) HBase Backup/Restore Phase1: Abstract WALPlayer (incremental restore)

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14033:
--
Summary: HBase Backup/Restore Phase1: Abstract WALPlayer (incremental 
restore)  (was: Backup/Restore Phase1: Abstract WALPlayer (incremental restore))

 HBase Backup/Restore Phase1: Abstract WALPlayer (incremental restore)
 -

 Key: HBASE-14033
 URL: https://issues.apache.org/jira/browse/HBASE-14033
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


 Abstract WALPlayer (incremental restore) to support non-M/R based 
 implementations. Provide M/R implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14027) Clean up netty dependencies

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617168#comment-14617168
 ] 

Hadoop QA commented on HBASE-14027:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743982/HBASE-14027.3.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743982

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14694//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14694//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14694//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14694//console

This message is automatically generated.

 Clean up netty dependencies
 ---

 Key: HBASE-14027
 URL: https://issues.apache.org/jira/browse/HBASE-14027
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 1.0.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14027.1.patch, HBASE-14027.2.patch, 
 HBASE-14027.3.patch


 We have multiple copies of Netty (3?) getting shipped around. clean some up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14038) Incremental backup list set is ignored during backup

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14038:
-

 Summary: Incremental backup list set is ignored during backup
 Key: HBASE-14038
 URL: https://issues.apache.org/jira/browse/HBASE-14038
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


BUG: during incremental backup, provided table list is ignored and replaced 
with the set of tables which have been already backuped before. Test case: 
backup T1, T2, T3, then request incremental backup for T1, T2 = T3 will be 
included as well. See: BackupClient.requestBackup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14025) Update CHANGES.txt for 1.2

2015-07-07 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617122#comment-14617122
 ] 

Sean Busbey commented on HBASE-14025:
-

{quote}
I think I get your point, but not sure how easily we can follow this practice. 
It gets complicated for committers to think about what fixVersions to set at 
the time of the commit. The thing we have where we just mark the next scheduled 
version from that branch is simple and worked so far.
{quote}

That's fair. I think this is the kind of thing that release managers will have 
to do, since they know if something actually made it into the release branch. 
Since it's one time work on non-patch releases it doesn't seem so bad.

I see it the same as how when the RCs get close the RM needs to move out things 
that aren't going to make it in (e.g. replacing a 1.2.0 version with 1.2.1 and 
1.3.0) or move back in things that got committed between candidates (e.g. by 
replacing 1.2.1 with 1.2.0).

{quote}
Can we do a middle ground where we keep the fixVersions in jira, and filter 
them out in CHANGES.txt for convenience if not needed?
{quote}

The problem with this is that then the release notes in Jira will be less 
accurate than the CHANGES.txt file, and I'd very much like to point to the jira 
data as authoritative.

What about having the committers continue with their current habit and then 
leaving it to the release managers to make things consistent around release 
time?


 Update CHANGES.txt for 1.2
 --

 Key: HBASE-14025
 URL: https://issues.apache.org/jira/browse/HBASE-14025
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 1.2.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 1.2.0


 Since it's more effort than I expected, making a ticket to track actually 
 updating CHANGES.txt so that new RMs have an idea what to expect.
 Maybe will make doc changes if there's enough here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13446) Add docs warning about missing data for downstream on versions prior to HBASE-13262

2015-07-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-13446:
--
Fix Version/s: (was: 1.0.2)
   1.0.3

 Add docs warning about missing data for downstream on versions prior to 
 HBASE-13262
 ---

 Key: HBASE-13446
 URL: https://issues.apache.org/jira/browse/HBASE-13446
 Project: HBase
  Issue Type: Task
  Components: documentation
Affects Versions: 0.98.0, 1.0.0
Reporter: Sean Busbey
Priority: Critical
 Fix For: 2.0.0, 0.98.14, 1.0.3


 From conversation at the end of HBASE-13262:
 [~davelatham]
 {quote}
 Should we put a warning somewhere (mailing list? book?) about this? Something 
 like:
 IF (client OR server is = 0.98.11/1.0.0) AND server has a smaller value for 
 hbase.client.scanner.max.result.size than client does, THEN scan requests 
 that reach the server's hbase.client.scanner.max.result.size are likely to 
 miss data. In particular, 0.98.11 defaults 
 hbase.client.scanner.max.result.size to 2MB but other versions default to 
 larger values, so be very careful using 0.98.11 servers with any other client 
 version.
 {quote}
 [~busbey]
 {quote}
 How about we add a note in the ref guide for upgrades and for
 troubleshooting?
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14040) Small refactoring in BackupHandler

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14040:
-

 Summary: Small refactoring in BackupHandler
 Key: HBASE-14040
 URL: https://issues.apache.org/jira/browse/HBASE-14040
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Move distributed log roll procedure call to BackupHandler.call from 
IncrementalBackupManager.getLogFilesForNewBackup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop

2015-07-07 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617210#comment-14617210
 ] 

Jerry He commented on HBASE-14000:
--

In HBASE-13317, we try to be conservative if the region sever gets 
ServerNotRunningYetException when reportForDuty.
ServerNotRunningYetException means the master may still be initializing, so 
there may not be an immediate need to try a new RPC connection.

In your case, do you see the loop stuck for a long time, meaning that the old 
master continued to return ServerNotRunningYetException for a long time? 

 Region server failed to report Master and stuck in reportForDuty retry loop
 ---

 Key: HBASE-14000
 URL: https://issues.apache.org/jira/browse/HBASE-14000
 Project: HBase
  Issue Type: Bug
Reporter: Pankaj Kumar
Assignee: Pankaj Kumar
 Attachments: HBASE-14000.patch


 In a HA cluster, region server got stuck in reportForDuty retry loop if the 
 active master is restarting and later on master switch happens before it 
 reports successfully.
 Root cause is same as HBASE-13317, but the region server tried to connect 
 master when it was starting, so rssStub reset didnt happen as
 {code}
   if (ioe instanceof ServerNotRunningYetException) {
   LOG.debug(Master is not running yet);
   }
 {code}
 When master starts, master switch happened. So RS always tried to connect to 
 standby master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13561:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 1.0.3)
   (was: 0.98.14)
   Status: Resolved  (was: Patch Available)

Pushed to branch-1, branch-1.1, branch-1.2, and master. Did not push to 
branch-1.0 or 0.98 (if you put up a backport, I'll apply to these branches too 
[~elserj])

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14032) Backup/Restore Phase 1: Abstract SnapshotCopy (full backup)

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14032:
-

 Summary: Backup/Restore Phase 1: Abstract SnapshotCopy (full 
backup)
 Key: HBASE-14032
 URL: https://issues.apache.org/jira/browse/HBASE-14032
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Abstract SnapshotCopy (full backup) to support non-M/R based implementations. 
Provide M/R implementation. SnapshotCopy is used to copy snapshot’s data during 
full backup operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617085#comment-14617085
 ] 

Josh Elser commented on HBASE-13561:


Awesome. Thanks for the shepherding, [~stack]. LMK if you have any issues 
backporting, I can rebase easily enough if desired.

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617136#comment-14617136
 ] 

Ted Yu commented on HBASE-13897:


QA run passed:
{code}
[INFO] HBase - Shaded - Server ... SUCCESS [  0.488 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 02:09 h
[INFO] Finished at: 2015-07-07T18:24:00+00:00
{code}

 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14031) Backup/Restore Phase 1: Abstract DistCp in incremental backup

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14031:
-

 Summary: Backup/Restore Phase 1: Abstract DistCp in incremental 
backup
 Key: HBASE-14031
 URL: https://issues.apache.org/jira/browse/HBASE-14031
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Abstract DistCp (incremental backup) to support non-M/R based implementations. 
Provide M/R implementation. DistCp is used to copy WAL files during incremental 
backup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12848) Utilize Flash storage for WAL

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617170#comment-14617170
 ] 

ramkrishna.s.vasudevan commented on HBASE-12848:


Reading the intent of the JIRA it is ideally to specify where the active WAL is 
going to be and not on the archived WALs or any file. The archived files 
movement should happen at the back end something like a copy option. So it 
should be a back end option.
Moving some files to SSDs also should be a backend option and it is more like 
HDFS doing it on receiving an instruciton to move the files. 

 Utilize Flash storage for WAL
 -

 Key: HBASE-12848
 URL: https://issues.apache.org/jira/browse/HBASE-12848
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.1.0

 Attachments: 12848-v1.patch, 12848-v2.patch, 12848-v3.patch, 
 12848-v4.patch, 12848-v4.patch


 One way to improve data ingestion rate is to make use of Flash storage.
 HDFS is doing the heavy lifting - see HDFS-7228.
 We assume an environment where:
 1. Some servers have a mix of flash, e.g. 2 flash drives and 4 traditional 
 drives.
 2. Some servers have all traditional storage.
 3. RegionServers are deployed on both profiles within one HBase cluster.
 This JIRA allows WAL to be managed on flash in a mixed-profile environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617250#comment-14617250
 ] 

Hadoop QA commented on HBASE-13561:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743994/HBASE-13561-v2.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743994

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.phoenix.mapreduce.IndexToolIT.testMutalbleIndexWithUpdates(IndexToolIT.java:228)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14696//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14696//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14696//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14696//console

This message is automatically generated.

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14033) Backup/Restore Phase1: Abstract WALPlayer (incremental restore)

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14033:
-

 Summary: Backup/Restore Phase1: Abstract WALPlayer (incremental 
restore)
 Key: HBASE-14033
 URL: https://issues.apache.org/jira/browse/HBASE-14033
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Abstract WALPlayer (incremental restore) to support non-M/R based 
implementations. Provide M/R implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14030) HBase Backup/Restore Phase 1

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14030:
--
Summary: HBase Backup/Restore Phase 1  (was: Backup/Restore Phase 1)

 HBase Backup/Restore Phase 1
 

 Key: HBASE-14030
 URL: https://issues.apache.org/jira/browse/HBASE-14030
 Project: HBase
  Issue Type: Umbrella
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov

 This is the umbrella ticket for Backup/Restore Phase 1. See HBase-7912 design 
 doc for the phase description.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14031) HBase Backup/Restore Phase 1: Abstract DistCp in incremental backup

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14031:
--
Summary: HBase Backup/Restore Phase 1: Abstract DistCp in incremental 
backup  (was: Backup/Restore Phase 1: Abstract DistCp in incremental backup)

 HBase Backup/Restore Phase 1: Abstract DistCp in incremental backup
 ---

 Key: HBASE-14031
 URL: https://issues.apache.org/jira/browse/HBASE-14031
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


 Abstract DistCp (incremental backup) to support non-M/R based 
 implementations. Provide M/R implementation. DistCp is used to copy WAL files 
 during incremental backup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14039) BackupHandler.deleteSnapshot MUST use HBase Snapshot API

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14039:
-

 Summary: BackupHandler.deleteSnapshot MUST use HBase Snapshot API 
 Key: HBASE-14039
 URL: https://issues.apache.org/jira/browse/HBASE-14039
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


BackupHandler.deleteSnapshot MUST use HBase API for that (HBaseAdmin) - not 
direct FS access (deleting snapshot folder may be not enough?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617215#comment-14617215
 ] 

Hadoop QA commented on HBASE-13867:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743985/HBASE-13867.1.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743985

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+HBase Coprocessors are modeled after the Coprocessors which are part of 
Google's BigTable 
(http://static.googleusercontent.com/media/research.google.com/en//people/jeff/SOCC2010-keynote-slides.pdf,
 pages 41-42.). + 
+Coprocessor is a framework that provides an easy way to run your custom code 
directly on Region Server.
+. Mingjie Lai's blog post  
link:https://blogs.apache.org/hbase/entry/coprocessor_introduction[Coprocessor 
Introduction].
+. Gaurav Bhardwaj's blog post 
link:http://www.3pillarglobal.com/insights/hbase-coprocessors[The How To Of 
HBase Coprocessors].
+When working with any data store (like RDBMS or HBase) you fetch the data (in 
case of RDBMS you might use SQL query and in case of HBase you use either Get 
or Scan). To fetch only relevant data you filter it (for RDBMS you put 
conditions in 'WHERE' clause and in HBase you use Filters). After fetching the 
desired data, you perform your business computation on the data. +
+This scenario is close to ideal for small data, where few thousand rows and 
a bunch of columns are returned from the data store. Now imagine a scenario 
where there are billions of rows and millions of columns and you want to 
perform some computation which requires all the data, like calculating average 
or sum. Even if you are interested in just few columns, you still have to fetch 
all the rows. There are a few drawbacks in this approach as described below:
+. In this approach the data transfer (from data store to client side) will 
become the bottleneck, and the time required to complete the operation is 
limited by the rate at which data transfer is taking place.
+. Bandwidth is one of the most precious resources in any data center. 
Operations like this will severely impact the performance of your cluster.
+. Your client code is becoming thick as you are maintaining the code for 
calculating average or summation on client side. Not a major drawback when 
talking of severe issues like performance/bandwidth but still worth giving 
consideration.
+In a scenario like this it's better to move the computation (i.e. user's 
custom code) to the data itself (Region Server). Coprocessor helps you achieve 
this but you can do more than that. There is another advantage that your code 
runs in parallel (i.e. on all Regions). To give an idea of Coprocessor's 
capabilities, different people give different analogies. The three most famous 
analogies for Coprocessor present in the industry are:

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at org.apache.oozie.test.MiniHCatServer$1.run(MiniHCatServer.java:137)
at 
org.apache.oozie.test.XTestCase$MiniClusterShutdownMonitor.run(XTestCase.java:1071)
at org.apache.oozie.test.XTestCase.waitFor(XTestCase.java:692)
at 
org.apache.oozie.action.hadoop.TestMapReduceActionExecutor.testSetExecutionStats_when_user_has_specified_stats_write_TRUE(TestMapReduceActionExecutor.java:976)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14695//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14695//artifact/patchprocess/newFindbugsWarnings.html

[jira] [Created] (HBASE-14035) HBase Backup/Restore Phase 1: hbase:backup - backup system table

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14035:
-

 Summary: HBase Backup/Restore Phase 1: hbase:backup - backup 
system table
 Key: HBASE-14035
 URL: https://issues.apache.org/jira/browse/HBASE-14035
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


*hbase:backup* - move all backup meta info from Zk (coordination manager) to 
hbase system table.  Do not use Zk (coordination manager) as a persistent 
storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14037) Deletion of a table from backup set results int RTE during next backup

2015-07-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14037:
-

 Summary: Deletion of a table from backup set results int RTE 
during next backup 
 Key: HBASE-14037
 URL: https://issues.apache.org/jira/browse/HBASE-14037
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


 Deletion of a table with backup history (has Zk node) results in 
RuntimeException on all subsequent backup requests. See: 
BackupClient.requestBackup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)

2015-07-07 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616950#comment-14616950
 ] 

Vladimir Rodionov commented on HBASE-13890:
---

{quote}
I see.. Checking the patch now. So this will fail it to client.. Can the get op 
be repeated (with out memstore only setting) at server side only?
{quote}
Yes, I think it can be improved. I am working on patch #2. I want to clarify 
little bit what is this patch is for.

This is mostly to improve high performance counters (HPC), not Get, not Append 
(is anybody using them anyway) and not Scan operations. Most recent version of 
HPCs are always in Memstore (99.99% of a time), but each store file in this 
region/cf has its version as well (before major compaction). When HBase reads 
counter it has to go through all store files and compare results - very 
inefficient. This patch allows to bypass store files completely most of the 
time. 

For 

 Get/Scan from MemStore only (Client API)
 

 Key: HBASE-13890
 URL: https://issues.apache.org/jira/browse/HBASE-13890
 Project: HBase
  Issue Type: New Feature
  Components: API, Client, Scanners
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Attachments: HBASE-13890-v1.patch


 This is short-circuit read for get/scan when recent data (version) of a cell 
 can be found only in MemStore (with very high probability). 
 Good examples are: Atomic counters and appends. This feature will allow to 
 bypass completely store file scanners and improve performance and latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14030) HBase Backup/Restore Phase 1

2015-07-07 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14030:
--
Attachment: HBASE-14030-v0.patch

Current version (patch) derived from HBASE-11085 is attached.

 HBase Backup/Restore Phase 1
 

 Key: HBASE-14030
 URL: https://issues.apache.org/jira/browse/HBASE-14030
 Project: HBase
  Issue Type: Umbrella
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Attachments: HBASE-14030-v0.patch


 This is the umbrella ticket for Backup/Restore Phase 1. See HBase-7912 design 
 doc for the phase description.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2015-07-07 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617316#comment-14617316
 ] 

Vladimir Rodionov commented on HBASE-7912:
--

For the most recent updates please go to HBASE-14030.

 HBase Backup/Restore Based on HBase Snapshot
 

 Key: HBASE-7912
 URL: https://issues.apache.org/jira/browse/HBASE-7912
 Project: HBase
  Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Vladimir Rodionov
  Labels: backup
 Fix For: 2.0.0

 Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
 HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
 HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
 HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf


 Finally, we completed the implementation of our backup/restore solution, and 
 would like to share with community through this jira. 
 We are leveraging existing hbase snapshot feature, and provide a general 
 solution to common users. Our full backup is using snapshot to capture 
 metadata locally and using exportsnapshot to move data to another cluster; 
 the incremental backup is using offline-WALplayer to backup HLogs; we also 
 leverage global distribution rolllog and flush to improve performance; other 
 added-on values such as convert, merge, progress report, and CLI commands. So 
 that a common user can backup hbase data without in-depth knowledge of hbase. 
  Our solution also contains some usability features for enterprise users. 
 The detail design document and CLI command will be attached in this jira. We 
 plan to use 10~12 subtasks to share each of the following features, and 
 document the detail implement in the subtasks: 
 * *Full Backup* : provide local and remote back/restore for a list of tables
 * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
 backup)
 * *distributed* Logroll and distributed flush 
 * Backup *Manifest* and history
 * *Incremental* backup: to build on top of full backup as daily/weekly backup 
 * *Convert*  incremental backup WAL files into hfiles
 * *Merge* several backup images into one(like merge weekly into monthly)
 * *add and remove* table to and from Backup image
 * *Cancel* a backup process
 * backup progress *status*
 * full backup based on *existing snapshot*
 *-*
 *Below is the original description, to keep here as the history for the 
 design and discussion back in 2013*
 There have been attempts in the past to come up with a viable HBase 
 backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
 advancements and new features in HBase, for example, FileLink, Snapshot, and 
 Distributed Barrier Procedure. This is a proposal for a backup/restore 
 solution that utilizes these new features to achieve better performance and 
 consistency. 
  
 A common practice of backup and restore in database is to first take full 
 baseline backup, and then periodically take incremental backup that capture 
 the changes since the full baseline backup. HBase cluster can store massive 
 amount data.  Combination of full backups with incremental backups has 
 tremendous benefit for HBase as well.  The following is a typical scenario 
 for full and incremental backup.
 # The user takes a full backup of a table or a set of tables in HBase. 
 # The user schedules periodical incremental backups to capture the changes 
 from the full backup, or from last incremental backup.
 # The user needs to restore table data to a past point of time.
 # The full backup is restored to the table(s) or to different table name(s).  
 Then the incremental backups that are up to the desired point in time are 
 applied on top of the full backup. 
 We would support the following key features and capabilities.
 * Full backup uses HBase snapshot to capture HFiles.
 * Use HBase WALs to capture incremental changes, but we use bulk load of 
 HFiles for fast incremental restore.
 * Support single table or a set of tables, and column family level backup and 
 restore.
 * Restore to different table names.
 * Support adding additional tables or CF to backup set without interruption 
 of incremental backup schedule.
 * Support rollup/combining of incremental backups into longer period and 
 bigger incremental backups.
 * Unified command line interface for all the above.
 The solution will support HBase backup to FileSystem, either on the same 
 cluster or across clusters.  It has the flexibility to support backup to 
 other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14038) Incremental backup list set is ignored during backup

2015-07-07 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617416#comment-14617416
 ] 

Jerry He commented on HBASE-14038:
--

Hi, [~vrodionov]

In the original design, this is intended.  The incremental backup is controlled 
by the 'incremental backup table set'.
For example, if the table set contains (table1, table2, table3).  Incremental 
backup will back up the WALs, which cover all the tables in the table set.
It is to avoid copying the same set of WALs, which would the likely case if you 
backup up table1, then backup table2.

 Incremental backup list set is ignored during backup
 

 Key: HBASE-14038
 URL: https://issues.apache.org/jira/browse/HBASE-14038
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


 BUG: during incremental backup, provided table list is ignored and replaced 
 with the set of tables which have been already backuped before. Test case: 
 backup T1, T2, T3, then request incremental backup for T1, T2 = T3 will be 
 included as well. See: BackupClient.requestBackup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14038) Incremental backup list set is ignored during backup

2015-07-07 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617454#comment-14617454
 ] 

Vladimir Rodionov commented on HBASE-14038:
---

OK, this is not a bug, but a feature, at least, until we implement WAL 
filtering (by table(s)) on backup. I will leave it for now, but in a future, 
when Phase 2 begins, I will link this JIRA to Phase 2.

 Incremental backup list set is ignored during backup
 

 Key: HBASE-14038
 URL: https://issues.apache.org/jira/browse/HBASE-14038
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


 BUG: during incremental backup, provided table list is ignored and replaced 
 with the set of tables which have been already backuped before. Test case: 
 backup T1, T2, T3, then request incremental backup for T1, T2 = T3 will be 
 included as well. See: BackupClient.requestBackup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics

2015-07-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617464#comment-14617464
 ] 

Ted Yu commented on HBASE-13965:


Loaded patch v5 on a small cluster and obtained the following:
{code}
  }, {
name : Hadoop:service=HBase,name=Master,sub=Balancer,
modelerType : Master,sub=Balancer,
tag.Context : master,
tag.Hostname : cn013.l42scl.hortonworks.com,
IntegrationTestBigLinkedList_StoreFileCostFunction : 3.262317387568032,
IntegrationTestBigLinkedList_LocalityCostFunction : 2.4739584,
IntegrationTestBigLinkedList_TableSkewCostFunction : 5.60546874999,
IntegrationTestBigLinkedList_Overall : 35.09174447090137,
IntegrationTestBigLinkedList_WriteRequestCostFunction : 0.0,
IntegrationTestBigLinkedList_RegionCountSkewCostFunction : 0.0,
IntegrationTestBigLinkedList_ReadRequestCostFunction : 5.0,
IntegrationTestBigLinkedList_MemstoreSizeCostFunction : 0.0,
IntegrationTestBigLinkedList_RegionReplicaHostCostFunction : 0.0,
IntegrationTestBigLinkedList_RegionReplicaRackCostFunction : 0.0,
IntegrationTestBigLinkedList_MoveCostFunction : 18.75,
{code}
Do you think it makes sense to expose each cost (other than Overall) as 
percentage ?
This way, it is easier for user to figure out which cost is the dominant factor.

 Stochastic Load Balancer JMX Metrics
 

 Key: HBASE-13965
 URL: https://issues.apache.org/jira/browse/HBASE-13965
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, metrics
Reporter: Lei Chen
Assignee: Lei Chen
 Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, 
 HBASE-13965-v5.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, 
 stochasticloadbalancerclasses_v2.png


 Today’s default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 A driving example is a cluster we have been tuning which has skewed rack size 
 (one rack has half the nodes of the other few racks). We are tuning the 
 cluster for uniform response time from all region servers with the ability to 
 tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and 
 RegionCountSkew Cost is difficult without a way to attribute each cost 
 function’s contribution to overall cost. 
 What this jira proposes is to provide visibility via JMX into each cost 
 function of the stochastic load balancer, as well as the overall cost of the 
 balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617377#comment-14617377
 ] 

Hudson commented on HBASE-13897:


SUCCESS: Integrated in HBase-TRUNK #6635 (See 
[https://builds.apache.org/job/HBase-TRUNK/6635/])
HBASE-13897 OOM may occur when Import imports a row with too many KeyValues 
(Liu Junhong) (tedyu: rev 1162cbdf15acfc63b64835cb9e7ef29d5b9c6494)
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java


 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617475#comment-14617475
 ] 

Hudson commented on HBASE-13561:


SUCCESS: Integrated in HBase-1.2-IT #42 (See 
[https://builds.apache.org/job/HBase-1.2-IT/42/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 5a11c80aa0fe6e19f16abc9346467e4eef179526)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617470#comment-14617470
 ] 

Hudson commented on HBASE-13561:


SUCCESS: Integrated in HBase-1.3-IT #26 (See 
[https://builds.apache.org/job/HBase-1.3-IT/26/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 4e84ac7924a4f09be05c57ec018c796b960d3760)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616847#comment-14616847
 ] 

Hadoop QA commented on HBASE-13897:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12743970/HBASE-13897-master-20150707.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743970

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14692//console

This message is automatically generated.

 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 0.98.14

 Attachments: HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues

2015-07-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13897:
---
Attachment: 13897-v2.txt

 OOM may occur when Import imports a row with too many KeyValues
 ---

 Key: HBASE-13897
 URL: https://issues.apache.org/jira/browse/HBASE-13897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.13
Reporter: Liu Junhong
Assignee: Liu Junhong
 Fix For: 0.98.14

 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, 
 HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, 
 HBASE-13897-master-20150707.patch, HBASE-13897-master.patch


 When importing a row with too many KeyValues (may have too many columns or 
 versions),KeyValueReducer will incur OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14023) HBase Srores NULL Value from delimited File Input

2015-07-07 Thread Soumendra Kumar Mishra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616863#comment-14616863
 ] 

Soumendra Kumar Mishra commented on HBASE-14023:


Can you provide the mail ID where I need to post the Issue?

 HBase Srores NULL Value from delimited File Input
 -

 Key: HBASE-14023
 URL: https://issues.apache.org/jira/browse/HBASE-14023
 Project: HBase
  Issue Type: Bug
Reporter: Soumendra Kumar Mishra

 Data:
 101,SMITH,41775,,1000,,100,10
 102,ALLEN,,77597,2000,,,20
 103,WARD,,,2000,500,,30
 Result:
 ROW  COLUMN+CELL
 101  column=info:dept, timestamp=1435992182400, value=10
 101  column=info:ename, timestamp=1435992182400, value=SMITH
 101  column=pay:bonus, timestamp=1435992182400, value=100
 101  column=pay:comm, timestamp=1435992182400, value=
 101  column=pay:sal, timestamp=1435992182400, value=1000
 101  column=tel:mobile, timestamp=1435992182400, value=
 101  column=tel:telephone, timestamp=1435992182400, value=41775
 I am using PIG to Write Data into HBase. Same issue happened when Data 
 Inserted from TextFile to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13561:
--
Fix Version/s: 1.0.3
   0.98.14

Pushed the 0.98 and branch-1.0 patches. Thanks [~elserj]

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617550#comment-14617550
 ] 

Hudson commented on HBASE-13561:


FAILURE: Integrated in HBase-1.1 #577 (See 
[https://builds.apache.org/job/HBase-1.1/577/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 555c42a3f1c89196d9d9f0a70cd73fa7464fa42c)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.

2015-07-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13971:
---
Attachment: 13971-v1.txt

 Flushes stuck since 6 hours on a regionserver.
 --

 Key: HBASE-13971
 URL: https://issues.apache.org/jira/browse/HBASE-13971
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.3.0
 Environment: Caused while running IntegrationTestLoadAndVerify for 20 
 M rows on cluster with 32 region servers each with max heap size of 24GBs.
Reporter: Abhilash
Priority: Critical
 Attachments: 13971-v1.txt, jstack.1, jstack.2, jstack.3, jstack.4, 
 jstack.5, rsDebugDump.txt, screenshot-1.png


 One region server stuck while flushing(possible deadlock). Its trying to 
 flush two regions since last 6 hours (see the screenshot).
 Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 
 mapper jobs and 100 back references. ~37 Million writes on each regionserver 
 till now but no writes happening on any regionserver from past 6 hours  and 
 their memstore size is zero(I dont know if this is related). But this 
 particular regionserver has memstore size of 9GBs from past 6 hours.
 Relevant snaps from debug dump:
 Tasks:
 ===
 Task: Flushing 
 IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 8e2d075f94ce7699f416ec4ced9873cd
 Running for 22034s
 Task: Flushing 
 IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 9f8d0e01a40405b835bf6e5a22a86390
 Running for 22033s
 Executors:
 ===
 ...
 Thread 139 (MemStoreFlusher.1):
   State: WAITING
   Blocked count: 139711
   Waited count: 239212
   Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902)
 org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
 java.lang.Thread.run(Thread.java:745)
 Thread 137 (MemStoreFlusher.0):
   State: WAITING
   Blocked count: 138931
   Waited count: 237448
   Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
 

[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.

2015-07-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13971:
---
Attachment: (was: 13971-v1.txt)

 Flushes stuck since 6 hours on a regionserver.
 --

 Key: HBASE-13971
 URL: https://issues.apache.org/jira/browse/HBASE-13971
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.3.0
 Environment: Caused while running IntegrationTestLoadAndVerify for 20 
 M rows on cluster with 32 region servers each with max heap size of 24GBs.
Reporter: Abhilash
Priority: Critical
 Attachments: 13971-v1.txt, jstack.1, jstack.2, jstack.3, jstack.4, 
 jstack.5, rsDebugDump.txt, screenshot-1.png


 One region server stuck while flushing(possible deadlock). Its trying to 
 flush two regions since last 6 hours (see the screenshot).
 Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 
 mapper jobs and 100 back references. ~37 Million writes on each regionserver 
 till now but no writes happening on any regionserver from past 6 hours  and 
 their memstore size is zero(I dont know if this is related). But this 
 particular regionserver has memstore size of 9GBs from past 6 hours.
 Relevant snaps from debug dump:
 Tasks:
 ===
 Task: Flushing 
 IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 8e2d075f94ce7699f416ec4ced9873cd
 Running for 22034s
 Task: Flushing 
 IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 9f8d0e01a40405b835bf6e5a22a86390
 Running for 22033s
 Executors:
 ===
 ...
 Thread 139 (MemStoreFlusher.1):
   State: WAITING
   Blocked count: 139711
   Waited count: 239212
   Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902)
 org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
 java.lang.Thread.run(Thread.java:745)
 Thread 137 (MemStoreFlusher.0):
   State: WAITING
   Blocked count: 138931
   Waited count: 237448
   Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
 

[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617627#comment-14617627
 ] 

Hudson commented on HBASE-13561:


SUCCESS: Integrated in HBase-1.3 #41 (See 
[https://builds.apache.org/job/HBase-1.3/41/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 4e84ac7924a4f09be05c57ec018c796b960d3760)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617641#comment-14617641
 ] 

Hudson commented on HBASE-13561:


FAILURE: Integrated in HBase-TRUNK #6636 (See 
[https://builds.apache.org/job/HBase-TRUNK/6636/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev f5ad736282c8c9c27b14131919d60b72834ec9e4)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13971) Flushes stuck since 6 hours on a regionserver.

2015-07-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13971:
---
Attachment: 13971-v1.txt

First attempt.

Set upper limit for the duration for which HRegion waits for sequence number to 
be assigned.

 Flushes stuck since 6 hours on a regionserver.
 --

 Key: HBASE-13971
 URL: https://issues.apache.org/jira/browse/HBASE-13971
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.3.0
 Environment: Caused while running IntegrationTestLoadAndVerify for 20 
 M rows on cluster with 32 region servers each with max heap size of 24GBs.
Reporter: Abhilash
Priority: Critical
 Attachments: 13971-v1.txt, jstack.1, jstack.2, jstack.3, jstack.4, 
 jstack.5, rsDebugDump.txt, screenshot-1.png


 One region server stuck while flushing(possible deadlock). Its trying to 
 flush two regions since last 6 hours (see the screenshot).
 Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600 
 mapper jobs and 100 back references. ~37 Million writes on each regionserver 
 till now but no writes happening on any regionserver from past 6 hours  and 
 their memstore size is zero(I dont know if this is related). But this 
 particular regionserver has memstore size of 9GBs from past 6 hours.
 Relevant snaps from debug dump:
 Tasks:
 ===
 Task: Flushing 
 IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 8e2d075f94ce7699f416ec4ced9873cd
 Running for 22034s
 Task: Flushing 
 IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390.
 Status: RUNNING:Preparing to flush by snapshotting stores in 
 9f8d0e01a40405b835bf6e5a22a86390
 Running for 22033s
 Executors:
 ===
 ...
 Thread 139 (MemStoreFlusher.1):
   State: WAITING
   Blocked count: 139711
   Waited count: 239212
   Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902)
 org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
 java.lang.Thread.run(Thread.java:745)
 Thread 137 (MemStoreFlusher.0):
   State: WAITING
   Blocked count: 138931
   Waited count: 237448
   Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
 java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
 org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
 
 org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
 

[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617655#comment-14617655
 ] 

Hudson commented on HBASE-13561:


FAILURE: Integrated in HBase-1.2 #57 (See 
[https://builds.apache.org/job/HBase-1.2/57/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 5a11c80aa0fe6e19f16abc9346467e4eef179526)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions

2015-07-07 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617698#comment-14617698
 ] 

Ishan Chhabra commented on HBASE-6028:
--

[~esteban], are you working on this?

 Implement a cancel for in-progress compactions
 --

 Key: HBASE-6028
 URL: https://issues.apache.org/jira/browse/HBASE-6028
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Derek Wollenstein
Assignee: Esteban Gutierrez
Priority: Minor
  Labels: beginner

 Depending on current server load, it can be extremely expensive to run 
 periodic minor / major compactions.  It would be helpful to have a feature 
 where a user could use the shell or a client tool to explicitly cancel an 
 in-progress compactions.  This would allow a system to recover when too many 
 regions became eligible for compactions at once



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Gaurav Bhardwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617701#comment-14617701
 ] 

Gaurav Bhardwaj commented on HBASE-13867:
-

Is there any restriction that the length of a line should be less than 100 
characters? By line I mean number of characters (including white spaces) 
between two carriage return. 
Note: http://hbase.apache.org/book.html, has many occurrences where line length 
is greater than 100.

[~vrodionov], [~gliptak] please suggest.

 Add endpoint coprocessor guide to HBase book
 

 Key: HBASE-13867
 URL: https://issues.apache.org/jira/browse/HBASE-13867
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, documentation
Reporter: Vladimir Rodionov
Assignee: Gaurav Bhardwaj
 Attachments: HBASE-13867.1.patch


 Endpoint coprocessors are very poorly documented.
 Coprocessor section of HBase book must be updated either with its own 
 endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some 
 other guides. There is good description here:
 http://www.3pillarglobal.com/insights/hbase-coprocessors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-13561:
---
Attachment: HBASE-13561-branch-1.0-v2.patch
HBASE-13561-0.98-v2.patch

v2 patches for 0.98 and branch-1.0

 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-14023) HBase Srores NULL Value from delimited File Input

2015-07-07 Thread Pankaj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar reassigned HBASE-14023:


Assignee: Pankaj Kumar

 HBase Srores NULL Value from delimited File Input
 -

 Key: HBASE-14023
 URL: https://issues.apache.org/jira/browse/HBASE-14023
 Project: HBase
  Issue Type: Bug
Reporter: Soumendra Kumar Mishra
Assignee: Pankaj Kumar

 Data:
 101,SMITH,41775,,1000,,100,10
 102,ALLEN,,77597,2000,,,20
 103,WARD,,,2000,500,,30
 Result:
 ROW  COLUMN+CELL
 101  column=info:dept, timestamp=1435992182400, value=10
 101  column=info:ename, timestamp=1435992182400, value=SMITH
 101  column=pay:bonus, timestamp=1435992182400, value=100
 101  column=pay:comm, timestamp=1435992182400, value=
 101  column=pay:sal, timestamp=1435992182400, value=1000
 101  column=tel:mobile, timestamp=1435992182400, value=
 101  column=tel:telephone, timestamp=1435992182400, value=41775
 I am using PIG to Write Data into HBase. Same issue happened when Data 
 Inserted from TextFile to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616633#comment-14616633
 ] 

Hadoop QA commented on HBASE-12295:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12743926/HBASE-12295_10.patch
  against master branch at commit 7acb061e63614ad957da654f920f54ac7a02edd6.
  ATTACHMENT ID: 12743926

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 41 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1908 checkstyle errors (more than the master's current 1898 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  ret = new 
SharedMemorySizeCachedNoTagsKeyValue(blockBuffer.array(), 
blockBuffer.arrayOffset()
+// those cells are referring to a shared memory area which if evicted by 
the BucketCache would lead
+// readers using this block are aware of this fact and do the necessary 
action to prevent eviction
+// An RpcCallBack that creates a list of scanners that needs to perform 
callBack operation on completion of multiGets
+return Result.create(results, get.isCheckExistenceOnly() ? 
!results.isEmpty() : null, stale);
+  // HBaseAdmin only waits for regions to appear in hbase:meta we should 
wait until they are assigned
+  public void testGetsWithMultiColumnsAndExplicitTracker() throws IOException, 
InterruptedException {
+private void slowdownCode(final 
ObserverContextRegionCoprocessorEnvironment e, boolean isGet) {
+// call return twice because for the isCache cased the counter would 
have got incremented twice

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14690//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14690//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14690//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14690//console

This message is automatically generated.

 Prevent block eviction under us if reads are in progress from the BBs
 -

 Key: HBASE-12295
 URL: https://issues.apache.org/jira/browse/HBASE-12295
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
 HBASE-12295_10.patch, HBASE-12295_2.patch, HBASE-12295_4.patch, 
 HBASE-12295_4.pdf, HBASE-12295_5.pdf, HBASE-12295_9.patch, 
 HBASE-12295_trunk.patch


 While we try to serve the reads from the BBs directly from the block cache, 
 we need to ensure that the blocks does not get evicted under us while 
 reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

2015-07-07 Thread Heng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen reassigned HBASE-2236:


Assignee: Heng Chen

 Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
 --

 Key: HBASE-2236
 URL: https://issues.apache.org/jira/browse/HBASE-2236
 Project: HBase
  Issue Type: Bug
  Components: regionserver, wal
Reporter: stack
Assignee: Heng Chen
Priority: Critical
  Labels: moved_from_0_20_5

 So hbase-2053 is not aggressive enough.  WALs can still overwhelm the upper 
 limit on log count.  While the code added by HBASE-2053, when done, will 
 ensure we let go of the oldest WAL, to do it, we might have to flush many 
 regions.  E.g:
 {code}
 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too 
 many hlogs: logs=45, maxlogs=32; forcing flush of 5 regions(s): 
 test1,193717,1266095474624, test1,194375,1266108228663, 
 test1,195690,1266095539377, test1,196348,1266095539377, 
 test1,197939,1266069173999
 {code}
 This takes time.  If we are taking on edits a furious rate, we might have 
 rolled the log again, meantime, maybe more than once.
 Also log rolls happen inline with a put/delete as soon as it hits the 64MB 
 (default) boundary whereas the necessary flushing is done in background by a 
 single thread and the memstore can overrun the (default) 64MB size.  Flushes 
 needed to release logs will be mixed in with natural flushes as memstores 
 fill.  Flushes may take longer than the writing of an HLog because they can 
 be larger.
 So, on an RS that is struggling the tendency would seem to be for a slight 
 rise in WALs.  Only if the RS gets a breather will the flusher catch up.
 If HBASE-2087 happens, then the count of WALs get a boost.
 Ideas to fix this for good would be :
 + Priority queue for queuing up flushes with those that are queued to free up 
 WALs having priority
 + Improve the HBASE-2053 code so that it will free more than just the last 
 WAL, maybe even queuing flushes so we clear all WALs such that we are back 
 under the maximum WALS threshold again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14029) getting started for standalone still references hadoop-version-specific binary artifacts

2015-07-07 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-14029:
---

 Summary: getting started for standalone still references 
hadoop-version-specific binary artifacts
 Key: HBASE-14029
 URL: https://issues.apache.org/jira/browse/HBASE-14029
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.0.0
Reporter: Sean Busbey


As of HBase 1.0 we no longer have binary artifacts that are tied to a 
particular hadoop release. The current section of the ref guide for getting 
started with standalone mode still refers to them:

{quote}
Choose a download site from this list of Apache Download Mirrors. Click on the 
suggested top link. This will take you to a mirror of HBase Releases. Click on 
the folder named stable and then download the binary file that ends in .tar.gz 
to your local filesystem. Be sure to choose the version that corresponds with 
the version of Hadoop you are likely to use later. In most cases, you should 
choose the file for Hadoop 2, which will be called something like 
hbase-0.98.3-hadoop2-bin.tar.gz. Do not download the file ending in src.tar.gz 
for now.
{quote}

Either remove the reference or turn it into a note call-out for versions 0.98 
and earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop

2015-07-07 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616483#comment-14616483
 ] 

Pankaj Kumar commented on HBASE-14000:
--

[~tedyu], I tried but I am not able to simulate this scenario in test case.

 Region server failed to report Master and stuck in reportForDuty retry loop
 ---

 Key: HBASE-14000
 URL: https://issues.apache.org/jira/browse/HBASE-14000
 Project: HBase
  Issue Type: Bug
Reporter: Pankaj Kumar
Assignee: Pankaj Kumar
 Attachments: HBASE-14000.patch


 In a HA cluster, region server got stuck in reportForDuty retry loop if the 
 active master is restarting and later on master switch happens before it 
 reports successfully.
 Root cause is same as HBASE-13317, but the region server tried to connect 
 master when it was starting, so rssStub reset didnt happen as
 {code}
   if (ioe instanceof ServerNotRunningYetException) {
   LOG.debug(Master is not running yet);
   }
 {code}
 When master starts, master switch happened. So RS always tried to connect to 
 standby master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

2015-07-07 Thread Heng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen updated HBASE-2236:
-
Assignee: (was: Heng Chen)

 Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
 --

 Key: HBASE-2236
 URL: https://issues.apache.org/jira/browse/HBASE-2236
 Project: HBase
  Issue Type: Bug
  Components: regionserver, wal
Reporter: stack
Priority: Critical
  Labels: moved_from_0_20_5

 So hbase-2053 is not aggressive enough.  WALs can still overwhelm the upper 
 limit on log count.  While the code added by HBASE-2053, when done, will 
 ensure we let go of the oldest WAL, to do it, we might have to flush many 
 regions.  E.g:
 {code}
 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too 
 many hlogs: logs=45, maxlogs=32; forcing flush of 5 regions(s): 
 test1,193717,1266095474624, test1,194375,1266108228663, 
 test1,195690,1266095539377, test1,196348,1266095539377, 
 test1,197939,1266069173999
 {code}
 This takes time.  If we are taking on edits a furious rate, we might have 
 rolled the log again, meantime, maybe more than once.
 Also log rolls happen inline with a put/delete as soon as it hits the 64MB 
 (default) boundary whereas the necessary flushing is done in background by a 
 single thread and the memstore can overrun the (default) 64MB size.  Flushes 
 needed to release logs will be mixed in with natural flushes as memstores 
 fill.  Flushes may take longer than the writing of an HLog because they can 
 be larger.
 So, on an RS that is struggling the tendency would seem to be for a slight 
 rise in WALs.  Only if the RS gets a breather will the flusher catch up.
 If HBASE-2087 happens, then the count of WALs get a boost.
 Ideas to fix this for good would be :
 + Priority queue for queuing up flushes with those that are queued to free up 
 WALs having priority
 + Improve the HBASE-2053 code so that it will free more than just the last 
 WAL, maybe even queuing flushes so we clear all WALs such that we are back 
 under the maximum WALS threshold again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13988) Add exception handler for lease thread

2015-07-07 Thread Liu Shaohui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616541#comment-14616541
 ] 

Liu Shaohui commented on HBASE-13988:
-

The addendum for patch v001 just remove the out-of-date comments about lease 
thread.
I will commit it tomorrow if no objection. 

 Add exception handler for lease thread
 --

 Key: HBASE-13988
 URL: https://issues.apache.org/jira/browse/HBASE-13988
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2, 1.3.0, 0.98.15

 Attachments: HBASE-13988-addendum.diff, HBASE-13988-v001.diff, 
 HBASE-13988-v002.diff


 In a prod cluster, a region server exited for some important 
 threads were not alive. After excluding other threads from the log, we 
 doubted the lease thread was the root. 
 So we need to add an exception handler to the lease thread to debug why it 
 exited in future.
  
 {quote}
 2015-06-29,12:46:09,222 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: One or more 
 threads are no longer alive -- stop
 2015-06-29,12:46:09,223 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 21600
 ...
 2015-06-29,12:46:09,330 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Thread-37 exiting
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver21600.compactionChecker exiting
 2015-06-29,12:46:12,403 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher: 
 regionserver21600.periodicFlusher exiting
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13988) Add exception handler for lease thread

2015-07-07 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-13988:

Attachment: HBASE-13988-addendum.diff

addendum for patch v001

 Add exception handler for lease thread
 --

 Key: HBASE-13988
 URL: https://issues.apache.org/jira/browse/HBASE-13988
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2, 1.3.0, 0.98.15

 Attachments: HBASE-13988-addendum.diff, HBASE-13988-v001.diff, 
 HBASE-13988-v002.diff


 In a prod cluster, a region server exited for some important 
 threads were not alive. After excluding other threads from the log, we 
 doubted the lease thread was the root. 
 So we need to add an exception handler to the lease thread to debug why it 
 exited in future.
  
 {quote}
 2015-06-29,12:46:09,222 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: One or more 
 threads are no longer alive -- stop
 2015-06-29,12:46:09,223 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
 server on 21600
 ...
 2015-06-29,12:46:09,330 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
 LogRoller exiting.
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Thread-37 exiting
 2015-06-29,12:46:09,330 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: 
 regionserver21600.compactionChecker exiting
 2015-06-29,12:46:12,403 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher: 
 regionserver21600.periodicFlusher exiting
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12295:
---
Status: Open  (was: Patch Available)

 Prevent block eviction under us if reads are in progress from the BBs
 -

 Key: HBASE-12295
 URL: https://issues.apache.org/jira/browse/HBASE-12295
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
 HBASE-12295_2.patch, HBASE-12295_4.patch, HBASE-12295_4.pdf, 
 HBASE-12295_5.pdf, HBASE-12295_9.patch, HBASE-12295_trunk.patch


 While we try to serve the reads from the BBs directly from the block cache, 
 we need to ensure that the blocks does not get evicted under us while 
 reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12295:
---
Status: Patch Available  (was: Open)

 Prevent block eviction under us if reads are in progress from the BBs
 -

 Key: HBASE-12295
 URL: https://issues.apache.org/jira/browse/HBASE-12295
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
 HBASE-12295_10.patch, HBASE-12295_2.patch, HBASE-12295_4.patch, 
 HBASE-12295_4.pdf, HBASE-12295_5.pdf, HBASE-12295_9.patch, 
 HBASE-12295_trunk.patch


 While we try to serve the reads from the BBs directly from the block cache, 
 we need to ensure that the blocks does not get evicted under us while 
 reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-07-07 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12295:
---
Attachment: HBASE-12295_10.patch

Updated patch based on RB comments.

 Prevent block eviction under us if reads are in progress from the BBs
 -

 Key: HBASE-12295
 URL: https://issues.apache.org/jira/browse/HBASE-12295
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
 HBASE-12295_10.patch, HBASE-12295_2.patch, HBASE-12295_4.patch, 
 HBASE-12295_4.pdf, HBASE-12295_5.pdf, HBASE-12295_9.patch, 
 HBASE-12295_trunk.patch


 While we try to serve the reads from the BBs directly from the block cache, 
 we need to ensure that the blocks does not get evicted under us while 
 reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617754#comment-14617754
 ] 

Hudson commented on HBASE-13561:


FAILURE: Integrated in HBase-1.0 #989 (See 
[https://builds.apache.org/job/HBase-1.0/989/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 4fcc3103bda026a9b89414191896a6042af6e01d)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13561) ITBLL.Verify doesn't actually evaluate counters after job completes

2015-07-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617803#comment-14617803
 ] 

Hudson commented on HBASE-13561:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1004 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1004/])
HBASE-13561 ITBLL.Verify doesn't actually evaluate counters after job completes 
(Josh Elser) (stack: rev 2f6ef83adc203d6979e11f9527efe242d59ae04d)
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 ITBLL.Verify doesn't actually evaluate counters after job completes
 ---

 Key: HBASE-13561
 URL: https://issues.apache.org/jira/browse/HBASE-13561
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.12
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3

 Attachments: HBASE-13561-0.98-v2.patch, 
 HBASE-13561-branch-1.0-v2.patch, HBASE-13561-v1.patch, HBASE-13561-v2.patch


 Was digging through ITBLL and noticed this oddity:
 The {{Verify}} Tool doesn't actually call {{Verify#verify(long)}} like the 
 {{Loop}} Tool does. Granted, it doesn't know the total number of KVs that 
 were written given the current arguments, it's not even checking to see if 
 there things like UNDEFINED records found.
 It seems to me that {{Verify}} should really be doing *some* checking on the 
 counters like {{Loop}} does and not just leaving it up to the visual 
 inspection of whomever launched the task.
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13867) Add endpoint coprocessor guide to HBase book

2015-07-07 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617832#comment-14617832
 ] 

Gabor Liptak commented on HBASE-13867:
--

[~ndimiduk] Would the 100 character length limit apply to adoc files too? Thanks


 Add endpoint coprocessor guide to HBase book
 

 Key: HBASE-13867
 URL: https://issues.apache.org/jira/browse/HBASE-13867
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, documentation
Reporter: Vladimir Rodionov
Assignee: Gaurav Bhardwaj
 Attachments: HBASE-13867.1.patch


 Endpoint coprocessors are very poorly documented.
 Coprocessor section of HBase book must be updated either with its own 
 endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some 
 other guides. There is good description here:
 http://www.3pillarglobal.com/insights/hbase-coprocessors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13832) Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low

2015-07-07 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13832:

Priority: Blocker  (was: Critical)

 Procedure V2: master fail to start due to WALProcedureStore sync failures 
 when HDFS data nodes count is low
 ---

 Key: HBASE-13832
 URL: https://issues.apache.org/jira/browse/HBASE-13832
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0, 1.1.0, 1.2.0
Reporter: Stephen Yuan Jiang
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0

 Attachments: HBASE-13832-v0.patch, HBASE-13832-v1.patch, 
 HBASE-13832-v2.patch, HBASE-13832-v4.patch, HBASE-13832-v5.patch, 
 HDFSPipeline.java, hbase-13832-test-hang.patch, hbase-13832-v3.patch


 when the data node  3, we got failure in WALProcedureStore#syncLoop() during 
 master start.  The failure prevents master to get started.  
 {noformat}
 2015-05-29 13:27:16,625 ERROR [WALProcedureStoreSyncThread] 
 wal.WALProcedureStore: Sync slot failed, abort.
 java.io.IOException: Failed to replace a bad datanode on the existing 
 pipeline due to no more good datanodes being available to try. (Nodes: 
 current=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK],
  
 DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-490ece56c772,DISK]],
  
 original=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK],
  DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-
 490ece56c772,DISK]]). The current failed datanode replacement policy is 
 DEFAULT, and a client may configure this via 
 'dfs.client.block.write.replace-datanode-on-failure.policy'  in its 
 configuration.
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951)
 {noformat}
 One proposal is to implement some similar logic as FSHLog: if IOException is 
 thrown during syncLoop in WALProcedureStore#start(), instead of immediate 
 abort, we could try to roll the log and see whether this resolve the issue; 
 if the new log cannot be created or more exception from rolling the log, we 
 then abort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-07-07 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13415:

Priority: Blocker  (was: Major)

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14023) HBase Srores NULL Value from delimited File Input

2015-07-07 Thread Pankaj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-14023:
-
Assignee: (was: Pankaj Kumar)

 HBase Srores NULL Value from delimited File Input
 -

 Key: HBASE-14023
 URL: https://issues.apache.org/jira/browse/HBASE-14023
 Project: HBase
  Issue Type: Bug
Reporter: Soumendra Kumar Mishra

 Data:
 101,SMITH,41775,,1000,,100,10
 102,ALLEN,,77597,2000,,,20
 103,WARD,,,2000,500,,30
 Result:
 ROW  COLUMN+CELL
 101  column=info:dept, timestamp=1435992182400, value=10
 101  column=info:ename, timestamp=1435992182400, value=SMITH
 101  column=pay:bonus, timestamp=1435992182400, value=100
 101  column=pay:comm, timestamp=1435992182400, value=
 101  column=pay:sal, timestamp=1435992182400, value=1000
 101  column=tel:mobile, timestamp=1435992182400, value=
 101  column=tel:telephone, timestamp=1435992182400, value=41775
 I am using PIG to Write Data into HBase. Same issue happened when Data 
 Inserted from TextFile to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13387) Add ByteBufferedCell an extension to Cell

2015-07-07 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-13387:
---
Attachment: HBASE-13387_v2.patch

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, 
 HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, 
 benchmark.zip


 This came in btw the discussion abt the parent Jira and recently Stack added 
 as a comment on the E2E patch on the parent Jira.
 The idea is to add a new Interface 'ByteBufferedCell'  in which we can add 
 new buffer based getter APIs and getters for position in components in BB.  
 We will keep this interface @InterfaceAudience.Private.   When the Cell is 
 backed by a DBB, we can create an Object implementing this new interface.
 The Comparators has to be aware abt this new Cell extension and has to use 
 the BB based APIs rather than getXXXArray().  Also give util APIs in CellUtil 
 to abstract the checks for new Cell type.  (Like matchingXXX APIs, 
 getValueAstype APIs etc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14023) HBase Srores NULL Value from delimited File Input

2015-07-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14023.
---
Resolution: Invalid

Please ask the question on the mailing list rather than file an issue.

 HBase Srores NULL Value from delimited File Input
 -

 Key: HBASE-14023
 URL: https://issues.apache.org/jira/browse/HBASE-14023
 Project: HBase
  Issue Type: Bug
Reporter: Soumendra Kumar Mishra

 Data:
 101,SMITH,41775,,1000,,100,10
 102,ALLEN,,77597,2000,,,20
 103,WARD,,,2000,500,,30
 Result:
 ROW  COLUMN+CELL
 101  column=info:dept, timestamp=1435992182400, value=10
 101  column=info:ename, timestamp=1435992182400, value=SMITH
 101  column=pay:bonus, timestamp=1435992182400, value=100
 101  column=pay:comm, timestamp=1435992182400, value=
 101  column=pay:sal, timestamp=1435992182400, value=1000
 101  column=tel:mobile, timestamp=1435992182400, value=
 101  column=tel:telephone, timestamp=1435992182400, value=41775
 I am using PIG to Write Data into HBase. Same issue happened when Data 
 Inserted from TextFile to HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-07-07 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616736#comment-14616736
 ] 

Sean Busbey commented on HBASE-13415:
-

bumping priority to Blocker for 1.2 per request from [~enis]. How's this coming 
along?

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table

2015-07-07 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617966#comment-14617966
 ] 

Anoop Sam John commented on HBASE-12015:


[~jingchengdu] Any chance for a patch?  This is the only pending issue in MOB 
branch.
Any of our Huawei friends ready to take this up? cc [~ashish singhi], 
[~ashutosh_jindal]

 Not cleaning Mob data when Mob CF is removed from table
 ---

 Key: HBASE-12015
 URL: https://issues.apache.org/jira/browse/HBASE-12015
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-11339
Reporter: Anoop Sam John
 Fix For: hbase-11339


 During modifyTable, if a MOB CF is removed from a table, the corresponding 
 mob data also should get removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-07-07 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617994#comment-14617994
 ] 

Stephen Yuan Jiang commented on HBASE-13415:


I run the TestModifyTableProcedure test suite under both maven (mvn) and 
eclipse - everything is normal.

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13415.v1-master.patch


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12596) bulkload needs to follow locality

2015-07-07 Thread Victor Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Xu updated HBASE-12596:
--
Attachment: HBASE-12596-master-v5.patch
HBASE-12596-0.98-v5.patch

Address Ashish's comments and add a unit test for 
hbase.bulkload.locality.sensitive.enabled = true.

 bulkload needs to follow locality
 -

 Key: HBASE-12596
 URL: https://issues.apache.org/jira/browse/HBASE-12596
 Project: HBase
  Issue Type: Improvement
  Components: HFile, regionserver
Affects Versions: 0.98.8
 Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7
Reporter: Victor Xu
Assignee: Victor Xu
 Fix For: 0.98.14

 Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, 
 HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, 
 HBASE-12596-0.98-v5.patch, HBASE-12596-master-v1.patch, 
 HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, 
 HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596.patch


 Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles 
 to be loaded; 2. Move these HFiles to the right hdfs directory. However, the 
 locality could be loss during the first step. Why not just write the HFiles 
 directly into the right place? We can do this easily because 
 StoreFile.WriterBuilder has the withFavoredNodes method, and we just need 
 to call it in HFileOutputFormat's getNewWriter().
 This feature is enabled by default, and we could use 
 'hbase.bulkload.locality.sensitive.enabled=false' to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M

2015-07-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617997#comment-14617997
 ] 

stack commented on HBASE-14028:
---

I have been playing more with this. Losing data is pretty easy to do. Trying to 
find why the end of a WAL goes missing during replay; there is not enough info 
to debug and it is a little tough to trace where we're at at any one time. 
Trying to back fill.

 DistributedLogReplay drops edits when ITBLL 125M
 

 Key: HBASE-14028
 URL: https://issues.apache.org/jira/browse/HBASE-14028
 Project: HBase
  Issue Type: Bug
  Components: Recovery
Affects Versions: 1.2.0
Reporter: stack

 Testing DLR before 1.2.0RC gets cut, we are dropping edits.
 Issue seems to be around replay into a deployed region that is on a server 
 that dies before all edits have finished replaying. Logging is sparse on 
 sequenceid accounting so can't tell for sure how it is happening (and if our 
 now accounting by Store is messing up DLR). Digging.
 I notice also that DLR does not refresh its cache of region location on error 
 -- it just keeps trying till whole WAL fails 8 retries...about 30 
 seconds. We could do a bit of refactor and have the replay find region in new 
 location if moved during DLR replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics

2015-07-07 Thread Lei Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617922#comment-14617922
 ] 

Lei Chen commented on HBASE-13965:
--

Thanks for testing the patch and posting the result metrics.
I agree that using percentage is easier for quick look. 
I will update the patch.

 Stochastic Load Balancer JMX Metrics
 

 Key: HBASE-13965
 URL: https://issues.apache.org/jira/browse/HBASE-13965
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, metrics
Reporter: Lei Chen
Assignee: Lei Chen
 Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, 
 HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, 
 HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png


 Today’s default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 A driving example is a cluster we have been tuning which has skewed rack size 
 (one rack has half the nodes of the other few racks). We are tuning the 
 cluster for uniform response time from all region servers with the ability to 
 tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and 
 RegionCountSkew Cost is difficult without a way to attribute each cost 
 function’s contribution to overall cost. 
 What this jira proposes is to provide visibility via JMX into each cost 
 function of the stochastic load balancer, as well as the overall cost of the 
 balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >