Here's a few things David.

Regards your tool, you could have just done 'select info:regioninfo from .META.;' and it would output same data (If you did something like "echo 'select info:regioninfo from .META.;' |./bin/hbase shell --html &> /tmp/meta.html", the output would be html'ized and easier to read than an ascii table).

If you want to do merging of regions, check out the main on org.apache.hadoop.hbase.util.Merge.

Regards offline regions, looking at your report below, all offlined regions look legit. Their online status is offline but they also have the split attribute set (On split, the parent is offlined. The daughter regions take its place. The parent hangs around until such time as the daughters no longer hold reference to the parent. Then the parent is deleted).

Regards the 144 missing rows, is it possible you fed your map task duplicates? The duplicates would increment the map count of inputs processed but reduce would squash the duplicates together and output a single row. If you don't have that many rows, perhaps output inputs and outputs and try to figure where the 144 are going missing?

Regards hbase buckling under load, please send us logs. If you are using TRUNK, it should be able to easily carry ten concurrent clients and where it can't, it puts up a gate to block updates. It shouldn't be falling over.

Thanks D,
St.Ack

David Alves wrote:
Hi Guys

        Regarding my previous problems  I'm glad to say that I can now crawl an
entire repository with only a small percentage of failed tasks, last
hbase version plus the correction of replication property seemed to
solve it for me.
Still I have two issues I'd appreciate your input in. The first one regards splits. I've made a small tool (built upon
stack's one) that checks DB state, and can online/offline tables and
merge regions etc. This tool gives me the report ant the end of this
email. The question here Is that I seem to have lost 144 rows (comparing
the output formats output records and the actual rows in the table from
a select count(*)). I suspect these rows are in the offline splits. Can
I use my tool to merge the splits against their online parents using
HRegion.merge() ? Or is it a big no no.
        The second issue is more problematic, I misconfigured my last job and
it ran 10 maps instead of the 1 it should, but when under that kind of
load hbase completely failed, regionservers went down, at one time I had
to completely erase the database because it wouldn't start again (I
suspect .META. was offline) the other time I was able to recover all the
data by simply restarting it. Is there any kind of procedure I should
use in this situation?
o
Best Regards
David Alves

Log Trace:
Found region: cyclops-documents-database,,1208892792201
Id: 1208892792201
Start Key: End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm
Online/Offline
 Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm,1208892792202
Id: 1208892792202
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
Online/Offline Status: ONLINE
Split?: FALSE
DEBUG 23-04 14:54:50,744 (DFSClient.java:readChunk:934)  -DFSClient
readChunk got seqno 2 offsetInBlock 8192 lastPacketInBlock false
packetLen 4132
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm,1208891918491
Id: 1208891918491
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
Online/Offline Status: OFFLINE
Split?: TRUE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm,1208893494772
Id: 1208893494772
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
Online/Offline Status: ONLINE
Split?: FALSE
DEBUG 23-04 14:54:50,754 (DFSClient.java:readChunk:934)  -DFSClient
readChunk got seqno 3 offsetInBlock 12288 lastPacketInBlock false
packetLen 4132
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm,1208893494773
Id: 1208893494773
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
Online/Offline Status: OFFLINE
Split?: TRUE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm,1208894034845
Id: 1208894034845
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch40/01.htm
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch40/01.htm,1208896414707
Id: 1208896414707
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch40/01.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Delphi/Delphi
 Informant [1995-2003]/Works/95index.PDF
Online/Offline Status: ONLINE
Split?: FALSE
DEBUG 23-04 14:54:50,756 (DFSClient.java:readChunk:934)  -DFSClient
readChunk got seqno 4 offsetInBlock 16384 lastPacketInBlock true
packetLen 3402
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Delphi/Delphi
 Informant [1995-2003]/Works/95index.PDF,1208896478277
Id: 1208896478277
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Delphi/Delphi
 Informant [1995-2003]/Works/95index.PDF
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Java/java-look-feel-design-guidelines-2nd/HIG.Text3.html
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Java/java-look-feel-design-guidelines-2nd/HIG.Text3.html,1208896478277
Id: 1208896478277
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Programming/Java/java-look-feel-design-guidelines-2nd/HIG.Text3.html
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm,1208891918491
Id: 1208891918491
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION, USING LOTUS NOTES/e-book/ch25.htm
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION, USING LOTUS NOTES/e-book/ch25.htm,1208891541773
Id: 1208891541773
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION, USING LOTUS NOTES/e-book/ch25.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/oreilly-cgionwww/oreilly-cgionwww/ch01_02.txt
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/oreilly-cgionwww/oreilly-cgionwww/ch01_02.txt,1208891541774
Id: 1208891541774
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/oreilly-cgionwww/oreilly-cgionwww/ch01_02.txt
End Key: Online/Offline Status: ONLINE
Split?: FALSE
Found region: cyclops-links-database,,1208891170959
Id: 1208891170959
Start Key: End Key: Online/Offline Status: ONLINE
Split?: FALSE
        

Reply via email to