[jira] [Commented] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220729#comment-13220729 ] Phabricator commented on HBASE-5506: sc has commented on the revision HBASE-5506 [jira] Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo(). @stack: I add some exclude conditions to make the test pass (see the inline comments). If we remove those the test will fail. Is this the right way to do? What do you think? INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerCmdLine.java:95 if we comment this, the test will fail. REVISION DETAIL https://reviews.facebook.net/D2031 BRANCH test-getregioninfo Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5140) TableInputFormat subclass to allow N number of splits per region during MR jobs
[ https://issues.apache.org/jira/browse/HBASE-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220725#comment-13220725 ] Rajesh Balamohan commented on HBASE-5140: - @Josh - Thanks for this patch. for loop within getSplits() generates the splits with the help of generateRegionSplits(). However, the returned listInputSplit is not added back to ListInputSplit splits = new ArrayListInputSplit(keys.getFirst().length); TableInputFormat subclass to allow N number of splits per region during MR jobs --- Key: HBASE-5140 URL: https://issues.apache.org/jira/browse/HBASE-5140 Project: HBase Issue Type: New Feature Components: mapreduce Affects Versions: 0.90.4 Reporter: Josh Wymer Priority: Trivial Labels: mapreduce, split Fix For: 0.90.4 Attachments: Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch, Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1, Added_functionality_to_split_n_times_per_region_on_mapreduce_jobs.patch Original Estimate: 72h Remaining Estimate: 72h In regards to [HBASE-5138|https://issues.apache.org/jira/browse/HBASE-5138] I am working on a patch for the TableInputFormat class that overrides getSplits in order to generate N number of splits per regions and/or N number of splits per job. The idea is to convert the startKey and endKey for each region from byte[] to BigDecimal, take the difference, divide by N, convert back to byte[] and generate splits on the resulting values. Assuming your keys are fully distributed this should generate splits at nearly the same number of rows per split. Any suggestions on this issue are welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5504) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220734#comment-13220734 ] Lars Hofhansl commented on HBASE-5504: -- .bq 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M) That's would be awesome for a variety of other reason. For example snapshots. Online Merge Key: HBASE-5504 URL: https://issues.apache.org/jira/browse/HBASE-5504 Project: HBase Issue Type: Brainstorming Components: client, master, shell, zookeeper Affects Versions: 0.94.0 Reporter: Mubarak Seyed Fix For: 0.96.0 As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991] Design suggestion from Stack: {quote} I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA. Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing. (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper) 1. Client calls merge or deleteRegion API. API is a range of rows. (C) 2. Master gets call. (M) 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M - ZK) 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M) 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M - C) 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M) 7. Write into zk that just turned off the balancer (If it was on) (M - ZK) 8. Get regions that are involved in the span (M) 9. Hoist the list up into zk. (M - ZK) 10. Create region to span the range. (M) 11. Write that we did this up into zk. (M - ZK) 12. Close regions in parallel. Confirm close in parallel. (M - RS) 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M - ZK) 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M) 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M - ZK) 16. Edit .META. (M) 17. Confirm edits went in. (M) 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M) 19. Enable balancer (if it was off) (M) 20. Unlock table (M) {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5419) FileAlreadyExistsException has moved from mapred to fs package
[ https://issues.apache.org/jira/browse/HBASE-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220739#comment-13220739 ] stack commented on HBASE-5419: -- Well, maybe this is for 0.96 then? That ok w/ you Dhruba? 0.96 will be the singularity, the release that gets the protobuf rpcs will require cluster shutdown and restart but thereafter, we should be able to upgrade running hbase across major versions. FileAlreadyExistsException has moved from mapred to fs package -- Key: HBASE-5419 URL: https://issues.apache.org/jira/browse/HBASE-5419 Project: HBase Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Priority: Minor Fix For: 0.94.0 Attachments: D1767.1.patch, D1767.1.patch The FileAlreadyExistsException has moved from org.apache.hadoop.mapred.FileAlreadyExistsException to org.apache.hadoop.fs.FileAlreadyExistsException. HBase is currently using a class that is deprecated in hadoop trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5504) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220743#comment-13220743 ] Zhihong Yu commented on HBASE-5504: --- bq. You can't alter region edges once created. Understood. I meant that data for the neighbor region we choose should be copied. The neighbor region would have new delimiting key. Online Merge Key: HBASE-5504 URL: https://issues.apache.org/jira/browse/HBASE-5504 Project: HBase Issue Type: Brainstorming Components: client, master, shell, zookeeper Affects Versions: 0.94.0 Reporter: Mubarak Seyed Fix For: 0.96.0 As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991] Design suggestion from Stack: {quote} I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA. Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing. (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper) 1. Client calls merge or deleteRegion API. API is a range of rows. (C) 2. Master gets call. (M) 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M - ZK) 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M) 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M - C) 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M) 7. Write into zk that just turned off the balancer (If it was on) (M - ZK) 8. Get regions that are involved in the span (M) 9. Hoist the list up into zk. (M - ZK) 10. Create region to span the range. (M) 11. Write that we did this up into zk. (M - ZK) 12. Close regions in parallel. Confirm close in parallel. (M - RS) 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M - ZK) 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M) 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M - ZK) 16. Edit .META. (M) 17. Confirm edits went in. (M) 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M) 19. Enable balancer (if it was off) (M) 20. Unlock table (M) {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5506: -- Hadoop Flags: Reviewed Status: Patch Available (was: Open) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5419) FileAlreadyExistsException has moved from mapred to fs package
[ https://issues.apache.org/jira/browse/HBASE-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220754#comment-13220754 ] dhruba borthakur commented on HBASE-5419: - sounds fine to me, thanks for checking Stack. FileAlreadyExistsException has moved from mapred to fs package -- Key: HBASE-5419 URL: https://issues.apache.org/jira/browse/HBASE-5419 Project: HBase Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Priority: Minor Fix For: 0.94.0 Attachments: D1767.1.patch, D1767.1.patch The FileAlreadyExistsException has moved from org.apache.hadoop.mapred.FileAlreadyExistsException to org.apache.hadoop.fs.FileAlreadyExistsException. HBase is currently using a class that is deprecated in hadoop trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.11.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin 1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically get CRC32C. 2. I added a minorVersion= to the output of HFilePrettyPrinter. Stack, will you be able to run bin/hbase hfile -m -f filename on your cluster to verify that this checksum feature is switched on. If it prints minorVersion=1, then you are using this feature. Do you still need a print somewhere saying that this feature in on? The older files that were pre-created before that patch was deployed will still use hdfs-checksum verification, so you could possible see hdfs-checksum-verification on stack traces on a live regionserver. 3. I did some thinking (again) on the semantics of major version and minor version. The major version represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we might need to bump up the major version. The minor version indicates the format of data inside a HFileBlock. In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1. Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion to 3 but do not change the minorVersion because we did not change the internal format of an HFileBlock. Given the above, does it make sense to say that HFileBlock is independent of the majorVersion? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.11.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin 1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically get CRC32C. 2. I added a minorVersion= to the output of HFilePrettyPrinter. Stack, will you be able to run bin/hbase hfile -m -f filename on your cluster to verify that this checksum feature is switched on. If it prints minorVersion=1, then you are using this feature. Do you still need a print somewhere saying that this feature in on? The older files that were pre-created before that patch was deployed will still use hdfs-checksum verification, so you could possible see hdfs-checksum-verification on stack traces on a live regionserver. 3. I did some thinking (again) on the semantics of major version and minor version. The major version represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we might need to bump up the major version. The minor version indicates the format of data inside a HFileBlock. In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1. Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion to 3 but do not change the minorVersion because we did not change the internal format of an HFileBlock. Given the above, does it make sense to say that HFileBlock is independent of the majorVersion? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java