[jira] [Created] (HBASE-5431) Improve delete marker handling in Import M/R jobs
Improve delete marker handling in Import M/R jobs - Key: HBASE-5431 URL: https://issues.apache.org/jira/browse/HBASE-5431 Project: HBase Issue Type: Sub-task Components: mapreduce Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Import currently create a new Delete object for each delete KV found in a result object. This can be improved with the new Delete API that allows adding a delete KV to a Delete object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5440) Allow import to optionally use HFileOutputFormat
Allow import to optionally use HFileOutputFormat Key: HBASE-5440 URL: https://issues.apache.org/jira/browse/HBASE-5440 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 importtsv support imporing into a life table or to generate HFiles for bulk load. import should allow the same. Could even consider merging these tools into one (in principle the only difference is the parsing part - although that is maybe for a different jira). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5460) Add protobuf as M/R dependency jar
Add protobuf as M/R dependency jar -- Key: HBASE-5460 URL: https://issues.apache.org/jira/browse/HBASE-5460 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Getting this from M/R jobs (Export for example): Error: java.lang.ClassNotFoundException: com.google.protobuf.Message at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at org.apache.hadoop.hbase.io.HbaseObjectWritable.clinit(HbaseObjectWritable.java:262) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5472) LoadIncrementalHFiles loops forever if the target table misses a CF
LoadIncrementalHFiles loops forever if the target table misses a CF --- Key: HBASE-5472 URL: https://issues.apache.org/jira/browse/HBASE-5472 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Lars Hofhansl Priority: Minor I have some HFiles for two column families 'y','z', but I specified a target table that only has CF 'y'. I see the following repeated forever. ... 12/02/23 22:57:37 WARN mapreduce.LoadIncrementalHFiles: Attempt to bulk load region containing into table z with files [family:y path:hdfs://bunnypig:9000/bulk/z2/y/bd6f1c3cc8b443fc9e9e5fddcdaa3b09, family:z path:hdfs://bunnypig:9000/bulk/z2/z/38f12fdbb7de40e8bf0e6489ef34365d] failed. This is recoverable and they will be retried. 12/02/23 22:57:37 DEBUG client.MetaScanner: Scanning .META. starting at row=z,,00 for max=2147483647 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7b7a4989 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Split occured while grouping HFiles, retry attempt 1596 with 2 files remaining to group or split 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://bunnypig:9000/bulk/z2/y/bd6f1c3cc8b443fc9e9e5fddcdaa3b09 first=r last=r 12/02/23 22:57:37 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://bunnypig:9000/bulk/z2/z/38f12fdbb7de40e8bf0e6489ef34365d first=r last=r 12/02/23 22:57:37 DEBUG mapreduce.LoadIncrementalHFiles: Going to connect to server region=z,,1330066309814.d5fa76a38c9565f614755e34eacf8316., hostname=localhost, port=60020 for row ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5475) Allow importtsv and Import to work truly offline when using bulk import option
Allow importtsv and Import to work truly offline when using bulk import option -- Key: HBASE-5475 URL: https://issues.apache.org/jira/browse/HBASE-5475 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Lars Hofhansl Currently importtsv (and now also Import with HBASE-5440) support using HFileOutputFormat for later bulk loading. However, currently that cannot be without having access to the table we're going to import to, because both importtsv and Import need to lookup the split points, and find the compression setting. It would be nice if there would be an offline way to provide the split point and compression setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5497) Add protobuf as M/R dependency jar (mapred)
Add protobuf as M/R dependency jar (mapred) --- Key: HBASE-5497 URL: https://issues.apache.org/jira/browse/HBASE-5497 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Getting this from M/R jobs (Export for example): Error: java.lang.ClassNotFoundException: com.google.protobuf.Message at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at org.apache.hadoop.hbase.io.HbaseObjectWritable.clinit(HbaseObjectWritable.java:262) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5509) MR based copier for copying HFiles (trunk version)
MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Karthik Ranganathan This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5523) Fix Delete Timerange logic for KEEP_DELETED_CELLS
Fix Delete Timerange logic for KEEP_DELETED_CELLS - Key: HBASE-5523 URL: https://issues.apache.org/jira/browse/HBASE-5523 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0, 0.96.0 A Delete at time T marks a Put at time T as deleted. In parent I invented special logic that insert a virtual millisecond into the tr if the encountered KV is a delete marker. This was so that there is a way to specify a timerange that would allow to see the put but not the delete: {code} if (kv.isDelete()) { if (!keepDeletedCells) { // first ignore delete markers if the scanner can do so, and the // range does not include the marker boolean includeDeleteMarker = seePastDeleteMarkers ? // +1, to allow a range between a delete and put of same TS tr.withinTimeRange(timestamp+1) : tr.withinOrAfterTimeRange(timestamp); {code} Discussed this today with a coworker and he convinced me that this is very confusing and also not needed. When we have a Delete and Put at the same time T, there *is* not timerange that can include the Put but not the Delete. So I will change the code to this (and fix the tests): {code} if (kv.isDelete()) { if (!keepDeletedCells) { // first ignore delete markers if the scanner can do so, and the // range does not include the marker boolean includeDeleteMarker = seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : tr.withinOrAfterTimeRange(timestamp); {code} It's easier to understand, and does not lead to strange scenarios when the TS is used as a controlled counter. Needs to be done before 0.94 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5541) Avoid holding the rowlock during HLog sync in HRegion.mutateRowWithLocks
Avoid holding the rowlock during HLog sync in HRegion.mutateRowWithLocks Key: HBASE-5541 URL: https://issues.apache.org/jira/browse/HBASE-5541 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Currently mutateRowsWithLocks holds the row lock while the HLog is sync'ed. Similar to what we do in doMiniBatchPut, we should create the log entry with the lock held, but only sync the HLog after the log is released, along with rollback logic in case the sync'ing fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5547) Don't delete HFiles when in backup mode
Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5569) TestAtomicOperation.testMultiRowMutationMultiThreads fails occasionally
TestAtomicOperation.testMultiRowMutationMultiThreads fails occasionally --- Key: HBASE-5569 URL: https://issues.apache.org/jira/browse/HBASE-5569 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Priority: Minor What I pieces together so far is that it is the *scanning* side that has problems sometimes. Every time I see a assertion failure in the log I see this before: {quote} 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): Storescanner.peek() is changed where before = rowB/colfamily11:qual1/75366/Put/vlen=6,and after = rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0 {quote} The order of if the Put and Delete is sometimes reversed. The test threads should always see exactly one KV, if the before was the Put the thread see 0 KVs, if the before was the Delete the threads see 2 KVs. This debug message comes from StoreScanner to checkReseek. It seems we still some consistency issue with scanning sometimes :( -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
HLog replay tool that generates HFiles for use by LoadIncrementalHFiles. Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Just an idea I had. Might be useful for restore of a backup using the HLogs. This could either be a standalone tool and or an M/R (with a mapper per HLog file). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5622) Improve efficiency of mapred vesion of RowCounter
Improve efficiency of mapred vesion of RowCounter - Key: HBASE-5622 URL: https://issues.apache.org/jira/browse/HBASE-5622 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Priority: Minor Fix For: 0.94.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.
decayingSampleTick1 prevents HBase from shutting down. -- Key: HBASE-5641 URL: https://issues.apache.org/jira/browse/HBASE-5641 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5641.txt I think this is the problem. It creates a non-daemon thread. {code} private static final ScheduledExecutorService TICK_SERVICE = Executors.newScheduledThreadPool(1, Threads.getNamedThreadFactory(decayingSampleTick)); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5659) TestAtomicOperation.testMultiRowMutationMultiThreads is still failing occasionally
TestAtomicOperation.testMultiRowMutationMultiThreads is still failing occasionally -- Key: HBASE-5659 URL: https://issues.apache.org/jira/browse/HBASE-5659 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Priority: Minor See run here: https://builds.apache.org/job/PreCommit-HBASE-Build/1318//testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testMultiRowMutationMultiThreads/ {quote} 2012-03-27 04:36:12,627 DEBUG [Thread-118] regionserver.StoreScanner(499): Storescanner.peek() is changed where before = rowB/colfamily11:qual1/7202/Put/vlen=6/ts=7922,and after = rowB/colfamily11:qual1/7199/DeleteColumn/vlen=0/ts=0 2012-03-27 04:36:12,629 INFO [Thread-121] regionserver.HRegion(1558): Finished memstore flush of ~2.9k/2952, currentsize=1.6k/1640 for region testtable,,1332822963417.7cd30e219714cfc5e91f69def66e7f81. in 14ms, sequenceid=7927, compaction requested=true 2012-03-27 04:36:12,629 DEBUG [Thread-126] regionserver.TestAtomicOperation$2(362): flushing 2012-03-27 04:36:12,630 DEBUG [Thread-126] regionserver.HRegion(1426): Started memstore flush for testtable,,1332822963417.7cd30e219714cfc5e91f69def66e7f81., current region memstore size 1.9k 2012-03-27 04:36:12,630 DEBUG [Thread-126] regionserver.HRegion(1474): Finished snapshotting testtable,,1332822963417.7cd30e219714cfc5e91f69def66e7f81., commencing wait for mvcc, flushsize=1968 2012-03-27 04:36:12,630 DEBUG [Thread-126] regionserver.HRegion(1484): Finished snapshotting, commencing flushing stores 2012-03-27 04:36:12,630 DEBUG [Thread-126] util.FSUtils(153): Creating file=/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/.tmp/61954619003e469baf1a34be5ff2ec57 with permission=rwxrwxrwx 2012-03-27 04:36:12,631 DEBUG [Thread-126] hfile.HFileWriterV2(143): Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-03-27 04:36:12,631 INFO [Thread-126] regionserver.StoreFile$Writer(997): Delete Family Bloom filter type for /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/.tmp/61954619003e469baf1a34be5ff2ec57: CompoundBloomFilterWriter 2012-03-27 04:36:12,632 INFO [Thread-126] regionserver.StoreFile$Writer(1220): NO General Bloom and NO DeleteFamily was added to HFile (/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/.tmp/61954619003e469baf1a34be5ff2ec57) 2012-03-27 04:36:12,632 INFO [Thread-126] regionserver.Store(770): Flushed , sequenceid=7934, memsize=1.9k, into tmp file /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/.tmp/61954619003e469baf1a34be5ff2ec57 2012-03-27 04:36:12,632 DEBUG [Thread-126] regionserver.Store(795): Renaming flushed file at /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/.tmp/61954619003e469baf1a34be5ff2ec57 to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/colfamily11/61954619003e469baf1a34be5ff2ec57 2012-03-27 04:36:12,634 INFO [Thread-126] regionserver.Store(818): Added /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/b9091c3c-961e-4035-850a-83ad14d517cc/TestAtomicOperationtestMultiRowMutationMultiThreads/testtable/7cd30e219714cfc5e91f69def66e7f81/colfamily11/61954619003e469baf1a34be5ff2ec57, entries=12, sequenceid=7934, filesize=1.3k 2012-03-27 04:36:12,642 DEBUG [Thread-118] regionserver.TestAtomicOperation$2(392): [] Exception in thread Thread-118 junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.fail(Assert.java:56) at org.apache.hadoop.hbase.regionserver.TestAtomicOperation$2.run(TestAtomicOperation.java:394) 2012-03-27 04:36:12,643 INFO [Thread-126] regionserver.HRegion(1558): Finished memstore flush of ~1.9k/1968, currentsize=1.3k/1312 for region
[jira] [Created] (HBASE-5670) Have Mutation implement the Row interface.
Have Mutation implement the Row interface. -- Key: HBASE-5670 URL: https://issues.apache.org/jira/browse/HBASE-5670 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Trivial In HBASE-4347 I factored some code from Put/Delete/Append in Mutation. In a discussion with a co-worker I noticed that Put/Delete/Append still implement the Row interface, but Mutation does not. In a trivial change I would like to move that interface up to Mutation, along with changing HTable.batch(ListRow) to HTable.batch(List? extends Row) (HConnection.processBatch takes List? extends Row already anyway), so that HTable.batch can be used with a list of Mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5682) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers (port to 0.94)
Add retry logic in HConnectionImplementation#resetZooKeeperTrackers (port to 0.94) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5096) Replication does not handle deletes correctly.
Replication does not handle deletes correctly. -- Key: HBASE-5096 URL: https://issues.apache.org/jira/browse/HBASE-5096 Project: HBase Issue Type: Sub-task Components: replication Affects Versions: 0.94.0, 0.92.1 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Teruyoshi Zenmyo discovered this problem. The problem turns out to be this code in ReplicationSink.java: {code} if (kvs.get(0).isDelete()) { ... if (kv.isDeleteFamily()) { delete.deleteFamily(kv.getFamily()); } else if (!kv.isEmptyColumn()) { delete.deleteColumn(kv.getFamily(), kv.getQualifier()); } } ... {code} So the code deal with families delete markers and then assumes that if it's not a family delete marker it must have been a version delete marker. (deleteColumn sets a version delete marker, deleteColumns sets a column delete marker). I.e. column delete markers are not replicated correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5118) Fix Scan documentation
Fix Scan documentation -- Key: HBASE-5118 URL: https://issues.apache.org/jira/browse/HBASE-5118 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Trivial Current documentation for scan states: {code} Scan scan = new Scan(); scan.addColumn(Bytes.toBytes(cf),Bytes.toBytes(attr)); scan.setStartRow( Bytes.toBytes(row)); // start key is inclusive scan.setStopRow( Bytes.toBytes(row + new byte[] {0})); // stop key is exclusive for(Result result : htable.getScanner(scan)) { // process Result instance } {code} row + new byte[] {0} is not correct. That should row + (char)0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5164) Better HTable resource consumption in CoprocessorHost
Better HTable resource consumption in CoprocessorHost - Key: HBASE-5164 URL: https://issues.apache.org/jira/browse/HBASE-5164 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: Lars Hofhansl Priority: Minor Fix For: 0.94.0 HBASE-4805 allows for more control over HTable's resource consumption. This is currently not used by CoprocessorHost (even though it would even be more critical to control this inside the RegionServer). It's not immediate obvious how to do that. Maybe CoprocessorHost should maintain a lazy ExecutorService and HConnection and reuse both for all HTables retrieved via CoprocessorEnvironment.getTable(...). Not sure how critical this is, but I feel without this it is dangerous to use getTable, as it would lead to all resource consumption problems we find in the client, but inside a crucial part of the HBase servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5203) Fix atomic put/delete with region server failures.
Fix atomic put/delete with region server failures. -- Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5205) Delete handles deleteFamily incorrectly
Delete handles deleteFamily incorrectly --- Key: HBASE-5205 URL: https://issues.apache.org/jira/browse/HBASE-5205 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Priority: Minor Delete.deleteFamily clears all other markers for the same family. That is not correct as some of these other markers might be for a later time. That logic should be removed. If (really) needed this can be slightly optimized by keeping track of the max TS so far for each family. If both the TS-so-far and the TS of a new deleteFamily request is LATEST_TIMESTAMP and the TS-so-far is deleteFamily marker, then the previous delete marker can be removed. I think that might be overkill, as most deletes issued from clients are for LATEST_TIMESTAMP (which the server translates to the current time). I'll have a (one-line) patch soon. Unless folks insist on the optimization I mentioned above. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5229) Support atomic region operations
Support atomic region operations Key: HBASE-5229 URL: https://issues.apache.org/jira/browse/HBASE-5229 Project: HBase Issue Type: New Feature Components: client, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5229.txt As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple. I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere. Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5257) Allow filter to be evaluated after version handling
Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5266) Add documentation for ColumnRangeFilter
Add documentation for ColumnRangeFilter --- Key: HBASE-5266 URL: https://issues.apache.org/jira/browse/HBASE-5266 Project: HBase Issue Type: Sub-task Components: documentation Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 There are only a few lines of documentation for ColumnRangeFilter. Given the usefulness of this filter for efficient intra-row scanning (see HASE-5229 and HBASE-4256), we should make this filter more prominent in the documentation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5268) Add delete column prefix delete marker
Add delete column prefix delete marker -- Key: HBASE-5268 URL: https://issues.apache.org/jira/browse/HBASE-5268 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 This is another part missing in the wide row challenge. Currently entire families of a row can be deleted or individual columns or versions. There is no facility to mark multiple columns for deletion by column prefix. Turns out that be achieve with very little code (it's possible that I missed some of the new delete bloom filter code, so please review this thoroughly). I'll attach a patch soon, just working on some tests now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5304) Pluggable split key policy
Pluggable split key policy -- Key: HBASE-5304 URL: https://issues.apache.org/jira/browse/HBASE-5304 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 We need a way to specify custom policies to determine split keys. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5311) Allow inmemory Memstore compactions
Allow inmemory Memstore compactions --- Key: HBASE-5311 URL: https://issues.apache.org/jira/browse/HBASE-5311 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Just like we periodically compact the StoreFiles we should also periodically compact the MemStore. During these compactions we eliminate deleted cells, expired cells, cells to removed because of version count, etc, before we even do a memstore flush. Besides the optimization that we could get from this, it should also allow to remove the special handling of IVN, Increment, and Append (all of which use upsert logic to avoid accumulating excessive cells in the Memstore). Not targeting this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5333) Introduce Memstore backpressure for writes
Introduce Memstore backpressure for writes Key: HBASE-5333 URL: https://issues.apache.org/jira/browse/HBASE-5333 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Currently if the memstore/flush/compaction cannot keep up with the writeload, we block writers up to hbase.hstore.blockingWaitTime milliseconds (default is 9). Would be nice if there was a concept of a soft backpressure that slows writing clients gracefully *before* we reach this condition. From the log: 2012-02-04 00:00:06,963 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region table,,1328313512779.c2761757621ddf8fb78baf5288d71271. has too many store files; delaying flush up to 9ms -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5336) Spurious exceptions in HConnectionImplementation
Spurious exceptions in HConnectionImplementation Key: HBASE-5336 URL: https://issues.apache.org/jira/browse/HBASE-5336 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl I have seen this on the client a few time during heave write testing: java.util.concurrent.ExecutionException: java.io.IOException: java.io.IOException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1524) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1376) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:891) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:743) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:730) at NewsFeedCreate.insert(NewsFeedCreate.java:91) at NewsFeedCreate$1.run(NewsFeedCreate.java:38) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: java.io.IOException: java.lang.NullPointerException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79) at org.apache.hadoop.hbase.client.ServerCallable.translateException(ServerCallable.java:228) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:212) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1360) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1348) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ... 1 more Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:243) at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1289) at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1386) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2161) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1954) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3363) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:899) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) at $Proxy1.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1351) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) ... 7 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5368) Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs
Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs - Key: HBASE-5368 URL: https://issues.apache.org/jira/browse/HBASE-5368 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Very simple change to make PrefixSplitKeyPolicy accessible in HBase installs (user still needs to setup the table(s) accordingly). Right now it is in src/test/org.apache.hadoop.hbase.regionserver, I propose moving it to src/org.apache.hadoop.hbase.regionserver (alongside ConstantSizeRegionSplitPolicy), and maybe renaming it too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5370) Allow HBase shell set HTableDescriptor values
Allow HBase shell set HTableDescriptor values - Key: HBASE-5370 URL: https://issues.apache.org/jira/browse/HBASE-5370 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor Currently it does not seem to be possible to set value on a table's HTableDescriptor (either on creation or afterwards). The syntax I have in mind is something like: create {NAME='table', 'somekey'='somevalue'}, 'column' In analogy to how we allow a column to either a string ('column') or an association {NAME='column', ...} alter would be changed to allow setting arbitrary values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5774) Add documentation for WALPlayer to HBase reference guide.
Add documentation for WALPlayer to HBase reference guide. - Key: HBASE-5774 URL: https://issues.apache.org/jira/browse/HBASE-5774 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4488) Store could miss rows during flush
Store could miss rows during flush -- Key: HBASE-4488 URL: https://issues.apache.org/jira/browse/HBASE-4488 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Lars Hofhansl Priority: Critical While looked at HBASE-4344 I found that my change HBASE-4241 contains a critical mistake. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4496) HFile V2 does not honor setCacheBlocks when scanning.
HFile V2 does not honor setCacheBlocks when scanning. - Key: HBASE-4496 URL: https://issues.apache.org/jira/browse/HBASE-4496 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Lars Hofhansl Fix For: 0.92.0, 0.94.0 While testing the LRU cache during the scanning I noticed quite some churn in the cache even when Scan.cacheBlocks is set to false. After debugging this, I found that HFile V2 always caches blocks in the LRU cache regardless of the cacheBlocks setting. Here's a trace (from Eclipse) showing the problem: HFileReaderV2.readBlock(long, int, boolean, boolean, boolean) line: 279 HFileReaderV2.readBlockData(long, long, int, boolean) line: 219 HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, HFileBlock) line: 191 HFileReaderV2$ScannerV2.seekTo(byte[], int, int, boolean) line: 502 HFileReaderV2$ScannerV2.reseekTo(byte[], int, int) line: 539 StoreFileScanner.reseekAtOrAfter(HFileScanner, KeyValue) line: 151 StoreFileScanner.reseek(KeyValue) line: 110 KeyValueHeap.reseek(KeyValue) line: 255 StoreScanner.reseek(KeyValue) line: 409 StoreScanner.next(ListKeyValue, int) line: 304 KeyValueHeap.next(ListKeyValue, int) line: 114 KeyValueHeap.next(ListKeyValue) line: 143 HRegion$RegionScannerImpl.nextRow(byte[]) line: 2774 HRegion$RegionScannerImpl.nextInternal(int) line: 2722 HRegion$RegionScannerImpl.next(ListKeyValue, int) line: 2682 HRegion$RegionScannerImpl.next(ListKeyValue) line: 2699 HRegionServer.next(long, int) line: 2092 Every scanner.next causes a reseek, which eventually causes a call to HFileBlockIndex$BlockIndexReader.seekToDataBlock(...) at which point the cacheBlocks information is lost. HFileReaderV2.readBlockData calls HFileReaderV2.readBlock with cacheBlocks set unconditionally to true. The fix is not immediately clear, unless we want to pass cacheBlocks to HFileBlockIndex$BlockIndexReader.seekToDataBlock and then on to HFileBlock.BasicReader.readBlockData and all its implementers, which is ugly as readBlockData should not care about caching. Avoiding caching during scans is somewhat important for us. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4517) Document new replication features in 0.92
Document new replication features in 0.92 - Key: HBASE-4517 URL: https://issues.apache.org/jira/browse/HBASE-4517 Project: HBase Issue Type: Sub-task Components: documentation Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.92.0, 0.94.0 Document changes from HBASE-2195 and HBASE-2196 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4536) Allow CF to retain deleted rows
Allow CF to retain deleted rows --- Key: HBASE-4536 URL: https://issues.apache.org/jira/browse/HBASE-4536 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.92.0, 0.94.0 Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions. However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241). There should be a way to retain those version to guard against software error. I see two options here: 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED. 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker. #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint) Comments? Any other options? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4556) Fix all incorrect uses of InternalScanner.next(...)
Fix all incorrect uses of InternalScanner.next(...) --- Key: HBASE-4556 URL: https://issues.apache.org/jira/browse/HBASE-4556 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl There are cases all over the code where InternalScanner.next(...) is not used correctly. I see this a lot: {code} while(scanner.next(...)) { } {code} The correct pattern is: {code} boolean more = false; do { more = scanner.next(...); } while (more); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4583) Integrate RWCC with Append and Increment operations
Integrate RWCC with Append and Increment operations --- Key: HBASE-4583 URL: https://issues.apache.org/jira/browse/HBASE-4583 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.0 Currently Increment and Append operations do not work with RWCC and hence a client could see the results of multiple such operation mixed in the same Get/Scan. The semantics might be a bit more interesting here as upsert adds and removes to and from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4626) Filters unnecessarily copy byte arrays...
Filters unnecessarily copy byte arrays... - Key: HBASE-4626 URL: https://issues.apache.org/jira/browse/HBASE-4626 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Just looked at SingleCol and ValueFilter... And on every column compared they create a copy of the column and/or value portion of the KV. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4673) NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0
NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0 --- Key: HBASE-4673 URL: https://issues.apache.org/jira/browse/HBASE-4673 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Lars Hofhansl Priority: Minor On a test system got this exception when hfile.block.cache.size is set to 0: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.close(HFileReaderV2.java:321) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1065) at org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:539) at org.apache.hadoop.hbase.regionserver.StoreFile.deleteReader(StoreFile.java:549) at org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:1314) at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:686) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1016) at org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest.run(CompactionRequest.java:178) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Minor issue as nobody in their right mind with have hfile.block.cache.size=0 Looks like this is due to HBASE-4422 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4682) Support deleted rows using Import/Export
Support deleted rows using Import/Export Key: HBASE-4682 URL: https://issues.apache.org/jira/browse/HBASE-4682 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Lars Hofhansl Parent allow keeping deleted rows around. Would be nice if those could be exported and imported as well. All the building blocks are there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4683) Create config option to only cache index blocks
Create config option to only cache index blocks --- Key: HBASE-4683 URL: https://issues.apache.org/jira/browse/HBASE-4683 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Fix For: 0.94.0 This would add a new boolean config option: hfile.block.cache.datablocks Default would be true. Setting this to false allows HBase in a mode where only index blocks are cached, which is useful for analytical scenarios where a useful working set of the data cannot be expected to fit into the cache. This is the equivalent of setting all cacheBlocks to false on all scans (including scans on behalf of gets). I would like to general feeling about what folks think about this. The change itself would be simple. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4691) Remove more unnecessary byte[] copies from KeyValues
Remove more unnecessary byte[] copies from KeyValues Key: HBASE-4691 URL: https://issues.apache.org/jira/browse/HBASE-4691 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Just looking through the code I found some more spots where we unnecessarily copy byte[] rather than just passing offset and length around. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4800) Result.compareResults is incorrect
Result.compareResults is incorrect -- Key: HBASE-4800 URL: https://issues.apache.org/jira/browse/HBASE-4800 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.92.0, 0.94.0 Reporter: Lars Hofhansl A coworker of mine (James Taylor) found a bug in Result.compareResults(...). This condition: {code} if (!ourKVs[i].equals(replicatedKVs[i]) !Bytes.equals(ourKVs[i].getValue(), replicatedKVs[i].getValue())) { throw new Exception(This result was different: {code} should be {code} if (!ourKVs[i].equals(replicatedKVs[i]) || !Bytes.equals(ourKVs[i].getValue(), replicatedKVs[i].getValue())) { throw new Exception(This result was different: {code} Just checked, this is wrong in all branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4805) Allow better control of resource consumption in HTable
Allow better control of resource consumption in HTable -- Key: HBASE-4805 URL: https://issues.apache.org/jira/browse/HBASE-4805 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl From some internal discussions at Salesforce we concluded that we need better control over the resources (mostly threads) consumed by HTable when used in a AppServer with many client threads. Since HTable is not thread safe, the only options are cache them (in a custom thread local or using HTablePool) or to create them on-demand. I propose a simple change: Add a new constructor to HTable that takes an optional ExecutorService and HConnection instance. That would make HTable a pretty lightweight object and we would manage the ES and HC separately. I'll upload a patch a soon to get some feedback. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4838) Port 2856 (TestAcidGuarantees is failing) to 0.92
Port 2856 (TestAcidGuarantees is failing) to 0.92 - Key: HBASE-4838 URL: https://issues.apache.org/jira/browse/HBASE-4838 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.92.0 Moving back port into a separate issue (as suggested by JonH), because this not trivial. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4844) Coprocessor hooks for log rolling
Coprocessor hooks for log rolling - Key: HBASE-4844 URL: https://issues.apache.org/jira/browse/HBASE-4844 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0 Reporter: Lars Hofhansl Priority: Minor In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4886) truncate fails in HBase shell
truncate fails in HBase shell - Key: HBASE-4886 URL: https://issues.apache.org/jira/browse/HBASE-4886 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4886.txt Seeing this in trunk: {noformat} hbase(main):001:0 truncate 'table' Truncating 'table' table (it may take a while): ERROR: wrong number of arguments (1 for 3) Here is some help for this command: Disables, drops and recreates the specified table. {noformat} ... caused by the removal of the HTable(String) constructor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4945) NPE in HRegion.bulkLoadHFiles(...)
NPE in HRegion.bulkLoadHFiles(...) -- Key: HBASE-4945 URL: https://issues.apache.org/jira/browse/HBASE-4945 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.0 Reporter: Lars Hofhansl Priority: Minor Was playing with completebulkload, and ran into an NPE. The problem is here. {code} Store store = getStore(familyName); if (store == null) { IOException ioe = new DoNotRetryIOException( No such column family + Bytes.toStringBinary(familyName)); ioes.add(ioe); failures.add(p); } try { store.assertBulkLoadHFileOk(new Path(path)); } catch (WrongRegionException wre) { // recoverable (file doesn't fit in region) failures.add(p); } catch (IOException ioe) { // unrecoverable (hdfs problem) ioes.add(ioe); } {code} This should be {code} Store store = getStore(familyName); if (store == null) { ... } else { try { store.assertBulkLoadHFileOk(new Path(path)); ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4979) Setting KEEP_DELETE_CELLS fails in shell
Setting KEEP_DELETE_CELLS fails in shell Key: HBASE-4979 URL: https://issues.apache.org/jira/browse/HBASE-4979 Project: HBase Issue Type: Sub-task Components: shell Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl admin.rb uses wrong method on HColumnDescription to enable keeping of deleted cells. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4981) add raw scan support to shell
add raw scan support to shell - Key: HBASE-4981 URL: https://issues.apache.org/jira/browse/HBASE-4981 Project: HBase Issue Type: Sub-task Components: shell Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Parent adds raw scan support to include delete markers and deleted rows in scan results. Would be nice it that would available in the shell to see exactly what exists in a table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4998) Support deleted rows in CopyTable
Support deleted rows in CopyTable - Key: HBASE-4998 URL: https://issues.apache.org/jira/browse/HBASE-4998 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor It turns out that with HBASE-4682 in place, it is trivial to add this to CopyTable as well. This would be another tools in the backup arsenal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5058) Allow HBaseAmin to use an existing connection
Allow HBaseAmin to use an existing connection - Key: HBASE-5058 URL: https://issues.apache.org/jira/browse/HBASE-5058 Project: HBase Issue Type: Sub-task Components: client Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor What HBASE-4805 does for HTables, this should do for HBaseAdmin. Along with this the shared error handling and retrying between HBaseAdmin and HConnectionManager can also be improved. I'll attach a first pass patch soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5059) Tests for: Support deleted rows in CopyTable
Tests for: Support deleted rows in CopyTable Key: HBASE-5059 URL: https://issues.apache.org/jira/browse/HBASE-5059 Project: HBase Issue Type: Sub-task Components: mapreduce Affects Versions: 0.94.0 Reporter: Lars Hofhansl Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira