[jira] [Created] (HBASE-11402) Scanner perform redundand datanode requests
Max Lapan created HBASE-11402: - Summary: Scanner perform redundand datanode requests Key: HBASE-11402 URL: https://issues.apache.org/jira/browse/HBASE-11402 Project: HBase Issue Type: Bug Components: HFile, Scanners Reporter: Max Lapan Using hbase 0.94.6 I found duplicate datanode requests of this sort: {noformat} 2014-06-09 14:12:22,039 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57897, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 35840, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 109928797000 2014-06-09 14:12:22,080 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57910, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 0, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 3825 {noformat} After short investigation, I found the source of such behaviour: * StoreScanner in constructor calls StoreFileScanner::seek, which (after several levels of calls) is calling HFileBlock::readBlockDataInternal which reads block and pre-reads header of the next block. * This pre-readed header is stored in ThreadLocalPrefetchedHeader variable and stream is left in a position right behind the header of next block. * After constructor finished, scanner code does scanning, and, after pre-readed block data finished, it calls HFileReaderV2::readNextDataBlock, which again calls HFileBlock::readBlockDataInternal, but this call occured from different thread and there is nothing usefull in ThreadLocal variable * Due to this, stream is asked to seek backwards, and this cause duplicate DN request. As far as I understood from trunk code, the problem hasn't fixed yet. Log of calls with process above: {noformat} 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileBlockIndex: loadDataBlockWithScanInfo: entered 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: seekTo: readBlock, ofs = 0, size = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Before block read: path = hdfs://tsthdp1.p:9000/hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal. Ofs = 0, is.pos = 137257042, ondDiskSizeWithHeader = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal: prefetchHeader.ofs = -1, thread = 48 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 24, offset = 0, peekNext = false 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 0, pos = 137257042, blockEnd = 137257229 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: not done, blockEnd = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: readWithStrategy: before seek, pos = 0, blockEnd = -1, currentNode = 10.103.0.73:50010 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockAt: blockEnd updated to 137257229 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: blockSeekTo: loop, target = 0 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockReader: dn = tsthdp2.p, file = /hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b42bc40aa9e18ac8a4b, bl 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: ofs = 0, len = 24 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: try to read 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: done, len = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 35899, offset = 24, peekNext = true 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 24, pos = 24, blockEnd = 137257229 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: check that we cat skip diff = 0 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: try to fast-forward on diff = 0, pos = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: pos after = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: ofs = 24, len = 35923 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: try to read 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: done, len = 35923 2014-06-18 14:55:36,642 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Block data read 2014-06-18 14:55:36,642 INFO
[jira] [Updated] (HBASE-11402) Scanner performs redundand datanode requests
[ https://issues.apache.org/jira/browse/HBASE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-11402: -- Summary: Scanner performs redundand datanode requests (was: Scanner perform redundand datanode requests) Scanner performs redundand datanode requests Key: HBASE-11402 URL: https://issues.apache.org/jira/browse/HBASE-11402 Project: HBase Issue Type: Bug Components: HFile, Scanners Reporter: Max Lapan Using hbase 0.94.6 I found duplicate datanode requests of this sort: {noformat} 2014-06-09 14:12:22,039 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57897, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 35840, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 109928797000 2014-06-09 14:12:22,080 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57910, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 0, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 3825 {noformat} After short investigation, I found the source of such behaviour: * StoreScanner in constructor calls StoreFileScanner::seek, which (after several levels of calls) is calling HFileBlock::readBlockDataInternal which reads block and pre-reads header of the next block. * This pre-readed header is stored in ThreadLocalPrefetchedHeader variable and stream is left in a position right behind the header of next block. * After constructor finished, scanner code does scanning, and, after pre-readed block data finished, it calls HFileReaderV2::readNextDataBlock, which again calls HFileBlock::readBlockDataInternal, but this call occured from different thread and there is nothing usefull in ThreadLocal variable * Due to this, stream is asked to seek backwards, and this cause duplicate DN request. As far as I understood from trunk code, the problem hasn't fixed yet. Log of calls with process above: {noformat} 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileBlockIndex: loadDataBlockWithScanInfo: entered 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: seekTo: readBlock, ofs = 0, size = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Before block read: path = hdfs://tsthdp1.p:9000/hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal. Ofs = 0, is.pos = 137257042, ondDiskSizeWithHeader = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal: prefetchHeader.ofs = -1, thread = 48 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 24, offset = 0, peekNext = false 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 0, pos = 137257042, blockEnd = 137257229 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: not done, blockEnd = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: readWithStrategy: before seek, pos = 0, blockEnd = -1, currentNode = 10.103.0.73:50010 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockAt: blockEnd updated to 137257229 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: blockSeekTo: loop, target = 0 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockReader: dn = tsthdp2.p, file = /hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b42bc40aa9e18ac8a4b, bl 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: ofs = 0, len = 24 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: try to read 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: done, len = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 35899, offset = 24, peekNext = true 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 24, pos = 24, blockEnd = 137257229 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: check that we cat skip diff = 0 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: try to fast-forward on diff = 0, pos = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: pos after = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient:
[jira] [Updated] (HBASE-11402) Scanner performs redundand datanode requests
[ https://issues.apache.org/jira/browse/HBASE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-11402: -- Description: Using hbase 0.94.6 I found duplicate datanode requests of this sort: {noformat} 2014-06-09 14:12:22,039 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57897, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 35840, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 109928797000 2014-06-09 14:12:22,080 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.103.0.73:50010, dest: /10.103.0.38:57910, bytes: 1056768, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1702752887_26, offset: 0, srvID: DS-504316153-10.103.0.73-50010-1342437562377, blockid: BP-404551095-10.103.0.38-1376045452213:blk_3541255952831727320_613837, duration: 3825 {noformat} After short investigation, I found the source of such behaviour: * StoreScanner in constructor calls StoreFileScanner::seek, which (after several levels of calls) is calling HFileBlock::readBlockDataInternal which reads block and pre-reads header of the next block. * This pre-readed header is stored in ThreadLocalPrefetchedHeader variable and stream is left in a position right behind the header of next block. * After constructor finished, scanner code does scanning, and, after pre-readed block data finished, it calls HFileReaderV2::readNextDataBlock, which again calls HFileBlock::readBlockDataInternal, but this call occured from different thread and there is nothing usefull in ThreadLocal variable * Due to this, stream is asked to seek backwards, and this cause duplicate DN request. As far as I understood from trunk code, the problem hasn't fixed yet. Log of calls with process above: {noformat} 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileBlockIndex: loadDataBlockWithScanInfo: entered 2014-06-18 14:55:36,616 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: seekTo: readBlock, ofs = 0, size = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Before block read: path = hdfs://tsthdp1.p:9000/hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal. Ofs = 0, is.pos = 137257042, ondDiskSizeWithHeader = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: readBlockDataInternal: prefetchHeader.ofs = -1, thread = 48 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 24, offset = 0, peekNext = false 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 0, pos = 137257042, blockEnd = 137257229 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: seek: not done, blockEnd = -1 2014-06-18 14:55:36,617 INFO org.apache.hadoop.hdfs.DFSClient: readWithStrategy: before seek, pos = 0, blockEnd = -1, currentNode = 10.103.0.73:50010 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockAt: blockEnd updated to 137257229 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: blockSeekTo: loop, target = 0 2014-06-18 14:55:36,618 INFO org.apache.hadoop.hdfs.DFSClient: getBlockReader: dn = tsthdp2.p, file = /hbase/webpagesII/ba16051997b1272f00bed5f65094dc63/p/c866b7b0eded4b42bc40aa9e18ac8a4b, bl 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: ofs = 0, len = 24 2014-06-18 14:55:36,627 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: try to read 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: done, len = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hbase.io.hfile.HFile: FSReaderV2: readAtOffset: size = 35899, offset = 24, peekNext = true 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: targetPos = 24, pos = 24, blockEnd = 137257229 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: check that we cat skip diff = 0 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: try to fast-forward on diff = 0, pos = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: seek: pos after = 24 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: ofs = 24, len = 35923 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: try to read 2014-06-18 14:55:36,641 INFO org.apache.hadoop.hdfs.DFSClient: readBuffer: done, len = 35923 2014-06-18 14:55:36,642 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Block data read 2014-06-18 14:55:36,642 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2: After block read, ms = 25191000 2014-06-18 14:55:36,670 INFO org.apache.hadoop.hbase.io.hfile.HFileReaderV2:
[jira] [Created] (HBASE-10118) Major compact keeps deletes with future timestamps
Max Lapan created HBASE-10118: - Summary: Major compact keeps deletes with future timestamps Key: HBASE-10118 URL: https://issues.apache.org/jira/browse/HBASE-10118 Project: HBase Issue Type: Bug Components: Compaction, Deletes, regionserver Reporter: Max Lapan Priority: Minor Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. It is very possible that we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete markers. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HBASE-10118) Major compact keeps deletes with future timestamps
[ https://issues.apache.org/jira/browse/HBASE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-10118: -- Description: Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. Maybe, we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete markers, and it looks like a solution. But, maybe, there are better approach. was: Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. Maybe, we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete markers. Major compact keeps deletes with future timestamps -- Key: HBASE-10118 URL: https://issues.apache.org/jira/browse/HBASE-10118 Project: HBase Issue Type: Bug Components: Compaction, Deletes, regionserver Reporter: Max Lapan Priority: Minor Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. Maybe, we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In
[jira] [Updated] (HBASE-10118) Major compact keeps deletes with future timestamps
[ https://issues.apache.org/jira/browse/HBASE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-10118: -- Description: Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. Maybe, we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete markers. was: Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. It is very possible that we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete markers. Major compact keeps deletes with future timestamps -- Key: HBASE-10118 URL: https://issues.apache.org/jira/browse/HBASE-10118 Project: HBase Issue Type: Bug Components: Compaction, Deletes, regionserver Reporter: Max Lapan Priority: Minor Hello! During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not reached. I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes}}. This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java. Maybe, we are trying to do something weird by specifing delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we rely on. Steps to reproduce this: {code} put 'test', 'delmeRow', 'delme:something', 'hello' flush 'test' delete 'test', 'delmeRow', 'delme:something', 1394161431061 flush 'test' major_compact 'test' {code} Before major_compact we have two hfiles with the following: {code} first: K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0 second: K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} After major compact we get the following: {code} K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0 {code} In our installation, we resolved this by removing
[jira] [Created] (HBASE-8959) Bitmasks handling
Max Lapan created HBASE-8959: Summary: Bitmasks handling Key: HBASE-8959 URL: https://issues.apache.org/jira/browse/HBASE-8959 Project: HBase Issue Type: New Feature Reporter: Max Lapan Priority: Minor I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I missing something and it is much harder to implement than I imagine? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8959) Bitmasks handling
[ https://issues.apache.org/jira/browse/HBASE-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-8959: - Description: I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I'm missing something and it is much harder to implement than I think? was: I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I missing something and it is much harder to implement than I imagine? Bitmasks handling - Key: HBASE-8959 URL: https://issues.apache.org/jira/browse/HBASE-8959 Project: HBase Issue Type: New Feature Reporter: Max Lapan Priority: Minor I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I'm missing something and it is much harder to implement than I think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8959) Bitmasks handling
[ https://issues.apache.org/jira/browse/HBASE-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709877#comment-13709877 ] Max Lapan commented on HBASE-8959: -- When such update performed by a client, there is a chance of concurrent update. One task can set one bit, other update another in the same byte and the whole result is unpredictable. If bit operations are in KeyValues, there is no such problem - both clients put two KeyValues which update different bits. On nexct compact, RS will combine them into correct value. Bitmasks handling - Key: HBASE-8959 URL: https://issues.apache.org/jira/browse/HBASE-8959 Project: HBase Issue Type: New Feature Reporter: Max Lapan Priority: Minor I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I'm missing something and it is much harder to implement than I think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8959) Bitmasks handling
[ https://issues.apache.org/jira/browse/HBASE-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710669#comment-13710669 ] Max Lapan commented on HBASE-8959: -- Not yet, thanks for suggestion. Bitmasks handling - Key: HBASE-8959 URL: https://issues.apache.org/jira/browse/HBASE-8959 Project: HBase Issue Type: New Feature Reporter: Max Lapan Priority: Minor I think it would be useful to natively support bitmasks in hbase columns with ability to check/set/clear/toggle individual bits in byte[] for any column. Now, we forced to store lots of feature flags as separate, 1-byte values, which is a waste of space. Compression helps, I guess, but anyway, not ideal. I see this as a set of new KeyValue.Type, which describes needed bit-operations. Operations itself performed on compaction or scan process. Similiar things a performed for KeyValue.Minimum and KeyValue.Maximum. What do you think about this feature? Maybe, I'm missing something and it is much harder to implement than I think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5071) HFile has a possible cast issue.
[ https://issues.apache.org/jira/browse/HBASE-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585860#comment-13585860 ] Max Lapan commented on HBASE-5071: -- Add my notes on this bug. This could be helpful to somewone who as we are still use HFileV1. This bug was introduced by HBASE-3040 performance optimisation and cannot be fixed by Harsh's patch, which truncates index data (there are later problems rise on index parse). I fixed this issue in our installation by replacing readAllIndex whith BufferedInputStreams, which is transparent and have no index size limitations: https://github.com/Shmuma/hbase/commit/d0ef517482a0475588e229344558c31b47d5a269 HFile has a possible cast issue. Key: HBASE-5071 URL: https://issues.apache.org/jira/browse/HBASE-5071 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.90.0 Reporter: Harsh J Labels: hfile Fix For: 0.96.0 HBASE-3040 introduced this line originally in HFile.Reader#loadFileInfo(...): {code} int allIndexSize = (int)(this.fileSize - this.trailer.dataIndexOffset - FixedFileTrailer.trailerSize()); {code} Which on trunk today, for HFile v1 is: {code} int sizeToLoadOnOpen = (int) (fileSize - trailer.getLoadOnOpenDataOffset() - trailer.getTrailerSize()); {code} This computed (and casted) integer is then used to build an array of the same size. But if fileSize is very large ( Integer.MAX_VALUE), then there's an easy chance this can go negative at some point and spew out exceptions such as: {code} java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readAllIndex(HFile.java:805) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:832) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.loadFileInfo(StoreFile.java:1003) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:382) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:267) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:209) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2088) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:358) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647) {code} Did we accidentally limit single region sizes this way? (Unsure about HFile v2's structure so far, so do not know if v2 has the same issue.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5071) HFile has a possible cast issue.
[ https://issues.apache.org/jira/browse/HBASE-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586673#comment-13586673 ] Max Lapan commented on HBASE-5071: -- Harsh: at the moment, we aren't ready to upgrade from 0.90.6 to 0.92 HFile has a possible cast issue. Key: HBASE-5071 URL: https://issues.apache.org/jira/browse/HBASE-5071 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.90.0 Reporter: Harsh J Labels: hfile Fix For: 0.96.0 HBASE-3040 introduced this line originally in HFile.Reader#loadFileInfo(...): {code} int allIndexSize = (int)(this.fileSize - this.trailer.dataIndexOffset - FixedFileTrailer.trailerSize()); {code} Which on trunk today, for HFile v1 is: {code} int sizeToLoadOnOpen = (int) (fileSize - trailer.getLoadOnOpenDataOffset() - trailer.getTrailerSize()); {code} This computed (and casted) integer is then used to build an array of the same size. But if fileSize is very large ( Integer.MAX_VALUE), then there's an easy chance this can go negative at some point and spew out exceptions such as: {code} java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readAllIndex(HFile.java:805) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:832) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.loadFileInfo(StoreFile.java:1003) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:382) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:267) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:209) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2088) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:358) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647) {code} Did we accidentally limit single region sizes this way? (Unsure about HFile v2's structure so far, so do not know if v2 has the same issue.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540062#comment-13540062 ] Max Lapan commented on HBASE-5416: -- bq. I think you meant we only have storeHeap. No, exactly one KVS in joinedScanner heap and empty storeHeap. It was caused by {{!scan.doLoadColumnFamiliesOnDemand()}} extra condition in constructor. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536069#comment-13536069 ] Max Lapan commented on HBASE-5416: -- is this part of acceptable inconsistency that was discussed above? I understand this can be a valid scenario (speedup is huge at the cost of very few (or none if pre-split) easily recoverable errors), but just wonder if you are aware of that... No. It is very strange, because spliting process is handled on lower level of Store class and should be transparent to HRegion level (at least in 0.90.6 codebase, maybe something changed dramatically). In our production split is quite common operation, run without issues. We had one problem (HBASE-6499), which was caused by no one calling seek/reseek frequently, it may be similiar issue. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530714#comment-13530714 ] Max Lapan commented on HBASE-5416: -- Have you increased forkedProcessTimeoutInSeconds option it pom.xml? Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492173#comment-13492173 ] Max Lapan commented on HBASE-5416: -- If no one against inclusion, let's include it :). But I have a small improvement to do. Personally, I don't like filters interface alteration. When I started, I thought that it would be more filters to conform to optimization, but only SingleColumnValueFiler and SingleColumnValueFilter are. So, I'd better to just check for these filters in HRegionScanner than introduce extra method in interface. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
[ https://issues.apache.org/jira/browse/HBASE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484835#comment-13484835 ] Max Lapan commented on HBASE-6499: -- Yes, this bug related with HBASE-6900, but also fixes seek() case. StoreScanner's QueryMatcher not reset on store update - Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan Attachments: StoreScanner_not_reset_matcher.patch When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484842#comment-13484842 ] Max Lapan commented on HBASE-5416: -- Yes, I think CP will work, thanks. The sad thing is that we use 0.90.6 (CDH) version of HBase, which don't have CPs. In fact, we use this patch on our production system without major issues and quite happy with it. But I don't think it's a good idea to include it in trunk, when much better approach exists. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
Max Lapan created HBASE-6499: Summary: StoreScanner's QueryMatcher not reset on store update Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
[ https://issues.apache.org/jira/browse/HBASE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-6499: - Attachment: StoreScanner_not_reset_matcher.patch StoreScanner's QueryMatcher not reset on store update - Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan Attachments: StoreScanner_not_reset_matcher.patch When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6499) StoreScanner's QueryMatcher not reset on store update
[ https://issues.apache.org/jira/browse/HBASE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-6499: - Status: Patch Available (was: Open) StoreScanner's QueryMatcher not reset on store update - Key: HBASE-6499 URL: https://issues.apache.org/jira/browse/HBASE-6499 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Max Lapan Assignee: Max Lapan Attachments: StoreScanner_not_reset_matcher.patch When underlying store changed (due compact, bulk load, etc), we destroy current KeyValueHeap and recreate it using checkReseek call. Besides heap recreation, it resets underlying QueryMatcher instance. The problem is that checkReseek not called by seek() and reseek(), only by next(). If someone calls seek() just after store changed, it gets wrong scanner results. Call to reseek may end up with NPE. AFAIK, current codebase don't call seek and reseek, but it is quite possible in future. Personally, I spent lots of time to find source of wrong scanner results in HBASE-5416. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v7.patch Implemented benchmark of joined scanners. You can run it with {{mvn test -P localTests --Dtest=TestJoinedScanners}}. It lasts for about an hour, so, don't foreget to increase {{forkedProcessTimeoutInSeconds}} it pom.xml file. On my notebook I got the following output: {quote} 2012-06-29 22:12:00,182 INFO [main] regionserver.TestJoinedScanners(102): Make 10 rows, total size = 9765.0 MB 2012-06-29 22:56:51,231 INFO [main] regionserver.TestJoinedScanners(128): Data generated in 2691.048310914 seconds 2012-06-29 23:03:03,865 INFO [main] regionserver.TestJoinedScanners(152): Slow scanner finished in 372.634075184 seconds, got 1000 rows 2012-06-29 23:04:02,443 INFO [main] regionserver.TestJoinedScanners(172): Joined scanner finished in 58.577552657 seconds, got 1000 rows 2012-06-29 23:09:41,837 INFO [main] regionserver.TestJoinedScanners(195): Slow scanner finished in 339.394307354 seconds, got 1000 rows {quote} I run slow scanners test twice to be sure that it's not a cache effect. So, it's about 5.7 times speedup on this toy data. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Status: Open (was: Patch Available) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v5.1.patch Fixed issues with incorrect rebase, applied suggested changes from first review. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v5.patch Fixed issues with limits in next() call. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Status: Patch Available (was: Open) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282497#comment-13282497 ] Max Lapan commented on HBASE-5416: -- After a long delay, I decided to return to this optimization. We have this patch on our production system (300TB HBase data, 160 nodes) during last two months without issues. 2-phase approach tests demonstrated much worse performance improvement over this patch - only 2 times speedup vs near 20 times. I extended tests, but don't feel myself experienced enougth to implement concurrent, multithread test as suggested, sorry. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Status: Open (was: Patch Available) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: (was: Filtered_scans_v5.patch) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v5.patch Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v5.patch Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: (was: Filtered_scans_v5.patch) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Status: Patch Available (was: Open) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282528#comment-13282528 ] Max Lapan commented on HBASE-5416: -- Additional code handled the case when InternalScanner::next called with limit != -1. In this case, we must remember KeyValueHeap we populated when limit reached, and restart this population on next method issue. I also added a test case for such situation. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282539#comment-13282539 ] Max Lapan commented on HBASE-5416: -- I tried to post it there, but constantly get Internal server error. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282543#comment-13282543 ] Max Lapan commented on HBASE-5416: -- Ahhh, I'm stupid, it works with hbase-git repository. Posted https://reviews.apache.org/r/5225/ Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira