[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564032#comment-13564032 ] Anoop Sam John commented on HBASE-7337: --- [~zhou wen jian] When you scan data and your table having multiple versions for a cell, which all versions the scan should return to be specified in Scan object. By default HBase will return only the latest version. But you have setters on Scan object using which you can tell to get more version. SCVF is used to specify a column value check. If the condition is not satisfied for a row, then that row will be fully filtered out from the returned results. Here using latestVersionOnly = true , you are specifying to check the condition only on the latest version. [*Not* that to return the latestVersion only in result]If this is set as false, all the versions will be checked for the condition and if any of the version value is satisfying the condition, that row will get included. But remember SCVF can not specify to return only latest version of cell or not. This is just for specifying the condition and all the filter will see all the versions of cells. Which version to be returned is decided down the line in another part of code which will be executed after this Filter#filterKeyValue(KeyValue) SingleColumnValueFilter seems to get unavailble data - Your heading says getting unavailable data. Can you tell more? Or your problem is getting the older versions? As per your reply we can check whether some real bug is there or not. If no issues we can close this. SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.6 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. {code} public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } {code} From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532042#comment-13532042 ] Anoop Sam John commented on HBASE-7337: --- bq. then i set latestVersionOnly to false, the result does no change. latestVersionOnly in SCVF is not for determining what will get returned by the Scan. This specifies which value of a row column to be checked in case of multiple versions available for the cell. When latestVersionOnly = true, only the latest version value will get checked and so in your case only if the latest version value is 2 that row will get selected. When latestVersionOnly =false, it will check all the versions and if any of the version value is 2 that row will get selected. BTW how is the Scan object? I guess you have set in Scan to get all the versions. SCVF will not decide how many versions of a cell to be send back to client. It just checks the cell value with the value specified. SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.4 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532066#comment-13532066 ] Zhou wenjian commented on HBASE-7337: - [~anoopsamjohn] actually i did get all the versions. when using SCVF, it is not allowed to set version in the scan for it will get data againset SCVF,do you mean it? SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.4 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532082#comment-13532082 ] Anoop Sam John commented on HBASE-7337: --- In Scan you can set how many version you need or you need all the versions or only latest version. This is irrespective of whether SCVF or any other filters are set on the Scan. In fact the filter will get to know all the versions. (in filterKeyValue) SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.4 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529746#comment-13529746 ] ramkrishna.s.vasudevan commented on HBASE-7337: --- Did you check with your values? Like the inserted values are also String and the one that you are querying is also String? Just to verify... SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.4 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to false, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529773#comment-13529773 ] Zhou wenjian commented on HBASE-7337: - they are both String SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.4 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira