[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13469906#comment-13469906 ] Hudson commented on HBASE-6757: --- Integrated in HBase-0.94-security-on-Hadoop-23 #8 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/8/]) HBASE-6757 Very inefficient behaviour of scan using FilterList (Revision 1383749) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.94.2, 0.96.0 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455034#comment-13455034 ] Hudson commented on HBASE-6757: --- Integrated in HBase-0.94-security #52 (See [https://builds.apache.org/job/HBase-0.94-security/52/]) HBASE-6757 Very inefficient behaviour of scan using FilterList (Revision 1383749) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453918#comment-13453918 ] Hudson commented on HBASE-6757: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #169 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/169/]) HBASE-6757 Very inefficient behaviour of scan using FilterList (Revision 1383748) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453068#comment-13453068 ] Lars Hofhansl commented on HBASE-6757: -- Interesting. I just looked at the code and I agree this is wrong. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453077#comment-13453077 ] Lars Hofhansl commented on HBASE-6757: -- There are more inefficiencies in there too. For example for MUST_PASS_ONE, we can break the outer loop as soon as we find one filter indicating INCLUDE. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Attachments: CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453104#comment-13453104 ] Jerry Lam commented on HBASE-6757: -- Hi Lars: Thanks for looking into it! I have a little concern, not about the change but the existing code that depends on the SKIP return code. There might be some users that have a complex FilterList (A filterlist of a filterlist of a filterlist, etc) that might depends on the SKIP behaviour. The users might not aware of it. What do you think? Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453138#comment-13453138 ] Lars Hofhansl commented on HBASE-6757: -- As long as FilterList does the right thing it should be transparent, right? Now, if somebody wrote their own filter that wraps a FilterList, then the behavior might be changed slightly. Then again, there FilterList.filterKeyValue can already return other codes (NEXT_COL, SEEK_NEXT_USING_HINT, etc), which the wrapper would have to deal with. So I think the change is fine. I'll let some of my fellow committers comment on this too. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453142#comment-13453142 ] Ted Yu commented on HBASE-6757: --- I think Jerry has a point. The fix should be fine for 0.96 For 0.94, we should take cautionary stance. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453160#comment-13453160 ] Lars Hofhansl commented on HBASE-6757: -- The case Jerry mentions is not a problem. A filterlist of a filterlist of a filterlist, etc will work fine. Also see this code FilterList.filterKeyValue: {code} if (operator == Operator.MUST_PASS_ALL) { if (filter.filterAllRemaining()) { return ReturnCode.NEXT_ROW; } {code} So it is already possible now that FilterList.filterKeyValue return NEXT_ROW and hence calling code must deal with it. Definitely fine for 0.96, but I'm cool with this in 0.94 as well. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453203#comment-13453203 ] Andrew Purtell commented on HBASE-6757: --- +1 for 0.94 as well. The current behavior is wrong. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453205#comment-13453205 ] Ted Yu commented on HBASE-6757: --- Then this JIRA should be a bug. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453209#comment-13453209 ] Andrew Purtell commented on HBASE-6757: --- @Ted, so then update the jira metadata if you'd like. And per recent discussion on the dev list, I believe the the consensus is the conservative approach is appropriate for 0.92 and earlier, FYI. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453231#comment-13453231 ] Lars Hofhansl commented on HBASE-6757: -- Generally I am find with Improvements and even (targeted, small) New Features in 0.94, simply because 0.96 will be a while before it can be considered stable. 0.92 is different, since it is a maintenance release and there is a clean, safe upgrade path to 0.94 (well at least when HBASE-6710 is in). Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Improvement Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453676#comment-13453676 ] Hadoop QA commented on HBASE-6757: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544647/6757.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2849//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2849//console This message is automatically generated. Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453707#comment-13453707 ] Hudson commented on HBASE-6757: --- Integrated in HBase-TRUNK #3321 (See [https://builds.apache.org/job/HBase-TRUNK/3321/]) HBASE-6757 Very inefficient behaviour of scan using FilterList (Revision 1383748) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6757) Very inefficient behaviour of scan using FilterList
[ https://issues.apache.org/jira/browse/HBASE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453710#comment-13453710 ] Hudson commented on HBASE-6757: --- Integrated in HBase-0.94 #463 (See [https://builds.apache.org/job/HBase-0.94/463/]) HBASE-6757 Very inefficient behaviour of scan using FilterList (Revision 1383749) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java Very inefficient behaviour of scan using FilterList --- Key: HBASE-6757 URL: https://issues.apache.org/jira/browse/HBASE-6757 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.90.6 Reporter: Jerry Lam Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6757.txt, CopyOfTestColumnPrefixFilter.java, DisplayFilter.java The behaviour of scan is very inefficient when using with FilterList. The FilterList rewrites the return code from NEXT_ROW to SKIP from a filter if Operator.MUST_PASS_ALL is used. This happens when using ColumnPrefixFilter. Even though the ColumnPrefixFilter indicates to jump to NEXT_ROW because no further match can be found, the scan continues to scan all versions of a column in that row and all columns of that row because the ReturnCode from ColumnPrefixFilter has been rewritten by the FilterList from NEXT_ROW to SKIP. This is particularly inefficient when there are many versions in a column because the check is performed on all versions of the column instead of just by checking the qualifier of the column name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira