[jira] [Commented] (PHOENIX-1334) Issue when LIKE expression contains Chinese characters on Key column
[ https://issues.apache.org/jira/browse/PHOENIX-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164878#comment-14164878 ] jay wong commented on PHOENIX-1334: --- [~wuyang630] for a reason of a chinese characters has two bytes. maybe you can use: where VAL like '%中%'; Issue when LIKE expression contains Chinese characters on Key column Key: PHOENIX-1334 URL: https://issues.apache.org/jira/browse/PHOENIX-1334 Project: Phoenix Issue Type: Bug Affects Versions: 4.1 Environment: jdk 1.8 linux Reporter: wuyang When I use like expression in SELECT query. It works well when I put *Chinese* characters in LIKE expression on NONE PRIMARY KEY columns . BUT when I put them in LIKE expression on *PRIMARY KEY* , it occurs an Exception: select * from test3 where PK like '中%'; ||COLUMN_NAME||DATA_TYPE||TYPE_NAME |PK|12|VARCHAR |VAL|12|VARCHAR {quote} org.apache.phoenix.schema.IllegalDataException: CHAR types may only contain single byte characters (中) at org.apache.phoenix.schema.PDataType$2.toBytes(PDataType.java:216) at org.apache.phoenix.compile.WhereOptimizer$KeyExpressionVisitor.visitLeave(WhereOptimizer.java:829) at org.apache.phoenix.compile.WhereOptimizer$KeyExpressionVisitor.visitLeave(WhereOptimizer.java:349) at org.apache.phoenix.expression.LikeExpression.accept(LikeExpression.java:269) at {quote} the type of PRIMARY KEY and NONE PRIMARY KEY columns are all VARCHAR In the relative source code: {code} byte[] b = VARCHAR.toBytes(object); if (b.length != ((String) object).length()) { throw new IllegalDataException(CHAR types may only contain single byte characters ( + object + )); } {code} actually, Chinese (or other non-Latin) characters will never meet the condition b.length == ((String) object).length() . Default encode method is UTF-8. *User following sentences to reappear:* create table test_c ( pk varchar primary key , val varchar); upsert into test_c values ('中文','中文'); select * from test_c where VAL like '中%'; _// it works well until now_ select * from test_c where PK like '中%'; _// oops..._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1267) Set scan.setSmall(true) when appropriate
[ https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1267: -- Attachment: smallscan2.patch Set scan.setSmall(true) when appropriate Key: PHOENIX-1267 URL: https://issues.apache.org/jira/browse/PHOENIX-1267 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: jay wong Attachments: smallscan.patch, smallscan2.patch There's a nice optimization that has been in HBase for a while now to set a scan as small. This prevents extra RPC calls, I believe. We should add a hint for queries that forces it to be set/not set, and make our best guess on when it should default to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1267) Set scan.setSmall(true) when appropriate
[ https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163236#comment-14163236 ] jay wong commented on PHOENIX-1267: --- [~jamestaylor] About isScanForbidden logic is not about topN, it's about order by. But i found that the order by query sign for TOPN. eg: we have a result like. assume that small scan query 100 results one query. |key|col1| |1|2| |2|1| |...|...| |100|55| |2|101| when next query. it found the row which is rowkey is greater than the last rowkey in the last page. so the rowkey will be fall into a Infinite loop. Set scan.setSmall(true) when appropriate Key: PHOENIX-1267 URL: https://issues.apache.org/jira/browse/PHOENIX-1267 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: jay wong Attachments: smallscan.patch, smallscan2.patch There's a nice optimization that has been in HBase for a while now to set a scan as small. This prevents extra RPC calls, I believe. We should add a hint for queries that forces it to be set/not set, and make our best guess on when it should default to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1267) Set scan.setSmall(true) when appropriate
[ https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163246#comment-14163246 ] jay wong commented on PHOENIX-1267: --- [~jamestaylor] About joins. I think that the joins query is always a big query almost the whole table. about all big query. the small scan has a bad performance. So I set smallScan false before Set scan.setSmall(true) when appropriate Key: PHOENIX-1267 URL: https://issues.apache.org/jira/browse/PHOENIX-1267 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: jay wong Attachments: smallscan.patch, smallscan2.patch There's a nice optimization that has been in HBase for a while now to set a scan as small. This prevents extra RPC calls, I believe. We should add a hint for queries that forces it to be set/not set, and make our best guess on when it should default to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (PHOENIX-1267) Set scan.setSmall(true) when appropriate
[ https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong reassigned PHOENIX-1267: - Assignee: jay wong Set scan.setSmall(true) when appropriate Key: PHOENIX-1267 URL: https://issues.apache.org/jira/browse/PHOENIX-1267 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: jay wong Attachments: smallscan.patch There's a nice optimization that has been in HBase for a while now to set a scan as small. This prevents extra RPC calls, I believe. We should add a hint for queries that forces it to be set/not set, and make our best guess on when it should default to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1267) Set scan.setSmall(true) when appropriate
[ https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163003#comment-14163003 ] jay wong commented on PHOENIX-1267: --- I have a holiday in the past several days. so sorry for reply later. I know your mean. normally the hint is more structured and a better way. I think use hint control the small is a good point. the small scan will be set true default when both the startkey and stopkey is setted. if we have a order by query. and the small is true. the result will be Infinite loop. So I think the small scan is not only a query optimize for user. I will cause a bug. So I think the smallScanForbidden is needed also. Set scan.setSmall(true) when appropriate Key: PHOENIX-1267 URL: https://issues.apache.org/jira/browse/PHOENIX-1267 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: jay wong Attachments: smallscan.patch There's a nice optimization that has been in HBase for a while now to set a scan as small. This prevents extra RPC calls, I believe. We should add a hint for queries that forces it to be set/not set, and make our best guess on when it should default to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1258: -- Attachment: regex_substr_func_group3.patch RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch, regex_substr_func_group2.patch, regex_substr_func_group3.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138615#comment-14138615 ] jay wong commented on PHOENIX-1258: --- [~samarthjain] of caurse. thanks for review and suggest RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch, regex_substr_func_group2.patch, regex_substr_func_group3.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1258: -- Attachment: regex_substr_func.patch RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func.patch, regex_substr_func_group.patch, regex_substr_func_group2.patch, regex_substr_func_group3.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139908#comment-14139908 ] jay wong commented on PHOENIX-1258: --- [~gabriel.reid] I know your mean. And I agree with it. thanks for review and wonderful suggest RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func.patch, regex_substr_func_group.patch, regex_substr_func_group2.patch, regex_substr_func_group3.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1252) Potential lock leak for ungrouped aggregation
[ https://issues.apache.org/jira/browse/PHOENIX-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137007#comment-14137007 ] jay wong commented on PHOENIX-1252: --- thx for review and commit [~jamestaylor] Potential lock leak for ungrouped aggregation - Key: PHOENIX-1252 URL: https://issues.apache.org/jira/browse/PHOENIX-1252 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 4.2, 3.2 Attachments: lock-leak.patch as innerScanner.close throw a exception the region.CloseRegionOperation will not execute at all -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
jay wong created PHOENIX-1258: - Summary: RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; ++ | SUBGMT | ++ | asdfasf?fas | | basdfasf?fas | ++ 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; ++ | SUBGMT | ++ | asdfasf| | basdfasf | ++ 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1258: -- Description: 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected was: 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; ++ | SUBGMT | ++ | asdfasf?fas | | basdfasf?fas | ++ 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; ++ | SUBGMT | ++ | asdfasf| | basdfasf | ++ 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1258: -- Description: 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | was: 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138395#comment-14138395 ] jay wong commented on PHOENIX-1258: --- [~gabriel.reid] yes. override readFields will be a better way. But why the test case I named it GroupByWithFunctionIT. It's because of that only query use RegexpSubstrFunction ,the result is all right. the bad case is query with both group by and function. as It set into scan attribute which key named UNORDERED_GROUP_BY_EXPRESSIONS. So It is a group by test. I think RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138395#comment-14138395 ] jay wong edited comment on PHOENIX-1258 at 9/18/14 2:07 AM: [~gabriel.reid] yes. override readFields will be a better way. But why the test case I named it GroupByWithFunctionIT. It's because of that only query use RegexpSubstrFunction ,the result is all right. the bad case is query with both group by and function. And as It serialized and set into scan attribute which key named UNORDERED_GROUP_BY_EXPRESSIONS. So It is a group by test. I think. thanks for review and suggest was (Author: jaywong): [~gabriel.reid] yes. override readFields will be a better way. But why the test case I named it GroupByWithFunctionIT. It's because of that only query use RegexpSubstrFunction ,the result is all right. the bad case is query with both group by and function. as It set into scan attribute which key named UNORDERED_GROUP_BY_EXPRESSIONS. So It is a group by test. I think RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138395#comment-14138395 ] jay wong edited comment on PHOENIX-1258 at 9/18/14 2:11 AM: [~gabriel.reid] yes. override readFields will be a better way. But why the test case I named it GroupByWithFunctionIT. It's because of that only query use RegexpSubstrFunction ,the result is all right. the bad case is query with both group by and function. And as It serialized and set into scan attribute which key named UNORDERED_GROUP_BY_EXPRESSIONS. And **other functions** may be have same problem with group by. So I think It is a group by test. thanks for review and suggest was (Author: jaywong): [~gabriel.reid] yes. override readFields will be a better way. But why the test case I named it GroupByWithFunctionIT. It's because of that only query use RegexpSubstrFunction ,the result is all right. the bad case is query with both group by and function. And as It serialized and set into scan attribute which key named UNORDERED_GROUP_BY_EXPRESSIONS. So It is a group by test. I think. thanks for review and suggest RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1258) RegexpSubstrFunction is not useable with group by query
[ https://issues.apache.org/jira/browse/PHOENIX-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1258: -- Attachment: regex_substr_func_group2.patch RegexpSubstrFunction is not useable with group by query Key: PHOENIX-1258 URL: https://issues.apache.org/jira/browse/PHOENIX-1258 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Assignee: jay wong Fix For: 3.1 Attachments: regex_substr_func_group.patch, regex_substr_func_group2.patch 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select gmt subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf?fas | | basdfasf?fas | 2 rows selected (0.071 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt from test_salt1 limit 2; | SUBGMT | | asdfasf| | basdfasf | 2 rows selected (0.07 seconds) 0: jdbc:phoenix:10.232.98.99:/hbase-phoenix-l select REGEXP_SUBSTR(gmt, '[^\\?]+') subgmt,sum(int_A) from test_salt1 group by subgmt; | SUBGMT | SUM(A.INT_A) | | null | 17498| 1 row selected (0.082 seconds) the group by result is not expected below result is expected |SUBGMT |SUM(A.INT_A)| | asdfasf| 8342 | | basdfasf | 5449 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1252) A lock maybe leak.
[ https://issues.apache.org/jira/browse/PHOENIX-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1252: -- Attachment: lock-leak.patch A lock maybe leak. -- Key: PHOENIX-1252 URL: https://issues.apache.org/jira/browse/PHOENIX-1252 Project: Phoenix Issue Type: Bug Affects Versions: 3.1 Reporter: jay wong Fix For: 3.1 Attachments: lock-leak.patch as innerScanner.close throw a exception the region.CloseRegionOperation will not execute at all -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1074: -- Attachment: SkipScanFilterSaltedIntersectTest.java ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong Attachments: SkipScanFilterSaltedIntersectTest.java create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} and {code} select * from table1 where gmt '20140202' and gmt '20140207' {code} the two sql will has the same split , but why not? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068585#comment-14068585 ] jay wong commented on PHOENIX-1074: --- [~jamestaylor] add a test case. the second testcase can attest the problem I know the rule is hard to correct. but it's not advisable ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong Attachments: SkipScanFilterSaltedIntersectTest.java create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} and {code} select * from table1 where gmt '20140202' and gmt '20140207' {code} the two sql will has the same split , but why not? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057369#comment-14057369 ] jay wong commented on PHOENIX-1074: --- [~jamestaylor] please check my problem again. this is my primary key. and salt_buckets. {code} CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4 {code} {code} select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 (is logical) {code} {code} select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28(I think the split size is also 12 is logical) {code} this is only a epitome. In my online table . has 1900 regions. If it's run with logical splits policy, only has nearly 20 splits. BUT it has 1900 splits ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt
[jira] [Created] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
jay wong created PHOENIX-1074: - Summary: ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Wish Reporter: jay wong create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table partition as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C| | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} and {code} select * from table1 where gmt '20140202' and gmt '20140207' {code} the two sql will has the same split , but why not? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1074: -- Issue Type: Bug (was: Wish) ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} and {code} select * from table1 where gmt '20140202' and gmt '20140207' {code} the two sql will has the same split , but why not? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PHOENIX-1065) In order by case, I think nulls last will be default
jay wong created PHOENIX-1065: - Summary: In order by case, I think nulls last will be default Key: PHOENIX-1065 URL: https://issues.apache.org/jira/browse/PHOENIX-1065 Project: Phoenix Issue Type: New Feature Reporter: jay wong 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PHOENIX-1064) In order by case, I think nulls last will be default
jay wong created PHOENIX-1064: - Summary: In order by case, I think nulls last will be default Key: PHOENIX-1064 URL: https://issues.apache.org/jira/browse/PHOENIX-1064 Project: Phoenix Issue Type: New Feature Reporter: jay wong 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1064) In order by case, I think nulls last will be default
[ https://issues.apache.org/jira/browse/PHOENIX-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1064: -- Description: 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. was: 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. In order by case, I think nulls last will be default Key: PHOENIX-1064 URL: https://issues.apache.org/jira/browse/PHOENIX-1064 Project: Phoenix Issue Type: New Feature Reporter: jay wong 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1064) In order by case, I think nulls last will be default
[ https://issues.apache.org/jira/browse/PHOENIX-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1064: -- Attachment: PHOENIX-1064.patch In order by case, I think nulls last will be default Key: PHOENIX-1064 URL: https://issues.apache.org/jira/browse/PHOENIX-1064 Project: Phoenix Issue Type: New Feature Reporter: jay wong Attachments: PHOENIX-1064.patch 1. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder; |PK1 |COL1|COL2| | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | | row4 | 5 | null | | row5 | 5 | null | 2.jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row1 | 2 | 3 | | row2 | 3 | 4 | | row3 | 4 | 5 | 3. jdbc:phoenix:ip:/hbase-phoenix-l select * from testorder order by col2 desc; |PK1 |COL1|COL2| | row4 | 5 | null | | row5 | 5 | null | | row3 | 4 | 5 | | row2 | 3 | 4 | | row1 | 2 | 3 | I think it will be at last when the column is null as default As mysql or oracle DB is at last. It is More in line with normal logic of thinking. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1062) A SQL Trimmer for log sql execute times
[ https://issues.apache.org/jira/browse/PHOENIX-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055756#comment-14055756 ] jay wong commented on PHOENIX-1062: --- [~jamestaylor] when any sql execute. just like select col1 from table1 where col2=1; select col1 from table1 where col2=2; select col1 from table1 where col2=3; select col1 from table1 where col2=99; I think it's the sql : select col1 from table1 where col2=?; execute 99 times. so the tool trim any sql condition value or limit value to '?' A SQL Trimmer for log sql execute times --- Key: PHOENIX-1062 URL: https://issues.apache.org/jira/browse/PHOENIX-1062 Project: Phoenix Issue Type: New Feature Affects Versions: 3.0.0 Reporter: jay wong Priority: Critical Fix For: 3.1 Attachments: SQLTrimmer.java If we need a statistics that which sql execute times just like : select a,b,c from table1 where d=13 and e='abc' limit 20; but the condition value is not needed because of overlap so the will be trim as : select a,b,c from table1 where d=? and e=? limit ?; Now the tool fix it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.
[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055809#comment-14055809 ] jay wong commented on PHOENIX-1056: --- [~jamestaylor] yes . It create both table data and indexes data (HFile) in a single job. The patch is a Alpha version. I build it for a preview. Finally it's will be a part of CsvImportTsv. A ImportTsv tool for phoenix to build table data and all index data. Key: PHOENIX-1056 URL: https://issues.apache.org/jira/browse/PHOENIX-1056 Project: Phoenix Issue Type: Task Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1056.patch I have just build a tool for build table data and index table data just like ImportTsv job. http://hbase.apache.org/book/ops_mgt.html#importtsv when ImportTsv work it write HFile in a CF name path. for example A table has two cf, A and B. the output is ./outputpath/A ./outputpath/B In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. the output will be ./outputpath/TableOne/A ./outputpath/TableOne/B ./outputpath/IdxOne ./outputpath/IdxTwo. If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PHOENIX-1073) A memory table in every region is needed?
jay wong created PHOENIX-1073: - Summary: A memory table in every region is needed? Key: PHOENIX-1073 URL: https://issues.apache.org/jira/browse/PHOENIX-1073 Project: Phoenix Issue Type: Wish Reporter: jay wong When a do a group by query. We assume that a Region has 30M data. 100K row which include 30 kv per row And the RT of GroupedAggregateRegionObserver is about 3 sec. but most of time. in fact nearly 2.2 sec is spend on RegionScaner scan all the row. I have a test. first time scan all of the data into memory. the second time only load the data from memory. the RT of GroupedAggregateRegionObserver execute only 0.7s. So If a memory table is needed for Phoenix computational intensive scene -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.
[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1056: -- Attachment: PHOENIX-1056.patch a preview tool patch for review A ImportTsv tool for phoenix to build table data and all index data. Key: PHOENIX-1056 URL: https://issues.apache.org/jira/browse/PHOENIX-1056 Project: Phoenix Issue Type: Task Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1056.patch I have just build a tool for build table data and index table data just like ImportTsv job. http://hbase.apache.org/book/ops_mgt.html#importtsv when ImportTsv work it write HFile in a CF name path. for example A table has two cf, A and B. the output is ./outputpath/A ./outputpath/B In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. the output will be ./outputpath/TableOne/A ./outputpath/TableOne/B ./outputpath/IdxOne ./outputpath/IdxTwo. If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.
[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053505#comment-14053505 ] jay wong commented on PHOENIX-1056: --- nomarl HBase ImportTsv is : bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:a,cf:b,cf:c -Dimporttsv.bulk.output=hdfs://storefile-outputdir tablename hdfs-inputdir Phoenix ImportTsv is this. and it support phoenix datatype bin/hbase org.apache.hadoop.hbase.mapreduce.PhoneixImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:A:PH_INT,CF:B:PH:BIGINT,cf:c -Dimporttsv.index.all=true -Dimporttsv.bulk.output=hdfs://storefile-outputdir tablename hdfs-inputdir If the primary key is mutil-col. support the rule replace HBASE_ROW_KEY to HBASE_ROW_KEY^CF1:Q1:PH_INT^CF2:Q2^0^CF1:Q3:PH_INT parameter: -Dimporttsv.index.all=true. If build all index table data, default is false -Dimporttsv.build.table=true if build the data table, default is true. -Dimporttsv.index.names=INDEX1,INDEX2. which index table we build. need set -Dimporttsv.index.all=false. A ImportTsv tool for phoenix to build table data and all index data. Key: PHOENIX-1056 URL: https://issues.apache.org/jira/browse/PHOENIX-1056 Project: Phoenix Issue Type: Task Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1056.patch I have just build a tool for build table data and index table data just like ImportTsv job. http://hbase.apache.org/book/ops_mgt.html#importtsv when ImportTsv work it write HFile in a CF name path. for example A table has two cf, A and B. the output is ./outputpath/A ./outputpath/B In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. the output will be ./outputpath/TableOne/A ./outputpath/TableOne/B ./outputpath/IdxOne ./outputpath/IdxTwo. If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.
[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053507#comment-14053507 ] jay wong commented on PHOENIX-1056: --- Anyway. In my ImportTsv It support a char or unicode separator, which in the apache code only support a single-byte separators just like -Dimporttsv.separator=\001 -Dimporttsv.separator=\u0019 A ImportTsv tool for phoenix to build table data and all index data. Key: PHOENIX-1056 URL: https://issues.apache.org/jira/browse/PHOENIX-1056 Project: Phoenix Issue Type: Task Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1056.patch I have just build a tool for build table data and index table data just like ImportTsv job. http://hbase.apache.org/book/ops_mgt.html#importtsv when ImportTsv work it write HFile in a CF name path. for example A table has two cf, A and B. the output is ./outputpath/A ./outputpath/B In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. the output will be ./outputpath/TableOne/A ./outputpath/TableOne/B ./outputpath/IdxOne ./outputpath/IdxTwo. If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1062) A SQL Trimmer for log sql execute times
[ https://issues.apache.org/jira/browse/PHOENIX-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1062: -- Attachment: SQLTrimmer.java A SQL Trimmer for log sql execute times --- Key: PHOENIX-1062 URL: https://issues.apache.org/jira/browse/PHOENIX-1062 Project: Phoenix Issue Type: New Feature Affects Versions: 3.0.0 Reporter: jay wong Priority: Critical Fix For: 3.1 Attachments: SQLTrimmer.java If we need a statistics that which sql execute times just like : select a,b,c from table1 where d=13 and e='abc' limit 20; but the condition value is not needed because of overlap so the will be trim as : select a,b,c from table1 where d=? and e=? limit ?; Now the tool fix it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PHOENIX-1062) A SQL Trimmer for log sql execute times
jay wong created PHOENIX-1062: - Summary: A SQL Trimmer for log sql execute times Key: PHOENIX-1062 URL: https://issues.apache.org/jira/browse/PHOENIX-1062 Project: Phoenix Issue Type: New Feature Affects Versions: 3.0.0 Reporter: jay wong Priority: Critical Fix For: 3.1 Attachments: SQLTrimmer.java If we need a statistics that which sql execute times just like : select a,b,c from table1 where d=13 and e='abc' limit 20; but the condition value is not needed because of overlap so the will be trim as : select a,b,c from table1 where d=? and e=? limit ?; Now the tool fix it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PHOENIX-1053) UngroupedAggregateRegionObserver make lock unlock
jay wong created PHOENIX-1053: - Summary: UngroupedAggregateRegionObserver make lock unlock Key: PHOENIX-1053 URL: https://issues.apache.org/jira/browse/PHOENIX-1053 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1053.patch {code} region.startRegionOperation(); try { //do something } finally { innerScanner.close(); region.closeRegionOperation(); } {code} but when innerScanner.close() throw a exception. the lock will not be unlock . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-136) Support derived tables
[ https://issues.apache.org/jira/browse/PHOENIX-136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043287#comment-14043287 ] jay wong commented on PHOENIX-136: -- [~maryannxue] Could you attach a patch for 3.0.0-release version. thx Support derived tables -- Key: PHOENIX-136 URL: https://issues.apache.org/jira/browse/PHOENIX-136 Project: Phoenix Issue Type: Task Reporter: James Taylor Assignee: Maryann Xue Labels: enhancement Fix For: 5.0.0, 3.1, 4.1 Add support for derived queries of the form: SELECT * FROM ( SELECT company, revenue FROM Company ORDER BY revenue) LIMIT 10 Adding support for this requires a compile time change as well as a runtime execution change. The first version of the compile-time change could limit aggregation to only be allowed in the inner or the outer query, but not both. In this case, the inner and outer queries can be combined into a single query with the outer select becoming just a remapping of a subset of the projection from the inner select. The second version of the compile-time change could handle aggregation in the inner and outer select by performing client side (this is likely a less common scenario). For the runtime execution, change the UngroupedAggregateRegionObserver would be modified to look for a new TopNLimit attribute with an int value in the Scan. This would control the maximum number of values for the coprocessor to hold on to as the scan is performed. Then the GroupedAggregatingResultIterator would be modified to handle keeping the topN values received back from all the child iterators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1040) support qurey using metadata cache
[ https://issues.apache.org/jira/browse/PHOENIX-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027362#comment-14027362 ] jay wong commented on PHOENIX-1040: --- [~jamestaylor] I think it's remarkable in using simple query, such as primary key condition query. At before, phoneix MetaData cache is not using as a rule. But the opportunity for update meta data cache it's a option for user setting. support qurey using metadata cache --- Key: PHOENIX-1040 URL: https://issues.apache.org/jira/browse/PHOENIX-1040 Project: Phoenix Issue Type: Improvement Affects Versions: 3.0.0 Reporter: daniel meng Labels: PointLookup, metadata Attachments: PHOENIX-1040.PATCH I have use case like : select * from t where ROW = ? (ROW is primary key),so it just a hbase get. and it turn out a single query will do two RPCs for metadata, they are all calling to {code} public static ColumnResolver getResolverForQuery(SelectStatement statement, PhoenixConnection connection) throws SQLException { ListTableNode fromNodes = statement.getFrom(); if (fromNodes.size() == 1) return new SingleTableColumnResolver(connection, (NamedTableNode)fromNodes.get(0), true); {code} as you can see SingleTableColumnResolver(connection, (NamedTableNode)fromNodes.get(0), true), the third parameter is hard coded to true, which means it never use cache. for performance, we let the query use metadata cache, our test show 30% improvement on RT in our use case. in our patch, we do: 1. add an timer based updater at the global cache (the one in ConnectionQueryServices) 2. the phoenixconnection doesn't have it's own cache, it has a reference to global cache instead i'm looking forward for your advice, thanks -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1027) Why position be reset in SkipScanFilter
[ https://issues.apache.org/jira/browse/PHOENIX-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020753#comment-14020753 ] jay wong commented on PHOENIX-1027: --- [~jamestaylor] OK. thx . I try to recurred the case I other table or a Unit Test. But I didn't found the way to recurred. But In a produce table. the error confirm exist. So I create the issue for help if I can do anything to recurred the error. if the code {code} (position[i] = (position[i] + 1) % slots.get(i).size()) == 0) change to ((position[i] + 1) % slots.get(i).size()) == 0) {code} it will be all right. I want to know the change it is ok or not ? Why position be reset in SkipScanFilter --- Key: PHOENIX-1027 URL: https://issues.apache.org/jira/browse/PHOENIX-1027 Project: Phoenix Issue Type: Bug Reporter: jay wong the method in SkipScanFilter.java {code} private int nextPosition(int i) { while (i = 0 slots.get(i).get(position[i]).isSingleKey() (position[i] = (position[i] + 1) % slots.get(i).size()) == 0) { i--; } return i; } {code} In a query case. position[] has two elements position[0] = 3, position[1] = 0. and slots.get( i ).size() = 1. after the mothod execute two element in position[] are all 0. then {code} private boolean intersect(byte[] lowerInclusiveKey, byte[] upperExclusiveKey, ListListKeyRange newSlots) { for (int i = 0; i = lastSlot; i++) { ListKeyRange newRanges = slots.get(i).subList(lowerPosition[i], Math.min(position[i] + 1, slots.get(i).size())); . } return true; } {code} lowerPosition[i] = 3, Math.min(position[i] + 1, slots.get( i ).size()) = 0 as a result of position[i] = 0 so: {code} java.lang.IndexOutOfBoundsException: end index (1) must not be less than start index (3) at com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:384) at com.google.common.collect.RegularImmutableList.subList(RegularImmutableList.java:118) at com.google.common.collect.RegularImmutableList.subList(RegularImmutableList.java:31) at org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:221) at org.apache.phoenix.filter.SkipScanFilter.hasIntersect(SkipScanFilter.java:121) at org.apache.phoenix.compile.ScanRanges.intersect(ScanRanges.java:251) at org.apache.phoenix.iterate.SkipRangeParallelIteratorRegionSplitter$1.apply(SkipRangeParallelIteratorRegionSplitter.java:84) at org.apache.phoenix.iterate.SkipRangeParallelIteratorRegionSplitter$1.apply(SkipRangeParallelIteratorRegionSplitter.java:65) at com.google.common.collect.Iterators$7.computeNext(Iterators.java:649) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.Lists.newArrayList(Lists.java:138) at com.google.common.collect.Lists.newArrayList(Lists.java:119) at org.apache.phoenix.iterate.SkipRangeParallelIteratorRegionSplitter.filterRegions(SkipRangeParallelIteratorRegionSplitter.java:93) at org.apache.phoenix.iterate.SkipRangeParallelIteratorRegionSplitter.getAllRegions(SkipRangeParallelIteratorRegionSplitter.java:54) at org.apache.phoenix.iterate.DefaultParallelIteratorRegionSplitter.getSplits(DefaultParallelIteratorRegionSplitter.java:244) at org.apache.phoenix.iterate.ParallelIterators.getSplits(ParallelIterators.java:205) at org.apache.phoenix.iterate.ParallelIterators.init(ParallelIterators.java:78) at org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:162) at org.apache.phoenix.execute.BasicQueryPlan.iterator(BasicQueryPlan.java:144) at org.apache.phoenix.execute.BasicQueryPlan.iterator(BasicQueryPlan.java:125) at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:202) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:922) at sqlline.SqlLine$Commands.execute(SqlLine.java:3673) at sqlline.SqlLine$Commands.sql(SqlLine.java:3584) at sqlline.SqlLine.dispatch(SqlLine.java:821) at sqlline.SqlLine.begin(SqlLine.java:699) at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441) at sqlline.SqlLine.main(SqlLine.java:424) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1025) PhoenixConnection constructor Peoperties set isn't deep copy
[ https://issues.apache.org/jira/browse/PHOENIX-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jay wong updated PHOENIX-1025: -- Attachment: PHOENIX-1025.patch PhoenixConnection constructor Peoperties set isn't deep copy Key: PHOENIX-1025 URL: https://issues.apache.org/jira/browse/PHOENIX-1025 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.0.0 Attachments: PHOENIX-1025.patch {code} public PhoenixConnection(ConnectionQueryServices services, String url, Properties info, PMetaData metaData) throws SQLException { this.url = url; // Copy so client cannot change this.info = info == null ? new Properties() : new Properties(info); final PName tenantId = JDBCUtil.getTenantId(url, info); if (this.info.isEmpty() tenantId == null) { this.services = services; } else { . } } {code} As this.info is set with new Properties(info). So at the code below info.isEmpty() is always return true. I think this.info should be set as PropertiesUtil.deepCopy(info); -- This message was sent by Atlassian JIRA (v6.2#6252)