[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.
[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057258#comment-14057258 ] James Taylor commented on PHOENIX-1056: --- Good point, [~jaywong]. A ImportTsv tool for phoenix to build table data and all index data. Key: PHOENIX-1056 URL: https://issues.apache.org/jira/browse/PHOENIX-1056 Project: Phoenix Issue Type: Task Affects Versions: 3.0.0 Reporter: jay wong Fix For: 3.1 Attachments: PHOENIX-1056.patch I have just build a tool for build table data and index table data just like ImportTsv job. http://hbase.apache.org/book/ops_mgt.html#importtsv when ImportTsv work it write HFile in a CF name path. for example A table has two cf, A and B. the output is ./outputpath/A ./outputpath/B In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. the output will be ./outputpath/TableOne/A ./outputpath/TableOne/B ./outputpath/IdxOne ./outputpath/IdxTwo. If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1016) Support MINVALUE, MAXVALUE, and CYCLE options in CREATE SEQUENCE
[ https://issues.apache.org/jira/browse/PHOENIX-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-1016: Attachment: PHOENIX-1016.v2.patch PHOENIX-1016.v2.3.0.patch Support MINVALUE, MAXVALUE, and CYCLE options in CREATE SEQUENCE Key: PHOENIX-1016 URL: https://issues.apache.org/jira/browse/PHOENIX-1016 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: Thomas D'Silva Attachments: PHOENIX-1016.3.0.patch, PHOENIX-1016.patch, PHOENIX-1016.v2.3.0.patch, PHOENIX-1016.v2.patch We currently don't support MINVALUE, MAXVALUE, and CYCLE options in CREATE SEQUENCE, but we should. See http://msdn.microsoft.com/en-us/library/ff878091.aspx for the syntax. I believe MINVALUE applies if the INCREMENT is negative while MAXVALUE applies otherwise. If the value of a sequence goes beyond MINVALUE/MAXVALUE, then: - if CYCLE is true, then the sequence value should start again at the START WITH value (or the MINVALUE if specified too? Not sure about this). - if CYCLE is false, then an exception should be thrown. To implement this: - make the grammar changes in PhoenixSQL.g - add member variables for MINVALUE, MAXVALUE, and CYCLE to CreateSequenceStatement - add the appropriate error checking and handle bind variables for these new options in CreateSequenceCompiler - modify the MetaDataClient.createSequence() call by passing along these new parameters. - same for ConnectionQueryServices.createSequence() call - same for Sequence.createSequence(). - pass along these parameters as new KeyValues in the Append that constitutes the RPC call - act on these in the SequenceRegionObserver coprocessor as indicated above. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PHOENIX-1079) ConnectionQueryServicesImpl : Close HTable after use
[ https://issues.apache.org/jira/browse/PHOENIX-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John resolved PHOENIX-1079. - Resolution: Fixed Fix Version/s: (was: 4.0.0) (was: 3.0.0) 4.1 3.1 Committed to all branches. Thanks for the patch Samarth Jain. ConnectionQueryServicesImpl : Close HTable after use Key: PHOENIX-1079 URL: https://issues.apache.org/jira/browse/PHOENIX-1079 Project: Phoenix Issue Type: Bug Reporter: Samarth Jain Assignee: Samarth Jain Fix For: 5.0.0, 3.1, 4.1 Attachments: master.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057369#comment-14057369 ] jay wong commented on PHOENIX-1074: --- [~jamestaylor] please check my problem again. this is my primary key. and salt_buckets. {code} CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4 {code} {code} select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 (is logical) {code} {code} select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28(I think the split size is also 12 is logical) {code} this is only a epitome. In my online table . has 1900 regions. If it's run with logical splits policy, only has nearly 20 splits. BUT it has 1900 splits ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt
[jira] [Commented] (PHOENIX-1079) ConnectionQueryServicesImpl : Close HTable after use
[ https://issues.apache.org/jira/browse/PHOENIX-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057394#comment-14057394 ] Hudson commented on PHOENIX-1079: - SUCCESS: Integrated in Phoenix | Master | Hadoop1 #266 (See [https://builds.apache.org/job/Phoenix-master-hadoop1/266/]) PHOENIX-1079 ConnectionQueryServicesImpl : Close HTable after use.(Samarth) (anoopsamjohn: rev 1e61578555e2c54ed801ffa166fcab2cc4499971) * phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java ConnectionQueryServicesImpl : Close HTable after use Key: PHOENIX-1079 URL: https://issues.apache.org/jira/browse/PHOENIX-1079 Project: Phoenix Issue Type: Bug Reporter: Samarth Jain Assignee: Samarth Jain Fix For: 5.0.0, 3.1, 4.1 Attachments: master.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1080) Fix PhoenixRuntime.decodepk for salted tables. Add integration tests.
[ https://issues.apache.org/jira/browse/PHOENIX-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057418#comment-14057418 ] James Taylor commented on PHOENIX-1080: --- [~anoop.hbase] - would you mind committing this one too? Fix PhoenixRuntime.decodepk for salted tables. Add integration tests. - Key: PHOENIX-1080 URL: https://issues.apache.org/jira/browse/PHOENIX-1080 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0, 4.0.0, 5.0.0 Reporter: Samarth Jain Assignee: Samarth Jain Attachments: encodeDecode_3.patch, encodeDecode_master_4.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[GitHub] phoenix pull request: PHOENIX-933 Local index support to Phoenix
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/1#discussion_r14762643 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/TrackOrderPreservingExpressionCompiler.java --- @@ -69,6 +70,7 @@ boolean isSharedViewIndex = table.getViewIndexId() != null; // TODO: util for this offset, as it's computed in numerous places positionOffset = (isSalted ? 1 : 0) + (isMultiTenant ? 1 : 0) + (isSharedViewIndex ? 1 : 0); +this.isOrderPreserving = table.getIndexType() != IndexType.LOCAL; --- End diff -- One thing that's necessary, though, to maintain rows in row key order is to modify ScanPlan.java:118 to do a merge sort instead of a concat: if ((isSalted || isLocalIndex) (context.getConnection().getQueryServices().getProps().getBoolean( QueryServices.ROW_KEY_ORDER_SALTED_TABLE_ATTRIB, QueryServicesOptions.DEFAULT_ROW_KEY_ORDER_SALTED_TABLE) || orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == OrderBy.REV_ROW_KEY_ORDER_BY)) { // ORDER BY was optimized out b/c query is in row key order scanner = new MergeSortRowKeyResultIterator(iterators, SaltingUtil.NUM_SALTING_BYTES, orderBy == OrderBy.REV_ROW_KEY_ORDER_BY); } else { scanner = new ConcatResultIterator(iterators); } Local indexes are similar to salted tables in that the parallel scans will all be within a region, ordered correctly. As long as we do a merge sort across the results of these scans, the rows will be ordered correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PHOENIX-933) Local index support to Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057420#comment-14057420 ] ASF GitHub Bot commented on PHOENIX-933: Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/1#discussion_r14762643 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/TrackOrderPreservingExpressionCompiler.java --- @@ -69,6 +70,7 @@ boolean isSharedViewIndex = table.getViewIndexId() != null; // TODO: util for this offset, as it's computed in numerous places positionOffset = (isSalted ? 1 : 0) + (isMultiTenant ? 1 : 0) + (isSharedViewIndex ? 1 : 0); +this.isOrderPreserving = table.getIndexType() != IndexType.LOCAL; --- End diff -- One thing that's necessary, though, to maintain rows in row key order is to modify ScanPlan.java:118 to do a merge sort instead of a concat: if ((isSalted || isLocalIndex) (context.getConnection().getQueryServices().getProps().getBoolean( QueryServices.ROW_KEY_ORDER_SALTED_TABLE_ATTRIB, QueryServicesOptions.DEFAULT_ROW_KEY_ORDER_SALTED_TABLE) || orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == OrderBy.REV_ROW_KEY_ORDER_BY)) { // ORDER BY was optimized out b/c query is in row key order scanner = new MergeSortRowKeyResultIterator(iterators, SaltingUtil.NUM_SALTING_BYTES, orderBy == OrderBy.REV_ROW_KEY_ORDER_BY); } else { scanner = new ConcatResultIterator(iterators); } Local indexes are similar to salted tables in that the parallel scans will all be within a region, ordered correctly. As long as we do a merge sort across the results of these scans, the rows will be ordered correctly. Local index support to Phoenix -- Key: PHOENIX-933 URL: https://issues.apache.org/jira/browse/PHOENIX-933 Project: Phoenix Issue Type: New Feature Reporter: rajeshbabu Hindex(https://github.com/Huawei-Hadoop/hindex) provides local indexing support to HBase. It stores region level index in a separate table, and co-locates the user and index table regions with a custom load balancer. See http://goo.gl/phkhwC and http://goo.gl/EswlxC for more information. This JIRA addresses the local indexing solution integration to phoenix. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1074) ParallelIteratorRegionSplitterFactory get Splits is not rational
[ https://issues.apache.org/jira/browse/PHOENIX-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057436#comment-14057436 ] James Taylor commented on PHOENIX-1074: --- The second query is using a skip scan because there's range information for all column in your PK: {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} So it'll run the skip scan over all regions since the table is salted. How is performance for this query? You can force it to do a range scan with a hint like this: {code} select /*+ RANGE_SCAN */ from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' {code} Please let us know how performance compares between the two. ParallelIteratorRegionSplitterFactory get Splits is not rational Key: PHOENIX-1074 URL: https://issues.apache.org/jira/browse/PHOENIX-1074 Project: Phoenix Issue Type: Bug Reporter: jay wong create a table {code} create table if not exists table1( gmt VARCHAR NOT NULL, spm_type VARCHAR NOT NULL, spm VARCHAR NOT NULL, A.int_a INTEGER, B.int_b INTEGER, B.int_c INTEGER CONSTRAINT pk PRIMARY KEY (gmt, spm_type, spm)) SALT_BUCKETS = 4, bloomfilter='ROW'; {code} and made the table 29 partitions as this. |startrow|endrow| | |\x0020140201| |\x0020140201|\x0020140202| |\x0020140202|\x0020140203| |\x0020140203|\x0020140204| |\x0020140204|\x0020140205| |\x0020140205|\x0020140206| |\x0020140206|\x0020140207| |\x0020140207|\x0120140201| |\x0120140201|\x0120140202| |\x0120140202|\x0120140203| |\x0120140203|\x0120140204| |\x0120140204|\x0120140205| |\x0120140205|\x0120140206| |\x0120140206|\x0120140207| |\x0120140207|\x0220140201| |\x0220140201|\x0220140202| |\x0220140202|\x0220140203| |\x0220140203|\x0220140204| |\x0220140204|\x0220140205| |\x0220140205|\x0220140206| |\x0220140206|\x0220140207| |\x0220140207|\x0320140201| |\x0320140201|\x0320140202| |\x0320140202|\x0320140203| |\x0320140203|\x0320140204| |\x0320140204|\x0320140205| |\x0320140205|\x0320140206| |\x0320140206|\x0320140207| |\x0320140207| | Then insert some data; |GMT | SPM_TYPE |SPM | INT_A| INT_B| INT_C | | 20140201 | 1 | 1.2.3.4546 | 218| 218| null | | 20140201 | 1 | 1.2.44545 | 190| 190| null | | 20140201 | 1 | 1.353451312 | 246| 246| null | | 20140201 | 2 | 1.2.3.6775 | 183| 183| null | |...|...|...|...|...|...| | 20140207 | 3 | 1.2.3.4546 | 224| 224| null | | 20140207 | 3 | 1.2.44545 | 196| 196| null | | 20140207 | 3 | 1.353451312 | 168| 168| null | | 20140207 | 4 | 1.2.3.6775 | 189| 189| null | | 20140207 | 4 | 1.23.345345 | 217| 217| null | | 20140207 | 4 | 1.23234234234 | 245| 245| null | print a log like this {code} public class ParallelIterators extends ExplainTable implements ResultIterators { @Override public ListPeekingResultIterator getIterators() throws SQLException { boolean success = false; final ConnectionQueryServices services = context.getConnection().getQueryServices(); ReadOnlyProps props = services.getProps(); int numSplits = splits.size(); ListPeekingResultIterator iterators = new ArrayListPeekingResultIterator(numSplits); ListPairbyte[],FuturePeekingResultIterator futures = new ArrayListPairbyte[],FuturePeekingResultIterator(numSplits); final UUID scanId = UUID.randomUUID(); try { ExecutorService executor = services.getExecutor(); System.out.println(the split size is + numSplits); } } {code} then execute some sql {code} select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' and spm like '1.%' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' and spm_type = '2' the split size is 31 select * from table1 where gmt '20140202' and gmt '20140207' the split size is 27 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' and spm like '1.%' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' and spm_type = '2' the split size is 28 select * from table1 where gmt '20140202' and gmt '20140204' the split size is 12 {code} but I think {code} select * from table1 where gmt '20140202' and gmt '20140207' and
[jira] [Commented] (PHOENIX-938) Use higher priority queue for index updates to prevent deadlock
[ https://issues.apache.org/jira/browse/PHOENIX-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057462#comment-14057462 ] James Taylor commented on PHOENIX-938: -- How about this as a plan, [~jesse_yates] and [~apurtell]? - check-in your patch and document that it fixes the issue for 0.98.3 only (assuming it doesn't break Phoenix for 0.98.2- and 0.98.3+). - work with HBase community to make these APIs public and evolving for 0.98.next release. - implement solution on top of these public APIs. I don't think transactions is really going to help with this, so it'd be good to get an as-permanent-as-possible solution IMO. Use higher priority queue for index updates to prevent deadlock --- Key: PHOENIX-938 URL: https://issues.apache.org/jira/browse/PHOENIX-938 Project: Phoenix Issue Type: Bug Affects Versions: 4.0.0, 4.1 Reporter: James Taylor Assignee: Jesse Yates Fix For: 5.0.0, 4.1 Attachments: phoenix-938-4.0-v0.patch, phoenix-938-master-v0.patch, phoenix-938-master-v1.patch With our current global secondary indexing solution, a batched Put of table data causes a RS to do a batch Put to other RSs. This has the potential to lead to a deadlock if all RS are overloaded and unable to process the pending batched Put. To prevent this, we should use a higher priority queue to submit these Puts so that they're always processed before other Puts. This will prevent the potential for a deadlock under high load. Note that this will likely require some HBase 0.98 code changes and would not be feasible to implement for HBase 0.94. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1071) Provide integration for exposing Phoenix tables as Spark RDDs
[ https://issues.apache.org/jira/browse/PHOENIX-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057549#comment-14057549 ] Josh Mahonin commented on PHOENIX-1071: --- Hi Andrew, It's definitely a starting point. The PIG integration doesn't quite have the full JDBC feature set yet, so there's a fair bit of client-side processing necessary that could be handled server-side instead. The DSL you describe above, including the on-demand save / schema-creation feature would be an amazing addition. That said, the fact that today we can read and process a full Phoenix data-set across a Spark cluster is pretty neat. Josh Provide integration for exposing Phoenix tables as Spark RDDs - Key: PHOENIX-1071 URL: https://issues.apache.org/jira/browse/PHOENIX-1071 Project: Phoenix Issue Type: New Feature Reporter: Andrew Purtell A core concept of Apache Spark is the resilient distributed dataset (RDD), a fault-tolerant collection of elements that can be operated on in parallel. One can create a RDDs referencing a dataset in any external storage system offering a Hadoop InputFormat, like PhoenixInputFormat and PhoenixOutputFormat. There could be opportunities for additional interesting and deep integration. Add the ability to save RDDs back to Phoenix with a {{saveAsPhoenixTable}} action, implicitly creating necessary schema on demand. Add support for {{filter}} transformations that push predicates to the server. Add a new {{select}} transformation supporting a LINQ-like DSL, for example: {code} // Count the number of different coffee varieties offered by each // supplier from Guatemala phoenixTable(coffees) .select(c = where(c.origin == GT)) .countByKey() .foreach(r = println(r._1 + = + r._2)) {code} Support conversions between Scala and Java types and Phoenix table data. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1081) With phoenix case CPU usage 100%
[ https://issues.apache.org/jira/browse/PHOENIX-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yang ming updated PHOENIX-1081: --- Description: The concurrent of the system is not highly,but CPU usage often up to 100%. I had stopped the system,but regionserver's CPU usage is still high. what can case this problem? table row count:6000 million table ddl: create table if not exists summary ( videoid integer not null, date date not null, platform varchar not null, device varchar not null, systemgroup varchar not null, system varchar not null, vv bigint, ts bigint, up bigint, down bigint, comment bigint, favori bigint, favord bigint, quote bigint, reply bigint constraint pk primary key (videoid, date,platform, device, systemgroup,system) )salt_buckets = 30,versions=1,compression='snappy'; query 1: select sum(vv) as sumvv,sum(comment) as sumcomment,sum(up) as sumup,sum(down) as sumdown,sum(reply) as sumreply,count(*) as count from summary(reply bigint) where videoid in(137102991,151113895,171559204,171559439,171573932,171573932,171573932,171574082,171574082,171574164,171677219,171794335,171902734,172364368,172475141,172700554,172700554,172700554,172716705,172784258,172835778,173112067,173165316,173165316,173379601,173448315,173503961,173692664,173911358,174077089,174099017,174349633,174349877,174651474,174651474,174759297,174883566,174883566,174987670,174987670,175131298) and date=to_date('2013-09-01','-MM-dd') and date=to_date('2014-07-07','-MM-dd') was: The system is not highly concurrent access,but CPU usage often 100%. I had stopped the system,but regionserver's CPU usage is still high. what can case this problem? table row count:6000 million table ddl: create table if not exists summary ( videoid integer not null, date date not null, platform varchar not null, device varchar not null, systemgroup varchar not null, system varchar not null, vv bigint, ts bigint, up bigint, down bigint, comment bigint, favori bigint, favord bigint, quote bigint, reply bigint constraint pk primary key (videoid, date,platform, device, systemgroup,system) )salt_buckets = 30,versions=1,compression='snappy'; query 1: select sum(vv) as sumvv,sum(comment) as sumcomment,sum(up) as sumup,sum(down) as sumdown,sum(reply) as sumreply,count(*) as count from summary(reply bigint) where videoid in(137102991,151113895,171559204,171559439,171573932,171573932,171573932,171574082,171574082,171574164,171677219,171794335,171902734,172364368,172475141,172700554,172700554,172700554,172716705,172784258,172835778,173112067,173165316,173165316,173379601,173448315,173503961,173692664,173911358,174077089,174099017,174349633,174349877,174651474,174651474,174759297,174883566,174883566,174987670,174987670,175131298) and date=to_date('2013-09-01','-MM-dd') and date=to_date('2014-07-07','-MM-dd') With phoenix case CPU usage 100% Key: PHOENIX-1081 URL: https://issues.apache.org/jira/browse/PHOENIX-1081 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Reporter: yang ming Priority: Critical The concurrent of the system is not highly,but CPU usage often up to 100%. I had stopped the system,but regionserver's CPU usage is still high. what can case this problem? table row count:6000 million table ddl: create table if not exists summary ( videoid integer not null, date date not null, platform varchar not null, device varchar not null, systemgroup varchar not null, system varchar not null, vv bigint, ts bigint, up bigint, down bigint, comment bigint, favori bigint, favord bigint, quote bigint, reply bigint constraint pk primary key (videoid, date,platform, device, systemgroup,system) )salt_buckets = 30,versions=1,compression='snappy'; query 1: select sum(vv) as sumvv,sum(comment) as sumcomment,sum(up) as sumup,sum(down) as sumdown,sum(reply) as sumreply,count(*) as count from summary(reply bigint) where videoid in(137102991,151113895,171559204,171559439,171573932,171573932,171573932,171574082,171574082,171574164,171677219,171794335,171902734,172364368,172475141,172700554,172700554,172700554,172716705,172784258,172835778,173112067,173165316,173165316,173379601,173448315,173503961,173692664,173911358,174077089,174099017,174349633,174349877,174651474,174651474,174759297,174883566,174883566,174987670,174987670,175131298) and date=to_date('2013-09-01','-MM-dd') and date=to_date('2014-07-07','-MM-dd') -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-1081) CPU usage 100% With phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yang ming updated PHOENIX-1081: --- Summary: CPU usage 100% With phoenix (was: With phoenix case CPU usage 100%) CPU usage 100% With phoenix Key: PHOENIX-1081 URL: https://issues.apache.org/jira/browse/PHOENIX-1081 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Reporter: yang ming Priority: Critical Attachments: JMX.jpg, jstat.jpg, the jstack of all threads, the jstack of thread 12725.jpg, the jstack of thread 12748.jpg, the threads of regionserver process.jpg The concurrent of the system is not high,but CPU usage often up to 100%. I had stopped the system,but regionserver's CPU usage is still high. what can case this problem? table row count:6000 million table ddl: create table if not exists summary ( videoid integer not null, date date not null, platform varchar not null, device varchar not null, systemgroup varchar not null, system varchar not null, vv bigint, ts bigint, up bigint, down bigint, comment bigint, favori bigint, favord bigint, quote bigint, reply bigint constraint pk primary key (videoid, date,platform, device, systemgroup,system) )salt_buckets = 30,versions=1,compression='snappy'; query 1: select sum(vv) as sumvv,sum(comment) as sumcomment,sum(up) as sumup,sum(down) as sumdown,sum(reply) as sumreply,count(*) as count from summary(reply bigint) where videoid in(137102991,151113895,171559204,171559439,171573932,171573932,171573932,171574082,171574082,171574164,171677219,171794335,171902734,172364368,172475141,172700554,172700554,172700554,172716705,172784258,172835778,173112067,173165316,173165316,173379601,173448315,173503961,173692664,173911358,174077089,174099017,174349633,174349877,174651474,174651474,174759297,174883566,174883566,174987670,174987670,175131298) and date=to_date('2013-09-01','-MM-dd') and date=to_date('2014-07-07','-MM-dd') -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PHOENIX-1081) CPU usage 100% With phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057581#comment-14057581 ] yang ming commented on PHOENIX-1081: [~jamestaylor] CPU usage 100% With phoenix Key: PHOENIX-1081 URL: https://issues.apache.org/jira/browse/PHOENIX-1081 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Reporter: yang ming Priority: Critical Attachments: JMX.jpg, jstat.jpg, the jstack of all threads, the jstack of thread 12725.jpg, the jstack of thread 12748.jpg, the threads of regionserver process.jpg The concurrent of the system is not high,but CPU usage often up to 100%. I had stopped the system,but regionserver's CPU usage is still high. what can case this problem? table row count:6000 million table ddl: create table if not exists summary ( videoid integer not null, date date not null, platform varchar not null, device varchar not null, systemgroup varchar not null, system varchar not null, vv bigint, ts bigint, up bigint, down bigint, comment bigint, favori bigint, favord bigint, quote bigint, reply bigint constraint pk primary key (videoid, date,platform, device, systemgroup,system) )salt_buckets = 30,versions=1,compression='snappy'; query 1: select sum(vv) as sumvv,sum(comment) as sumcomment,sum(up) as sumup,sum(down) as sumdown,sum(reply) as sumreply,count(*) as count from summary(reply bigint) where videoid in(137102991,151113895,171559204,171559439,171573932,171573932,171573932,171574082,171574082,171574164,171677219,171794335,171902734,172364368,172475141,172700554,172700554,172700554,172716705,172784258,172835778,173112067,173165316,173165316,173379601,173448315,173503961,173692664,173911358,174077089,174099017,174349633,174349877,174651474,174651474,174759297,174883566,174883566,174987670,174987670,175131298) and date=to_date('2013-09-01','-MM-dd') and date=to_date('2014-07-07','-MM-dd') -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PHOENIX-950) Improve Secondary Index Update Failure Handling
[ https://issues.apache.org/jira/browse/PHOENIX-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated PHOENIX-950: -- Attachment: TransactionSupportPhoenixSecondaryIndexUpdate.pdf Improve Secondary Index Update Failure Handling --- Key: PHOENIX-950 URL: https://issues.apache.org/jira/browse/PHOENIX-950 Project: Phoenix Issue Type: Improvement Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: Improve Phoenix Secondary Index Update Failure Handling.pdf, TransactionSupportPhoenixSecondaryIndexUpdate.pdf Current secondary index update could trigger chained region server failures. This isn't friendly to end-users. Even we disable index after index update failures before aborting, it will involve lot of human involvement because index update failure isn't a rare situation. In this JIRA, I propose a 2PC like protocol. The like means it's a not a real 2PC because no infinitely blocking but it requires read time(query) to reconcile inconsistence between index and data. Since I'm not familiar with the query time logic, please let me know if the proposal could fly. Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252)