[jira] [Commented] (PHOENIX-1550) Freeze and thaw Phoenix tables

2015-03-25 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379397#comment-14379397
 ] 

Sun Fulin commented on PHOENIX-1550:


[~maghamraviki...@gmail.com] Any progress ? 

 Freeze and thaw Phoenix tables
 --

 Key: PHOENIX-1550
 URL: https://issues.apache.org/jira/browse/PHOENIX-1550
 Project: Phoenix
  Issue Type: New Feature
Reporter: Andrew Purtell

 HBase snapshots provide a way to freeze the state of a table and then later 
 restore that state from the snapshot or clone it into a new table. This can 
 be very convenient for a wide range of use cases. However, we can't apply 
 this technique directly to Phoenix tables because Phoenix generates and uses 
 more state in more locations than a basic HBase table - a Phoenix table may 
 have one or more secondary indexes, global or local, schema in the system 
 catalog, sequence state, statistics.
 This issue is about considering a freeze and thaw feature for Phoenix tables, 
 perhaps based on HBase snapshots. A freeze operation would create a 
 consistent point in time snapshot of a Phoenix table and all related index 
 data and metadata. A thaw operation would restore frozen state either by 
 overwriting the table or by cloning the state into a new table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-1550) Freeze and thaw Phoenix tables

2015-03-16 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362752#comment-14362752
 ] 

Sun Fulin edited comment on PHOENIX-1550 at 3/16/15 6:15 AM:
-

[~jamestaylor] - The priority lies that snapshot scan can improve aggregation 
work performance greatly as provided through HBASE-8369. We had integrated 
spark with hbase and had tested a lot for hbase scan VS snapshot, which had got 
results for that snapshot upgrade performance a lot in use cases of some 
hourly, daily, weekly and monthly aggregation work. We would prefer more like 
[~maghamraviki...@gmail.com] designed for mapreduce support, e.g. issuing more 
like {{PhoenixSnapshotInputFormat}}. Thus we believe can achieve our goal of 
resolving some kind of phoenix full table scan performance issues.

Sorry for some misunderstanding about my comments.


was (Author: sunfl):
[~jamestaylor] - The priority lies that snapshot scan can improve aggregation 
work performance greatly as provided through HBASE-8369. We had integrated 
spark with hbase and had tested a lot for hbase scan VS snapshot, which had got 
results for that snapshot upgrade performance a lot in use cases of some 
hourly, daily, weekly and monthly aggregation work. We would prefer more like 
[~maghamraviki...@gmail.com] designed for mapreduce support, e.g. issuing more 
like PhoenixSnapshotInputFormat. Thus we believe can achieve our goal of 
resolving some kind of phoenix full table scan performance issues.

Sorry for some misunderstanding about my comments.

 Freeze and thaw Phoenix tables
 --

 Key: PHOENIX-1550
 URL: https://issues.apache.org/jira/browse/PHOENIX-1550
 Project: Phoenix
  Issue Type: New Feature
Reporter: Andrew Purtell

 HBase snapshots provide a way to freeze the state of a table and then later 
 restore that state from the snapshot or clone it into a new table. This can 
 be very convenient for a wide range of use cases. However, we can't apply 
 this technique directly to Phoenix tables because Phoenix generates and uses 
 more state in more locations than a basic HBase table - a Phoenix table may 
 have one or more secondary indexes, global or local, schema in the system 
 catalog, sequence state, statistics.
 This issue is about considering a freeze and thaw feature for Phoenix tables, 
 perhaps based on HBase snapshots. A freeze operation would create a 
 consistent point in time snapshot of a Phoenix table and all related index 
 data and metadata. A thaw operation would restore frozen state either by 
 overwriting the table or by cloning the state into a new table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1550) Freeze and thaw Phoenix tables

2015-03-15 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362752#comment-14362752
 ] 

Sun Fulin commented on PHOENIX-1550:


[~jamestaylor] - The priority lies that snapshot scan can improve aggregation 
work performance greatly as provided through HBASE-8369. We had integrated 
spark with hbase and had tested a lot for hbase scan VS snapshot, which had got 
results for that snapshot upgrade performance a lot in use cases of some 
hourly, daily, weekly and monthly aggregation work. We would prefer more like 
[~maghamraviki...@gmail.com] designed for mapreduce support, e.g. issuing more 
like PhoenixSnapshotInputFormat. Thus we believe can achieve our goal of 
resolving some kind of phoenix full table scan performance issues.

Sorry for some misunderstanding about my comments.

 Freeze and thaw Phoenix tables
 --

 Key: PHOENIX-1550
 URL: https://issues.apache.org/jira/browse/PHOENIX-1550
 Project: Phoenix
  Issue Type: New Feature
Reporter: Andrew Purtell

 HBase snapshots provide a way to freeze the state of a table and then later 
 restore that state from the snapshot or clone it into a new table. This can 
 be very convenient for a wide range of use cases. However, we can't apply 
 this technique directly to Phoenix tables because Phoenix generates and uses 
 more state in more locations than a basic HBase table - a Phoenix table may 
 have one or more secondary indexes, global or local, schema in the system 
 catalog, sequence state, statistics.
 This issue is about considering a freeze and thaw feature for Phoenix tables, 
 perhaps based on HBase snapshots. A freeze operation would create a 
 consistent point in time snapshot of a Phoenix table and all related index 
 data and metadata. A thaw operation would restore frozen state either by 
 overwriting the table or by cloning the state into a new table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1248) CsvBulkLoadTool is failing with IAE when local index specified for --index-table parameter

2015-02-27 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341173#comment-14341173
 ] 

Sun Fulin commented on PHOENIX-1248:


[~apurtell] Would these changes affect a lot for Phoenix 4.3 release to install 
upon 0.98.4-0.98.6 hbase cluster? Cause I got 3 major obstacles when trying to 
deploy and utilize the latest 4.3 release. 
  1.  Phoenix 4.3 release shall not be compiled if we specify 
Dhbase.version=0.98.6-cdh5.3.0
  2.  CsvBulkloadTool would not function well under the current hbase 
0.98.6-cdh5.3.0 version, the full stack trace can be referenced for above 
commemt of [~nagav]
  3.  Local index not work with current hbase 0.98.6 version.
[~jamestaylor] Can the above incompatibility issues be smoothly resolved 
without upgrading our current hbase version? Also, I deployed the latest hbase 
1.0.0 version and install phoenix with that, problems occure when starting the 
hbase cluster. Exception goes to :
The interface of EnvironmentEdgeManager changed between 0.98.9 and 
1.0.0 hbase version. Do you think to significantly specify these 
incompatibility issues offically or there can be other solution to resolve 
these? 

 2015-02-27 16:52:27,096 FATAL [hmaster:16020.activeMasterManager] 
master.HMaster: Unhandled exception. Starting shutdown.
java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.util.EnvironmentEdgeManager.currentTimeMillis()J
at 
org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer.clinit(IndexLoadBalancer.java:84)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at 
org.apache.hadoop.hbase.master.balancer.LoadBalancerFactory.getLoadBalancer(LoadBalancerFactory.java:40)
at 
org.apache.hadoop.hbase.master.HMaster.initializeZKBasedSystemTrackers(HMaster.java:519)
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:613)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1425)
at java.lang.Thread.run(Thread.java:744)

 CsvBulkLoadTool is failing with IAE when local index specified for 
 --index-table parameter
 --

 Key: PHOENIX-1248
 URL: https://issues.apache.org/jira/browse/PHOENIX-1248
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.1
Reporter: rajeshbabu
Assignee: Gabriel Reid
 Fix For: 5.0.0, 4.3

 Attachments: PHOENIX-1248.patch, PHOENIX-1248b.patch, 
 PHOENIX-1248c.patch, p1248_0.log


 {code}
 14/09/11 22:38:11 INFO mapreduce.HFileOutputFormat2: Writing partition 
 information to /tmp/partitions_81e2ee23-836c-4eae-b413-ac6bd6a96623
 14/09/11 22:38:11 ERROR mapreduce.CsvBulkLoadTool: Import job on 
 table=TEST_IDX failed due to exception:java.lang.IllegalArgumentException: No 
 regions passed
 14/09/11 22:38:11 INFO client.HConnectionManager$HConnectionImplementation: 
 Closing zookeeper sessionid=0x148123145090287
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1248) CsvBulkLoadTool is failing with IAE when local index specified for --index-table parameter

2015-02-27 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341228#comment-14341228
 ] 

Sun Fulin commented on PHOENIX-1248:


[~apurtell] Thanks for the guidance. 

bq. What's the compile issue?

MVN Command: {{mvn clean package -DskipTests -Dhadoop.version=2.5.0-cdh5.3.0 
-Dhbase.version=0.98.6-cdh5.3.0}}

Error message: 

   [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on 
project phoenix-core: Compilation failure
[ERROR] 
/home/soft/phoenix-4.3.0-src/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexSplitter.java:[89,23]
 cannot find symbol
[ERROR] symbol:   variable useZKForAssignment
[ERROR] location: variable st of type 
org.apache.hadoop.hbase.regionserver.SplitTransaction

bq. You're going to want to upgrade to 0.98.10.1 if using local mutable 
indexes, see PHOENIX-1634

Got that, shall make a try.

 CsvBulkLoadTool is failing with IAE when local index specified for 
 --index-table parameter
 --

 Key: PHOENIX-1248
 URL: https://issues.apache.org/jira/browse/PHOENIX-1248
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.1
Reporter: rajeshbabu
Assignee: Gabriel Reid
 Fix For: 5.0.0, 4.3

 Attachments: PHOENIX-1248.patch, PHOENIX-1248b.patch, 
 PHOENIX-1248c.patch, p1248_0.log


 {code}
 14/09/11 22:38:11 INFO mapreduce.HFileOutputFormat2: Writing partition 
 information to /tmp/partitions_81e2ee23-836c-4eae-b413-ac6bd6a96623
 14/09/11 22:38:11 ERROR mapreduce.CsvBulkLoadTool: Import job on 
 table=TEST_IDX failed due to exception:java.lang.IllegalArgumentException: No 
 regions passed
 14/09/11 22:38:11 INFO client.HConnectionManager$HConnectionImplementation: 
 Closing zookeeper sessionid=0x148123145090287
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1550) Freeze and thaw Phoenix tables

2015-02-11 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315876#comment-14315876
 ] 

Sun Fulin commented on PHOENIX-1550:


Any progress on this issue ? [~apurtell]

 Freeze and thaw Phoenix tables
 --

 Key: PHOENIX-1550
 URL: https://issues.apache.org/jira/browse/PHOENIX-1550
 Project: Phoenix
  Issue Type: New Feature
Reporter: Andrew Purtell

 HBase snapshots provide a way to freeze the state of a table and then later 
 restore that state from the snapshot or clone it into a new table. This can 
 be very convenient for a wide range of use cases. However, we can't apply 
 this technique directly to Phoenix tables because Phoenix generates and uses 
 more state in more locations than a basic HBase table - a Phoenix table may 
 have one or more secondary indexes, global or local, schema in the system 
 catalog, sequence state, statistics.
 This issue is about considering a freeze and thaw feature for Phoenix tables, 
 perhaps based on HBase snapshots. A freeze operation would create a 
 consistent point in time snapshot of a Phoenix table and all related index 
 data and metadata. A thaw operation would restore frozen state either by 
 overwriting the table or by cloning the state into a new table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1484) Index creation failed due to specifying DEFAULT_COLUMN_FAMILY option

2014-11-25 Thread Sun Fulin (JIRA)
Sun Fulin created PHOENIX-1484:
--

 Summary: Index creation failed due to specifying 
DEFAULT_COLUMN_FAMILY option
 Key: PHOENIX-1484
 URL: https://issues.apache.org/jira/browse/PHOENIX-1484
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.2
 Environment: HBase 0.98.6-cdh5.2.0 
Reporter: Sun Fulin


I tryied again with phoenix 4.1 version and still got the exception. Following 
is my statements:
  1 create table example (id integer not null,fn varchar,ln varchar constraint 
pk primary key(id)) DEFAULT_COLUMN_FAMILY='F';
   Notice that I would like to create default mutable index for the table.
  2  create local index my_idx on example (fn) DEFAULT_COLUMN_FAMILY='F';   
   Here no matter with or without specifying DEFAULT_COLUMN_FAMILY options 
we would always got the following error:
   Error: ERROR 1009 (42L02): Properties may not be defined for a view. 
(state=42L02,code=1009)
  3  I also tried to create local index with immutable rows setting true, 
however, I still got the above error. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1249) Support local immutable index

2014-10-08 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163150#comment-14163150
 ] 

Sun Fulin commented on PHOENIX-1249:


[~jamestaylor]  Yes. Sorry for not giving response in time. The patch indeed 
improves our performance issues. Thanks for great work, [~rajesh23].

 Support local immutable index 
 --

 Key: PHOENIX-1249
 URL: https://issues.apache.org/jira/browse/PHOENIX-1249
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 4.1
 Environment: Hbase 0.98.4-Hadoop2
 Phoenix 4.1
Reporter: Sun Fulin
Assignee: rajeshbabu
 Fix For: 5.0.0, 4.2

 Attachments: PHOENIX-1249.patch, PHOENIX-1249_v2.patch, 
 PHOENIX-1249_v3.patch


 Currently local indexing are forced created as default mutable index which 
 requires index maintenance and server side processing, while immutable 
 indexes are appropriate for write-once/append-only use case that may fit for 
 many use cases. Hope some work for capability to create local index as 
 immutable index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1248) CsvBulkLoadTool is failing with IAE when local index specified for --index-table parameter

2014-09-12 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131148#comment-14131148
 ] 

Sun Fulin commented on PHOENIX-1248:


[~rajesh23] Yes. Rajeshbabu said that CSVBulkloader via mapreduce has a little 
problem when dealing with local index related data bulkloading. 

 CsvBulkLoadTool is failing with IAE when local index specified for 
 --index-table parameter
 --

 Key: PHOENIX-1248
 URL: https://issues.apache.org/jira/browse/PHOENIX-1248
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.1
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 5.0.0, 4.2


 {code}
 14/09/11 22:38:11 INFO mapreduce.HFileOutputFormat2: Writing partition 
 information to /tmp/partitions_81e2ee23-836c-4eae-b413-ac6bd6a96623
 14/09/11 22:38:11 ERROR mapreduce.CsvBulkLoadTool: Import job on 
 table=TEST_IDX failed due to exception:java.lang.IllegalArgumentException: No 
 regions passed
 14/09/11 22:38:11 INFO client.HConnectionManager$HConnectionImplementation: 
 Closing zookeeper sessionid=0x148123145090287
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-28 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114890#comment-14114890
 ] 

Sun Fulin commented on PHOENIX-1203:


[~anoop.hbase]
Any progress about the test?

 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index creation worked successfully without any errors. So, when I am
  trying to conduct such query as:
 select count (distinct t.imsi) from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 Without secondary indexes, the final result got 536 distinct imsi, wihch
  is the right distinct count results. However, after I  create the above
secondary index PKT_IDX and reconducting the above count (distinct imsi)
  query, I got 2322 imsi rows which obviously are not the expected
distinct counts results. I used the explain grammar to observe the scan of
  the above select query and found that it definitely scaned over
the index table PKT_IDX. I then tried to conduct the following query with
  no count function:
  select distinct t.imsi from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 And the result is right 536 distinct imsi over scanning the index table.
  By the way, imsi is one of the primary key when creating the table
  RANAPSIGNAL.
 Here are several considerations for my trouble and practice:
 1. Did you guys ever practice such count (distinct) queries over phoenix
  table via secondary index?
 2. I am not sure whether this problem was due to the index table, but my
  practice may assume that conclusion.
 3. Corrects my if 

[jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-26 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110506#comment-14110506
 ] 

Sun Fulin commented on PHOENIX-1203:


James Taylor - I rebuilt the latest phoenix version 4.1 RC from here: 
https://dist.apache.org/repos/dist/dev/phoenix/phoenix-4.1.0-rc1/src/ 
And confirm queries with another unit test:
DDL: create table example (my_pk integer not null, first_name varchar(20), 
last_name varchar(20) constraint pk PRIMARY KEY (my_pk));
create index my_idx on example (first_name);
DML: Upsert into example values (9 records with several same first_name groups)
The following queries: select /+NO_INDEX/ count (distinct first_name) from 
example; got finally 5 rows. 
The following queries: select /+INDEX(example my_idx)/ count (distinct 
first_name) from example; got all 9 records.
So James, the count (distinct col) query over index table still not get 
expecting results even in the latest release.
Hoping Anoop Sam John can explain this and get some available report.


 
From: James Taylor (JIRA)
Date: 2014-08-26 02:52
To: sunfl
Subject: [jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct 
col) queries via phoenix table with secondary indexes
 
[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109498#comment-14109498
 ] 
 
James Taylor commented on PHOENIX-1203:
---
 
[~sunfl] - can you confirm that when you tried with with the latest 4.1 RC that 
you replaced *both* the client and the *server* jar? If the problem is what I 
suspect it is, this should have been fixed. 
 
 
 
 
--
This message was sent by Atlassian JIRA
(v6.2#6252)
 
 


 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index 

[jira] [Issue Comment Deleted] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-26 Thread Sun Fulin (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Fulin updated PHOENIX-1203:
---

Comment: was deleted

(was: James Taylor - I rebuilt the latest phoenix version 4.1 RC from here: 
https://dist.apache.org/repos/dist/dev/phoenix/phoenix-4.1.0-rc1/src/ 
And confirm queries with another unit test:
DDL: create table example (my_pk integer not null, first_name varchar(20), 
last_name varchar(20) constraint pk PRIMARY KEY (my_pk));
create index my_idx on example (first_name);
DML: Upsert into example values (9 records with several same first_name groups)
The following queries: select /+NO_INDEX/ count (distinct first_name) from 
example; got finally 5 rows. 
The following queries: select /+INDEX(example my_idx)/ count (distinct 
first_name) from example; got all 9 records.
So James, the count (distinct col) query over index table still not get 
expecting results even in the latest release.
Hoping Anoop Sam John can explain this and get some available report.


 
From: James Taylor (JIRA)
Date: 2014-08-26 02:52
To: sunfl
Subject: [jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct 
col) queries via phoenix table with secondary indexes
 
[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109498#comment-14109498
 ] 
 
James Taylor commented on PHOENIX-1203:
---
 
[~sunfl] - can you confirm that when you tried with with the latest 4.1 RC that 
you replaced *both* the client and the *server* jar? If the problem is what I 
suspect it is, this should have been fixed. 
 
 
 
 
--
This message was sent by Atlassian JIRA
(v6.2#6252)
 
 
)

 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index creation worked successfully 

[jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-26 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111750#comment-14111750
 ] 

Sun Fulin commented on PHOENIX-1203:


[~jamestaylor] Fine. Following provides the records in table example after 
upsert statements. If I run the query: SELECT /*+NO_INDEX*/ COUNT (DISTINCT 
FIRST_NAME) FROM EXAMPLE; I can correctly got the final 5 rows as 'ssd', 'fsd', 
'fdfd', 'dsad' and 'dffds'. However, If I run the query:  SELECT 
/*+INDEX(example my_idx)*/ COUNT (DISTINCT FIRST_NAME) FROM EXAMPLE; I still 
got 9 counts and the distinct never works.
0: jdbc:phoenix:zookeeper1 select * from example;
++++
|   MY_PK| FIRST_NAME | LAST_NAME  |
++++
| 1  | ssd| ad |
| 2  | fsd| re |
| 3  | dffds  | gfgf   |
| 4  | dsad   | gfrg   |
| 5  | fdfd   | fdsf   |
| 6  | ssd| fdfgg  |
| 7  | fsd| re |
| 8  | dsad   | ere|
| 9  | ssd| ere|
++++

 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index creation worked successfully without any errors. So, when I am
  trying to conduct such query as:
 select count (distinct t.imsi) from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 Without secondary indexes, the final result got 536 distinct imsi, wihch
  is the right distinct count results. However, after I  create the above
secondary index PKT_IDX and reconducting the above count (distinct imsi)
  query, I got 2322 imsi rows which 

[jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-26 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111863#comment-14111863
 ] 

Sun Fulin commented on PHOENIX-1203:


[~anoop.hbase]
Do you have available phoenix cluster and sqlline client now? Cause I did not 
run queries from java client. Mind if you just test the above data records or 
even just one table with two distinct records? I suspect the count distinct 
query would not function correctly as long as there was a secondary index.

 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index creation worked successfully without any errors. So, when I am
  trying to conduct such query as:
 select count (distinct t.imsi) from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 Without secondary indexes, the final result got 536 distinct imsi, wihch
  is the right distinct count results. However, after I  create the above
secondary index PKT_IDX and reconducting the above count (distinct imsi)
  query, I got 2322 imsi rows which obviously are not the expected
distinct counts results. I used the explain grammar to observe the scan of
  the above select query and found that it definitely scaned over
the index table PKT_IDX. I then tried to conduct the following query with
  no count function:
  select distinct t.imsi from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 And the result is right 536 distinct imsi over scanning the index table.
  By the way, imsi is one of the primary key when creating the table
  RANAPSIGNAL.
 Here are several 

[jira] [Commented] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-25 Thread Sun Fulin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108861#comment-14108861
 ] 

Sun Fulin commented on PHOENIX-1203:


[~anoop.hbase]
Yes. I conduct several queries about count (distinct col) over index table, 
which show
-
COUNT(IMSI)
-
instead of 
-
DISTINCT_COUNT(IMSI)
-
That means the query did not function well.

 Uable to work for count (distinct col) queries via phoenix table with 
 secondary indexes
 ---

 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin
Assignee: Anoop Sam John
  Labels: distinct, secondaryIndex, test

 I build the latest 4.1 rc0 from here:  
 https://github.com/apache/phoenix/releases
 And examine the count (distinct col) query within the new environment. 
 However, the problem still exists with index queries as the following while 
 the correct distinct query result is expected to be 536 for my project:
 0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
 where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 
 and t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
 +-+
 | COUNT(IMSI) |
 +-+
 | 2322|
 +-+
 1 row selected (70.572 seconds)
 As James suggests, I conduct the query adding group by t.imsi with /without 
 secondary indexes. And the result seems to be fine as they both got the 
 correct 536 distinct groups. 
 Here are some considerations:
 1. count (distinct col) query over index table did not work as expectation.
 2. only distinct query over index table works fine.
 3. If the phoenix version got some wrong configuration, correct me.
 Thanks and Best Regards,
 Sun
 ---
 Hi Sun,
 Thanks for the detailed description. Yes, your syntax is correct, and
 it's definitely true that the count distinct query should return the
 same result with and without the index. Would you mind trying this on
 our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
 file a JIRA?
  
 One thing that make make it easier for your testing: do you know about
 our NO_INDEX hint which forces the query *not* to use an index, like
 this:
  
 select /*+ NO_INDEX */ ...
  
 Another question too. What about this query with and with/out the index:
  
 select count(*) from ranapsignal t
 where t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
 and t.ranapsubmsgtype=0
 group by t.imsi;
  
 Thanks,
 James
  
 On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
 su...@certusnet.com.cn wrote:
 
 
  Hi James,
 
Recently I got trouble while trying to conduct some query performance test
  in my phoenix tables with secondary indexes.
I created a table called RANAPSIGNAL for my projects in phoenix via
  sqlline client and load data into the table. Then I create
an index on the specific column PKTTIME for the table RANAPSIGNAL while
  including other more columns for adjusting my
index query, like the following DDL:
   create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
  (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
 The index creation worked successfully without any errors. So, when I am
  trying to conduct such query as:
 select count (distinct t.imsi) from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 Without secondary indexes, the final result got 536 distinct imsi, wihch
  is the right distinct count results. However, after I  create the above
secondary index PKT_IDX and reconducting the above count (distinct imsi)
  query, I got 2322 imsi rows which obviously are not the expected
distinct counts results. I used the explain grammar to observe the scan of
  the above select query and found that it definitely scaned over
the index table PKT_IDX. I then tried to conduct the following query with
  no count function:
  select distinct t.imsi from ranapsignal t where
  t.pkttime=140496480 and t.pkttime=140496569
 and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
  t.ranapsubmsgtype=0 ;
 And the result is right 536 distinct imsi over scanning the index table.
  By the way, imsi is one of the primary key when creating the table
  RANAPSIGNAL.
 Here are several considerations for my trouble and practice:
 1. Did you guys ever practice such 

[jira] [Created] (PHOENIX-1203) Uable to work for count (distinct col) queries via phoenix table with secondary indexes

2014-08-24 Thread Sun Fulin (JIRA)
Sun Fulin created PHOENIX-1203:
--

 Summary: Uable to work for count (distinct col) queries via 
phoenix table with secondary indexes
 Key: PHOENIX-1203
 URL: https://issues.apache.org/jira/browse/PHOENIX-1203
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.0.0, 4.1
 Environment: hadoop-2.2.0   hbase: Version 0.98.3-hadoop2
Reporter: Sun Fulin


I build the latest 4.1 rc0 from here:  
https://github.com/apache/phoenix/releases
And examine the count (distinct col) query within the new environment. However, 
the problem still exists with index queries as the following while the correct 
distinct query result is expected to be 536 for my project:

0: jdbc:phoenix:zookeeper1 select count (distinct t.imsi) from ranapsignal t 
where t.pkttime=140496480 and t.pkttime=140496569 and t.sac=32351 and 
t.nasmsgtype=0 and t.ranapmsgtype=0 and t.ranapsubmsgtype=0 ;
+-+
| COUNT(IMSI) |
+-+
| 2322|
+-+
1 row selected (70.572 seconds)

As James suggests, I conduct the query adding group by t.imsi with /without 
secondary indexes. And the result seems to be fine as they both got the correct 
536 distinct groups. 

Here are some considerations:
1. count (distinct col) query over index table did not work as expectation.
2. only distinct query over index table works fine.
3. If the phoenix version got some wrong configuration, correct me.

Thanks and Best Regards,
Sun
---
Hi Sun,
Thanks for the detailed description. Yes, your syntax is correct, and
it's definitely true that the count distinct query should return the
same result with and without the index. Would you mind trying this on
our latest 3.1 RC2 and/or 4.1 RC0 and if the problem still occurs to
file a JIRA?
 
One thing that make make it easier for your testing: do you know about
our NO_INDEX hint which forces the query *not* to use an index, like
this:
 
select /*+ NO_INDEX */ ...
 
Another question too. What about this query with and with/out the index:
 
select count(*) from ranapsignal t
where t.pkttime=140496480 and t.pkttime=140496569
and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0
and t.ranapsubmsgtype=0
group by t.imsi;
 
Thanks,
James
 
On Thu, Aug 21, 2014 at 10:38 PM, su...@certusnet.com.cn
su...@certusnet.com.cn wrote:


 Hi James,

   Recently I got trouble while trying to conduct some query performance test
 in my phoenix tables with secondary indexes.
   I created a table called RANAPSIGNAL for my projects in phoenix via
 sqlline client and load data into the table. Then I create
   an index on the specific column PKTTIME for the table RANAPSIGNAL while
 including other more columns for adjusting my
   index query, like the following DDL:
  create index if not exists pkt_idx on RANAPSIGNAL (PKTTIME) include
 (SAC,NASMSGTYPE, RANAPMSGTYPE, RANAPSUBMSGTYPE );
The index creation worked successfully without any errors. So, when I am
 trying to conduct such query as:
select count (distinct t.imsi) from ranapsignal t where
 t.pkttime=140496480 and t.pkttime=140496569
and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
 t.ranapsubmsgtype=0 ;
Without secondary indexes, the final result got 536 distinct imsi, wihch
 is the right distinct count results. However, after I  create the above
   secondary index PKT_IDX and reconducting the above count (distinct imsi)
 query, I got 2322 imsi rows which obviously are not the expected
   distinct counts results. I used the explain grammar to observe the scan of
 the above select query and found that it definitely scaned over
   the index table PKT_IDX. I then tried to conduct the following query with
 no count function:
 select distinct t.imsi from ranapsignal t where
 t.pkttime=140496480 and t.pkttime=140496569
and t.sac=32351 and t.nasmsgtype=0 and t.ranapmsgtype=0 and
 t.ranapsubmsgtype=0 ;
And the result is right 536 distinct imsi over scanning the index table.
 By the way, imsi is one of the primary key when creating the table
 RANAPSIGNAL.
Here are several considerations for my trouble and practice:
1. Did you guys ever practice such count (distinct) queries over phoenix
 table via secondary index?
2. I am not sure whether this problem was due to the index table, but my
 practice may assume that conclusion.
3. Corrects my if I am wrong with my previous operations (index creation,
 DDL grammar, etc..)
4. Any available hints or reply are best appreciated.

   Thanks and Best Regards
   Sun
 



--
This message was sent by Atlassian JIRA
(v6.2#6252)