[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info

2019-07-09 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Attachment: PHOENIX-4925.phoenix-stats.003.patch

> Use a Variant Segment tree to organize Guide Post Info
> --
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4925.phoenix-stats.003.patch, 
> PHOENIX-4925.phoenix-stats.0502.patch, PHOENIX-4925.phoenix-stats.0510.patch
>
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4927) Disentangle the granularity of guidepost data from the granularity stored in the client cache

2019-07-08 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4927:
-
Summary: Disentangle the granularity of guidepost data from the granularity 
stored in the client cache  (was: Disentangle the granularity of guidepost data 
from that of client cached guide post data)

> Disentangle the granularity of guidepost data from the granularity stored in 
> the client cache
> -
>
> Key: PHOENIX-4927
> URL: https://issues.apache.org/jira/browse/PHOENIX-4927
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The expected behaviors:
>  # It should be possible to have 10MB guideposts for precision and not force 
> that to cache that many guide posts on the clients.
>  # It should combine as many guideposts into chunks as needed for the 
> particular query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info

2019-05-10 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Attachment: PHOENIX-4925.phoenix-stats.0510.patch

> Use a Variant Segment tree to organize Guide Post Info
> --
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4925.phoenix-stats.0502.patch, 
> PHOENIX-4925.phoenix-stats.0510.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info

2019-05-03 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Summary: Use a Variant Segment tree to organize Guide Post Info  (was: Use 
a Variant Segment tree to organize Guide Post Info.)

> Use a Variant Segment tree to organize Guide Post Info
> --
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info.

2019-05-03 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Attachment: (was: PHOENIX-4925.phoenix-stats.0502.patch)

> Use a Variant Segment tree to organize Guide Post Info.
> ---
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info.

2019-05-03 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Summary: Use a Variant Segment tree to organize Guide Post Info.  (was: Use 
a Variant Segment tree to organize Guide Post Info)

> Use a Variant Segment tree to organize Guide Post Info.
> ---
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use a Variant Segment tree to organize Guide Post Info

2019-05-03 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Summary: Use a Variant Segment tree to organize Guide Post Info  (was: Use 
Segment tree to organize Guide Post Info)

> Use a Variant Segment tree to organize Guide Post Info
> --
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5265) Phoenix Test should use gold files for result comparison instead of using hard-corded comparison.

2019-04-30 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5265:
-
Description: 
Currently, in Phoenix Test, the comparison of the returned result and the 
expected result is always hard-coded, which is especially widely used by E2E 
and Integration test for comparing the query result, including the result of 
EXPLAIN query plan. This has significantly impaired the productivity and 
efficiency of workin on Phoenix.

The right approach is, each test case should write the generated result to a 
file under a new directory, then compares this file with a gold file under 
"gold" directory. After we make some code changes and make sure the change is 
correct by manually verifying several specific test cases, we can safely 
replace the whole "gold" directory with the directory which contains all new 
generated test files during test running. In this way, we can easily rebase all 
the tests.

For example, in BaseStatsCollectorIT.java, we verify the estimated row size in 
the returned result of EXPLAIN query plan, the row size is decided by many 
factors, like the column is encoded or not, it is mutable or not, it uses 
transaction provider or not, it uses TEPHRA or OMID as transaction provider, 
etc. The code snippet "testWithMultiCF" shows a typical test case. The 
comparisons of the result result and the expected result are hard-coded in 
those asserts. Now imagine, if we change the way collecting stats or we change 
column encoding scheme or we change the cell storage format for TEPHRA or OMID, 
which is very likely to happen, then we need to manually change all those 
hard-coded comparison everywhere, and it isn't trivial to re-calculate all 
expected row sizes with the different conditions . Today you might need one or 
two weeks to rebase all the tests, which is a huge waste. We should use "gold" 
files here, so that we can rebase the test very easily.

 BTW, the new generated test result files and gold files should be in XML or 
JSON. The result of "EXPLAN" query should be in XML or JSON too, because the 
file format matches the structure of a query plan.
{code:java}
// code placeholder
@Test
public void testWithMultiCF() throws Exception {
int nRows = 20;
Connection conn = getConnection(0);
PreparedStatement stmt;
conn.createStatement().execute(
"CREATE TABLE " + fullTableName
+ "(k VARCHAR PRIMARY KEY, a.v INTEGER, b.v INTEGER, c.v 
INTEGER NULL, d.v INTEGER NULL) "
+ tableDDLOptions );
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + " VALUES(?,?, 
?, ?, ?)");
byte[] val = new byte[250];
for (int i = 0; i < nRows; i++) {
stmt.setString(1, Character.toString((char)('a' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.setInt(4, i);
stmt.setInt(5, i);
stmt.executeUpdate();
}
conn.commit();
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + "(k, c.v, 
d.v) VALUES(?,?,?)");
for (int i = 0; i < 5; i++) {
stmt.setString(1, Character.toString((char)('a' + 'z' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.executeUpdate();
}
conn.commit();

ResultSet rs;
String actualExplainPlan;
collectStatistics(conn, fullTableName);
List keyRanges = getAllSplits(conn, fullTableName);
assertEquals(26, keyRanges.size());
rs = conn.createStatement().executeQuery("EXPLAIN SELECT * FROM " + 
fullTableName);
actualExplainPlan = QueryUtil.getExplainPlan(rs);
assertEquals(
"CLIENT 26-CHUNK 25 ROWS " + (columnEncoded ? ( mutable ? "12530" : 
"14190" ) : 
(TransactionFactory.Provider.OMID.name().equals(transactionProvider)) ? "25320" 
: "12420") +
" BYTES PARALLEL 1-WAY FULL SCAN OVER " + physicalTableName,
actualExplainPlan);

ConnectionQueryServices services = 
conn.unwrap(PhoenixConnection.class).getQueryServices();
List regions = 
services.getAllTableRegions(Bytes.toBytes(physicalTableName));
assertEquals(1, regions.size());

collectStatistics(conn, fullTableName, Long.toString(1000));
keyRanges = getAllSplits(conn, fullTableName);
boolean oneCellPerColFamilyStorageScheme = !mutable && columnEncoded;
boolean hasShadowCells = 
TransactionFactory.Provider.OMID.name().equals(transactionProvider);
assertEquals(oneCellPerColFamilyStorageScheme ? 14 : hasShadowCells ? 24 : 
13, keyRanges.size());

rs = conn
.createStatement()
.executeQuery(
"SELECT 
COLUMN_FAMILY,SUM(GUIDE_POSTS_ROW_COUNT),SUM(GUIDE_POSTS_WIDTH),COUNT(*) from 
\"SYSTEM\".STATS where PHYSICAL_NAME = '"
+ physicalTableName + "' GROUP BY COLUMN_FAMILY 
ORDER BY COLUMN_FAMILY");

assertTrue(rs.next());

[jira] [Updated] (PHOENIX-5265) Phoenix Test should use gold files for result comparison instead of using hard-corded comparison.

2019-04-30 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5265:
-
Description: 
Currently, in Phoenix Test, the comparison of the returned result and the 
expected result is always hard-coded, which is especially widely used by E2E 
and Integration test for comparing the query result, including the result of 
EXPLAIN query plan. This has significantly impaired the productivity and 
efficiency of workin on Phoenix.

The right approach is, each test case should write the generated result to a 
file under a new directory, then compares this file with a gold file under 
"gold" directory. After we make some code changes and make sure the change is 
correct by manually verifying several specific test cases, we can safely 
replace the whole "gold" directory with the directory which contains all new 
generated test files during test running. In this way, we can easily rebase all 
the tests.

For example, in BaseStatsCollectorIT.java, we verify the estimated row size in 
the returned result of EXPLAIN query plan, the row size is decided by many 
factors, like the column is encoded or not, it is mutable or not, it uses 
transaction provider or not, it uses TEPHRA or OMID as transaction provider, 
etc. The code snippet "testWithMultiCF" shows a typical test case. The 
comparisons of the result result and the expected result are hard-coded in 
those asserts. Now imagine, if we change the way collecting stats or we change 
column encoding scheme or we change the cell storage format for TEPHRA or OMID, 
which is very likely to happen, then we need to manually change all those 
hard-coded comparison everywhere, and it isn't trivial to re-calculate all 
expected row sizes with the different conditions . Today you might need one or 
two weeks to rebase all the tests, which is a huge waste. We should use "gold" 
files here, so that we can rebase the test very easily.

 
{code:java}
// code placeholder
@Test
public void testWithMultiCF() throws Exception {
int nRows = 20;
Connection conn = getConnection(0);
PreparedStatement stmt;
conn.createStatement().execute(
"CREATE TABLE " + fullTableName
+ "(k VARCHAR PRIMARY KEY, a.v INTEGER, b.v INTEGER, c.v 
INTEGER NULL, d.v INTEGER NULL) "
+ tableDDLOptions );
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + " VALUES(?,?, 
?, ?, ?)");
byte[] val = new byte[250];
for (int i = 0; i < nRows; i++) {
stmt.setString(1, Character.toString((char)('a' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.setInt(4, i);
stmt.setInt(5, i);
stmt.executeUpdate();
}
conn.commit();
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + "(k, c.v, 
d.v) VALUES(?,?,?)");
for (int i = 0; i < 5; i++) {
stmt.setString(1, Character.toString((char)('a' + 'z' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.executeUpdate();
}
conn.commit();

ResultSet rs;
String actualExplainPlan;
collectStatistics(conn, fullTableName);
List keyRanges = getAllSplits(conn, fullTableName);
assertEquals(26, keyRanges.size());
rs = conn.createStatement().executeQuery("EXPLAIN SELECT * FROM " + 
fullTableName);
actualExplainPlan = QueryUtil.getExplainPlan(rs);
assertEquals(
"CLIENT 26-CHUNK 25 ROWS " + (columnEncoded ? ( mutable ? "12530" : 
"14190" ) : 
(TransactionFactory.Provider.OMID.name().equals(transactionProvider)) ? "25320" 
: "12420") +
" BYTES PARALLEL 1-WAY FULL SCAN OVER " + physicalTableName,
actualExplainPlan);

ConnectionQueryServices services = 
conn.unwrap(PhoenixConnection.class).getQueryServices();
List regions = 
services.getAllTableRegions(Bytes.toBytes(physicalTableName));
assertEquals(1, regions.size());

collectStatistics(conn, fullTableName, Long.toString(1000));
keyRanges = getAllSplits(conn, fullTableName);
boolean oneCellPerColFamilyStorageScheme = !mutable && columnEncoded;
boolean hasShadowCells = 
TransactionFactory.Provider.OMID.name().equals(transactionProvider);
assertEquals(oneCellPerColFamilyStorageScheme ? 14 : hasShadowCells ? 24 : 
13, keyRanges.size());

rs = conn
.createStatement()
.executeQuery(
"SELECT 
COLUMN_FAMILY,SUM(GUIDE_POSTS_ROW_COUNT),SUM(GUIDE_POSTS_WIDTH),COUNT(*) from 
\"SYSTEM\".STATS where PHYSICAL_NAME = '"
+ physicalTableName + "' GROUP BY COLUMN_FAMILY 
ORDER BY COLUMN_FAMILY");

assertTrue(rs.next());
assertEquals("A", rs.getString(1));
assertEquals(25, rs.getInt(2));
assertEquals(columnEncoded ? ( mutable ? 12530 : 14190 ) : hasShadowCells ? 
25320 : 12420, rs.getInt(3));

[jira] [Created] (PHOENIX-5265) Phoenix Test should use gold files for result comparison instead of using hard-corded comparison.

2019-04-30 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-5265:


 Summary: Phoenix Test should use gold files for result comparison 
instead of using hard-corded comparison.
 Key: PHOENIX-5265
 URL: https://issues.apache.org/jira/browse/PHOENIX-5265
 Project: Phoenix
  Issue Type: Improvement
 Environment: {code:java}
// code placeholder
@Test
public void testWithMultiCF() throws Exception {
int nRows = 20;
Connection conn = getConnection(0);
PreparedStatement stmt;
conn.createStatement().execute(
"CREATE TABLE " + fullTableName
+ "(k VARCHAR PRIMARY KEY, a.v INTEGER, b.v INTEGER, c.v 
INTEGER NULL, d.v INTEGER NULL) "
+ tableDDLOptions );
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + " VALUES(?,?, 
?, ?, ?)");
byte[] val = new byte[250];
for (int i = 0; i < nRows; i++) {
stmt.setString(1, Character.toString((char)('a' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.setInt(4, i);
stmt.setInt(5, i);
stmt.executeUpdate();
}
conn.commit();
stmt = conn.prepareStatement("UPSERT INTO " + fullTableName + "(k, c.v, 
d.v) VALUES(?,?,?)");
for (int i = 0; i < 5; i++) {
stmt.setString(1, Character.toString((char)('a' + 'z' + i)) + 
Bytes.toString(val));
stmt.setInt(2, i);
stmt.setInt(3, i);
stmt.executeUpdate();
}
conn.commit();

ResultSet rs;
String actualExplainPlan;
collectStatistics(conn, fullTableName);
List keyRanges = getAllSplits(conn, fullTableName);
assertEquals(26, keyRanges.size());
rs = conn.createStatement().executeQuery("EXPLAIN SELECT * FROM " + 
fullTableName);
actualExplainPlan = QueryUtil.getExplainPlan(rs);
assertEquals(
"CLIENT 26-CHUNK 25 ROWS " + (columnEncoded ? ( mutable ? "12530" : 
"14190" ) : 
(TransactionFactory.Provider.OMID.name().equals(transactionProvider)) ? "25320" 
: "12420") +
" BYTES PARALLEL 1-WAY FULL SCAN OVER " + physicalTableName,
actualExplainPlan);

ConnectionQueryServices services = 
conn.unwrap(PhoenixConnection.class).getQueryServices();
List regions = 
services.getAllTableRegions(Bytes.toBytes(physicalTableName));
assertEquals(1, regions.size());

collectStatistics(conn, fullTableName, Long.toString(1000));
keyRanges = getAllSplits(conn, fullTableName);
boolean oneCellPerColFamilyStorageScheme = !mutable && columnEncoded;
boolean hasShadowCells = 
TransactionFactory.Provider.OMID.name().equals(transactionProvider);
assertEquals(oneCellPerColFamilyStorageScheme ? 14 : hasShadowCells ? 24 : 
13, keyRanges.size());

rs = conn
.createStatement()
.executeQuery(
"SELECT 
COLUMN_FAMILY,SUM(GUIDE_POSTS_ROW_COUNT),SUM(GUIDE_POSTS_WIDTH),COUNT(*) from 
\"SYSTEM\".STATS where PHYSICAL_NAME = '"
+ physicalTableName + "' GROUP BY COLUMN_FAMILY 
ORDER BY COLUMN_FAMILY");

assertTrue(rs.next());
assertEquals("A", rs.getString(1));
assertEquals(25, rs.getInt(2));
assertEquals(columnEncoded ? ( mutable ? 12530 : 14190 ) : hasShadowCells ? 
25320 : 12420, rs.getInt(3));
assertEquals(oneCellPerColFamilyStorageScheme ? 13 : hasShadowCells ? 23 : 
12, rs.getInt(4));

assertTrue(rs.next());
assertEquals("B", rs.getString(1));
assertEquals(oneCellPerColFamilyStorageScheme ? 25 : 20, rs.getInt(2));
assertEquals(columnEncoded ? ( mutable ? 5600 : 7260 ) : hasShadowCells ? 
11260 : 5540, rs.getInt(3));
assertEquals(oneCellPerColFamilyStorageScheme ? 7 : hasShadowCells ? 10 : 
5, rs.getInt(4));

assertTrue(rs.next());
assertEquals("C", rs.getString(1));
assertEquals(25, rs.getInt(2));
assertEquals(columnEncoded ? ( mutable ? 7005 : 7280 ) : hasShadowCells ? 
14085 : 6930, rs.getInt(3));
assertEquals(hasShadowCells ? 13 : 7, rs.getInt(4));

assertTrue(rs.next());
assertEquals("D", rs.getString(1));
assertEquals(25, rs.getInt(2));
assertEquals(columnEncoded ? ( mutable ? 7005 : 7280 ) : hasShadowCells ? 
14085 : 6930, rs.getInt(3));
assertEquals(hasShadowCells ? 13 : 7, rs.getInt(4));

assertFalse(rs.next());

// Disable stats
conn.createStatement().execute("ALTER TABLE " + fullTableName + 
" SET " + PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH + "=0");
collectStatistics(conn, fullTableName);
// Assert that there are no more guideposts
rs = conn.createStatement().executeQuery("SELECT count(1) FROM " + 
PhoenixDatabaseMetaData.SYSTEM_STATS_NAME + 
" WHERE " + PhoenixDatabaseMetaData.PHYSICAL_NAME + "='" + 
physicalTableName + "' AND " + PhoenixDatabaseMetaData.COLUMN_FAMILY + " IS NOT 
NULL");
assertTrue(rs.next());
assertEquals(0, rs.getLong(1));
assertFalse(rs.next());
rs = 

[jira] [Updated] (PHOENIX-5262) Wrong Result on Salted table with Varbinary PK

2019-04-29 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5262:
-
Comment: was deleted

(was: [~dbwong], The description for this bug isn't accurate. This is a bug 
generally for Salted Table regardless of PK. If the PK is "k INTEGER PRIMARY 
KEY", we actually have the same issue.)

> Wrong Result on Salted table with Varbinary PK
> --
>
> Key: PHOENIX-5262
> URL: https://issues.apache.org/jira/browse/PHOENIX-5262
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Attachments: PHOENIX-5262.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5176) KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns wrong result when two key ranges have the same upper bound values but one is inclusive and another is exclusive

2019-03-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5176:
-
Fix Version/s: (was: 4.14.2)
   (was: 5.1.0)

> KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns wrong result when 
> two key ranges have the same upper bound values but one is inclusive and 
> another is exclusive 
> -
>
> Key: PHOENIX-5176
> URL: https://issues.apache.org/jira/browse/PHOENIX-5176
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Blocker
> Fix For: 4.15.0
>
>
> In KeyRange.java, 
> {color:#262626}    public static int compareUpperRange(KeyRange rowKeyRange1, 
> KeyRange rowKeyRange2) {{color}
> {color:#262626}        int result = 
> Boolean.compare(rowKeyRange1.upperUnbound(), 
> rowKeyRange2.upperUnbound());{color}
> {color:#262626}        if (result != 0) {{color}
> {color:#262626}            return result;{color}
> {color:#262626}        }{color}
> {color:#262626}        result = 
> Bytes.BYTES_COMPARATOR.compare(rowKeyRange1.getUpperRange(), 
> rowKeyRange2.getUpperRange());{color}
> {color:#262626}        if (result != 0) {{color}
> {color:#262626}            return result;{color}
> {color:#262626}        }{color}
> {color:#262626}        return 
> Boolean.compare(*rowKeyRange2*.isUpperInclusive(), 
> *rowKeyRange1*.isUpperInclusive());{color}
> {color:#262626}    }{color}
> {color:#262626} {color}
> {color:#262626}The last line in yellow color should be "{color}return 
> Boolean.compare(*rowKeyRange1*.isUpperInclusive(), 
> *rowKeyRange2*.isUpperInclusive());".  Given rowKeyRange1 [3, 5) and 
> rowKeyRange2 [3, 5], the function should return -1, but now it returns 1 due 
> to the bug I mentioned.
>  
> The KeyRange.compareUpperRange is only used in 
> KeyRange.intersect(List rowKeyRanges1, List 
> rowKeyRanges2). Given rowKeyRanges1 \{[3, 5), [5, 6)} and rowKeyRanges2\{[3, 
> 5], [6, 7]}, the function should return \{[3, 5), [5, 5]}, i.e., \{[3, 5]}, 
> but it seems that now it returns \{[3,5)} due to the bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5176) KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns wrong result when two key ranges have the same upper bound values but one is inclusive and another is exclusive

2019-03-05 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-5176:


 Summary: KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns 
wrong result when two key ranges have the same upper bound values but one is 
inclusive and another is exclusive 
 Key: PHOENIX-5176
 URL: https://issues.apache.org/jira/browse/PHOENIX-5176
 Project: Phoenix
  Issue Type: Bug
Reporter: Bin Shi


In KeyRange.java, 
{color:#262626}    public static int compareUpperRange(KeyRange rowKeyRange1, 
KeyRange rowKeyRange2) {{color}
{color:#262626}        int result = 
Boolean.compare(rowKeyRange1.upperUnbound(), 
rowKeyRange2.upperUnbound());{color}
{color:#262626}        if (result != 0) {{color}
{color:#262626}            return result;{color}
{color:#262626}        }{color}
{color:#262626}        result = 
Bytes.BYTES_COMPARATOR.compare(rowKeyRange1.getUpperRange(), 
rowKeyRange2.getUpperRange());{color}
{color:#262626}        if (result != 0) {{color}
{color:#262626}            return result;{color}
{color:#262626}        }{color}
{color:#262626}        return 
Boolean.compare(*rowKeyRange2*.isUpperInclusive(), 
*rowKeyRange1*.isUpperInclusive());{color}
{color:#262626}    }{color}
{color:#262626} {color}
{color:#262626}The last line in yellow color should be "{color}return 
Boolean.compare(*rowKeyRange1*.isUpperInclusive(), 
*rowKeyRange2*.isUpperInclusive());".  Given rowKeyRange1 [3, 5) and 
rowKeyRange2 [3, 5], the function should return -1, but now it returns 1 due to 
the bug I mentioned.
 
The KeyRange.compareUpperRange is only used in 
KeyRange.intersect(List rowKeyRanges1, List rowKeyRanges2). 
Given rowKeyRanges1 \{[3, 5), [5, 6)} and rowKeyRanges2\{[3, 5], [6, 7]}, the 
function should return \{[3, 5), [5, 5]}, i.e., \{[3, 5]}, but it seems that 
now it returns \{[3,5)} due to the bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5176) KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns wrong result when two key ranges have the same upper bound values but one is inclusive and another is exclusiv

2019-03-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-5176:


Assignee: Bin Shi

> KeyRange.compareUpperRange(KeyRang 1, KeyRang 2) returns wrong result when 
> two key ranges have the same upper bound values but one is inclusive and 
> another is exclusive 
> -
>
> Key: PHOENIX-5176
> URL: https://issues.apache.org/jira/browse/PHOENIX-5176
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> In KeyRange.java, 
> {color:#262626}    public static int compareUpperRange(KeyRange rowKeyRange1, 
> KeyRange rowKeyRange2) {{color}
> {color:#262626}        int result = 
> Boolean.compare(rowKeyRange1.upperUnbound(), 
> rowKeyRange2.upperUnbound());{color}
> {color:#262626}        if (result != 0) {{color}
> {color:#262626}            return result;{color}
> {color:#262626}        }{color}
> {color:#262626}        result = 
> Bytes.BYTES_COMPARATOR.compare(rowKeyRange1.getUpperRange(), 
> rowKeyRange2.getUpperRange());{color}
> {color:#262626}        if (result != 0) {{color}
> {color:#262626}            return result;{color}
> {color:#262626}        }{color}
> {color:#262626}        return 
> Boolean.compare(*rowKeyRange2*.isUpperInclusive(), 
> *rowKeyRange1*.isUpperInclusive());{color}
> {color:#262626}    }{color}
> {color:#262626} {color}
> {color:#262626}The last line in yellow color should be "{color}return 
> Boolean.compare(*rowKeyRange1*.isUpperInclusive(), 
> *rowKeyRange2*.isUpperInclusive());".  Given rowKeyRange1 [3, 5) and 
> rowKeyRange2 [3, 5], the function should return -1, but now it returns 1 due 
> to the bug I mentioned.
>  
> The KeyRange.compareUpperRange is only used in 
> KeyRange.intersect(List rowKeyRanges1, List 
> rowKeyRanges2). Given rowKeyRanges1 \{[3, 5), [5, 6)} and rowKeyRanges2\{[3, 
> 5], [6, 7]}, the function should return \{[3, 5), [5, 5]}, i.e., \{[3, 5]}, 
> but it seems that now it returns \{[3,5)} due to the bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info

2019-02-19 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Description: 
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use Segment Tree to address both memory and performance concerns. The 
guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
prefix encoding and the encoded data is a leaf node of the tree. The inner node 
contains summary info (the count of rows, the data size) of the sub tree rooted 
at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 

  was:
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use Segment Tree to address both memory and performance concerns. The 
guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
prefix encoding and the encoded data is a leaf node of the tree. The inner node 
contains summary info (the count of rows, the data size) of the sub tree rooted 
at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 


> Use Segment tree to organize Guide Post Info
> 
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, 

[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info

2019-02-19 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Summary: Use Segment tree to organize Guide Post Info  (was: Use 
Segment/SUM tree to organize Guide Post Info)

> Use Segment tree to organize Guide Post Info
> 
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use something like Sum Tree (even Binary Indexed Tree) to address both 
> memory and performance concerns. The guide posts are partitioned to k chunks 
> (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
> leaf node of the tree. The inner node contains summary info (the count of 
> rows, the data size) of the sub tree rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info

2019-02-19 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Description: 
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use Segment Tree to address both memory and performance concerns. The 
guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
prefix encoding and the encoded data is a leaf node of the tree. The inner node 
contains summary info (the count of rows, the data size) of the sub tree rooted 
at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 

  was:
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use something like Sum Tree (even Binary Indexed Tree) to address both 
memory and performance concerns. The guide posts are partitioned to k chunks 
(k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
leaf node of the tree. The inner node contains summary info (the count of rows, 
the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 


> Use Segment tree to organize Guide Post Info
> 
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating 

[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-02-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.4.x-HBase-1.3.001.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.001.patch, 
> PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.002.patch, 
> PHOENIX-5069.4.x-HBase-1.3.001.patch, PHOENIX-5069.4.x-HBase-1.4.001.patch, 
> PHOENIX-5069.master.001.patch, PHOENIX-5069.master.002.patch, 
> PHOENIX-5069.master.003.patch, PHOENIX-5069.master.004.patch, 
> PHOENIX-5069.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4594) Query Compilation where there are many guideposts causes overhead

2019-02-10 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4594:
-
Summary: Query Compilation where there are many guideposts causes overhead  
(was: Perform binary search on guideposts during query compilation)

> Query Compilation where there are many guideposts causes overhead
> -
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch, 
> PHOENIX-4594_0918.patch
>
>
> If there are many guideposts, performance will suffer during query 
> compilation because we do a linear search of the guideposts to find the 
> intersection with the scan ranges. Instead, in 
> BaseResultIterators.getParallelScans() we should populate an array of 
> guideposts and perform a binary search. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4594) Query Compilation where there are many guideposts causes overhead

2019-02-10 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4594:
-
Description: If there are many guideposts, performance will suffer during 
query compilation.  (was: If there are many guideposts, performance will suffer 
during query compilation because we do a linear search of the guideposts to 
find the intersection with the scan ranges. Instead, in 
BaseResultIterators.getParallelScans() we should populate an array of 
guideposts and perform a binary search. )

> Query Compilation where there are many guideposts causes overhead
> -
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch, 
> PHOENIX-4594_0918.patch
>
>
> If there are many guideposts, performance will suffer during query 
> compilation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-02-10 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.master.004.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.001.patch, 
> PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.002.patch, 
> PHOENIX-5069.4.x-HBase-1.4.001.patch, PHOENIX-5069.master.001.patch, 
> PHOENIX-5069.master.002.patch, PHOENIX-5069.master.003.patch, 
> PHOENIX-5069.master.004.patch, PHOENIX-5069.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-02-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.4.x-HBase-1.4.001.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.001.patch, 
> PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.002.patch, 
> PHOENIX-5069.4.x-HBase-1.4.001.patch, PHOENIX-5069.master.001.patch, 
> PHOENIX-5069.master.002.patch, PHOENIX-5069.master.003.patch, 
> PHOENIX-5069.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PHOENIX-4924) Provide strong guarantee for Stats Write to prevent inconsistent and incomplete data issue

2019-02-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi resolved PHOENIX-4924.
--
Resolution: Fixed

> Provide strong guarantee for Stats Write to prevent inconsistent and 
> incomplete data issue
> --
>
> Key: PHOENIX-4924
> URL: https://issues.apache.org/jira/browse/PHOENIX-4924
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The Stats could be inconsistent and incomplete due to region servers going 
> down, RPC failures when committing Guidepost data to Stats table and the lack 
> of retry and atomic operation. The expected behavior from the Platform is “We 
> need a strong guarantee that on stats write there are sufficient retries and 
> monitoring in place so that stats writes are resilient such that we will not 
> end up with inconsistent, partial or empty stats until the next run of UPDATE 
> STATISTICS”. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4924) Provide strong guarantee for Stats Write to prevent inconsistent and incomplete data issue

2019-02-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4924:
-
Comment: was deleted

(was: h2. Conclusion:
 # The test result clearly shows that the change doesn't introduce any new 
performance regression and issues.
 # The test result clearly shows that the pattern of “Periodical query spikes” 
observed in blocking stats cache (see the design document [Use Asynchronous 
Refresh to Provide Non-blocking Phoenix Stats 
Cache|https://salesforce.quip.com/rxokAkVatiQO] for details) is gone.
 # When analyzing the noises in the test result of the Flighting, I suspected 
that it is because multiple loading stats async tasks are triggered for the 
same cached entry during the same period. By exploring the code base of Google 
Guava Cache, I found this negative case won't happen, because Google Guava 
Cache won't try to reload a cache entry when there is another thread performing 
the refresh.

h2. Test Result:
h3. Baseline 1st run

!x8q4HN0IRAZ0wBJRU5ErkJggg==!
h3. Flighting 1st RUN

!B8EntiMHCbVvAElFTkSuQmCC!
h3. Baseline 4th Run

!0HRAABAKBQCCweRHwyX0jwmTzykK0PBAIBFYfgf8PqgsrMqhZgIIASUVORK5CYII=!
h3. Flighting 4th Run

!j8FfZ 2FQd2ugBJRU5ErkJggg==!
h2.  )

> Provide strong guarantee for Stats Write to prevent inconsistent and 
> incomplete data issue
> --
>
> Key: PHOENIX-4924
> URL: https://issues.apache.org/jira/browse/PHOENIX-4924
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The Stats could be inconsistent and incomplete due to region servers going 
> down, RPC failures when committing Guidepost data to Stats table and the lack 
> of retry and atomic operation. The expected behavior from the Platform is “We 
> need a strong guarantee that on stats write there are sufficient retries and 
> monitoring in place so that stats writes are resilient such that we will not 
> end up with inconsistent, partial or empty stats until the next run of UPDATE 
> STATISTICS”. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (PHOENIX-4924) Provide strong guarantee for Stats Write to prevent inconsistent and incomplete data issue

2019-02-05 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reopened PHOENIX-4924:
--

> Provide strong guarantee for Stats Write to prevent inconsistent and 
> incomplete data issue
> --
>
> Key: PHOENIX-4924
> URL: https://issues.apache.org/jira/browse/PHOENIX-4924
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The Stats could be inconsistent and incomplete due to region servers going 
> down, RPC failures when committing Guidepost data to Stats table and the lack 
> of retry and atomic operation. The expected behavior from the Platform is “We 
> need a strong guarantee that on stats write there are sufficient retries and 
> monitoring in place so that stats writes are resilient such that we will not 
> end up with inconsistent, partial or empty stats until the next run of UPDATE 
> STATISTICS”. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-01-30 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.001.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069-4.14.1-hbase-1.3-phoenix-stats.001.patch, 
> PHOENIX-5069.master.001.patch, PHOENIX-5069.master.002.patch, 
> PHOENIX-5069.master.003.patch, PHOENIX-5069.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-01-23 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.master.003.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069.master.001.patch, 
> PHOENIX-5069.master.002.patch, PHOENIX-5069.master.003.patch, 
> PHOENIX-5069.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-01-18 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.master.002.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069.master.001.patch, 
> PHOENIX-5069.master.002.patch, PHOENIX-5069.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2019-01-15 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: PHOENIX-5069.master.001.patch

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-5069.master.001.patch, PHOENIX-5069.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-14 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-5069:


Assignee: Bin Shi

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Description: 
The current Phoenix Stats Cache uses TTL based eviction policy. A cached entry 
will expire after a given amount of time (900s by default) passed since the 
entry's been created. This will lead to cache miss when Compiler/Optimizer 
fetches stats from cache at the next time. As you can see from the above graph, 
fetching stats from the cache is a blocking operation — when there is cache 
miss, it has a round trip over the wire to scan the SYSTEM.STATS Table and to 
get the latest stats info, rebuild the cache and finally return the stats to 
the Compiler/Optimizer. Whenever there is a cache miss, this blocking call 
causes significant performance penalty and see periodic spikes.

*This Jira suggests to use asynchronous refresh mechanism to provide a 
non-blocking cache. For details, please see the linked design document below.*

[~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
[~sergey soldatov] 

  was:
The current Phoenix Stats Cache uses TTL based eviction policy. A cached entry 
will expire after a given amount of time (900s by default) passed since the 
entry's been created. This will lead to cache miss when Compiler/Optimizer 
fetches stats from cache at the next time. As you can see from the above graph, 
fetching stats from the cache is a blocking operation — when there is cache 
miss, it has a round trip over the wire to scan the SYSTEM.STATS Table and to 
get the latest stats info, rebuild the cache and finally return the stats to 
the Compiler/Optimizer. Whenever there is a cache miss, this blocking call 
causes significant performance penalty and see periodic spikes.

This Jira suggests to use asynchronous refresh mechanism to provide a 
non-blocking cache. 


> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Priority: Major
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> *This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. For details, please see the linked design document below.*
> [~karanmehta93] [~twdsi...@gmail.com] [~dbwong] [~elserj] [~an...@apache.org] 
> [~sergey soldatov] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Description: 
The current Phoenix Stats Cache uses TTL based eviction policy. A cached entry 
will expire after a given amount of time (900s by default) passed since the 
entry's been created. This will lead to cache miss when Compiler/Optimizer 
fetches stats from cache at the next time. As you can see from the above graph, 
fetching stats from the cache is a blocking operation — when there is cache 
miss, it has a round trip over the wire to scan the SYSTEM.STATS Table and to 
get the latest stats info, rebuild the cache and finally return the stats to 
the Compiler/Optimizer. Whenever there is a cache miss, this blocking call 
causes significant performance penalty and see periodic spikes.

This Jira suggests to use asynchronous refresh mechanism to provide a 
non-blocking cache. 

  was:
Below is the high level picture of Phoenix Stats Cache which is based on Google 
Guava cache.
 !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
eviction policy. A cached entry will expire after a given amount of time (900s 
by default) passed since the entry's been created. This will lead to cache miss 
when Compiler/Optimizer fetches stats from cache at the next time. As you can 
see from the above graph, fetching stats from the cache is a blocking operation 
— when there is cache miss, it has a round trip over the wire to scan the 
SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
miss, this blocking call causes significant performance penalty and see 
periodic spikes.

This Jira suggests to use asynchronous refresh mechanism to fix this and 
provide a non-blocking cache.


> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Priority: Major
>
> The current Phoenix Stats Cache uses TTL based eviction policy. A cached 
> entry will expire after a given amount of time (900s by default) passed since 
> the entry's been created. This will lead to cache miss when 
> Compiler/Optimizer fetches stats from cache at the next time. As you can see 
> from the above graph, fetching stats from the cache is a blocking operation — 
> when there is cache miss, it has a round trip over the wire to scan the 
> SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
> finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
> miss, this blocking call causes significant performance penalty and see 
> periodic spikes.
> This Jira suggests to use asynchronous refresh mechanism to provide a 
> non-blocking cache. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Attachment: (was: image-2018-12-13-11-16-16-884.png)

> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Priority: Major
>
> Below is the high level picture of Phoenix Stats Cache which is based on 
> Google Guava cache.
>  !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
> eviction policy. A cached entry will expire after a given amount of time 
> (900s by default) passed since the entry's been created. This will lead to 
> cache miss when Compiler/Optimizer fetches stats from cache at the next time. 
> As you can see from the above graph, fetching stats from the cache is a 
> blocking operation — when there is cache miss, it has a round trip over the 
> wire to scan the SYSTEM.STATS Table and to get the latest stats info, rebuild 
> the cache and finally return the stats to the Compiler/Optimizer. Whenever 
> there is a cache miss, this blocking call causes significant performance 
> penalty and see periodic spikes.
> This Jira suggests to use asynchronous refresh mechanism to fix this and 
> provide a non-blocking cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-5069:
-
Description: 
Below is the high level picture of Phoenix Stats Cache which is based on Google 
Guava cache.
 !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
eviction policy. A cached entry will expire after a given amount of time (900s 
by default) passed since the entry's been created. This will lead to cache miss 
when Compiler/Optimizer fetches stats from cache at the next time. As you can 
see from the above graph, fetching stats from the cache is a blocking operation 
— when there is cache miss, it has a round trip over the wire to scan the 
SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
miss, this blocking call causes significant performance penalty and see 
periodic spikes.

This Jira suggests to use asynchronous refresh mechanism to fix this and 
provide a non-blocking cache.

  was:
Below is the high level picture of Phoenix Stats Cache which is based on Google 
Guava cache.
 !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
eviction policy. A cached entry will expire after a given amount of time (900s 
by default) passed since the entry's been created. This will lead to cache miss 
when Compiler/Optimizer fetches stats from cache at the next time. As you can 
see from the above graph, fetching stats from the cache is a blocking operation 
— when there is cache miss, it has a round trip over the wire to scan the 
SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
miss, this blocking call causes significant performance penalty and see 
periodic spikes.

This Jira suggests to use asynchronous refresh mechanism[link 
title|http://example.com][link title|http://example.com] to fix this and 
provide a non-blocking cache.


> Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache
> ---
>
> Key: PHOENIX-5069
> URL: https://issues.apache.org/jira/browse/PHOENIX-5069
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Priority: Major
>
> Below is the high level picture of Phoenix Stats Cache which is based on 
> Google Guava cache.
>  !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
> eviction policy. A cached entry will expire after a given amount of time 
> (900s by default) passed since the entry's been created. This will lead to 
> cache miss when Compiler/Optimizer fetches stats from cache at the next time. 
> As you can see from the above graph, fetching stats from the cache is a 
> blocking operation — when there is cache miss, it has a round trip over the 
> wire to scan the SYSTEM.STATS Table and to get the latest stats info, rebuild 
> the cache and finally return the stats to the Compiler/Optimizer. Whenever 
> there is a cache miss, this blocking call causes significant performance 
> penalty and see periodic spikes.
> This Jira suggests to use asynchronous refresh mechanism to fix this and 
> provide a non-blocking cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5069) Use asynchronous refresh to provide non-blocking Phoenix Stats Client Cache

2018-12-13 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-5069:


 Summary: Use asynchronous refresh to provide non-blocking Phoenix 
Stats Client Cache
 Key: PHOENIX-5069
 URL: https://issues.apache.org/jira/browse/PHOENIX-5069
 Project: Phoenix
  Issue Type: Improvement
Reporter: Bin Shi
 Attachments: image-2018-12-13-11-16-16-884.png

Below is the high level picture of Phoenix Stats Cache which is based on Google 
Guava cache.
 !OmCWFETQBJRU5ErkJggg==! The current Phoenix Stats Cache uses TTL based 
eviction policy. A cached entry will expire after a given amount of time (900s 
by default) passed since the entry's been created. This will lead to cache miss 
when Compiler/Optimizer fetches stats from cache at the next time. As you can 
see from the above graph, fetching stats from the cache is a blocking operation 
— when there is cache miss, it has a round trip over the wire to scan the 
SYSTEM.STATS Table and to get the latest stats info, rebuild the cache and 
finally return the stats to the Compiler/Optimizer. Whenever there is a cache 
miss, this blocking call causes significant performance penalty and see 
periodic spikes.

This Jira suggests to use asynchronous refresh mechanism[link 
title|http://example.com][link title|http://example.com] to fix this and 
provide a non-blocking cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5058) Improvements to client side cache guideposts cache

2018-12-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-5058:


Assignee: Bin Shi

> Improvements to client side cache guideposts cache
> --
>
> Key: PHOENIX-5058
> URL: https://issues.apache.org/jira/browse/PHOENIX-5058
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0, 5.1.0
>Reporter: Karan Mehta
>Assignee: Bin Shi
>Priority: Major
>
> Current implementation of phoenix guideposts cache has certain limitations 
> and is not easy for production use cases. I am attaching a design document to 
> this Jira, which highlights these aspects of the cache and the potential 
> solutions to mitigate them. These improvements should be taken as child 
> Jira's under this umbrella Jira.
> [~Bin Shi] [~dbwong] [~elserj] [~an...@apache.org] [~sergey soldatov] 
> Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use Segment/SUM tree to organize Guide Post Info

2018-12-03 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Summary: Use Segment/SUM tree to organize Guide Post Info  (was: Use SUM 
tree to organize Guide Post Info)

> Use Segment/SUM tree to organize Guide Post Info
> 
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use something like Sum Tree (even Binary Indexed Tree) to address both 
> memory and performance concerns. The guide posts are partitioned to k chunks 
> (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
> leaf node of the tree. The inner node contains summary info (the count of 
> rows, the data size) of the sub tree rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4998) The estimated instance size of GuidePostsInfo doesn't count the last update timestamp array

2018-10-28 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-4998:


Assignee: Bin Shi

> The estimated instance size of GuidePostsInfo doesn't count the last update 
> timestamp array
> ---
>
> Key: PHOENIX-4998
> URL: https://issues.apache.org/jira/browse/PHOENIX-4998
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> GuidePostsCache (implemented Guava Cache) use the estimated instance size of 
> GuidePostsInfo as the weight of the cache entry. The bug results in the 
> underestimation of the weight of the cache entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4998) The estimated instance size of GuidePostsInfo doesn't count the last update timestamp array

2018-10-28 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4998:


 Summary: The estimated instance size of GuidePostsInfo doesn't 
count the last update timestamp array
 Key: PHOENIX-4998
 URL: https://issues.apache.org/jira/browse/PHOENIX-4998
 Project: Phoenix
  Issue Type: Bug
Reporter: Bin Shi


GuidePostsCache (implemented Guava Cache) use the estimated instance size of 
GuidePostsInfo as the weight of the cache entry. The bug results in the 
underestimation of the weight of the cache entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4927) Disentangle the granularity of guidepost data from that of client cached guide post data

2018-09-25 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-4927:


Assignee: Bin Shi

> Disentangle the granularity of guidepost data from that of client cached 
> guide post data
> 
>
> Key: PHOENIX-4927
> URL: https://issues.apache.org/jira/browse/PHOENIX-4927
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The expected behaviors:
>  # It should be possible to have 10MB guideposts for precision and not force 
> that to cache that many guide posts on the clients.
>  # It should combine as many guideposts into chunks as needed for the 
> particular query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4927) Disentangle the granularity of guidepost data from that of client cached guide post data

2018-09-25 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4927:


 Summary: Disentangle the granularity of guidepost data from that 
of client cached guide post data
 Key: PHOENIX-4927
 URL: https://issues.apache.org/jira/browse/PHOENIX-4927
 Project: Phoenix
  Issue Type: Improvement
Reporter: Bin Shi


The expected behaviors:
 # It should be possible to have 10MB guideposts for precision and not force 
that to cache that many guide posts on the clients.
 # It should combine as many guideposts into chunks as needed for the 
particular query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4926) Disentangle the guidepost data from the actually materialized scans

2018-09-25 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4926:


 Summary: Disentangle the guidepost data from the actually 
materialized scans
 Key: PHOENIX-4926
 URL: https://issues.apache.org/jira/browse/PHOENIX-4926
 Project: Phoenix
  Issue Type: Improvement
Reporter: Bin Shi
Assignee: Bin Shi


Currently, when we enable stats for intra-region read parallelization, the 
granularity of guide posts is the same as the granularity of parallel scans, 
which blocks us from using smaller guide posts in query complexity estimation 
and query optimization for better accuracy. 

We can define the target intra-region parallelism so that a scan is an integral 
multiple of the defined GUIDE_POSTS_WIDTH value and its value is between [1, 
the number of threads in thread pool or PRC handlers on Region server].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4925) Use SUM tree to organize Guide Post Info

2018-09-25 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4925:
-
Description: 
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use something like Sum Tree (even Binary Indexed Tree) to address both 
memory and performance concerns. The guide posts are partitioned to k chunks 
(k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
leaf node of the tree. The inner node contains summary info (the count of rows, 
the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 

  was:
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
 
By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts. 

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use something like Sum Tree (even Binary Indexed Tree) to address both 
memory and performance concerns. The guide posts are partitioned to k chunks 
(k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
leaf node of the tree. The inner node contains summary info (the count of rows, 
the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).
 


> Use SUM tree to organize Guide Post Info
> 
>
> Key: PHOENIX-4925
> URL: https://issues.apache.org/jira/browse/PHOENIX-4925
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT 

[jira] [Created] (PHOENIX-4925) Use SUM tree to organize Guide Post Info

2018-09-25 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4925:


 Summary: Use SUM tree to organize Guide Post Info
 Key: PHOENIX-4925
 URL: https://issues.apache.org/jira/browse/PHOENIX-4925
 Project: Phoenix
  Issue Type: Improvement
Reporter: Bin Shi
Assignee: Bin Shi


As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
 
By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts. 

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use something like Sum Tree (even Binary Indexed Tree) to address both 
memory and performance concerns. The guide posts are partitioned to k chunks 
(k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
leaf node of the tree. The inner node contains summary info (the count of rows, 
the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4924) Provide strong guarantee for Stats Write to prevent inconsistent and incomplete data issue

2018-09-25 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4924:


 Summary: Provide strong guarantee for Stats Write to prevent 
inconsistent and incomplete data issue
 Key: PHOENIX-4924
 URL: https://issues.apache.org/jira/browse/PHOENIX-4924
 Project: Phoenix
  Issue Type: Improvement
Reporter: Bin Shi
Assignee: Bin Shi


The Stats could be inconsistent and incomplete due to region servers going 
down, RPC failures when committing Guidepost data to Stats table and the lack 
of retry and atomic operation. The expected behavior from the Platform is “We 
need a strong guarantee that on stats write there are sufficient retries and 
monitoring in place so that stats writes are resilient such that we will not 
end up with inconsistent, partial or empty stats until the next run of UPDATE 
STATISTICS”. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4923) Create more integration test cases for Phoenix Stats

2018-09-25 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4923:


 Summary: Create more integration test cases for Phoenix Stats
 Key: PHOENIX-4923
 URL: https://issues.apache.org/jira/browse/PHOENIX-4923
 Project: Phoenix
  Issue Type: Test
Reporter: Bin Shi
Assignee: Bin Shi


For Phoenix Stats, we need to create more test cases for point lookup, serial 
plan and parallel plan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4008) UPDATE STATISTIC should collect all versions of cells

2018-09-24 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4008:
-
Attachment: PHONEIX-4008.4.X-HBase-1.2.001.patch
PHONEIX-4008.4.X-HBase-1.3.001.patch
PHONEIX-4008.4.X-HBase-1.4.001.patch

> UPDATE STATISTIC should collect all versions of cells
> -
>
> Key: PHOENIX-4008
> URL: https://issues.apache.org/jira/browse/PHOENIX-4008
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Bin Shi
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-4008_0918.patch, PHOENIX-4008_0920.patch, 
> PHONEIX-4008.4.X-HBase-1.2.001.patch, PHONEIX-4008.4.X-HBase-1.3.001.patch, 
> PHONEIX-4008.4.X-HBase-1.4.001.patch
>
>
> In order to truly measure the size of data when calculating guide posts, 
> UPDATE STATISTIC should taken into account all versions of cells. We should 
> also be setting the max versions on the scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4916) When collecting statistics, the estimated size of a guide post may only count part of cells of the last row

2018-09-22 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4916:
-
Description: 
In DefaultStatisticsCollector.collectStatistics(...), it iterates all cells of 
the current row, once the accumulated estimated size plus the size of the 
current cell >= guide post width, it skips all the remaining cells. The result 
is that the estimated size of a guide post may only count part of cells of the 
last row.

This problem can be ignored in clusters with real data where the guide post 
width is much bigger than the row size, but it does have impact on unit test 
and integration test, because we use very small guide post width in the test 
which results in inaccuracy of the estimated size of the query.

  was:
In DefaultStatisticsCollector.collectStatistics(...), it iterate all cells of 
the current row, once the accumulated estimated size plus the size of the 
current cell >= guide post width, it skipped all the remaining cells. The 
result is that  he estimated size of a guide post may only count part of cells 
of the last row.

This problem can be ignored in clusters with real data where the guide post 
width is much bigger than the row size, but it does have impact on unit test 
and iteration test, because we use very small guide post width in the test 
which results in inaccuracy of the estimated size of the query.


> When collecting statistics, the estimated size of a guide post may only count 
> part of cells of the last row
> ---
>
> Key: PHOENIX-4916
> URL: https://issues.apache.org/jira/browse/PHOENIX-4916
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> In DefaultStatisticsCollector.collectStatistics(...), it iterates all cells 
> of the current row, once the accumulated estimated size plus the size of the 
> current cell >= guide post width, it skips all the remaining cells. The 
> result is that the estimated size of a guide post may only count part of 
> cells of the last row.
> This problem can be ignored in clusters with real data where the guide post 
> width is much bigger than the row size, but it does have impact on unit test 
> and integration test, because we use very small guide post width in the test 
> which results in inaccuracy of the estimated size of the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-21 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4912:
-
Description: 
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans 
denoted as T * table sampling rate denoted as R (0 <= R <= 100) / 100.00.

To resolve the above problem, one of algorithms that we can consider are 
described below. The core idea is to adjust T, R, Y after each pick, so the new 
problem is a child problem of the original problem.
{code:java}
ArrayList TableSampling(ArrayList scans, T, R) {  
    ArrayList pickedScans = new ArrayList();
    Y = T * R / 100.00;
    for (scan in scans) {
        if (Y <= 0) break;
        if (getHashCode(Ki) MOD 100 < R) {
            // then pick this scan, and adjust T, R, Y accordingly
            pickedScans.Add(scan);
            T -= Ci;
            Y -= Ci;
            if (T != 0 && Y > 0) { 
                R = 100.00 * Y / T;
            }
        }
    }
    return pickedScans;
}
{code}

  was:
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans 
denoted as T * table sampling rate denoted as R (0 <= R <= 100).

To resolve the above problem, one of algorithms that we can consider are 
described below. The core idea is to adjust T, R, Y after each pick, so the new 
problem is a child problem of the original problem.
{code:java}
ArrayList TableSampling(ArrayList scans, T, R) {  
    ArrayList pickedScans = new ArrayList();
    Y = T * R / 100.00;
    for (scan in scans) {
        if (Y <= 0) break;
        if (getHashCode(Ki) MOD 100 < R) {
            // then pick this scan, and adjust T, R, Y accordingly
            pickedScans.Add(scan);
            T -= Ci;
            Y -= Ci;
            if (T != 0 && Y > 0) { 
                R = 100.00 * Y / T;
            }
        }
    }
    return pickedScans;
}
{code}


> Make Table Sampling algorithm to accommodate to the imbalance row 
> distribution across guide posts
> -
>
> Key: PHOENIX-4912
> URL: https://issues.apache.org/jira/browse/PHOENIX-4912
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current implementation of table sampling is based on the assumption 
> "Every two consecutive guide posts contains the equal number of rows" which 
> isn't accurate in practice, and once we collect multiple versions of cells 
> and the deleted rows, the thing will become worse.
> In details, the current implementation of table sampling is (see 
> BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
> of function) as described below:
>  # Iterate all parallel scans generated;
>  # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
> tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
> otherwise discard this scan.
> The problem can be formalized as: We have a group of scans and each scan is 
> defined as  Ci>. Now we want to randomly pick X groups so that the sum of count of rows 

[jira] [Created] (PHOENIX-4916) When collecting statistics, the estimated size of a guide post may only count part of cells of the last row

2018-09-21 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4916:


 Summary: When collecting statistics, the estimated size of a guide 
post may only count part of cells of the last row
 Key: PHOENIX-4916
 URL: https://issues.apache.org/jira/browse/PHOENIX-4916
 Project: Phoenix
  Issue Type: Bug
Reporter: Bin Shi
Assignee: Bin Shi


In DefaultStatisticsCollector.collectStatistics(...), it iterate all cells of 
the current row, once the accumulated estimated size plus the size of the 
current cell >= guide post width, it skipped all the remaining cells. The 
result is that  he estimated size of a guide post may only count part of cells 
of the last row.

This problem can be ignored in clusters with real data where the guide post 
width is much bigger than the row size, but it does have impact on unit test 
and iteration test, because we use very small guide post width in the test 
which results in inaccuracy of the estimated size of the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-21 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4912:
-
Description: 
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans 
denoted as T * table sampling rate denoted as R (0 <= R <= 100).

To resolve the above problem, one of algorithms that we can consider are 
described below. The core idea is to adjust T, R, Y after each pick, so the new 
problem is a child problem of the original problem.
{code:java}
ArrayList TableSampling(ArrayList scans, T, R) {  
    ArrayList pickedScans = new ArrayList();
    Y = T * R / 100.00;
    for (scan in scans) {
        if (Y <= 0) break;
        if (getHashCode(Ki) MOD 100 < R) {
            // then pick this scan, and adjust T, R, Y accordingly
            pickedScans.Add(scan);
            T -= Ci;
            Y -= Ci;
            if (T != 0 && Y > 0) { 
                R = 100.00 * Y / T;
            }
        }
    }
    return pickedScans;
}
{code}

  was:
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:
{code:java}
ArrayList TableSampling(ArrayList scans, T, R) {  
    ArrayList pickedScans = new ArrayList();
    Y = T * R;
    for (scan in scans) {
        if (Y <= 0) break;
        if (getHashCode(Ki) MOD 100 < R) {
            // then pick this scan, and adjust T, R, Y accordingly
            pickedScans.Add(scan);
            T -= Ci;
            Y -= Ci;
            if (T != 0 && Y > 0) { 
                R = Y / T;
             }
        }
    }
    return pickedScans;
}
{code}


> Make Table Sampling algorithm to accommodate to the imbalance row 
> distribution across guide posts
> -
>
> Key: PHOENIX-4912
> URL: https://issues.apache.org/jira/browse/PHOENIX-4912
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current implementation of table sampling is based on the assumption 
> "Every two consecutive guide posts contains the equal number of rows" which 
> isn't accurate in practice, and once we collect multiple versions of cells 
> and the deleted rows, the thing will become worse.
> In details, the current implementation of table sampling is (see 
> BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
> of function) as described below:
>  # Iterate all parallel scans generated;
>  # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
> tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
> otherwise discard this scan.
> The problem can be formalized as: We have a group of scans and each scan is 
> defined as  Ci>. Now we want to randomly pick X groups so that the sum of count of rows 
> in the selected groups is close to Y, where Y = the total count of rows of 
> all scans denoted as T * table sampling rate denoted as R (0 <= R <= 100).
> To resolve the above 

[jira] [Created] (PHOENIX-4914) There is one corner case in which BaseResultIterators.getParallelScans() returns wrong result of the last guide post info update timestamp.

2018-09-20 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4914:


 Summary: There is one corner case in which 
BaseResultIterators.getParallelScans() returns wrong result of the last guide 
post info update timestamp.
 Key: PHOENIX-4914
 URL: https://issues.apache.org/jira/browse/PHOENIX-4914
 Project: Phoenix
  Issue Type: Bug
Reporter: Bin Shi
Assignee: Bin Shi


When I add the following test case to testSelectQueriesWithFilters(...)  in 
[ExplainPlanWithStatsEnabledIT.java|https://github.com/apache/phoenix/pull/347/files#diff-21d3742c352623e12ec4889b0ac4f5d2]
 in my clean local repository (without any local changes), the highlighted 
assertion (commented out) will be hit which indicates a bug in 
BaseResultIterators.getParallelScans() in the current code base.

// Query with multiple scan ranges, and each range's start key and end key

// are both between data
 sql = "SELECT a FROM " + tableName + " WHERE K <= 103 AND K >= 101 OR K <= 108 
AND K >= 106";
 rs = conn.createStatement().executeQuery(sql);
 i = 0;
 numRows = 6;
 int[] result = new int[] \{ 101, 102, 103, 106, 107, 108 };
 while (rs.next()) {
 assertEquals(result[i++], rs.getInt(1));
assertEquals(numRows, i);
 info = getByteRowEstimates(conn, sql, binds);
{color:#8eb021}*// TODO: the original code before this change will hit the 
following assertion.* {color}

{color:#8eb021}*// Need to investigate it.*{color}
{color:#8eb021} *// assertTrue(info.getEstimateInfoTs() > 0);*{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4913) UPDATE STATISTICS should run raw scan to collect the deleted rows

2018-09-20 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4913:
-
Description: 
In order to truly measure the size of data when calculating guide posts, UPDATE 
STATISTIC should run raw scan to take into account the deleted rows.

For the deleted rows, they will contribute to estimated size of guide post but 
it has no contribution to the count of rows of guide post.

  was:
In order to truly measure the size of data when calculating guide posts, UPDATE 
STATISTIC should run raw scan to take into account all versions of cells.

For the deleted rows, they will contribute to estimated size of guide post but 
it has no contribution to the count of rows of guide post.


> UPDATE STATISTICS should run raw scan to collect the deleted rows
> -
>
> Key: PHOENIX-4913
> URL: https://issues.apache.org/jira/browse/PHOENIX-4913
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> In order to truly measure the size of data when calculating guide posts, 
> UPDATE STATISTIC should run raw scan to take into account the deleted rows.
> For the deleted rows, they will contribute to estimated size of guide post 
> but it has no contribution to the count of rows of guide post.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4913) UPDATE STATISTICS should run raw scan to collect the deleted rows

2018-09-20 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4913:


 Summary: UPDATE STATISTICS should run raw scan to collect the 
deleted rows
 Key: PHOENIX-4913
 URL: https://issues.apache.org/jira/browse/PHOENIX-4913
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 5.0.0
Reporter: Bin Shi
Assignee: Bin Shi


In order to truly measure the size of data when calculating guide posts, UPDATE 
STATISTIC should run raw scan to take into account all versions of cells.

For the deleted rows, they will contribute to estimated size of guide post but 
it has no contribution to the count of rows of guide post.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-20 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4912:
-
Description: 
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0) {

                R = Y / T;

            }

        }

    }

    return pickedScans;

}

  was:
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0) R = Y / T;

        }

    }

    return pickedScans;

}


> Make Table Sampling algorithm to accommodate to the imbalance row 
> distribution across guide posts
> -
>
> Key: PHOENIX-4912
> URL: https://issues.apache.org/jira/browse/PHOENIX-4912
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current implementation of table sampling is based on the assumption 
> "Every two consecutive guide posts contains the equal number of rows" which 
> isn't accurate in practice, and once we collect multiple versions of cells 
> and the deleted rows, the thing will become worse.
> In details, the current implementation of table sampling is (see 
> BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
> of function) as described below:
>  # Iterate all parallel scans generated;
>  # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
> tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
> otherwise discard this scan.
> The problem can be formalized as: We have a group of scans and each scan is 
> defined as  Ci>. Now we want to randomly pick X groups so that the sum of count of rows 
> in the selected groups is close to Y, where Y = the total count of rows of 
> all scans T * table sampling rate R.
> To resolve the above problem, one of algorithms that we can consider are 
> described below:
> ArrayList TableSampling(ArrayList scans, T, R)
> {  
>     ArrayList pickedScans = new ArrayList();
>     Y = T * R;
>     for (scan in scans) {
>         if (Y <= 0) break;
>         

[jira] [Updated] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-20 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4912:
-
Description: 
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0) R = Y / T;

        }

    }

    return pickedScans;

}

  was:
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0)  {

                 R = Y / T;

            }

        }

    }

    return pickedScans;

}


> Make Table Sampling algorithm to accommodate to the imbalance row 
> distribution across guide posts
> -
>
> Key: PHOENIX-4912
> URL: https://issues.apache.org/jira/browse/PHOENIX-4912
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current implementation of table sampling is based on the assumption 
> "Every two consecutive guide posts contains the equal number of rows" which 
> isn't accurate in practice, and once we collect multiple versions of cells 
> and the deleted rows, the thing will become worse.
> In details, the current implementation of table sampling is (see 
> BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
> of function) as described below:
>  # Iterate all parallel scans generated;
>  # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
> tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
> otherwise discard this scan.
> The problem can be formalized as: We have a group of scans and each scan is 
> defined as  Ci>. Now we want to randomly pick X groups so that the sum of count of rows 
> in the selected groups is close to Y, where Y = the total count of rows of 
> all scans T * table sampling rate R.
> To resolve the above problem, one of algorithms that we can consider are 
> described below:
> ArrayList TableSampling(ArrayList scans, T, R)
> {  
>     ArrayList pickedScans = new ArrayList();
>     Y = T * R;
>     for (scan in scans) {
>         if (Y <= 0) break;
>       

[jira] [Updated] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-20 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4912:
-
Description: 
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0)  {

                 R = Y / T;

            }

        }

    }

    return pickedScans;

}

  was:
The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0) {

                  R = Y / T;

            }

        }

    }

    return pickedScans;

}


> Make Table Sampling algorithm to accommodate to the imbalance row 
> distribution across guide posts
> -
>
> Key: PHOENIX-4912
> URL: https://issues.apache.org/jira/browse/PHOENIX-4912
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0
>Reporter: Bin Shi
>Assignee: Bin Shi
>Priority: Major
>
> The current implementation of table sampling is based on the assumption 
> "Every two consecutive guide posts contains the equal number of rows" which 
> isn't accurate in practice, and once we collect multiple versions of cells 
> and the deleted rows, the thing will become worse.
> In details, the current implementation of table sampling is (see 
> BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
> of function) as described below:
>  # Iterate all parallel scans generated;
>  # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
> tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
> otherwise discard this scan.
> The problem can be formalized as: We have a group of scans and each scan is 
> defined as  Ci>. Now we want to randomly pick X groups so that the sum of count of rows 
> in the selected groups is close to Y, where Y = the total count of rows of 
> all scans T * table sampling rate R.
> To resolve the above problem, one of algorithms that we can consider are 
> described below:
> ArrayList TableSampling(ArrayList scans, T, R)
> {  
>     ArrayList pickedScans = new ArrayList();
>     Y = T * R;
>     for (scan in scans) {

[jira] [Created] (PHOENIX-4912) Make Table Sampling algorithm to accommodate to the imbalance row distribution across guide posts

2018-09-20 Thread Bin Shi (JIRA)
Bin Shi created PHOENIX-4912:


 Summary: Make Table Sampling algorithm to accommodate to the 
imbalance row distribution across guide posts
 Key: PHOENIX-4912
 URL: https://issues.apache.org/jira/browse/PHOENIX-4912
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.0.0
Reporter: Bin Shi
Assignee: Bin Shi


The current implementation of table sampling is based on the assumption "Every 
two consecutive guide posts contains the equal number of rows" which isn't 
accurate in practice, and once we collect multiple versions of cells and the 
deleted rows, the thing will become worse.

In details, the current implementation of table sampling is (see 
BaseResultIterators.getParallelScan() which calls sampleScans(...) at the end 
of function) as described below:
 # Iterate all parallel scans generated;
 # For each scan, if getHashHode(start row key of the scan) MOD 100 < 
tableSamplingRate (See TableSamplerPredicate.java) then pick this scan; 
otherwise discard this scan.

The problem can be formalized as: We have a group of scans and each scan is 
defined as . 
Now we want to randomly pick X groups so that the sum of count of rows in the 
selected groups is close to Y, where Y = the total count of rows of all scans T 
* table sampling rate R.

To resolve the above problem, one of algorithms that we can consider are 
described below:

ArrayList TableSampling(ArrayList scans, T, R)

{  

    ArrayList pickedScans = new ArrayList();

    Y = T * R;

    for (scan in scans) {

        if (Y <= 0) break;

        if (getHashCode(Ki) MOD 100 < R) {

            // then pick this scan, and adjust T, R, Y accordingly

            pickedScans.Add(scan);

            T -= Ci;

            Y -= Ci;

            if (T != 0 && Y > 0) {

                  R = Y / T;

            }

        }

    }

    return pickedScans;

}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4008) UPDATE STATISTIC should collect all versions of cells

2018-09-20 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4008:
-
Attachment: PHOENIX-4008_0920.patch

> UPDATE STATISTIC should collect all versions of cells
> -
>
> Key: PHOENIX-4008
> URL: https://issues.apache.org/jira/browse/PHOENIX-4008
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4008_0918.patch, PHOENIX-4008_0920.patch
>
>
> In order to truly measure the size of data when calculating guide posts, 
> UPDATE STATISTIC should taken into account all versions of cells. We should 
> also be setting the max versions on the scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4008) UPDATE STATISTIC should collect all versions of cells

2018-09-18 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4008:
-
Description: In order to truly measure the size of data when calculating 
guide posts, UPDATE STATISTIC should taken into account all versions of cells. 
We should also be setting the max versions on the scan.  (was: In order to 
truly measure the size of data when calculating guide posts, UPDATE STATISTIC 
should run a raw scan to taken into account all versions of cells. We should 
also be setting the max versions on the scan.)
Summary: UPDATE STATISTIC should collect all versions of cells  (was: 
UPDATE STATISTIC should run raw scan with all versions of cells)

> UPDATE STATISTIC should collect all versions of cells
> -
>
> Key: PHOENIX-4008
> URL: https://issues.apache.org/jira/browse/PHOENIX-4008
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Bin Shi
>Priority: Major
>
> In order to truly measure the size of data when calculating guide posts, 
> UPDATE STATISTIC should taken into account all versions of cells. We should 
> also be setting the max versions on the scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4594) Perform binary search on guideposts during query compilation

2018-09-17 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi updated PHOENIX-4594:
-
Attachment: PHOENIX-4594_0917.patch

> Perform binary search on guideposts during query compilation
> 
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4594-0913.patch, PHOENIX-4594_0917.patch
>
>
> If there are many guideposts, performance will suffer during query 
> compilation because we do a linear search of the guideposts to find the 
> intersection with the scan ranges. Instead, in 
> BaseResultIterators.getParallelScans() we should populate an array of 
> guideposts and perform a binary search. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4594) Perform binary search on guideposts during query compilation

2018-09-13 Thread Bin Shi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bin Shi reassigned PHOENIX-4594:


  Assignee: Bin Shi  (was: Abhishek Singh Chouhan)
Attachment: PHOENIX-4594-0913.patch

Please review. Thanks!

> Perform binary search on guideposts during query compilation
> 
>
> Key: PHOENIX-4594
> URL: https://issues.apache.org/jira/browse/PHOENIX-4594
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Bin Shi
>Priority: Major
> Attachments: PHOENIX-4594-0913.patch
>
>
> If there are many guideposts, performance will suffer during query 
> compilation because we do a linear search of the guideposts to find the 
> intersection with the scan ranges. Instead, in 
> BaseResultIterators.getParallelScans() we should populate an array of 
> guideposts and perform a binary search. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)