[jira] [Commented] (PHOENIX-1315) Optimize query for Pig loader

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161560#comment-14161560
 ] 

James Taylor commented on PHOENIX-1315:
---

Thanks, [~maghamravikiran]. Please resolve the issue as fixed.

 Optimize query for Pig loader
 -

 Key: PHOENIX-1315
 URL: https://issues.apache.org/jira/browse/PHOENIX-1315
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: maghamravikiran
 Fix For: 4.2, 3.2

 Attachments: PHOENIX-1315.patch, PHOENIX-1315_v2.patch, 
 PHOENIX-1315_v3.patch, PHOENIX-1315_v4.patch


 I came across this with a recent change I was making. Why is the call to 
 queryPlan.iterators() necessary in PhoenixInputFormat?
 {code}
 private QueryPlan getQueryPlan(final JobContext context) throws 
 IOException {
 Preconditions.checkNotNull(context);
 if(queryPlan == null) {
 try{
 final Connection connection = getConnection();
 final String selectStatement = getConf().getSelectStatement();
 Preconditions.checkNotNull(selectStatement);
 final Statement statement = connection.createStatement();
 final PhoenixStatement pstmt = 
 statement.unwrap(PhoenixStatement.class);
 this.queryPlan = pstmt.compileQuery(selectStatement);
 // FIXME: why is getting the iterator necessary here, as it 
 will
 // cause the query to run.
 this.queryPlan.iterator();
 } catch(Exception exception) {
 LOG.error(String.format(Failed to get the query plan with 
 error [%s],exception.getMessage()));
 throw new RuntimeException(exception);
 }
 }
 return queryPlan;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1315) Optimize query for Pig loader

2014-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161563#comment-14161563
 ] 

Hudson commented on PHOENIX-1315:
-

FAILURE: Integrated in Phoenix | Master #407 (See 
[https://builds.apache.org/job/Phoenix-master/407/])
PHOENIX-1315-Test Load from Index table. (ravimagham: rev 
e35503374393b0428f4e6603c8e05d87a073e3c3)
* phoenix-pig/src/it/java/org/apache/phoenix/pig/PhoenixHBaseLoaderIT.java


 Optimize query for Pig loader
 -

 Key: PHOENIX-1315
 URL: https://issues.apache.org/jira/browse/PHOENIX-1315
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: maghamravikiran
 Fix For: 4.2, 3.2

 Attachments: PHOENIX-1315.patch, PHOENIX-1315_v2.patch, 
 PHOENIX-1315_v3.patch, PHOENIX-1315_v4.patch


 I came across this with a recent change I was making. Why is the call to 
 queryPlan.iterators() necessary in PhoenixInputFormat?
 {code}
 private QueryPlan getQueryPlan(final JobContext context) throws 
 IOException {
 Preconditions.checkNotNull(context);
 if(queryPlan == null) {
 try{
 final Connection connection = getConnection();
 final String selectStatement = getConf().getSelectStatement();
 Preconditions.checkNotNull(selectStatement);
 final Statement statement = connection.createStatement();
 final PhoenixStatement pstmt = 
 statement.unwrap(PhoenixStatement.class);
 this.queryPlan = pstmt.compileQuery(selectStatement);
 // FIXME: why is getting the iterator necessary here, as it 
 will
 // cause the query to run.
 this.queryPlan.iterator();
 } catch(Exception exception) {
 LOG.error(String.format(Failed to get the query plan with 
 error [%s],exception.getMessage()));
 throw new RuntimeException(exception);
 }
 }
 return queryPlan;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1317) Cleanup non phoenix-core pom files

2014-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161574#comment-14161574
 ] 

Hudson commented on PHOENIX-1317:
-

FAILURE: Integrated in Phoenix | 3.0 | Hadoop1 #243 (See 
[https://builds.apache.org/job/Phoenix-3.0-hadoop1/243/])
PHOENIX-1317-Test on loading data from Index table (ravimagham: rev 
4b0d3ba199c0a9c64d247376a033c7761843a5c7)
* phoenix-pig/src/it/java/org/apache/phoenix/pig/PhoenixHBaseLoaderIT.java


 Cleanup non phoenix-core pom files
 --

 Key: PHOENIX-1317
 URL: https://issues.apache.org/jira/browse/PHOENIX-1317
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 3.1, 4.1
Reporter: James Taylor
Assignee: maghamravikiran
 Attachments: 0001-PHOENIX-1317-4.1.0.patch


 The phoenix-core pom is in much better shape after PHOENIX-1272, but the non 
 phoenix-core poms need to be updated as well. For one particular issue, take 
 a look at the following comment in BIGTOP-1420: 
 https://issues.apache.org/jira/browse/BIGTOP-1420?focusedCommentId=14125245page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14125245



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1321) Cleanup setting of timestamps when collecting and using stats

2014-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161614#comment-14161614
 ] 

Hudson commented on PHOENIX-1321:
-

SUCCESS: Integrated in Phoenix | 3.0 | Hadoop1 #244 (See 
[https://builds.apache.org/job/Phoenix-3.0-hadoop1/244/])
PHOENIX-1321 Cleanup setting of timestamps when collecting and using stats 
(jtaylor: rev 12fa6f7004fe70a657ebaea3d745296611b2b80e)
* phoenix-core/src/it/java/org/apache/phoenix/end2end/MultiCfQueryExecIT.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java
* 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/StatsCollectorIT.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java
* phoenix-core/src/test/java/org/apache/phoenix/query/QueryServicesTestImpl.java
* 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stat/StatisticsUtils.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/KeyOnlyIT.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/ParallelIteratorsIT.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stat/StatisticsScanner.java
* phoenix-core/src/main/java/org/apache/phoenix/compile/ExpressionCompiler.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryServices.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/index/SaltedIndexIT.java
* phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java
* phoenix-core/src/test/java/org/apache/phoenix/util/TestUtil.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stat/StatisticsTable.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stat/StatisticsCollector.java
* phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java
* 
phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseTenantSpecificTablesIT.java


 Cleanup setting of timestamps when collecting and using stats
 -

 Key: PHOENIX-1321
 URL: https://issues.apache.org/jira/browse/PHOENIX-1321
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: James Taylor
 Attachments: PHOENIX-1321_4.patch


 We're currently not using the max timestamp that was passed through the Scan 
 when we do an ANALYZE. In the same way, we're not using the client timestamp 
 when we read the stats and cache them on PTable. The tricky thing is what 
 timestamp to use for the stats when a split or compaction occurs, because in 
 those cases we don't have a user supplied timestamp (if they're managing 
 timestamps themselves).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1297) Adding utility methods to get primary key information from the optimized query plan

2014-10-07 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-1297:
--
Attachment: PHOENIX-1297_v2.patch

[~jamestaylor] - attached is the patch for master branch. Please review. Thanks!

 Adding utility methods to get primary key information from the optimized 
 query plan
 ---

 Key: PHOENIX-1297
 URL: https://issues.apache.org/jira/browse/PHOENIX-1297
 Project: Phoenix
  Issue Type: Task
Affects Versions: 5.0.0, 4.2, 3.2
Reporter: Samarth Jain
Assignee: Samarth Jain
 Attachments: PHOENIX-1297.patch, PHOENIX-1297_v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-1327) Disallow creating arrays of fixed width base type without the max length being specified

2014-10-07 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain reassigned PHOENIX-1327:
-

Assignee: Samarth Jain

 Disallow creating arrays of fixed width base type without the max length 
 being specified
 

 Key: PHOENIX-1327
 URL: https://issues.apache.org/jira/browse/PHOENIX-1327
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain
Assignee: Samarth Jain
 Fix For: 5.0.0, 4.2, 3.2


 Today, we allow a user to specify an array who base type is of fixed width as:
 CREATE TABLE foo (k BINARY_ARRAY NOT NULL PRIMARY KEY)
 This shouldn't be allowed as for fixed width data types like CHAR and BINARY, 
 specifying a max length is mandatory. 
 These alternate statements properly enforce the max length constraint:
 CREATE TABLE foo (k BINARY ARRAY NOT NULL PRIMARY KEY)
 CREATE TABLE foo (k BINARY[] NOT NULL PRIMARY KEY)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-1327) Disallow creating arrays of fixed width base type without the max length being specified

2014-10-07 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain resolved PHOENIX-1327.
---
Resolution: Duplicate

Resolved as part of PHOENIX-1297

 Disallow creating arrays of fixed width base type without the max length 
 being specified
 

 Key: PHOENIX-1327
 URL: https://issues.apache.org/jira/browse/PHOENIX-1327
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain
 Fix For: 5.0.0, 4.2, 3.2


 Today, we allow a user to specify an array who base type is of fixed width as:
 CREATE TABLE foo (k BINARY_ARRAY NOT NULL PRIMARY KEY)
 This shouldn't be allowed as for fixed width data types like CHAR and BINARY, 
 specifying a max length is mandatory. 
 These alternate statements properly enforce the max length constraint:
 CREATE TABLE foo (k BINARY ARRAY NOT NULL PRIMARY KEY)
 CREATE TABLE foo (k BINARY[] NOT NULL PRIMARY KEY)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-1320) Update stats atomically

2014-10-07 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor resolved PHOENIX-1320.
---
   Resolution: Fixed
Fix Version/s: 3.2
   4.2
   5.0.0

 Update stats atomically
 ---

 Key: PHOENIX-1320
 URL: https://issues.apache.org/jira/browse/PHOENIX-1320
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: James Taylor
 Fix For: 5.0.0, 4.2, 3.2

 Attachments: PHOENIX-1320.patch


 To prevent partially updated stats or a mix of old stats and new stats in the 
 event of a write failure, commit the stats atomically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-1321) Cleanup setting of timestamps when collecting and using stats

2014-10-07 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor resolved PHOENIX-1321.
---
   Resolution: Fixed
Fix Version/s: 3.2
   4.2
   5.0.0

 Cleanup setting of timestamps when collecting and using stats
 -

 Key: PHOENIX-1321
 URL: https://issues.apache.org/jira/browse/PHOENIX-1321
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: James Taylor
 Fix For: 5.0.0, 4.2, 3.2

 Attachments: PHOENIX-1321_4.patch


 We're currently not using the max timestamp that was passed through the Scan 
 when we do an ANALYZE. In the same way, we're not using the client timestamp 
 when we read the stats and cache them on PTable. The tricky thing is what 
 timestamp to use for the stats when a split or compaction occurs, because in 
 those cases we don't have a user supplied timestamp (if they're managing 
 timestamps themselves).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1328) Update ANALYZE syntax to collect stats on index tables and all tables

2014-10-07 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-1328:
--
Description: Based on the discussion in PHOENIX-1309 we will now modify the 
ANALYZE query to collect the stats for index table and all the tables 
associated with the main table (includes index).  (was: Based on the discussion 
in Phoenix-1309 we will now modify the ANALYZE query to collect the stats for 
index table and all the tables associated with the main table (includes index).)

 Update ANALYZE syntax to collect stats on index tables and all tables
 -

 Key: PHOENIX-1328
 URL: https://issues.apache.org/jira/browse/PHOENIX-1328
 Project: Phoenix
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Based on the discussion in PHOENIX-1309 we will now modify the ANALYZE query 
 to collect the stats for index table and all the tables associated with the 
 main table (includes index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1328) Update ANALYZE syntax to collect stats on index tables and all tables

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161649#comment-14161649
 ] 

James Taylor commented on PHOENIX-1328:
---

You can do an UPDATE STATISTICS on a table/index, it's indexes, or both the 
table and indexes (default). LIke this:
- UPDATE STATISTICS table INDEX  -- Updates the statistics of all 
indexes on the table
- UPDATE STATISTICS table ALL   -- Updates both the table and 
index statistics
- UPDATE STATISTICS table  --- Same as ALL
- UPDATE STATISTICS table COLUMNS   --- Updates only the table statistics
- UPDATE STATISTICS index --- Updates only the index 
statistics

 Update ANALYZE syntax to collect stats on index tables and all tables
 -

 Key: PHOENIX-1328
 URL: https://issues.apache.org/jira/browse/PHOENIX-1328
 Project: Phoenix
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Based on the discussion in PHOENIX-1309 we will now modify the ANALYZE query 
 to collect the stats for index table and all the tables associated with the 
 main table (includes index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1322) Add documentation for UPDATE STATISTICS command

2014-10-07 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-1322:
--
Issue Type: Sub-task  (was: Bug)
Parent: PHOENIX-1177

 Add documentation for UPDATE STATISTICS command
 ---

 Key: PHOENIX-1322
 URL: https://issues.apache.org/jira/browse/PHOENIX-1322
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: ramkrishna.s.vasudevan

 Four places need to be updated:
 - Add a new webpage and add the webpage to the Using menu 
 (/site/source/src/site/site.xml). The webpage can talk about the new ANALYZE 
 table command and give a couple of examples. It'd be good to document that 
 stats are updated automatically during splits and compaction. Also, mention 
 the new property values you added to control - number of bytes before 
 guidepost put in, min time before another analyze may be done. Don't talk 
 about implementation, though, other than the why we're doing this (i.e. to 
 improve parallelization).
 - Add the new ANALYZE call with a short explanation to 
 ./phoenix-docs/src/docsrc/help/phoenix.csv. This will cause it to appear 
 here: http://phoenix.apache.org/language/index.html
 - Add an item at the top for Statistics Collection with a short explanation 
 here: site/source/src/site/markdown/recent.md
 - Remove the first item from Cost-based Query Optimization, or change the 
 font to strike through with a note that it's implemented in 3.2/4.2) here: 
 site/source/src/site/markdown/roadmap.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1030) Change Expression.isDeterministic() to return a enum of values ALWAYS, PER_STATEMENT, PER_ROW

2014-10-07 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-1030:

Attachment: PHOENIX-1030-3.0.patch

[~jamestaylor]

I have attached a patch that works for all versions. I also modified NULL and 
TYPED_NULL expressions to be statically initialized using all the possible 
values of the Determinism enum. 

Thanks,
Thomas

 Change Expression.isDeterministic() to return a enum of values ALWAYS, 
 PER_STATEMENT, PER_ROW
 -

 Key: PHOENIX-1030
 URL: https://issues.apache.org/jira/browse/PHOENIX-1030
 Project: Phoenix
  Issue Type: Improvement
Reporter: Thomas D'Silva
Assignee: Thomas D'Silva
 Attachments: PHOENIX-1030-3.0.patch, PHOENIX-1030-3.0.patch, 
 PHOENIX-1030-3.0.patch, PHOENIX-1030-3.0.patch, PHOENIX-1030-4.0.patch, 
 PHOENIX-1030-4.0.patch, PHOENIX-1030-master.patch


 Change Expression.isDeterministic() to return an ENUM with three values
 DETERMINISTIC -  the expression returns the same output every time given the 
 same input.
 UNDETERMINISTIC_ROW - the expression should be computed for every row
 UNDETERMINISTIC_STMT - the expression should be be computed for a given 
 statement only once
 See PHOENIX-1001



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1328) Update ANALYZE syntax to collect stats on index tables and all tables

2014-10-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161662#comment-14161662
 ] 

ramkrishna.s.vasudevan commented on PHOENIX-1328:
-

bq.UPDATE STATISTICS index --- Updates only the index statistics
How do you identify this - it is a specific index?  From the name given? 
Because UPDATE STATISTICS table also will follow the same syntax right?


 Update ANALYZE syntax to collect stats on index tables and all tables
 -

 Key: PHOENIX-1328
 URL: https://issues.apache.org/jira/browse/PHOENIX-1328
 Project: Phoenix
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Based on the discussion in PHOENIX-1309 we will now modify the ANALYZE query 
 to collect the stats for index table and all the tables associated with the 
 main table (includes index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1328) Update ANALYZE syntax to collect stats on index tables and all tables

2014-10-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161688#comment-14161688
 ] 

ramkrishna.s.vasudevan commented on PHOENIX-1328:
-

Working on this. We need to any way collect all the indexes for the given table 
and issue a update stats in a future call right?
That would ensure that the stats are collected parallely across all the tables 
in question. 

 Update ANALYZE syntax to collect stats on index tables and all tables
 -

 Key: PHOENIX-1328
 URL: https://issues.apache.org/jira/browse/PHOENIX-1328
 Project: Phoenix
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Based on the discussion in PHOENIX-1309 we will now modify the ANALYZE query 
 to collect the stats for index table and all the tables associated with the 
 main table (includes index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-1300) Allow sub-queries to choose different execution path other than hash-join

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue resolved PHOENIX-1300.
--
   Resolution: Fixed
Fix Version/s: 5.0.0
   4.0.0
   3.0.0

Covered by fix for PHOENIX-167

 Allow sub-queries to choose different execution path other than hash-join
 -

 Key: PHOENIX-1300
 URL: https://issues.apache.org/jira/browse/PHOENIX-1300
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

   Original Estimate: 240h
  Remaining Estimate: 240h

 We can take a different approach (like PHOENIX-1179) for sub-queries where 
 the required hash-set cannot fit into memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1300) Allow sub-queries to choose different execution path other than hash-join

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-1300:
-
Fix Version/s: (was: 4.0.0)
   (was: 3.0.0)
   3.2
   4.2

 Allow sub-queries to choose different execution path other than hash-join
 -

 Key: PHOENIX-1300
 URL: https://issues.apache.org/jira/browse/PHOENIX-1300
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 5.0.0, 4.2, 3.2

   Original Estimate: 240h
  Remaining Estimate: 240h

 We can take a different approach (like PHOENIX-1179) for sub-queries where 
 the required hash-set cannot fit into memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1328) Update ANALYZE syntax to collect stats on index tables and all tables

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162112#comment-14162112
 ] 

James Taylor commented on PHOENIX-1328:
---

First pass it's fine if you just collect the stats for the table and/or indexes 
serially in MetaDataClient. Just loop over the indexes after you've resolved 
the table. like this:
{code}
public MutationState updateStatistics(UpdateStatisticsStatement 
updateStatisticsStmt) throws SQLException {
// Check before updating the stats if we have reached the configured 
time to reupdate the stats once again
long msMinBetweenUpdates = connection.getQueryServices().getProps()
.getLong(QueryServices.MIN_STATS_UPDATE_FREQ_MS_ATTRIB, 
QueryServicesOptions.DEFAULT_MIN_STATS_UPDATE_FREQ_MS);
ColumnResolver resolver = 
FromCompiler.getResolver(updateStatisticsStmt, connection);
PTable table = resolver.getTables().get(0).getTable();
if (updateStatisticsStmt.updateColumns()) {
doUpdateStatistics(table);
}
if (updateStatisticsStmt.updateIndexes()) {
for (PTable index : table.getIndexes()) {
doUpdateStatistics(index);
}
}
}

private void doUpdateStatistics(PTable table) {
// TODO: refactored code here
}
{code}

 Update ANALYZE syntax to collect stats on index tables and all tables
 -

 Key: PHOENIX-1328
 URL: https://issues.apache.org/jira/browse/PHOENIX-1328
 Project: Phoenix
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Based on the discussion in PHOENIX-1309 we will now modify the ANALYZE query 
 to collect the stats for index table and all the tables associated with the 
 main table (includes index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1317) Cleanup non phoenix-core pom files

2014-10-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162118#comment-14162118
 ] 

Andrew Purtell commented on PHOENIX-1317:
-

+1

 Cleanup non phoenix-core pom files
 --

 Key: PHOENIX-1317
 URL: https://issues.apache.org/jira/browse/PHOENIX-1317
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 3.1, 4.1
Reporter: James Taylor
Assignee: maghamravikiran
 Attachments: 0001-PHOENIX-1317-4.1.0.patch


 The phoenix-core pom is in much better shape after PHOENIX-1272, but the non 
 phoenix-core poms need to be updated as well. For one particular issue, take 
 a look at the following comment in BIGTOP-1420: 
 https://issues.apache.org/jira/browse/BIGTOP-1420?focusedCommentId=14125245page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14125245



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1030) Change Expression.isDeterministic() to return a enum of values ALWAYS, PER_STATEMENT, PER_ROW

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162274#comment-14162274
 ] 

James Taylor commented on PHOENIX-1030:
---

Thanks, [~tdsilva]. Here's some feedback on some minor stuff:
- I don't think you need a look here, as you should be able to index into 
BOOLEAN_EXPRESSIONS using the child.getDeterminism().ordinal() value.
{code}
 public static boolean isFalse(Expression child) {
-return child == FALSE_EXPRESSION || child == ND_FALSE_EXPRESSION;
+   for (Determinism determinism : Determinism.values()) {
+   if (child==BOOLEAN_EXPRESSIONS[determinism.ordinal()]) 
+   return true;
+   }
+   return false;
 }
 
 public static boolean isTrue(Expression child) {
-return child == TRUE_EXPRESSION || child == ND_TRUE_EXPRESSION;
+   for (Determinism determinism : Determinism.values()) {
+   if 
(child==BOOLEAN_EXPRESSIONS[Determinism.values().length+determinism.ordinal()]) 
+   return true;
+   }
+   return false;
 }
{code}
- How about some static helper functions for these?
{code}
NULL_EXPRESSIONS[determinism.ordinal()]
BOOLEAN_EXPRESSIONS[Determinism.values().length+determinism.ordinal()]
TYPED_NULL_EXPRESSIONS[type.ordinal()+PDataType.values().length*determinism.ordinal()]
{code}


 Change Expression.isDeterministic() to return a enum of values ALWAYS, 
 PER_STATEMENT, PER_ROW
 -

 Key: PHOENIX-1030
 URL: https://issues.apache.org/jira/browse/PHOENIX-1030
 Project: Phoenix
  Issue Type: Improvement
Reporter: Thomas D'Silva
Assignee: Thomas D'Silva
 Attachments: PHOENIX-1030-3.0.patch, PHOENIX-1030-3.0.patch, 
 PHOENIX-1030-3.0.patch, PHOENIX-1030-3.0.patch, PHOENIX-1030-4.0.patch, 
 PHOENIX-1030-4.0.patch, PHOENIX-1030-master.patch


 Change Expression.isDeterministic() to return an ENUM with three values
 DETERMINISTIC -  the expression returns the same output every time given the 
 same input.
 UNDETERMINISTIC_ROW - the expression should be computed for every row
 UNDETERMINISTIC_STMT - the expression should be be computed for a given 
 statement only once
 See PHOENIX-1001



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-167) Support semi/anti-joins

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-167:

Fix Version/s: (was: 4.0.0)
   (was: 3.0.0)
   3.2
   4.2

 Support semi/anti-joins
 ---

 Key: PHOENIX-167
 URL: https://issues.apache.org/jira/browse/PHOENIX-167
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Maryann Xue
  Labels: enhancement
 Fix For: 5.0.0, 4.2, 3.2

 Attachments: 167-2.patch, 167.patch


 A semi-join between two tables returns rows from the first table where one or 
 more matches are found in the second table. The difference between a 
 semi-join and a conventional join is that rows in the first table will be 
 returned at most once. Even if the second table contains two matches for a 
 row in the first table, only one copy of the row will be returned. Semi-joins 
 are written using the EXISTS or IN constructs.
 An anti-join is the opposite of a semi-join and is written using the NOT 
 EXISTS or NOT IN constructs.
 There's a pretty good write-up [here] 
 (http://www.dbspecialists.com/files/presentations/semijoins.html) on 
 semi/anti joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1329) Correctly support varbinary arrays

2014-10-07 Thread Jesse Yates (JIRA)
Jesse Yates created PHOENIX-1329:


 Summary: Correctly support varbinary arrays
 Key: PHOENIX-1329
 URL: https://issues.apache.org/jira/browse/PHOENIX-1329
 Project: Phoenix
  Issue Type: Bug
Reporter: Jesse Yates
 Fix For: 5.0.0, 4.3


Storing arrays of binary data can contain 0x00, which Phoenix uses a the field 
separator. This leads phoenix to return arrays incorrectly - shortening them 
prematurely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1329) Correctly support varbinary arrays

2014-10-07 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated PHOENIX-1329:
-
Attachment: phoenix-1329-bug.patch

Attaching patch to _demonstrate_ the issue. Its going to take an encoding 
change to actually do this correctly.

 Correctly support varbinary arrays
 --

 Key: PHOENIX-1329
 URL: https://issues.apache.org/jira/browse/PHOENIX-1329
 Project: Phoenix
  Issue Type: Bug
Reporter: Jesse Yates
 Fix For: 5.0.0, 4.3

 Attachments: phoenix-1329-bug.patch


 Storing arrays of binary data can contain 0x00, which Phoenix uses a the 
 field separator. This leads phoenix to return arrays incorrectly - shortening 
 them prematurely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1297) Adding utility methods to get primary key information from the optimized query plan

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162746#comment-14162746
 ] 

James Taylor commented on PHOENIX-1297:
---

Looks good, [~samarthjain]. Want me to wait for any other changes, or should I 
pull this in?

 Adding utility methods to get primary key information from the optimized 
 query plan
 ---

 Key: PHOENIX-1297
 URL: https://issues.apache.org/jira/browse/PHOENIX-1297
 Project: Phoenix
  Issue Type: Task
Affects Versions: 5.0.0, 4.2, 3.2
Reporter: Samarth Jain
Assignee: Samarth Jain
 Attachments: PHOENIX-1297.patch, PHOENIX-1297_v2.patch, 
 PHOENIX-1297_v3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1329) Correctly support varbinary arrays

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162763#comment-14162763
 ] 

James Taylor commented on PHOENIX-1329:
---

Our arrays were really designed to store arrays of other primitive types, not 
arbitrary arrays of arbitrary bytes (i.e. VARBINARY VARBINARY [] aren't really 
supported and we should flag it as an error). We should brainstorm about this - 
maybe for your use case you can just use a VARBINARY and serialize the raw 
bytes yourself? If you're planning on querying the data, then the story might 
be different, but otherwise, VARBINARY is the way to go.

 Correctly support varbinary arrays
 --

 Key: PHOENIX-1329
 URL: https://issues.apache.org/jira/browse/PHOENIX-1329
 Project: Phoenix
  Issue Type: Bug
Reporter: Jesse Yates
 Fix For: 5.0.0, 4.3

 Attachments: phoenix-1329-bug.patch


 Storing arrays of binary data can contain 0x00, which Phoenix uses a the 
 field separator. This leads phoenix to return arrays incorrectly - shortening 
 them prematurely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1302) Query against tenant specific view should use index

2014-10-07 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-1302:
--
Assignee: James Taylor

 Query against tenant specific view should use index
 ---

 Key: PHOENIX-1302
 URL: https://issues.apache.org/jira/browse/PHOENIX-1302
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 5.0.0, 4.2, 3.2
Reporter: Samarth Jain
Assignee: James Taylor

 Test that can be added in QueryOptimizerTest.java
 {code}
 @Test
 public void testAssertQueryAgainstTenantSpecificViewGoesThroughIndex() 
 throws Exception {
 Connection conn = DriverManager.getConnection(getUrl(), new 
 Properties());
 
 // create table
 conn.createStatement().execute(create table 
 + XYZ.ABC
 +(organization_id char(15) not null, \n
 + entity_id char(15) not null,\n
 + a_string_array varchar(100) array[] not null,\n
 + b_string varchar(100),\n
 + a_integer integer,\n
 + a_date date,\n
 + CONSTRAINT pk PRIMARY KEY (organization_id, entity_id, 
 a_string_array)\n
 + ) + MULTI_TENANT=true);
 
 // create index
 conn.createStatement().execute(CREATE INDEX ABC_IDX ON XYZ.ABC 
 (a_integer) INCLUDE (a_date));
 
 conn.close();
 
 // switch to a tenant specific connection
 conn = DriverManager.getConnection(getUrl(tenantId));
 
 // create a tenant specific view
 conn.createStatement().execute(CREATE VIEW ABC_VIEW AS SELECT * FROM 
 XYZ.ABC);
 
 // query against the tenant specific view
 String sql = SELECT a_date FROM ABC_VIEW where a_integer = ?;
 PreparedStatement stmt = conn.prepareStatement(sql);
 stmt.setInt(1, 1000);
 QueryPlan plan = 
 stmt.unwrap(PhoenixPreparedStatement.class).optimizeQuery();
 assertEquals(Query should use index, PTableType.INDEX, 
 plan.getTableRef().getTable().getType());
 }
 {code}
 Error:
 java.lang.AssertionError: Query should use index expected:INDEX but 
 was:VIEW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1302) Query against tenant specific view should use index

2014-10-07 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162824#comment-14162824
 ] 

Samarth Jain commented on PHOENIX-1302:
---

Tests added in QueryOptimizerTest with @Ignore annotation.

 Query against tenant specific view should use index
 ---

 Key: PHOENIX-1302
 URL: https://issues.apache.org/jira/browse/PHOENIX-1302
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 5.0.0, 4.2, 3.2
Reporter: Samarth Jain
Assignee: James Taylor

 Test that can be added in QueryOptimizerTest.java
 {code}
 @Test
 public void testAssertQueryAgainstTenantSpecificViewGoesThroughIndex() 
 throws Exception {
 Connection conn = DriverManager.getConnection(getUrl(), new 
 Properties());
 
 // create table
 conn.createStatement().execute(create table 
 + XYZ.ABC
 +(organization_id char(15) not null, \n
 + entity_id char(15) not null,\n
 + a_string_array varchar(100) array[] not null,\n
 + b_string varchar(100),\n
 + a_integer integer,\n
 + a_date date,\n
 + CONSTRAINT pk PRIMARY KEY (organization_id, entity_id, 
 a_string_array)\n
 + ) + MULTI_TENANT=true);
 
 // create index
 conn.createStatement().execute(CREATE INDEX ABC_IDX ON XYZ.ABC 
 (a_integer) INCLUDE (a_date));
 
 conn.close();
 
 // switch to a tenant specific connection
 conn = DriverManager.getConnection(getUrl(tenantId));
 
 // create a tenant specific view
 conn.createStatement().execute(CREATE VIEW ABC_VIEW AS SELECT * FROM 
 XYZ.ABC);
 
 // query against the tenant specific view
 String sql = SELECT a_date FROM ABC_VIEW where a_integer = ?;
 PreparedStatement stmt = conn.prepareStatement(sql);
 stmt.setInt(1, 1000);
 QueryPlan plan = 
 stmt.unwrap(PhoenixPreparedStatement.class).optimizeQuery();
 assertEquals(Query should use index, PTableType.INDEX, 
 plan.getTableRef().getTable().getType());
 }
 {code}
 Error:
 java.lang.AssertionError: Query should use index expected:INDEX but 
 was:VIEW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1297) Adding utility methods to get primary key information from the optimized query plan

2014-10-07 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-1297:
--
Attachment: PHOENIX-1297_v4.patch

Updated patch that takes into consideration the tenant id of the connection 
while determining the offset. Changed encode and decode pk methods to take into 
consideration the view index id too.

 Adding utility methods to get primary key information from the optimized 
 query plan
 ---

 Key: PHOENIX-1297
 URL: https://issues.apache.org/jira/browse/PHOENIX-1297
 Project: Phoenix
  Issue Type: Task
Affects Versions: 5.0.0, 4.2, 3.2
Reporter: Samarth Jain
Assignee: Samarth Jain
 Attachments: PHOENIX-1297.patch, PHOENIX-1297_v2.patch, 
 PHOENIX-1297_v3.patch, PHOENIX-1297_v4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1330) Flag VARBINARY VARBINARY ARRAY declaration in DDL as an error

2014-10-07 Thread James Taylor (JIRA)
James Taylor created PHOENIX-1330:
-

 Summary: Flag VARBINARY VARBINARY ARRAY declaration in DDL as an 
error
 Key: PHOENIX-1330
 URL: https://issues.apache.org/jira/browse/PHOENIX-1330
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor


As [~jesse_yates] pointed out in PHOENIX-1329, our variable length array 
encoding does not handle arrays of arbitrary variable length data. We should 
flag attempts to declare this at DDL time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1331) DropIndexDuringUpsertIT.testWriteFailureDropIndex fails on Mac

2014-10-07 Thread James Taylor (JIRA)
James Taylor created PHOENIX-1331:
-

 Summary: DropIndexDuringUpsertIT.testWriteFailureDropIndex fails 
on Mac
 Key: PHOENIX-1331
 URL: https://issues.apache.org/jira/browse/PHOENIX-1331
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor


The DropIndexDuringUpsertIT.testWriteFailureDropIndex() test consistently fails 
by timing out on my Mac laptop and Mac desktop with the following exception:
{code}
testWriteFailureDropIndex(org.apache.phoenix.end2end.index.DropIndexDuringUpsertIT)
  Time elapsed: 341.902 sec   ERROR!
java.lang.Exception: test timed out after 30 milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at 
org.apache.phoenix.end2end.index.DropIndexDuringUpsertIT.testWriteFailureDropIndex(DropIndexDuringUpsertIT.java:150)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-945) Support correlated subqueries in comparison without ANY/SOME/ALL

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-945:

Attachment: 945.patch

 Support correlated subqueries in comparison without ANY/SOME/ALL
 

 Key: PHOENIX-945
 URL: https://issues.apache.org/jira/browse/PHOENIX-945
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

 Attachments: 945.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Example:
  SELECT employee_number, name
FROM employees AS Bob
WHERE salary  (
  SELECT AVG(salary)
FROM employees
WHERE department = Bob.department);
 Basically we can optimize these queries into join queries, like:
  SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
  (SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary  temp.department_average;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-1332) Support correlated subqueries in comparison with ANY/SOME/ALL

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue reassigned PHOENIX-1332:


Assignee: Maryann Xue

 Support correlated subqueries in comparison with ANY/SOME/ALL
 -

 Key: PHOENIX-1332
 URL: https://issues.apache.org/jira/browse/PHOENIX-1332
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0


 Support grammar like:
 select * from OrderTable o where quantity = ALL(select quantity from 
 OrderTable where item_id = o.item_id)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1332) Support correlated subqueries in comparison with ANY/SOME/ALL

2014-10-07 Thread Maryann Xue (JIRA)
Maryann Xue created PHOENIX-1332:


 Summary: Support correlated subqueries in comparison with 
ANY/SOME/ALL
 Key: PHOENIX-1332
 URL: https://issues.apache.org/jira/browse/PHOENIX-1332
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue


Support grammar like:

select * from OrderTable o where quantity = ALL(select quantity from 
OrderTable where item_id = o.item_id)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-1179) Support many-to-many joins

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue reassigned PHOENIX-1179:


Assignee: Maryann Xue

 Support many-to-many joins
 --

 Key: PHOENIX-1179
 URL: https://issues.apache.org/jira/browse/PHOENIX-1179
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0


 Enhance our join capabilities to support many-to-many joins where the size of 
 both sides of the join are too big to fit into memory (and thus cannot use 
 our hash join mechanism). One technique would be to order both sides of the 
 join by their join key and merge sort the results on the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-945) Support correlated subqueries in comparison without ANY/SOME/ALL

2014-10-07 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162897#comment-14162897
 ] 

James Taylor commented on PHOENIX-945:
--

Wow, +1. You implemented correlated subquery support by adding about 50 lines 
of code?!  That's pretty awesome! What kind of limitations are there outside of 
only allowing the correlation in a comparison expression?

Minor nit: might be worth having a copy constructor to help with readability:
{code}
+subquery = NODE_FACTORY.select(subquery.getFrom(), subquery.getHint(), 
subquery.isDistinct(), 
+selectNodes, where, groupbyNodes, subquery.getHaving(), 
subquery.getOrderBy(), 
+subquery.getLimit(), subquery.getBindCount(), true, 
subquery.hasSequence());
+
{code}

 Support correlated subqueries in comparison without ANY/SOME/ALL
 

 Key: PHOENIX-945
 URL: https://issues.apache.org/jira/browse/PHOENIX-945
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

 Attachments: 945.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Example:
  SELECT employee_number, name
FROM employees AS Bob
WHERE salary  (
  SELECT AVG(salary)
FROM employees
WHERE department = Bob.department);
 Basically we can optimize these queries into join queries, like:
  SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
  (SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary  temp.department_average;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-945) Support correlated subqueries in comparison without ANY/SOME/ALL

2014-10-07 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162940#comment-14162940
 ] 

Maryann Xue commented on PHOENIX-945:
-

The limitations are:
1. The correlation condition in the inner query has to be what we currently 
allow for ON conditions in joins.
2. The inner query must be an non-group-by aggregate query, such as select 
max(c1) from table1 where correlation_condition.

At least half of the cases in limitation 2 will benefit from PHOENIX-1299, 
which will convert ANY/SOME/ALL queries into exactly the queries covered by 
this fix.

Besides, I just opened PHOENIX-1332, which will eliminate the second limitation 
completely. But this would require a bit of work. With those cases for which 
the optimization of PHOENIX-1299 is not enough, it will depend on the 
completion of PHOENIX-944, plus a special aggregate function which will allow 
returning values in the same group as an array.

 Support correlated subqueries in comparison without ANY/SOME/ALL
 

 Key: PHOENIX-945
 URL: https://issues.apache.org/jira/browse/PHOENIX-945
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

 Attachments: 945.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Example:
  SELECT employee_number, name
FROM employees AS Bob
WHERE salary  (
  SELECT AVG(salary)
FROM employees
WHERE department = Bob.department);
 Basically we can optimize these queries into join queries, like:
  SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
  (SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary  temp.department_average;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-945) Support correlated subqueries in comparison without ANY/SOME/ALL

2014-10-07 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-945:

Attachment: 945-2.patch

Minor change: added copy constructors in ParseNodeFactory

 Support correlated subqueries in comparison without ANY/SOME/ALL
 

 Key: PHOENIX-945
 URL: https://issues.apache.org/jira/browse/PHOENIX-945
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

 Attachments: 945-2.patch, 945.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Example:
  SELECT employee_number, name
FROM employees AS Bob
WHERE salary  (
  SELECT AVG(salary)
FROM employees
WHERE department = Bob.department);
 Basically we can optimize these queries into join queries, like:
  SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
  (SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary  temp.department_average;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-1267) Set scan.setSmall(true) when appropriate

2014-10-07 Thread jay wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jay wong reassigned PHOENIX-1267:
-

Assignee: jay wong

 Set scan.setSmall(true) when appropriate
 

 Key: PHOENIX-1267
 URL: https://issues.apache.org/jira/browse/PHOENIX-1267
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: jay wong
 Attachments: smallscan.patch


 There's a nice optimization that has been in HBase for a while now to set a 
 scan as small. This prevents extra RPC calls, I believe. We should add a 
 hint for queries that forces it to be set/not set, and make our best guess on 
 when it should default to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1267) Set scan.setSmall(true) when appropriate

2014-10-07 Thread jay wong (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163003#comment-14163003
 ] 

jay wong commented on PHOENIX-1267:
---

I have a holiday in the past several days. so sorry for reply later.

I know your mean. normally the hint is more structured and a better way. I 
think use hint control the small is a good point.

the small scan will be set true default when both the startkey and stopkey is 
setted. if we have a order by query. and the small is true. the result will be 
Infinite loop.

So I think the small scan is not only a query optimize for user. I will cause a 
bug. 
So I think the smallScanForbidden is needed also.






 Set scan.setSmall(true) when appropriate
 

 Key: PHOENIX-1267
 URL: https://issues.apache.org/jira/browse/PHOENIX-1267
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: jay wong
 Attachments: smallscan.patch


 There's a nice optimization that has been in HBase for a while now to set a 
 scan as small. This prevents extra RPC calls, I believe. We should add a 
 hint for queries that forces it to be set/not set, and make our best guess on 
 when it should default to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-945) Support correlated subqueries in comparison without ANY/SOME/ALL

2014-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163014#comment-14163014
 ] 

Hudson commented on PHOENIX-945:


SUCCESS: Integrated in Phoenix | Master #410 (See 
[https://builds.apache.org/job/Phoenix-master/410/])
PHOENIX-945 Support correlated subqueries in comparison without ANY/SOME/ALL 
(maryannxue: rev 5282a8a09fec1ea7a6241565ff034246e3b30b92)
* phoenix-core/src/main/java/org/apache/phoenix/compile/SubqueryRewriter.java
* phoenix-core/src/main/java/org/apache/phoenix/compile/StatementNormalizer.java
* phoenix-core/src/main/java/org/apache/phoenix/compile/ExpressionCompiler.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/SubqueryIT.java
* phoenix-core/src/main/java/org/apache/phoenix/parse/ParseNodeFactory.java


 Support correlated subqueries in comparison without ANY/SOME/ALL
 

 Key: PHOENIX-945
 URL: https://issues.apache.org/jira/browse/PHOENIX-945
 Project: Phoenix
  Issue Type: Sub-task
Affects Versions: 3.0.0, 4.0.0, 5.0.0
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 3.0.0, 4.0.0, 5.0.0

 Attachments: 945-2.patch, 945.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Example:
  SELECT employee_number, name
FROM employees AS Bob
WHERE salary  (
  SELECT AVG(salary)
FROM employees
WHERE department = Bob.department);
 Basically we can optimize these queries into join queries, like:
  SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
  (SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary  temp.department_average;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-1333) Store statistics guideposts as VARBINARY

2014-10-07 Thread James Taylor (JIRA)
James Taylor created PHOENIX-1333:
-

 Summary: Store statistics guideposts as VARBINARY
 Key: PHOENIX-1333
 URL: https://issues.apache.org/jira/browse/PHOENIX-1333
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: ramkrishna.s.vasudevan
Priority: Critical


There's a potential problem with storing the guideposts as a VARBINARY ARRAY, 
as pointed out by PHOENIX-1329. We'd run into this issue if we're collecting 
stats for a table with a trailing VARBINARY row key column if the value 
contained embedded null bytes. Because of this, we're better off storing 
guideposts as VARBINARY and serializing/deserializing in the following manner:
byte length as vintbytesbyte length as vintbytes...

We should also store as a separate KeyValue column the total number of 
guideposts. So the schema of SYSTEM.STATS would look like this now instead:
{code}
public static final String CREATE_STATS_TABLE_METADATA = 
CREATE TABLE  + SYSTEM_CATALOG_SCHEMA + .\ + 
SYSTEM_STATS_TABLE + \(\n +
// PK columns
PHYSICAL_NAME  +  VARCHAR NOT NULL, +
COLUMN_FAMILY +  VARCHAR, +
REGION_NAME +  VARCHAR, +
GUIDE_POSTS  +  VARBINARY, +
GUIDE_POSTS_COUNT +  SMALLINT, +
MIN_KEY +  VARBINARY, + 
MAX_KEY +  VARBINARY, +
LAST_STATS_UPDATE_TIME+  DATE, +
CONSTRAINT  + SYSTEM_TABLE_PK_NAME +  PRIMARY KEY (
+ PHYSICAL_NAME + ,
+ COLUMN_FAMILY + ,+ REGION_NAME+))\n +
// TODO: should we support versioned stats?
// Install split policy to prevent a physical table's stats from 
being split across regions.
HTableDescriptor.SPLIT_POLICY + =' + 
MetaDataSplitPolicy.class.getName() + '\n;
{code}

Then the serialization code in StatisticsTable.addStats() would need to change 
to populate the GUIDE_POSTS_COUNT and serialize the GUIDE_POSTS in the new 
format.

The deserialization code is isolated to StatisticsUtil.readStatisitics(). It 
would need to read the GUIDE_POSTS_COUNT first for estimated sizing, and then 
deserialize the GUIDE_POSTS in the new format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)