[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search

2015-08-04 Thread Dan Meany (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653729#comment-14653729
 ] 

Dan Meany commented on PHOENIX-2152:


How I did it manually:

https://github.com/threedliteguy/General/wiki/Adding-spatial-data-queries-to-Phoenix-on-HBase


 Ability to create spatial indexes (geohash) with bounding box / radius search
 -

 Key: PHOENIX-2152
 URL: https://issues.apache.org/jira/browse/PHOENIX-2152
 Project: Phoenix
  Issue Type: New Feature
Reporter: Dan Meany
Priority: Minor
   Original Estimate: 672h
  Remaining Estimate: 672h

 Add the ability to create spatial indexes such as in Elastic Search, MongoDB, 
 Oracle, etc.
 http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches
 https://github.com/davetroy/geohash-js



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1661) Implement built-in functions for JSON

2015-08-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653420#comment-14653420
 ] 

ASF GitHub Bot commented on PHOENIX-1661:
-

Github user ictwanglei closed the pull request at:

https://github.com/apache/phoenix/pull/93


 Implement built-in functions for JSON
 -

 Key: PHOENIX-1661
 URL: https://issues.apache.org/jira/browse/PHOENIX-1661
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: LeiWang
  Labels: JSON, Java, SQL, gsoc2015, mentor
 Attachments: PhoenixJSONSpecification-First-Draft.pdf


 Take a look at the JSON built-in functions that are implemented in Postgres 
 (http://www.postgresql.org/docs/9.3/static/functions-json.html) and implement 
 the same for Phoenix in Java following this guide: 
 http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html
 Examples of functions include ARRAY_TO_JSON, ROW_TO_JSON, TO_JSON, etc. The 
 implementation of these built-in functions will be impacted by how JSON is 
 stored in Phoenix. See PHOENIX-628. An initial implementation could work off 
 of a simple text-based JSON representation and then when a native JSON type 
 is implemented, they could be reworked to be more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] phoenix pull request: PHOENIX-1661 Implement built-in functions fo...

2015-08-04 Thread ictwanglei
Github user ictwanglei closed the pull request at:

https://github.com/apache/phoenix/pull/93


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search

2015-08-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653798#comment-14653798
 ] 

James Taylor commented on PHOENIX-2152:
---

Awesome, [~danmeany]. Great use case for UDFs and secondary indexing. Would be 
interested to hear what's missing from Phoenix to make this easier/more 
seamless. 

 Ability to create spatial indexes (geohash) with bounding box / radius search
 -

 Key: PHOENIX-2152
 URL: https://issues.apache.org/jira/browse/PHOENIX-2152
 Project: Phoenix
  Issue Type: New Feature
Reporter: Dan Meany
Priority: Minor
   Original Estimate: 672h
  Remaining Estimate: 672h

 Add the ability to create spatial indexes such as in Elastic Search, MongoDB, 
 Oracle, etc.
 http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches
 https://github.com/davetroy/geohash-js



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2159) Grammar changes and DDL support for surfacing native HBase timestamp

2015-08-04 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-2159:
--
Attachment: PHOENIX-2159_v2.patch

Thanks for the review [~jamestaylor]. Attached is the updated patch with 
additional tests. 

 Grammar changes and DDL support for surfacing native HBase timestamp
 

 Key: PHOENIX-2159
 URL: https://issues.apache.org/jira/browse/PHOENIX-2159
 Project: Phoenix
  Issue Type: Sub-task
Reporter: Samarth Jain
Assignee: Samarth Jain
 Attachments: PHOENIX-2159.patch, PHOENIX-2159_v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654752#comment-14654752
 ] 

Maryann Xue commented on PHOENIX-953:
-

Yes, the very fundamental thing we need for unnest is an 
UnnestArrayResultIterator (maybe as an inner class of UnnestArrayQueryPlan),  
which takes an array and emits a new tuple for each of the array's element.
The UnnestArrayQueryPlan may derive from DelegateQueryPlan, which has an input 
plan. For example, select unnest(array1) from t, we'll get a ScanPlan on 
table 't' wrapped by an UnnestArrayQueryPlan.
I just implemented the extension for VALUES 
(https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevtricks807365.html) on the 
calcite branch and committed a new class called LiteralResultIterationQueryPlan 
(which is an enhanced version of EmptyTableQueryPlan). I think this should be 
helpful for the no from clause UNNEST and might be exactly what it needs.

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST is a way of converting an array to a flattened table which 
 can then be filtered on, ordered, grouped, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654772#comment-14654772
 ] 

ramkrishna.s.vasudevan commented on PHOENIX-953:


I think our initial scope is going to be UNNEST without FROM clause and a 
simple 'SELECT UNNEST(ARRAY) from 'table''.
bq.We should be able to handle UNNEST purely at the compile layer.
Fine.
bq.The inner select will return one row per array element. We may have a 
special QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.
This would be for the UNNEST without FROM clause I think. 
bq.Yes, the very fundamental thing we need for unnest is an 
UnnestArrayResultIterator (maybe as an inner class of UnnestArrayQueryPlan),
True.  
I think the basic thing is more or less what we were discussing. I could help 
Dumindu in these patches.  REviews and suggestions welcome.  
Thanks for the comments James and Maryann. Anyway will come back here to 
discuss once we do the code development.
[~Dumindux]
You have any comments or questions?

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST is a way of converting an array to a flattened table which 
 can then be filtered on, ordered, grouped, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1609) MR job to populate index tables

2015-08-04 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557
 ] 

Thomas D'Silva commented on PHOENIX-1609:
-

[~maghamraviki...@gmail.com] [~jamestaylor]

I am trying to compare the performance of the map reduce index build vs the 
regular UPSERT SELECT based index build. One a 1 billion row table with 19 
columns the regular index build takes 8.5 hours compared to the map reduce 
index build which takes ~23 hours. Do you know if there are any special config 
settings I could use to speed up the MR index build ?

 MR job to populate index tables 
 

 Key: PHOENIX-1609
 URL: https://issues.apache.org/jira/browse/PHOENIX-1609
 Project: Phoenix
  Issue Type: New Feature
Reporter: maghamravikiran
Assignee: maghamravikiran
 Fix For: 5.0.0, 4.4.0

 Attachments: 0001-PHOENIX-1609-4.0.patch, 
 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch, 
 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch


 Often, we need to create new indexes on master tables way after the data 
 exists on the master tables.  It would be good to have a simple MR job given 
 by the phoenix code that users can call to have indexes in sync with the 
 master table. 
 Users can invoke the MR job using the following command 
 hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt 
 INDEX_TABLE -columns a,b,c
 Is this ideal? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (PHOENIX-1609) MR job to populate index tables

2015-08-04 Thread Ravi Kiran
Hi Thomas
 Those numbers are disturbing . Can you please share the following

A. How many mappers ran and what is the setting of io.sort.mb

B. How much data was written as mapper output . Also how many threads were
used by the mapper to aggregate the mapper output

C. From the job history,  are there any stragglers and the average
execution time of each map task


D

On Tuesday, August 4, 2015, Thomas D'Silva (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557
 ]

 Thomas D'Silva commented on PHOENIX-1609:
 -

 [~maghamraviki...@gmail.com javascript:;] [~jamestaylor]

 I am trying to compare the performance of the map reduce index build vs
 the regular UPSERT SELECT based index build. One a 1 billion row table with
 19 columns the regular index build takes 8.5 hours compared to the map
 reduce index build which takes ~23 hours. Do you know if there are any
 special config settings I could use to speed up the MR index build ?

  MR job to populate index tables
  
 
  Key: PHOENIX-1609
  URL: https://issues.apache.org/jira/browse/PHOENIX-1609
  Project: Phoenix
   Issue Type: New Feature
 Reporter: maghamravikiran
 Assignee: maghamravikiran
  Fix For: 5.0.0, 4.4.0
 
  Attachments: 0001-PHOENIX-1609-4.0.patch,
 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch,
 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch
 
 
  Often, we need to create new indexes on master tables way after the data
 exists on the master tables.  It would be good to have a simple MR job
 given by the phoenix code that users can call to have indexes in sync with
 the master table.
  Users can invoke the MR job using the following command
  hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt
 INDEX_TABLE -columns a,b,c
  Is this ideal?



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



[jira] [Updated] (PHOENIX-1902) Do not perform conflict detection for append-only tables

2015-08-04 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-1902:

Assignee: James Taylor  (was: Thomas D'Silva)

 Do not perform conflict detection for append-only tables
 

 Key: PHOENIX-1902
 URL: https://issues.apache.org/jira/browse/PHOENIX-1902
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: James Taylor

 When a table is declared as write-once/append-only (IMMUTABLE_ROWS=true), 
 then we should disable the conflict detection being done by Tephra as there 
 can be no conflicts. This is a much lighter weight model that relies on 
 Tephra mainly to:
 - filter rows for failed (and unabortable) transactions.
 - not show transactional data until it has successfully been committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-1902) Do not perform conflict detection for append-only tables

2015-08-04 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva resolved PHOENIX-1902.
-
Resolution: Fixed

 Do not perform conflict detection for append-only tables
 

 Key: PHOENIX-1902
 URL: https://issues.apache.org/jira/browse/PHOENIX-1902
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: James Taylor

 When a table is declared as write-once/append-only (IMMUTABLE_ROWS=true), 
 then we should disable the conflict detection being done by Tephra as there 
 can be no conflicts. This is a much lighter weight model that relies on 
 Tephra mainly to:
 - filter rows for failed (and unabortable) transactions.
 - not show transactional data until it has successfully been committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654587#comment-14654587
 ] 

Hudson commented on PHOENIX-2145:
-

FAILURE: Integrated in Phoenix-master #866 (See 
[https://builds.apache.org/job/Phoenix-master/866/])
PHOENIX-2145 Pherf - Make update stats optional and fix threads not exiting 
after performance run (mujtaba: rev d41a0fcbd75456bc68eaa673361290504abd9895)
* phoenix-pherf/src/main/java/org/apache/phoenix/pherf/PherfConstants.java
* phoenix-pherf/src/it/java/org/apache/phoenix/pherf/DataIngestIT.java
* 
phoenix-pherf/src/main/java/org/apache/phoenix/pherf/workload/QueryExecutor.java
* phoenix-pherf/src/main/java/org/apache/phoenix/pherf/Pherf.java
* phoenix-pherf/src/test/java/org/apache/phoenix/pherf/PherfTest.java
* phoenix-pherf/src/main/assembly/components-minimal.xml
* phoenix-pherf/pom.xml
* 
phoenix-pherf/src/main/java/org/apache/phoenix/pherf/workload/WriteWorkload.java
* phoenix-pherf/src/main/java/org/apache/phoenix/pherf/util/PhoenixUtil.java


 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2137) Range query on DECIMAL DESC sometimes incorrect

2015-08-04 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653923#comment-14653923
 ] 

Samarth Jain commented on PHOENIX-2137:
---

Patch looks good [~jamestaylor]. There are a few additional test cases that 
could be added (unless you know of existing tests that do the below already) :
1) Test with SALT_BUCKETS and DESC in row key.
2) Test with DESC and fixed width row key and DESC and variable length row key 
to exercise the byte comparator logic.
3) Tests with NULLS first combining with above two conditions.
4) Tests with combinations of using skip scan, NULLS first, DESC and 
SALT_BUCKETS.

 Range query on DECIMAL DESC sometimes incorrect
 ---

 Key: PHOENIX-2137
 URL: https://issues.apache.org/jira/browse/PHOENIX-2137
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: James Taylor
 Attachments: PHOENIX-2137.patch, PHOENIX-2137_wip.patch


 The following scenario is not working correctly:
 {code}
 create table t (k1 bigint not null, k2 decimal, constraint pk primary key 
 (k1,k2 desc));
 upsert into t values(1,1.01);
 upsert into t values(2,1.001);
 select * from t where k21.0; -- No rows, but should be both rows
 select * from t where k1 in (1,2) and k21.0; -- Same problem
 {code}
 The following queries do return the correct results:
 {code}
 select * from t where k21.0001;
 select * from t where k1 in (1,2) and k21.0001;
 {code}
 Note also that without the DESC declaration of k2, everything works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Issue with PhoenixHBaseStorage

2015-08-04 Thread Siddhi Mehta
Hello,

I am trying to run a pig job with mapreduce mode that tries to write to
Hbase using PhoenixHBaseStorage.

I am seeing the reduce task fail with no suitable driver found for the
connection.

AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver
found for jdbc:phoenix:remoteclusterZkQuorum:2181;
at 
org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88)
at 
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540)
at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.sql.SQLException: No suitable driver found for
jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181;

at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at 
org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
at 
org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
at 
org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
at 
org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49)
at 
org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)


I checked that the PhoenixDriver is on the classpath for the reduce task by
adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the
write still fails.

Anyone else encountered an issue while trying Hbase writes via Pig in
mapreduce mode.

--Siddhi


Re: Issue with PhoenixHBaseStorage

2015-08-04 Thread Ravi Kiran
Hi Siddhi,

Which jars of phoenix did you register in your pig script. Can you
please share the version of phoenix you are working on .  Also, do you
notice the same behavior with LOAD also ?

Thanks
Ravi



On Tue, Aug 4, 2015 at 12:47 PM, Siddhi Mehta sm26...@gmail.com wrote:

 Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to
 remove hostnames from stacktrace.

 The url is jdbc:phoenix:remoteclusterZkQuorum:2181

 --Siddhi

 On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org wrote:

  The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix:
  remoteclusterZkQuorum:2181
 
  It should be jdbc:phoenix:remoteclusterZkQuorum:2181
 
  Do you have the phoneix.mapreduce.output.cluster.quorum configured (take
  note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the values
  set as?
 
 
 
 
 
  On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com wrote:
 
   Hello,
  
   I am trying to run a pig job with mapreduce mode that tries to write to
   Hbase using PhoenixHBaseStorage.
  
   I am seeing the reduce task fail with no suitable driver found for the
   connection.
  
   AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error:
   java.lang.RuntimeException: java.sql.SQLException: No suitable driver
   found for jdbc:phoenix:remoteclusterZkQuorum:2181;
   at
  
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
   at
  
 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88)
   at
  
 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540)
   at
   org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at
  
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
   Caused by: java.sql.SQLException: No suitable driver found for
   jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181;
  
   at java.sql.DriverManager.getConnection(DriverManager.java:689)
   at java.sql.DriverManager.getConnection(DriverManager.java:208)
   at
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
   at
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
   at
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
   at
  
 
 org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49)
   at
  
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)
  
  
   I checked that the PhoenixDriver is on the classpath for the reduce
 task
  by
   adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the
   write still fails.
  
   Anyone else encountered an issue while trying Hbase writes via Pig in
   mapreduce mode.
  
   --Siddhi
  
 



[jira] [Comment Edited] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654306#comment-14654306
 ] 

Mujtaba Chohan edited comment on PHOENIX-2145 at 8/4/15 8:45 PM:
-

I'm not removing but rather adding *.properties and *.sh to classpath so we can 
execute Pherf from IDE. Won't be removing prod scenarios.


was (Author: mujtabachohan):
I'm not removing but rather adding *.properties and *.sh to classpath so we can 
execute Pherf from IDE. 2. config/schema and config/scenario are included in 
the zip package so there is no need for redundant copy of prod scenario.

 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks

2015-08-04 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654179#comment-14654179
 ] 

Samarth Jain commented on PHOENIX-1011:
---

[~Sudeten] - what version of phoenix are you running? This issue was fixed in 
3.2.1 and 4.2.1 versions.

 Warning from Global Memory Manager Orphaned chunks
 --

 Key: PHOENIX-1011
 URL: https://issues.apache.org/jira/browse/PHOENIX-1011
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: Phoenix JDBC 3.0.0-incubating
Reporter: Nandanavanam Karthik
Assignee: Mujtaba Chohan

 I am seeing a warning message when running upsert queries using Phoenix JDBC 
 3.0.0-incubating version. I am not seeing any issues as such but the warning 
 is alarming.
 James Taylor asked me to create a Jira ticket for it. Below is the warning I 
 am seeing.
 [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned 
 chunk of 6462 bytes found during finalize



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Issue with PhoenixHBaseStorage

2015-08-04 Thread Siddhi Mehta
Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to
remove hostnames from stacktrace.

The url is jdbc:phoenix:remoteclusterZkQuorum:2181

--Siddhi

On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org wrote:

 The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix:
 remoteclusterZkQuorum:2181

 It should be jdbc:phoenix:remoteclusterZkQuorum:2181

 Do you have the phoneix.mapreduce.output.cluster.quorum configured (take
 note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the values
 set as?





 On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com wrote:

  Hello,
 
  I am trying to run a pig job with mapreduce mode that tries to write to
  Hbase using PhoenixHBaseStorage.
 
  I am seeing the reduce task fail with no suitable driver found for the
  connection.
 
  AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error:
  java.lang.RuntimeException: java.sql.SQLException: No suitable driver
  found for jdbc:phoenix:remoteclusterZkQuorum:2181;
  at
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
  at
 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88)
  at
 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540)
  at
  org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
  Caused by: java.sql.SQLException: No suitable driver found for
  jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181;
 
  at java.sql.DriverManager.getConnection(DriverManager.java:689)
  at java.sql.DriverManager.getConnection(DriverManager.java:208)
  at
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
  at
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
  at
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
  at
 
 org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49)
  at
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)
 
 
  I checked that the PhoenixDriver is on the classpath for the reduce task
 by
  adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the
  write still fails.
 
  Anyone else encountered an issue while trying Hbase writes via Pig in
  mapreduce mode.
 
  --Siddhi
 



[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Cody Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654330#comment-14654330
 ] 

Cody Marcel commented on PHOENIX-2145:
--

Cool +1

 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280
 ] 

James Taylor edited comment on PHOENIX-953 at 8/4/15 8:13 PM:
--

Questions from [~ram_krish]  [~Dumindux]:
{quote}
- What is the initial scope of UNNEST we will target?
- As I can read from the description UNNEST can be used a full table like 
structure for doing JOINS etc.
- There can be cases like
SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1);

- The UNNEST without 'FROM' clause.

Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will 
not implement UNNEST as a function I believe.
We will add an entry in the grammar file and have an expression for UNNEST and 
for the UNNEST expression we may need a new type of compilation and a new type 
of result iterator on the Column Projector right?

So the KV that is getting returned back to the client ( I mean per KV) we will 
need to iterate the value part of it. Am not sure whether the normal iterators 
would do this work. 
{quote}

Yes, we'll need to add UNNEST support to the grammar on par with other 
expressions in the term rule:
{code}
term returns [ParseNode ret]
:   e=literal_or_bind { $ret = e; }
|   field=identifier { $ret = factory.column(null,field,field); }
|   UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); }
...
{code}
We should be able to handle UNNEST purely at the compile layer. It should be 
equivalent to the following (which should already work):
{code}
SELECT ( SELECT a[1],a[2],a[3],a[4]... )
{code}
The inner select will return one row per array element. We may have a special 
QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.

I don't think any runtime changes will be necessary. If there are places in 
which we don't support this, then those can be fixed in future work (and/or 
when we move to Calcite).

Thoughts, [~maryannxue]?


was (Author: jamestaylor):
Questions from [~ram_krish]  [~Dumindux]:
{quote}
- What is the initial scope of UNNEST we will target?
- As I can read from the description UNNEST can be used a full table like 
structure for doing JOINS etc.
- There can be cases like
SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1);

- The UNNEST without 'FROM' clause.

Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will 
not implement UNNEST as a function I believe.
We will add an entry in the grammar file and have an expression for UNNEST and 
for the UNNEST expression we may need a new type of compilation and a new type 
of result iterator on the Column Projector right?

So the KV that is getting returned back to the client ( I mean per KV) we will 
need to iterate the value part of it. Am not sure whether the normal iterators 
would do this work. 
{quote}

Yes, we'll need to add UNNEST support to the grammar on par with other 
expressions in the term rule:
{code}
term returns [ParseNode ret]
{code}
:   e=literal_or_bind { $ret = e; }
|   field=identifier { $ret = factory.column(null,field,field); }
|   UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); }
...
{code}
We should be able to handle UNNEST purely at the compile layer. It should be 
equivalent to the following (which should already work):
{code}
SELECT ( SELECT a[1],a[2],a[3],a[4]... )
{code}
The inner select will return one row per array element. We may have a special 
QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.

I don't think any runtime changes will be necessary. If there are places in 
which we don't support this, then those can be fixed in future work (and/or 
when we move to Calcite).

Thoughts, [~maryannxue]?

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST is a way of converting an array to a flattened table which 
 can then be filtered on, ordered, grouped, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280
 ] 

James Taylor commented on PHOENIX-953:
--

Questions from [~ram_krish]  [~Dumindux]:
{quote}
- What is the initial scope of UNNEST we will target?
- As I can read from the description UNNEST can be used a full table like 
structure for doing JOINS etc.
- There can be cases like
SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1);

- The UNNEST without 'FROM' clause.

Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will 
not implement UNNEST as a function I believe.
We will add an entry in the grammar file and have an expression for UNNEST and 
for the UNNEST expression we may need a new type of compilation and a new type 
of result iterator on the Column Projector right?

So the KV that is getting returned back to the client ( I mean per KV) we will 
need to iterate the value part of it. Am not sure whether the normal iterators 
would do this work. 
{quote}

Yes, we'll need to add UNNEST support to the grammar on par with other 
expressions in the term rule:
{code}
term returns [ParseNode ret]
{code}
:   e=literal_or_bind { $ret = e; }
|   field=identifier { $ret = factory.column(null,field,field); }
|   UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); }
...
{code}
We should be able to handle UNNEST purely at the compile layer. It should be 
equivalent to the following (which should already work):
{code}
SELECT ( SELECT a[1],a[2],a[3],a[4]... )
{code}
The inner select will return one row per array element. We may have a special 
QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.

I don't think any runtime changes will be necessary. If there are places in 
which we don't support this, then those can be fixed in future work (and/or 
when we move to Calcite).

Thoughts, [~maryannxue]?

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST is a way of converting an array to a flattened table which 
 can then be filtered on, ordered, grouped, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks

2015-08-04 Thread Kurt Woitke (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654173#comment-14654173
 ] 

Kurt Woitke commented on PHOENIX-1011:
--

We are having the same issue.  PLEASE ADDRESS THIS SOON.  Thanks.

 Warning from Global Memory Manager Orphaned chunks
 --

 Key: PHOENIX-1011
 URL: https://issues.apache.org/jira/browse/PHOENIX-1011
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: Phoenix JDBC 3.0.0-incubating
Reporter: Nandanavanam Karthik
Assignee: Mujtaba Chohan

 I am seeing a warning message when running upsert queries using Phoenix JDBC 
 3.0.0-incubating version. I am not seeing any issues as such but the warning 
 is alarming.
 James Taylor asked me to create a Jira ticket for it. Below is the warning I 
 am seeing.
 [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned 
 chunk of 6462 bytes found during finalize



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Cody Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654294#comment-14654294
 ] 

Cody Marcel commented on PHOENIX-2145:
--

Can you add the prod scenario back? It's there to smoke test a pherf install on 
a real cluster. test/resources are not packaged into the zip file.

 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Cody Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654296#comment-14654296
 ] 

Cody Marcel commented on PHOENIX-2145:
--

include**/*.properties/include

Also this was to include pherf.properties on the cp from the Intellij. I'd like 
to keep that as well.


 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run

2015-08-04 Thread Mujtaba Chohan (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654306#comment-14654306
 ] 

Mujtaba Chohan commented on PHOENIX-2145:
-

I'm not removing but rather adding *.properties and *.sh to classpath so we can 
execute Pherf from IDE. 2. config/schema and config/scenario are included in 
the zip package so there is no need for redundant copy of prod scenario.

 Pherf - Fix threads not exiting after performance run
 -

 Key: PHOENIX-2145
 URL: https://issues.apache.org/jira/browse/PHOENIX-2145
 Project: Phoenix
  Issue Type: Bug
Reporter: Mujtaba Chohan
Priority: Minor
 Attachments: PHOENIX-2145.patch


 Pherf does not exit when test run completes. Also need to fix other 
 miscellaneous issues:
 * Make update statistics optional as it does not work with large tables
 * Remove duplicate scenario files in project



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280
 ] 

James Taylor edited comment on PHOENIX-953 at 8/4/15 8:15 PM:
--

Questions from [~ram_krish]  [~Dumindux]:
{quote}
- What is the initial scope of UNNEST we will target?
- As I can read from the description UNNEST can be used a full table like 
structure for doing JOINS etc.
- There can be cases like
SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1);

- The UNNEST without 'FROM' clause.

Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will 
not implement UNNEST as a function I believe.
We will add an entry in the grammar file and have an expression for UNNEST and 
for the UNNEST expression we may need a new type of compilation and a new type 
of result iterator on the Column Projector right?

So the KV that is getting returned back to the client ( I mean per KV) we will 
need to iterate the value part of it. Am not sure whether the normal iterators 
would do this work. 
{quote}

Yes, we'll need to add UNNEST support to the grammar on par with other 
expressions in the term rule:
{code}
term returns [ParseNode ret]
:   e=literal_or_bind { $ret = e; }
|   field=identifier { $ret = factory.column(null,field,field); }
|   UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); }
...
{code}
We should be able to handle UNNEST purely at the compile layer. It should be 
equivalent to the following (which should already work):
{code}
SELECT ( SELECT a[1],a[2],a[3],a[4]... )
{code}
The inner select will return one row per array element. We may have a special 
QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.

I don't think any runtime changes will be necessary. If there are places in 
which we don't support this, then those can be fixed in future work (and/or 
when we move to Calcite).

First cut, I don't think we need to support the DISTINCT keyword. Not sure if 
our SELECT without FROM clause supports that currently. In theory, you might be 
able to put it in an outer SELECT instead.

Thoughts, [~maryannxue]?


was (Author: jamestaylor):
Questions from [~ram_krish]  [~Dumindux]:
{quote}
- What is the initial scope of UNNEST we will target?
- As I can read from the description UNNEST can be used a full table like 
structure for doing JOINS etc.
- There can be cases like
SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1);

- The UNNEST without 'FROM' clause.

Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will 
not implement UNNEST as a function I believe.
We will add an entry in the grammar file and have an expression for UNNEST and 
for the UNNEST expression we may need a new type of compilation and a new type 
of result iterator on the Column Projector right?

So the KV that is getting returned back to the client ( I mean per KV) we will 
need to iterate the value part of it. Am not sure whether the normal iterators 
would do this work. 
{quote}

Yes, we'll need to add UNNEST support to the grammar on par with other 
expressions in the term rule:
{code}
term returns [ParseNode ret]
:   e=literal_or_bind { $ret = e; }
|   field=identifier { $ret = factory.column(null,field,field); }
|   UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); }
...
{code}
We should be able to handle UNNEST purely at the compile layer. It should be 
equivalent to the following (which should already work):
{code}
SELECT ( SELECT a[1],a[2],a[3],a[4]... )
{code}
The inner select will return one row per array element. We may have a special 
QueryPlan derived from EmptyTableQueryPlan specific for this case (like 
UnnestArrayQueryPlan) that'll simply project all the elements of the array 
expression from the UNNEST call.

I don't think any runtime changes will be necessary. If there are places in 
which we don't support this, then those can be fixed in future work (and/or 
when we move to Calcite).

Thoughts, [~maryannxue]?

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST 

Re: Issue with PhoenixHBaseStorage

2015-08-04 Thread Siddhi Mehta
Hey Ravi,

I register both the phoenix-core jar as well as phoenix-pig jar.
Version: 4.5.0
I havent tried load. Will give it a try.


Thanks,
--Siddhi


On Tue, Aug 4, 2015 at 1:24 PM, Ravi Kiran maghamraviki...@gmail.com
wrote:

 Hi Siddhi,

 Which jars of phoenix did you register in your pig script. Can you
 please share the version of phoenix you are working on .  Also, do you
 notice the same behavior with LOAD also ?

 Thanks
 Ravi



 On Tue, Aug 4, 2015 at 12:47 PM, Siddhi Mehta sm26...@gmail.com wrote:

  Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to
  remove hostnames from stacktrace.
 
  The url is jdbc:phoenix:remoteclusterZkQuorum:2181
 
  --Siddhi
 
  On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org
 wrote:
 
   The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix:
   remoteclusterZkQuorum:2181
  
   It should be jdbc:phoenix:remoteclusterZkQuorum:2181
  
   Do you have the phoneix.mapreduce.output.cluster.quorum configured
 (take
   note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the
 values
   set as?
  
  
  
  
  
   On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com
 wrote:
  
Hello,
   
I am trying to run a pig job with mapreduce mode that tries to write
 to
Hbase using PhoenixHBaseStorage.
   
I am seeing the reduce task fail with no suitable driver found for
 the
connection.
   
AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error:
java.lang.RuntimeException: java.sql.SQLException: No suitable driver
found for jdbc:phoenix:remoteclusterZkQuorum:2181;
at
   
  
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
at
   
  
 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88)
at
   
  
 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540)
at
   
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at
 org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at
 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
   
  
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576)
at
 org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.sql.SQLException: No suitable driver found for
jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181;
   
at
 java.sql.DriverManager.getConnection(DriverManager.java:689)
at
 java.sql.DriverManager.getConnection(DriverManager.java:208)
at
   
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
at
   
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80)
at
   
  
 
 org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68)
at
   
  
 
 org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49)
at
   
  
 
 org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55)
   
   
I checked that the PhoenixDriver is on the classpath for the reduce
  task
   by
adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but
 the
write still fails.
   
Anyone else encountered an issue while trying Hbase writes via Pig in
mapreduce mode.
   
--Siddhi
   
  
 



[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks

2015-08-04 Thread Kurt Woitke (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654324#comment-14654324
 ] 

Kurt Woitke commented on PHOENIX-1011:
--

We are running version 4.1 with hadoop and hbase 0.98.  Can we upgrade the 
phoenix easily without creating any problems with the rest of the software that 
we are using?  What version can we upgrade to? 

 Warning from Global Memory Manager Orphaned chunks
 --

 Key: PHOENIX-1011
 URL: https://issues.apache.org/jira/browse/PHOENIX-1011
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: Phoenix JDBC 3.0.0-incubating
Reporter: Nandanavanam Karthik
Assignee: Mujtaba Chohan

 I am seeing a warning message when running upsert queries using Phoenix JDBC 
 3.0.0-incubating version. I am not seeing any issues as such but the warning 
 is alarming.
 James Taylor asked me to create a Jira ticket for it. Below is the warning I 
 am seeing.
 [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned 
 chunk of 6462 bytes found during finalize



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] drop 5.0.0 label for JIRAs

2015-08-04 Thread Nick Dimiduk
Sorry if this has already been resolved, I'm just getting back from an
extended absence.

I'm +1 for this as well. Can we remove 5.0.0 label from JIRA as well?
Presumably there's nothing with just the 5.0.0 label, so a simple removal
should suffice. I haven't checked though...

On Thu, Jul 9, 2015 at 5:21 AM, Josh Mahonin jmaho...@interset.com wrote:

 +1

 I think the version / branching system is pretty confusing. Any steps to
 minimize that sound good to me.

 On Wed, Jul 8, 2015 at 7:55 PM, James Taylor jamestay...@apache.org
 wrote:

  Given that 5.0.0 is not imminent and everything from the 4.x release will
  appear in it, I propose that we no longer label our JIRAs with 5.0.0 in
 the
  fixVersion field. We can continue to use 4.4.1 and 4.5.0 as an indicate
 for
  the target release of the fix.
 
  Thoughts?
 
  Thanks,
  James
 



[jira] [Comment Edited] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search

2015-08-04 Thread Dan Meany (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654732#comment-14654732
 ] 

Dan Meany edited comment on PHOENIX-2152 at 8/5/15 3:01 AM:


Looking at the long term big picture I wonder if it makes sense to try move in 
the direction of supporting the same functions as postgis (http://postgis.net/)

http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches

http://revenant.ca/www/postgis/workshop/index.html

In the intermediate term, a bundle of spatial functions (or just the DISTANCE 
one) packaged natively with Phoenix reduces the complexity of the user having 
to deploy/update their own udf jars and set the hbase-site flag to allow udfs 
on all clients/servers.









was (Author: danmeany):
Looking at the long term big picture I wonder if it makes sense to try move in 
the direction of supporting the same functions as postgis (http://postgis.net/)

http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches

http://revenant.ca/www/postgis/workshop/index.html

In the intermediate term, a bundle of spatial functions (or just the DISTANCE 
one) packaged natively with Phoenix reduces the complexity of the user having 
to deploying/update their own udf jars and setting the hbase-site flag to allow 
udfs on all clients/servers.








 Ability to create spatial indexes (geohash) with bounding box / radius search
 -

 Key: PHOENIX-2152
 URL: https://issues.apache.org/jira/browse/PHOENIX-2152
 Project: Phoenix
  Issue Type: New Feature
Reporter: Dan Meany
Priority: Minor
   Original Estimate: 672h
  Remaining Estimate: 672h

 Add the ability to create spatial indexes such as in Elastic Search, MongoDB, 
 Oracle, etc.
 http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches
 https://github.com/davetroy/geohash-js



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY

2015-08-04 Thread Dumindu Buddhika (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654835#comment-14654835
 ] 

Dumindu Buddhika commented on PHOENIX-953:
--

Thanks for the comments James and Maryann. I understand the basic idea of the 
implementation now. I will try to come up with a patch for the initial scope 
with help from Ram.

 Support UNNEST for ARRAY
 

 Key: PHOENIX-953
 URL: https://issues.apache.org/jira/browse/PHOENIX-953
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor
Assignee: Dumindu Buddhika

 The UNNEST built-in function converts an array into a set of rows. This is 
 more than a built-in function, so should be considered an advanced project.
 For an example, see the following Postgres documentation: 
 http://www.postgresql.org/docs/8.4/static/functions-array.html
 http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html
 http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html
 So the UNNEST is a way of converting an array to a flattened table which 
 can then be filtered on, ordered, grouped, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (PHOENIX-1609) MR job to populate index tables

2015-08-04 Thread Thomas D'Silva

 A. How many mappers ran and what is the setting of io.sort.mb

64 mappers ran and io.sort.mb=256
We don't set the number of reducers so it used a default of 1, should
this be increased?

 B. How much data was written as mapper output . Also how many threads were
 used by the mapper to aggregate the mapper output
Do you know how to get this information?

 C. From the job history,  are there any stragglers and the average
 execution time of each map task

Average Map Time56mins, 30sec
Average Shuffle Time1hrs, 43mins, 16sec
Average Merge Time2sec
Average Reduce Time16hrs, 24mins, 59sec



 D

 On Tuesday, August 4, 2015, Thomas D'Silva (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557
 ]

 Thomas D'Silva commented on PHOENIX-1609:
 -

 [~maghamraviki...@gmail.com javascript:;] [~jamestaylor]

 I am trying to compare the performance of the map reduce index build vs
 the regular UPSERT SELECT based index build. One a 1 billion row table with
 19 columns the regular index build takes 8.5 hours compared to the map
 reduce index build which takes ~23 hours. Do you know if there are any
 special config settings I could use to speed up the MR index build ?

  MR job to populate index tables
  
 
  Key: PHOENIX-1609
  URL: https://issues.apache.org/jira/browse/PHOENIX-1609
  Project: Phoenix
   Issue Type: New Feature
 Reporter: maghamravikiran
 Assignee: maghamravikiran
  Fix For: 5.0.0, 4.4.0
 
  Attachments: 0001-PHOENIX-1609-4.0.patch,
 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch,
 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch
 
 
  Often, we need to create new indexes on master tables way after the data
 exists on the master tables.  It would be good to have a simple MR job
 given by the phoenix code that users can call to have indexes in sync with
 the master table.
  Users can invoke the MR job using the following command
  hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt
 INDEX_TABLE -columns a,b,c
  Is this ideal?



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search

2015-08-04 Thread Dan Meany (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654732#comment-14654732
 ] 

Dan Meany commented on PHOENIX-2152:


Looking at the long term big picture I wonder if it makes sense to try move in 
the direction of supporting the same functions as postgis (http://postgis.net/)

http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches

http://revenant.ca/www/postgis/workshop/index.html

In the intermediate term, a bundle of spatial functions (or just the DISTANCE 
one) packaged natively with Phoenix reduces the complexity of the user having 
to deploying/update their own udf jars and setting the hbase-site flag to allow 
udfs on all clients/servers.








 Ability to create spatial indexes (geohash) with bounding box / radius search
 -

 Key: PHOENIX-2152
 URL: https://issues.apache.org/jira/browse/PHOENIX-2152
 Project: Phoenix
  Issue Type: New Feature
Reporter: Dan Meany
Priority: Minor
   Original Estimate: 672h
  Remaining Estimate: 672h

 Add the ability to create spatial indexes such as in Elastic Search, MongoDB, 
 Oracle, etc.
 http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches
 https://github.com/davetroy/geohash-js



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)