[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search
[ https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653729#comment-14653729 ] Dan Meany commented on PHOENIX-2152: How I did it manually: https://github.com/threedliteguy/General/wiki/Adding-spatial-data-queries-to-Phoenix-on-HBase Ability to create spatial indexes (geohash) with bounding box / radius search - Key: PHOENIX-2152 URL: https://issues.apache.org/jira/browse/PHOENIX-2152 Project: Phoenix Issue Type: New Feature Reporter: Dan Meany Priority: Minor Original Estimate: 672h Remaining Estimate: 672h Add the ability to create spatial indexes such as in Elastic Search, MongoDB, Oracle, etc. http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches https://github.com/davetroy/geohash-js -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1661) Implement built-in functions for JSON
[ https://issues.apache.org/jira/browse/PHOENIX-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653420#comment-14653420 ] ASF GitHub Bot commented on PHOENIX-1661: - Github user ictwanglei closed the pull request at: https://github.com/apache/phoenix/pull/93 Implement built-in functions for JSON - Key: PHOENIX-1661 URL: https://issues.apache.org/jira/browse/PHOENIX-1661 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: LeiWang Labels: JSON, Java, SQL, gsoc2015, mentor Attachments: PhoenixJSONSpecification-First-Draft.pdf Take a look at the JSON built-in functions that are implemented in Postgres (http://www.postgresql.org/docs/9.3/static/functions-json.html) and implement the same for Phoenix in Java following this guide: http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html Examples of functions include ARRAY_TO_JSON, ROW_TO_JSON, TO_JSON, etc. The implementation of these built-in functions will be impacted by how JSON is stored in Phoenix. See PHOENIX-628. An initial implementation could work off of a simple text-based JSON representation and then when a native JSON type is implemented, they could be reworked to be more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] phoenix pull request: PHOENIX-1661 Implement built-in functions fo...
Github user ictwanglei closed the pull request at: https://github.com/apache/phoenix/pull/93 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search
[ https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653798#comment-14653798 ] James Taylor commented on PHOENIX-2152: --- Awesome, [~danmeany]. Great use case for UDFs and secondary indexing. Would be interested to hear what's missing from Phoenix to make this easier/more seamless. Ability to create spatial indexes (geohash) with bounding box / radius search - Key: PHOENIX-2152 URL: https://issues.apache.org/jira/browse/PHOENIX-2152 Project: Phoenix Issue Type: New Feature Reporter: Dan Meany Priority: Minor Original Estimate: 672h Remaining Estimate: 672h Add the ability to create spatial indexes such as in Elastic Search, MongoDB, Oracle, etc. http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches https://github.com/davetroy/geohash-js -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2159) Grammar changes and DDL support for surfacing native HBase timestamp
[ https://issues.apache.org/jira/browse/PHOENIX-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-2159: -- Attachment: PHOENIX-2159_v2.patch Thanks for the review [~jamestaylor]. Attached is the updated patch with additional tests. Grammar changes and DDL support for surfacing native HBase timestamp Key: PHOENIX-2159 URL: https://issues.apache.org/jira/browse/PHOENIX-2159 Project: Phoenix Issue Type: Sub-task Reporter: Samarth Jain Assignee: Samarth Jain Attachments: PHOENIX-2159.patch, PHOENIX-2159_v2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654752#comment-14654752 ] Maryann Xue commented on PHOENIX-953: - Yes, the very fundamental thing we need for unnest is an UnnestArrayResultIterator (maybe as an inner class of UnnestArrayQueryPlan), which takes an array and emits a new tuple for each of the array's element. The UnnestArrayQueryPlan may derive from DelegateQueryPlan, which has an input plan. For example, select unnest(array1) from t, we'll get a ScanPlan on table 't' wrapped by an UnnestArrayQueryPlan. I just implemented the extension for VALUES (https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevtricks807365.html) on the calcite branch and committed a new class called LiteralResultIterationQueryPlan (which is an enhanced version of EmptyTableQueryPlan). I think this should be helpful for the no from clause UNNEST and might be exactly what it needs. Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST is a way of converting an array to a flattened table which can then be filtered on, ordered, grouped, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654772#comment-14654772 ] ramkrishna.s.vasudevan commented on PHOENIX-953: I think our initial scope is going to be UNNEST without FROM clause and a simple 'SELECT UNNEST(ARRAY) from 'table''. bq.We should be able to handle UNNEST purely at the compile layer. Fine. bq.The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. This would be for the UNNEST without FROM clause I think. bq.Yes, the very fundamental thing we need for unnest is an UnnestArrayResultIterator (maybe as an inner class of UnnestArrayQueryPlan), True. I think the basic thing is more or less what we were discussing. I could help Dumindu in these patches. REviews and suggestions welcome. Thanks for the comments James and Maryann. Anyway will come back here to discuss once we do the code development. [~Dumindux] You have any comments or questions? Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST is a way of converting an array to a flattened table which can then be filtered on, ordered, grouped, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1609) MR job to populate index tables
[ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557 ] Thomas D'Silva commented on PHOENIX-1609: - [~maghamraviki...@gmail.com] [~jamestaylor] I am trying to compare the performance of the map reduce index build vs the regular UPSERT SELECT based index build. One a 1 billion row table with 19 columns the regular index build takes 8.5 hours compared to the map reduce index build which takes ~23 hours. Do you know if there are any special config settings I could use to speed up the MR index build ? MR job to populate index tables Key: PHOENIX-1609 URL: https://issues.apache.org/jira/browse/PHOENIX-1609 Project: Phoenix Issue Type: New Feature Reporter: maghamravikiran Assignee: maghamravikiran Fix For: 5.0.0, 4.4.0 Attachments: 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch, 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch Often, we need to create new indexes on master tables way after the data exists on the master tables. It would be good to have a simple MR job given by the phoenix code that users can call to have indexes in sync with the master table. Users can invoke the MR job using the following command hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt INDEX_TABLE -columns a,b,c Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [jira] [Created] (PHOENIX-1609) MR job to populate index tables
Hi Thomas Those numbers are disturbing . Can you please share the following A. How many mappers ran and what is the setting of io.sort.mb B. How much data was written as mapper output . Also how many threads were used by the mapper to aggregate the mapper output C. From the job history, are there any stragglers and the average execution time of each map task D On Tuesday, August 4, 2015, Thomas D'Silva (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557 ] Thomas D'Silva commented on PHOENIX-1609: - [~maghamraviki...@gmail.com javascript:;] [~jamestaylor] I am trying to compare the performance of the map reduce index build vs the regular UPSERT SELECT based index build. One a 1 billion row table with 19 columns the regular index build takes 8.5 hours compared to the map reduce index build which takes ~23 hours. Do you know if there are any special config settings I could use to speed up the MR index build ? MR job to populate index tables Key: PHOENIX-1609 URL: https://issues.apache.org/jira/browse/PHOENIX-1609 Project: Phoenix Issue Type: New Feature Reporter: maghamravikiran Assignee: maghamravikiran Fix For: 5.0.0, 4.4.0 Attachments: 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch, 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch Often, we need to create new indexes on master tables way after the data exists on the master tables. It would be good to have a simple MR job given by the phoenix code that users can call to have indexes in sync with the master table. Users can invoke the MR job using the following command hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt INDEX_TABLE -columns a,b,c Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-1902) Do not perform conflict detection for append-only tables
[ https://issues.apache.org/jira/browse/PHOENIX-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-1902: Assignee: James Taylor (was: Thomas D'Silva) Do not perform conflict detection for append-only tables Key: PHOENIX-1902 URL: https://issues.apache.org/jira/browse/PHOENIX-1902 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: James Taylor When a table is declared as write-once/append-only (IMMUTABLE_ROWS=true), then we should disable the conflict detection being done by Tephra as there can be no conflicts. This is a much lighter weight model that relies on Tephra mainly to: - filter rows for failed (and unabortable) transactions. - not show transactional data until it has successfully been committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PHOENIX-1902) Do not perform conflict detection for append-only tables
[ https://issues.apache.org/jira/browse/PHOENIX-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva resolved PHOENIX-1902. - Resolution: Fixed Do not perform conflict detection for append-only tables Key: PHOENIX-1902 URL: https://issues.apache.org/jira/browse/PHOENIX-1902 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: James Taylor When a table is declared as write-once/append-only (IMMUTABLE_ROWS=true), then we should disable the conflict detection being done by Tephra as there can be no conflicts. This is a much lighter weight model that relies on Tephra mainly to: - filter rows for failed (and unabortable) transactions. - not show transactional data until it has successfully been committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654587#comment-14654587 ] Hudson commented on PHOENIX-2145: - FAILURE: Integrated in Phoenix-master #866 (See [https://builds.apache.org/job/Phoenix-master/866/]) PHOENIX-2145 Pherf - Make update stats optional and fix threads not exiting after performance run (mujtaba: rev d41a0fcbd75456bc68eaa673361290504abd9895) * phoenix-pherf/src/main/java/org/apache/phoenix/pherf/PherfConstants.java * phoenix-pherf/src/it/java/org/apache/phoenix/pherf/DataIngestIT.java * phoenix-pherf/src/main/java/org/apache/phoenix/pherf/workload/QueryExecutor.java * phoenix-pherf/src/main/java/org/apache/phoenix/pherf/Pherf.java * phoenix-pherf/src/test/java/org/apache/phoenix/pherf/PherfTest.java * phoenix-pherf/src/main/assembly/components-minimal.xml * phoenix-pherf/pom.xml * phoenix-pherf/src/main/java/org/apache/phoenix/pherf/workload/WriteWorkload.java * phoenix-pherf/src/main/java/org/apache/phoenix/pherf/util/PhoenixUtil.java Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2137) Range query on DECIMAL DESC sometimes incorrect
[ https://issues.apache.org/jira/browse/PHOENIX-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653923#comment-14653923 ] Samarth Jain commented on PHOENIX-2137: --- Patch looks good [~jamestaylor]. There are a few additional test cases that could be added (unless you know of existing tests that do the below already) : 1) Test with SALT_BUCKETS and DESC in row key. 2) Test with DESC and fixed width row key and DESC and variable length row key to exercise the byte comparator logic. 3) Tests with NULLS first combining with above two conditions. 4) Tests with combinations of using skip scan, NULLS first, DESC and SALT_BUCKETS. Range query on DECIMAL DESC sometimes incorrect --- Key: PHOENIX-2137 URL: https://issues.apache.org/jira/browse/PHOENIX-2137 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: James Taylor Attachments: PHOENIX-2137.patch, PHOENIX-2137_wip.patch The following scenario is not working correctly: {code} create table t (k1 bigint not null, k2 decimal, constraint pk primary key (k1,k2 desc)); upsert into t values(1,1.01); upsert into t values(2,1.001); select * from t where k21.0; -- No rows, but should be both rows select * from t where k1 in (1,2) and k21.0; -- Same problem {code} The following queries do return the correct results: {code} select * from t where k21.0001; select * from t where k1 in (1,2) and k21.0001; {code} Note also that without the DESC declaration of k2, everything works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Issue with PhoenixHBaseStorage
Hello, I am trying to run a pig job with mapreduce mode that tries to write to Hbase using PhoenixHBaseStorage. I am seeing the reduce task fail with no suitable driver found for the connection. AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:remoteclusterZkQuorum:2181; at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.sql.SQLException: No suitable driver found for jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181; at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68) at org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49) at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55) I checked that the PhoenixDriver is on the classpath for the reduce task by adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the write still fails. Anyone else encountered an issue while trying Hbase writes via Pig in mapreduce mode. --Siddhi
Re: Issue with PhoenixHBaseStorage
Hi Siddhi, Which jars of phoenix did you register in your pig script. Can you please share the version of phoenix you are working on . Also, do you notice the same behavior with LOAD also ? Thanks Ravi On Tue, Aug 4, 2015 at 12:47 PM, Siddhi Mehta sm26...@gmail.com wrote: Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to remove hostnames from stacktrace. The url is jdbc:phoenix:remoteclusterZkQuorum:2181 --Siddhi On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org wrote: The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix: remoteclusterZkQuorum:2181 It should be jdbc:phoenix:remoteclusterZkQuorum:2181 Do you have the phoneix.mapreduce.output.cluster.quorum configured (take note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the values set as? On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com wrote: Hello, I am trying to run a pig job with mapreduce mode that tries to write to Hbase using PhoenixHBaseStorage. I am seeing the reduce task fail with no suitable driver found for the connection. AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:remoteclusterZkQuorum:2181; at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.sql.SQLException: No suitable driver found for jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181; at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68) at org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49) at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55) I checked that the PhoenixDriver is on the classpath for the reduce task by adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the write still fails. Anyone else encountered an issue while trying Hbase writes via Pig in mapreduce mode. --Siddhi
[jira] [Comment Edited] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654306#comment-14654306 ] Mujtaba Chohan edited comment on PHOENIX-2145 at 8/4/15 8:45 PM: - I'm not removing but rather adding *.properties and *.sh to classpath so we can execute Pherf from IDE. Won't be removing prod scenarios. was (Author: mujtabachohan): I'm not removing but rather adding *.properties and *.sh to classpath so we can execute Pherf from IDE. 2. config/schema and config/scenario are included in the zip package so there is no need for redundant copy of prod scenario. Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks
[ https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654179#comment-14654179 ] Samarth Jain commented on PHOENIX-1011: --- [~Sudeten] - what version of phoenix are you running? This issue was fixed in 3.2.1 and 4.2.1 versions. Warning from Global Memory Manager Orphaned chunks -- Key: PHOENIX-1011 URL: https://issues.apache.org/jira/browse/PHOENIX-1011 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Environment: Phoenix JDBC 3.0.0-incubating Reporter: Nandanavanam Karthik Assignee: Mujtaba Chohan I am seeing a warning message when running upsert queries using Phoenix JDBC 3.0.0-incubating version. I am not seeing any issues as such but the warning is alarming. James Taylor asked me to create a Jira ticket for it. Below is the warning I am seeing. [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned chunk of 6462 bytes found during finalize -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Issue with PhoenixHBaseStorage
Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to remove hostnames from stacktrace. The url is jdbc:phoenix:remoteclusterZkQuorum:2181 --Siddhi On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org wrote: The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix: remoteclusterZkQuorum:2181 It should be jdbc:phoenix:remoteclusterZkQuorum:2181 Do you have the phoneix.mapreduce.output.cluster.quorum configured (take note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the values set as? On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com wrote: Hello, I am trying to run a pig job with mapreduce mode that tries to write to Hbase using PhoenixHBaseStorage. I am seeing the reduce task fail with no suitable driver found for the connection. AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:remoteclusterZkQuorum:2181; at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.sql.SQLException: No suitable driver found for jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181; at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68) at org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49) at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55) I checked that the PhoenixDriver is on the classpath for the reduce task by adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the write still fails. Anyone else encountered an issue while trying Hbase writes via Pig in mapreduce mode. --Siddhi
[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654330#comment-14654330 ] Cody Marcel commented on PHOENIX-2145: -- Cool +1 Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280 ] James Taylor edited comment on PHOENIX-953 at 8/4/15 8:13 PM: -- Questions from [~ram_krish] [~Dumindux]: {quote} - What is the initial scope of UNNEST we will target? - As I can read from the description UNNEST can be used a full table like structure for doing JOINS etc. - There can be cases like SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1); - The UNNEST without 'FROM' clause. Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will not implement UNNEST as a function I believe. We will add an entry in the grammar file and have an expression for UNNEST and for the UNNEST expression we may need a new type of compilation and a new type of result iterator on the Column Projector right? So the KV that is getting returned back to the client ( I mean per KV) we will need to iterate the value part of it. Am not sure whether the normal iterators would do this work. {quote} Yes, we'll need to add UNNEST support to the grammar on par with other expressions in the term rule: {code} term returns [ParseNode ret] : e=literal_or_bind { $ret = e; } | field=identifier { $ret = factory.column(null,field,field); } | UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); } ... {code} We should be able to handle UNNEST purely at the compile layer. It should be equivalent to the following (which should already work): {code} SELECT ( SELECT a[1],a[2],a[3],a[4]... ) {code} The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. I don't think any runtime changes will be necessary. If there are places in which we don't support this, then those can be fixed in future work (and/or when we move to Calcite). Thoughts, [~maryannxue]? was (Author: jamestaylor): Questions from [~ram_krish] [~Dumindux]: {quote} - What is the initial scope of UNNEST we will target? - As I can read from the description UNNEST can be used a full table like structure for doing JOINS etc. - There can be cases like SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1); - The UNNEST without 'FROM' clause. Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will not implement UNNEST as a function I believe. We will add an entry in the grammar file and have an expression for UNNEST and for the UNNEST expression we may need a new type of compilation and a new type of result iterator on the Column Projector right? So the KV that is getting returned back to the client ( I mean per KV) we will need to iterate the value part of it. Am not sure whether the normal iterators would do this work. {quote} Yes, we'll need to add UNNEST support to the grammar on par with other expressions in the term rule: {code} term returns [ParseNode ret] {code} : e=literal_or_bind { $ret = e; } | field=identifier { $ret = factory.column(null,field,field); } | UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); } ... {code} We should be able to handle UNNEST purely at the compile layer. It should be equivalent to the following (which should already work): {code} SELECT ( SELECT a[1],a[2],a[3],a[4]... ) {code} The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. I don't think any runtime changes will be necessary. If there are places in which we don't support this, then those can be fixed in future work (and/or when we move to Calcite). Thoughts, [~maryannxue]? Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST is a way of converting an array to a flattened table which can then be filtered on, ordered, grouped, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280 ] James Taylor commented on PHOENIX-953: -- Questions from [~ram_krish] [~Dumindux]: {quote} - What is the initial scope of UNNEST we will target? - As I can read from the description UNNEST can be used a full table like structure for doing JOINS etc. - There can be cases like SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1); - The UNNEST without 'FROM' clause. Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will not implement UNNEST as a function I believe. We will add an entry in the grammar file and have an expression for UNNEST and for the UNNEST expression we may need a new type of compilation and a new type of result iterator on the Column Projector right? So the KV that is getting returned back to the client ( I mean per KV) we will need to iterate the value part of it. Am not sure whether the normal iterators would do this work. {quote} Yes, we'll need to add UNNEST support to the grammar on par with other expressions in the term rule: {code} term returns [ParseNode ret] {code} : e=literal_or_bind { $ret = e; } | field=identifier { $ret = factory.column(null,field,field); } | UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); } ... {code} We should be able to handle UNNEST purely at the compile layer. It should be equivalent to the following (which should already work): {code} SELECT ( SELECT a[1],a[2],a[3],a[4]... ) {code} The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. I don't think any runtime changes will be necessary. If there are places in which we don't support this, then those can be fixed in future work (and/or when we move to Calcite). Thoughts, [~maryannxue]? Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST is a way of converting an array to a flattened table which can then be filtered on, ordered, grouped, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks
[ https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654173#comment-14654173 ] Kurt Woitke commented on PHOENIX-1011: -- We are having the same issue. PLEASE ADDRESS THIS SOON. Thanks. Warning from Global Memory Manager Orphaned chunks -- Key: PHOENIX-1011 URL: https://issues.apache.org/jira/browse/PHOENIX-1011 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Environment: Phoenix JDBC 3.0.0-incubating Reporter: Nandanavanam Karthik Assignee: Mujtaba Chohan I am seeing a warning message when running upsert queries using Phoenix JDBC 3.0.0-incubating version. I am not seeing any issues as such but the warning is alarming. James Taylor asked me to create a Jira ticket for it. Below is the warning I am seeing. [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned chunk of 6462 bytes found during finalize -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654294#comment-14654294 ] Cody Marcel commented on PHOENIX-2145: -- Can you add the prod scenario back? It's there to smoke test a pherf install on a real cluster. test/resources are not packaged into the zip file. Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654296#comment-14654296 ] Cody Marcel commented on PHOENIX-2145: -- include**/*.properties/include Also this was to include pherf.properties on the cp from the Intellij. I'd like to keep that as well. Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2145) Pherf - Fix threads not exiting after performance run
[ https://issues.apache.org/jira/browse/PHOENIX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654306#comment-14654306 ] Mujtaba Chohan commented on PHOENIX-2145: - I'm not removing but rather adding *.properties and *.sh to classpath so we can execute Pherf from IDE. 2. config/schema and config/scenario are included in the zip package so there is no need for redundant copy of prod scenario. Pherf - Fix threads not exiting after performance run - Key: PHOENIX-2145 URL: https://issues.apache.org/jira/browse/PHOENIX-2145 Project: Phoenix Issue Type: Bug Reporter: Mujtaba Chohan Priority: Minor Attachments: PHOENIX-2145.patch Pherf does not exit when test run completes. Also need to fix other miscellaneous issues: * Make update statistics optional as it does not work with large tables * Remove duplicate scenario files in project -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654280#comment-14654280 ] James Taylor edited comment on PHOENIX-953 at 8/4/15 8:15 PM: -- Questions from [~ram_krish] [~Dumindux]: {quote} - What is the initial scope of UNNEST we will target? - As I can read from the description UNNEST can be used a full table like structure for doing JOINS etc. - There can be cases like SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1); - The UNNEST without 'FROM' clause. Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will not implement UNNEST as a function I believe. We will add an entry in the grammar file and have an expression for UNNEST and for the UNNEST expression we may need a new type of compilation and a new type of result iterator on the Column Projector right? So the KV that is getting returned back to the client ( I mean per KV) we will need to iterate the value part of it. Am not sure whether the normal iterators would do this work. {quote} Yes, we'll need to add UNNEST support to the grammar on par with other expressions in the term rule: {code} term returns [ParseNode ret] : e=literal_or_bind { $ret = e; } | field=identifier { $ret = factory.column(null,field,field); } | UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); } ... {code} We should be able to handle UNNEST purely at the compile layer. It should be equivalent to the following (which should already work): {code} SELECT ( SELECT a[1],a[2],a[3],a[4]... ) {code} The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. I don't think any runtime changes will be necessary. If there are places in which we don't support this, then those can be fixed in future work (and/or when we move to Calcite). First cut, I don't think we need to support the DISTINCT keyword. Not sure if our SELECT without FROM clause supports that currently. In theory, you might be able to put it in an outer SELECT instead. Thoughts, [~maryannxue]? was (Author: jamestaylor): Questions from [~ram_krish] [~Dumindux]: {quote} - What is the initial scope of UNNEST we will target? - As I can read from the description UNNEST can be used a full table like structure for doing JOINS etc. - There can be cases like SELECT ARRAY(SELECT DISTINCT UNNEST(stuff) FROM Foo where id = 1); - The UNNEST without 'FROM' clause. Coming to the implementation for a SELECT UNNEST (ARRAY) from TABLE, we will not implement UNNEST as a function I believe. We will add an entry in the grammar file and have an expression for UNNEST and for the UNNEST expression we may need a new type of compilation and a new type of result iterator on the Column Projector right? So the KV that is getting returned back to the client ( I mean per KV) we will need to iterate the value part of it. Am not sure whether the normal iterators would do this work. {quote} Yes, we'll need to add UNNEST support to the grammar on par with other expressions in the term rule: {code} term returns [ParseNode ret] : e=literal_or_bind { $ret = e; } | field=identifier { $ret = factory.column(null,field,field); } | UNNEST LPAREN e=expression RPAREN { $ret = factory.unnest(e); } ... {code} We should be able to handle UNNEST purely at the compile layer. It should be equivalent to the following (which should already work): {code} SELECT ( SELECT a[1],a[2],a[3],a[4]... ) {code} The inner select will return one row per array element. We may have a special QueryPlan derived from EmptyTableQueryPlan specific for this case (like UnnestArrayQueryPlan) that'll simply project all the elements of the array expression from the UNNEST call. I don't think any runtime changes will be necessary. If there are places in which we don't support this, then those can be fixed in future work (and/or when we move to Calcite). Thoughts, [~maryannxue]? Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST
Re: Issue with PhoenixHBaseStorage
Hey Ravi, I register both the phoenix-core jar as well as phoenix-pig jar. Version: 4.5.0 I havent tried load. Will give it a try. Thanks, --Siddhi On Tue, Aug 4, 2015 at 1:24 PM, Ravi Kiran maghamraviki...@gmail.com wrote: Hi Siddhi, Which jars of phoenix did you register in your pig script. Can you please share the version of phoenix you are working on . Also, do you notice the same behavior with LOAD also ? Thanks Ravi On Tue, Aug 4, 2015 at 12:47 PM, Siddhi Mehta sm26...@gmail.com wrote: Ahh. Sorry ignore the typo in my stacktrace. That is due to me trying to remove hostnames from stacktrace. The url is jdbc:phoenix:remoteclusterZkQuorum:2181 --Siddhi On Tue, Aug 4, 2015 at 11:35 AM, Samarth Jain sama...@apache.org wrote: The jdbc url doesn't look correct - jdbc:phoenix:jdbc:phoenix: remoteclusterZkQuorum:2181 It should be jdbc:phoenix:remoteclusterZkQuorum:2181 Do you have the phoneix.mapreduce.output.cluster.quorum configured (take note of the typo)? Or hbase.zookeeper.quorum? If yes, what are the values set as? On Tue, Aug 4, 2015 at 11:19 AM, Siddhi Mehta sm26...@gmail.com wrote: Hello, I am trying to run a pig job with mapreduce mode that tries to write to Hbase using PhoenixHBaseStorage. I am seeing the reduce task fail with no suitable driver found for the connection. AttemptID:attempt_1436998373852_1140_r_00_1 Info:Error: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:remoteclusterZkQuorum:2181; at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:88) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.init(ReduceTask.java:540) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1576) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.sql.SQLException: No suitable driver found for jdbc:phoenix:jdbc:phoenix:remoteclusterZkQuorum:2181; at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:80) at org.apache.phoenix.mapreduce.util.ConnectionUtil.getOutputConnection(ConnectionUtil.java:68) at org.apache.phoenix.mapreduce.PhoenixRecordWriter.init(PhoenixRecordWriter.java:49) at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:55) I checked that the PhoenixDriver is on the classpath for the reduce task by adding a Class.forName(org.apache.phoenix.jdbc.PhoenixDriver) but the write still fails. Anyone else encountered an issue while trying Hbase writes via Pig in mapreduce mode. --Siddhi
[jira] [Commented] (PHOENIX-1011) Warning from Global Memory Manager Orphaned chunks
[ https://issues.apache.org/jira/browse/PHOENIX-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654324#comment-14654324 ] Kurt Woitke commented on PHOENIX-1011: -- We are running version 4.1 with hadoop and hbase 0.98. Can we upgrade the phoenix easily without creating any problems with the rest of the software that we are using? What version can we upgrade to? Warning from Global Memory Manager Orphaned chunks -- Key: PHOENIX-1011 URL: https://issues.apache.org/jira/browse/PHOENIX-1011 Project: Phoenix Issue Type: Bug Affects Versions: 3.0.0 Environment: Phoenix JDBC 3.0.0-incubating Reporter: Nandanavanam Karthik Assignee: Mujtaba Chohan I am seeing a warning message when running upsert queries using Phoenix JDBC 3.0.0-incubating version. I am not seeing any issues as such but the warning is alarming. James Taylor asked me to create a Jira ticket for it. Below is the warning I am seeing. [Finalizer] WARN org.apache.phoenix.memory.GlobalMemoryManager - Orphaned chunk of 6462 bytes found during finalize -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] drop 5.0.0 label for JIRAs
Sorry if this has already been resolved, I'm just getting back from an extended absence. I'm +1 for this as well. Can we remove 5.0.0 label from JIRA as well? Presumably there's nothing with just the 5.0.0 label, so a simple removal should suffice. I haven't checked though... On Thu, Jul 9, 2015 at 5:21 AM, Josh Mahonin jmaho...@interset.com wrote: +1 I think the version / branching system is pretty confusing. Any steps to minimize that sound good to me. On Wed, Jul 8, 2015 at 7:55 PM, James Taylor jamestay...@apache.org wrote: Given that 5.0.0 is not imminent and everything from the 4.x release will appear in it, I propose that we no longer label our JIRAs with 5.0.0 in the fixVersion field. We can continue to use 4.4.1 and 4.5.0 as an indicate for the target release of the fix. Thoughts? Thanks, James
[jira] [Comment Edited] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search
[ https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654732#comment-14654732 ] Dan Meany edited comment on PHOENIX-2152 at 8/5/15 3:01 AM: Looking at the long term big picture I wonder if it makes sense to try move in the direction of supporting the same functions as postgis (http://postgis.net/) http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches http://revenant.ca/www/postgis/workshop/index.html In the intermediate term, a bundle of spatial functions (or just the DISTANCE one) packaged natively with Phoenix reduces the complexity of the user having to deploy/update their own udf jars and set the hbase-site flag to allow udfs on all clients/servers. was (Author: danmeany): Looking at the long term big picture I wonder if it makes sense to try move in the direction of supporting the same functions as postgis (http://postgis.net/) http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches http://revenant.ca/www/postgis/workshop/index.html In the intermediate term, a bundle of spatial functions (or just the DISTANCE one) packaged natively with Phoenix reduces the complexity of the user having to deploying/update their own udf jars and setting the hbase-site flag to allow udfs on all clients/servers. Ability to create spatial indexes (geohash) with bounding box / radius search - Key: PHOENIX-2152 URL: https://issues.apache.org/jira/browse/PHOENIX-2152 Project: Phoenix Issue Type: New Feature Reporter: Dan Meany Priority: Minor Original Estimate: 672h Remaining Estimate: 672h Add the ability to create spatial indexes such as in Elastic Search, MongoDB, Oracle, etc. http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches https://github.com/davetroy/geohash-js -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-953) Support UNNEST for ARRAY
[ https://issues.apache.org/jira/browse/PHOENIX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654835#comment-14654835 ] Dumindu Buddhika commented on PHOENIX-953: -- Thanks for the comments James and Maryann. I understand the basic idea of the implementation now. I will try to come up with a patch for the initial scope with help from Ram. Support UNNEST for ARRAY Key: PHOENIX-953 URL: https://issues.apache.org/jira/browse/PHOENIX-953 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor Assignee: Dumindu Buddhika The UNNEST built-in function converts an array into a set of rows. This is more than a built-in function, so should be considered an advanced project. For an example, see the following Postgres documentation: http://www.postgresql.org/docs/8.4/static/functions-array.html http://www.anicehumble.com/2011/07/postgresql-unnest-function-do-many.html http://tech.valgog.com/2010/05/merging-and-manipulating-arrays-in.html So the UNNEST is a way of converting an array to a flattened table which can then be filtered on, ordered, grouped, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [jira] [Created] (PHOENIX-1609) MR job to populate index tables
A. How many mappers ran and what is the setting of io.sort.mb 64 mappers ran and io.sort.mb=256 We don't set the number of reducers so it used a default of 1, should this be increased? B. How much data was written as mapper output . Also how many threads were used by the mapper to aggregate the mapper output Do you know how to get this information? C. From the job history, are there any stragglers and the average execution time of each map task Average Map Time56mins, 30sec Average Shuffle Time1hrs, 43mins, 16sec Average Merge Time2sec Average Reduce Time16hrs, 24mins, 59sec D On Tuesday, August 4, 2015, Thomas D'Silva (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654557#comment-14654557 ] Thomas D'Silva commented on PHOENIX-1609: - [~maghamraviki...@gmail.com javascript:;] [~jamestaylor] I am trying to compare the performance of the map reduce index build vs the regular UPSERT SELECT based index build. One a 1 billion row table with 19 columns the regular index build takes 8.5 hours compared to the map reduce index build which takes ~23 hours. Do you know if there are any special config settings I could use to speed up the MR index build ? MR job to populate index tables Key: PHOENIX-1609 URL: https://issues.apache.org/jira/browse/PHOENIX-1609 Project: Phoenix Issue Type: New Feature Reporter: maghamravikiran Assignee: maghamravikiran Fix For: 5.0.0, 4.4.0 Attachments: 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch, 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch Often, we need to create new indexes on master tables way after the data exists on the master tables. It would be good to have a simple MR job given by the phoenix code that users can call to have indexes in sync with the master table. Users can invoke the MR job using the following command hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt INDEX_TABLE -columns a,b,c Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2152) Ability to create spatial indexes (geohash) with bounding box / radius search
[ https://issues.apache.org/jira/browse/PHOENIX-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654732#comment-14654732 ] Dan Meany commented on PHOENIX-2152: Looking at the long term big picture I wonder if it makes sense to try move in the direction of supporting the same functions as postgis (http://postgis.net/) http://gis.stackexchange.com/questions/112426/postgres-performance-with-radius-searches http://revenant.ca/www/postgis/workshop/index.html In the intermediate term, a bundle of spatial functions (or just the DISTANCE one) packaged natively with Phoenix reduces the complexity of the user having to deploying/update their own udf jars and setting the hbase-site flag to allow udfs on all clients/servers. Ability to create spatial indexes (geohash) with bounding box / radius search - Key: PHOENIX-2152 URL: https://issues.apache.org/jira/browse/PHOENIX-2152 Project: Phoenix Issue Type: New Feature Reporter: Dan Meany Priority: Minor Original Estimate: 672h Remaining Estimate: 672h Add the ability to create spatial indexes such as in Elastic Search, MongoDB, Oracle, etc. http://gis.stackexchange.com/questions/18330/would-it-be-possible-to-use-geohash-for-proximity-searches https://github.com/davetroy/geohash-js -- This message was sent by Atlassian JIRA (v6.3.4#6332)