[jira] [Commented] (PHOENIX-2298) Problem storing with pig on a salted table
[ https://issues.apache.org/jira/browse/PHOENIX-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939863#comment-14939863 ] Guillaume salou commented on PHOENIX-2298: -- Thanks, I am using 4.3.0-1 > Problem storing with pig on a salted table > -- > > Key: PHOENIX-2298 > URL: https://issues.apache.org/jira/browse/PHOENIX-2298 > Project: Phoenix > Issue Type: Bug >Reporter: Guillaume salou > > When I try to upsert via pigStorage on a salted table I get this error. > Store ... using org.apache.phoenix.pig.PhoenixHBaseStorage(); > first field of the table : > CurrentTime() asINTERNALTS:datetime, > This date is not used in the primary key of the table. > Works perfectly on a non salted table. > Caused by: java.lang.RuntimeException: Unable to process column _SALT:BINARY, > innerMessage=org.apache.phoenix.schema.TypeMismatchException: ERROR 203 > (22005): Type mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:66) > at > org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:78) > at > org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:39) > at > org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:182) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.phoenix.schema.ConstraintViolationException: > org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type > mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.schema.types.PDataType.throwConstraintViolationException(PDataType.java:282) > at org.apache.phoenix.schema.types.PDate.toObject(PDate.java:77) > at > org.apache.phoenix.pig.util.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:208) > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.convertTypeSpecificValue(PhoenixPigDBWritable.java:79) > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:59) > ... 21 more > Caused by: org.apache.phoenix.schema.TypeMismatchException: ERROR 203 > (22005): Type mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.exception.SQLExceptionCode$1.newException(SQLExceptionCode.java:68) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133) > ... 26 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: phoenix query server connection properties
Hi Jan, I just commented on the Calcite thread. Phoenix needs to know the connection properties and for each subsequent RPC, needs to know from which connection you came. Otherwise we wouldn't know your tenant ID when the statement is run. Something like a connection or session ID would need to be passed through each RPC (which would perhaps originally be generated and returned from the hypothetical createConnection RPC). Thanks for bringing this up. BJames On Thursday, October 1, 2015, Jan Van Besienwrote: > Hi, > > We are working on a calcite/avatica based "thin" JDBC driver very > similar to what Phoenix has done with for its QueryServer, and I am > looking for some feedback/options. > > Avatica in its current state doesn't have an RPC call for "create > connection". As a consequence, connection properties (i.e. the > Properties instance passed through the > DriverManager.getConnection(url, props)) are currently not RPC-ed from > the client to the server. > > For Phoenix, this means properties such as TenantId, CurrentSCN etc do > not work with the thin driver. I saw this question being asked in > PHOENIX-1824, so I am not sure whether you were aware of this problem. > I've tested it with the phoenix sandbox on master with a multi-tenant > table to be sure. > > There currently is a discussion ongoing on the calcite dev mailing > list on this topic as well (with subject "avatica jdbc URL connection > properties"). > > Our understanding of the problem is that we need to extend the RPC > with a "create connection", but this doesn't seem to be > straightforward in the current Avatica design. > > It would be interesting to hear your thoughts on this subject. > > Thanks > Jan >
[jira] [Commented] (PHOENIX-2298) Problem storing with pig on a salted table
[ https://issues.apache.org/jira/browse/PHOENIX-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939840#comment-14939840 ] maghamravikiran commented on PHOENIX-2298: -- This issue was fixed in PHOENIX-2181. Can you share the Phoenix version you are using. > Problem storing with pig on a salted table > -- > > Key: PHOENIX-2298 > URL: https://issues.apache.org/jira/browse/PHOENIX-2298 > Project: Phoenix > Issue Type: Bug >Reporter: Guillaume salou > > When I try to upsert via pigStorage on a salted table I get this error. > Store ... using org.apache.phoenix.pig.PhoenixHBaseStorage(); > first field of the table : > CurrentTime() asINTERNALTS:datetime, > This date is not used in the primary key of the table. > Works perfectly on a non salted table. > Caused by: java.lang.RuntimeException: Unable to process column _SALT:BINARY, > innerMessage=org.apache.phoenix.schema.TypeMismatchException: ERROR 203 > (22005): Type mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:66) > at > org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:78) > at > org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:39) > at > org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:182) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.phoenix.schema.ConstraintViolationException: > org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type > mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.schema.types.PDataType.throwConstraintViolationException(PDataType.java:282) > at org.apache.phoenix.schema.types.PDate.toObject(PDate.java:77) > at > org.apache.phoenix.pig.util.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:208) > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.convertTypeSpecificValue(PhoenixPigDBWritable.java:79) > at > org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:59) > ... 21 more > Caused by: org.apache.phoenix.schema.TypeMismatchException: ERROR 203 > (22005): Type mismatch. BINARY cannot be coerced to DATE > at > org.apache.phoenix.exception.SQLExceptionCode$1.newException(SQLExceptionCode.java:68) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133) > ... 26 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2270: -- Description: Phoenix should have a physical operator that executes a sort on the server-side which Drill can leverage when re-ordering is necessary. Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix handle, the sort is more of a gray area. Phoenix will be faster in the way it does the scan within the coprocessor, but it still needs to return the same number of rows. This process puts a pretty heavy burden on the region server as well. We should measure performance with and without Phoenix doing the sort. One potential scenario that may be a win for Phoenix is if the rows are already partially sorted and Phoenix can take advantage of this (which is not currently the case). (was: Phoenix should have a physical operator that executes a sort on the server-side which Drill can leverage when re-ordering is necessary. Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix handle, the sort is more of a gray area. Phoenix will be faster in the way it does the scan within the coprocessor, but it still needs to return the same number of rows. This process puts a pretty heavy burden on the region server as well. We should measure performance with and without Phoenix doing the sort.) > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Phoenix should have a physical operator that executes a sort on the > server-side which Drill can leverage when re-ordering is necessary. Unlike > PHOENIX-2269 which is clearing going to be more efficient to let Phoenix > handle, the sort is more of a gray area. Phoenix will be faster in the way it > does the scan within the coprocessor, but it still needs to return the same > number of rows. This process puts a pretty heavy burden on the region server > as well. We should measure performance with and without Phoenix doing the > sort. One potential scenario that may be a win for Phoenix is if the rows are > already partially sorted and Phoenix can take advantage of this (which is not > currently the case). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2269) Implement Drill-specific rule for first level server-side aggregation
[ https://issues.apache.org/jira/browse/PHOENIX-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2269: -- Description: Drill has already modeled aggregation as a distributed, multi-step process that includes an initial aggregation step followed by a shuffle. Phoenix should have a physical operator that executes the initial aggregation step on the server-side. On the Phoenix side, we may be able to use our same AggregatePlan since we grab the underlying Scan from the QueryPlan as part of the information passed to the DrillBit. As long as the TupleProjector contains the correct KeyValueSchema to interpret the results coming back from the Scan, we have everything we need. We just need to make sure that Drill uses our physical operator. (was: This includes two sub-tasks: 1. Model server-side partial aggregate and client-side merge aggregate as Phoenix physical operators. 2. Split Phoenix AggregatePlan into two parts accordingly.) > Implement Drill-specific rule for first level server-side aggregation > - > > Key: PHOENIX-2269 > URL: https://issues.apache.org/jira/browse/PHOENIX-2269 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Drill has already modeled aggregation as a distributed, multi-step process > that includes an initial aggregation step followed by a shuffle. Phoenix > should have a physical operator that executes the initial aggregation step on > the server-side. On the Phoenix side, we may be able to use our same > AggregatePlan since we grab the underlying Scan from the QueryPlan as part of > the information passed to the DrillBit. As long as the TupleProjector > contains the correct KeyValueSchema to interpret the results coming back from > the Scan, we have everything we need. We just need to make sure that Drill > uses our physical operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (PHOENIX-1519) Determine the usefulness of tracing
[ https://issues.apache.org/jira/browse/PHOENIX-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishani reassigned PHOENIX-1519: - Assignee: Nishani > Determine the usefulness of tracing > --- > > Key: PHOENIX-1519 > URL: https://issues.apache.org/jira/browse/PHOENIX-1519 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Nishani > > In order to help prioritize the child JIRAs under PHOENIX-1121, we need to do > a usability study around tracing. What problem(s) are we trying to solve and > what's the delta between the existing tracing feature and the required > tracing features. Are we capturing enough information to be useful? Is it > structured in a such a way to lend itself to analysis? > For example, is the main use case of tracing to determine the top 10 queries > that take the longest amount of time in production? Would we then want > tracing on all the time, so having a way to enable/disable tracing > (PHOENIX-1433, PHOENIX-1191, PHOENIX-1518, PHOENIX-1115) becomes moot. > Or is it more of something that would be used in a perf testing environment > where we'd need to information generated to be only for a particular query > (in which case the above JIRAs are more important)? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2270: -- Summary: Implement Drill-specific rule for first level server-side sort (was: Split sort into server-side partial sort and client-side merge sort) > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > This has two parts: > 1. Model server-side partial sort and client-side merge sort as Phoenix > physical operators (rels). > 2. Split Phoenix ScanPlan and AggregatePlan with orderBy accordingly. > This might help PHOENIX-2262 too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2270: -- Description: Phoenix should have a physical operator that executes a sort on the server-side which Drill can leverage when re-ordering is necessary. Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix handle, the sort is more of a gray area. Phoenix will be faster in the way it does the scan within the coprocessor, but it still needs to return the same number of rows. This process puts a pretty heavy burden on the region server as well. We should measure performance with and without Phoenix doing the sort. (was: Phoenix should have a physical operator that executes a sort on the server-side which Drill can leverage when re-ordering is necessary. ) > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Phoenix should have a physical operator that executes a sort on the > server-side which Drill can leverage when re-ordering is necessary. Unlike > PHOENIX-2269 which is clearing going to be more efficient to let Phoenix > handle, the sort is more of a gray area. Phoenix will be faster in the way it > does the scan within the coprocessor, but it still needs to return the same > number of rows. This process puts a pretty heavy burden on the region server > as well. We should measure performance with and without Phoenix doing the > sort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2269) Implement Drill-specific rule for first level server-side aggregation
[ https://issues.apache.org/jira/browse/PHOENIX-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2269: -- Summary: Implement Drill-specific rule for first level server-side aggregation (was: Split aggregate into server-side partial aggregate and client-side merge aggregate) > Implement Drill-specific rule for first level server-side aggregation > - > > Key: PHOENIX-2269 > URL: https://issues.apache.org/jira/browse/PHOENIX-2269 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > This includes two sub-tasks: > 1. Model server-side partial aggregate and client-side merge aggregate as > Phoenix physical operators. > 2. Split Phoenix AggregatePlan into two parts accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2270: -- Description: Phoenix should have a physical operator that executes a sort on the server-side which Drill can leverage when re-ordering is necessary. (was: This has two parts: 1. Model server-side partial sort and client-side merge sort as Phoenix physical operators (rels). 2. Split Phoenix ScanPlan and AggregatePlan with orderBy accordingly. This might help PHOENIX-2262 too.) > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Phoenix should have a physical operator that executes a sort on the > server-side which Drill can leverage when re-ordering is necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-2300) Support the use of materialized views to leverage secondary indexes
James Taylor created PHOENIX-2300: - Summary: Support the use of materialized views to leverage secondary indexes Key: PHOENIX-2300 URL: https://issues.apache.org/jira/browse/PHOENIX-2300 Project: Phoenix Issue Type: Sub-task Reporter: James Taylor In the Phoenix/Calcite world, secondary indexes are modeled as materialized views. We need to figure out in the Phoenix/Drill world how to get Drill to use materialized views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2300) Support the use of secondary indexes in Drill
[ https://issues.apache.org/jira/browse/PHOENIX-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-2300: -- Summary: Support the use of secondary indexes in Drill (was: Support the use of materialized views to leverage secondary indexes) > Support the use of secondary indexes in Drill > - > > Key: PHOENIX-2300 > URL: https://issues.apache.org/jira/browse/PHOENIX-2300 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor > > In the Phoenix/Calcite world, secondary indexes are modeled as materialized > views. We need to figure out in the Phoenix/Drill world how to get Drill to > use materialized views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940330#comment-14940330 ] Maryann Xue commented on PHOENIX-2270: -- [~jnadeau] Implemented this in https://github.com/jacques-n/drill/pull/4, but not sure if the sort on the Drill side is a merge instead of real sort. Could you please verify? The query plan is (also printed by the test case): {code} 00-00Screen 00-01 Project(B=[$0], E1=[$1], E2=[$2], R=[$3]) 00-02SelectionVectorRemover 00-03 Sort(sort0=[$1], dir0=[DESC]) 00-04Project(B=[$0], E1=[$1], E2=[$2], R=[$3]) PhoenixServerSort(sort0=[$1], dir0=[DESC]) PhoenixTableScan(table=[[PHOENIX, A, BEER]], filter=[>=($1, 1)]) 00-05 Phoenix { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ { "kind" : "LONG", "type" : "SESSION", "name" : "planner.width.max_per_node", "num_val" : 2 } ], "queue" : 0, "resultMode" : "EXEC" }, "graph" : [ { "pop" : "jdbc-scan", "@id" : 0, "scans" : [ "CgMKATASMQoNc2NhblByb2plY3RvchIgAP8EBAQBGAIBAhkCATACRTECGQIBMAJFMgIZAgEwAVISJQoKY29sdW1uSW5mbxIXBAFCAkUxAkUyAVISFwoSX05vbkFnZ3JlZ2F0ZVF1ZXJ5EgEBEiUKBV9Ub3BOEhyMAUAAAP8AAQAARRkCAP8EBAEABSQyLkUxKvABCilvcmcuYXBhY2hlLmhhZG9vcC5oYmFzZS5maWx0ZXIuRmlsdGVyTGlzdBLCAQgBElMKPG9yZy5hcGFjaGUucGhvZW5peC5maWx0ZXIuU2luZ2xlQ0ZDUUtleVZhbHVlQ29tcGFyaXNvbkZpbHRlchITFwQCAhkCATACRTEDBYEGAxJpCjBvcmcuYXBhY2hlLnBob2VuaXguZmlsdGVyLkNvbHVtblByb2plY3Rpb25GaWx0ZXISNQAAABUfiwgAMwAAId/b9AEBFR+LCAAzAAAh39v0AQAAMgwIABD//384AUAB" ], "config" : { "type" : "phoenix", "url" : "jdbc:phoenix:localhost", "enabled" : true }, "table" : "A.BEER", "userName" : "", "cost" : 0.0 }, { "pop" : "project", "@id" : 4, "exprs" : [ { "ref" : "`B`", "expr" : "`B`" }, { "ref" : "`E1`", "expr" : "`E1`" }, { "ref" : "`E2`", "expr" : "`E2`" }, { "ref" : "`R`", "expr" : "`R`" } ], "child" : 0, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 50.0 }, { "pop" : "external-sort", "@id" : 3, "child" : 4, "orderings" : [ { "expr" : "`E1`", "order" : "DESC", "nullDirection" : "UNSPECIFIED" } ], "reverse" : false, "initialAllocation" : 2000, "maxAllocation" : 100, "cost" : 50.0 }, { "pop" : "selection-vector-remover", "@id" : 2, "child" : 3, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 50.0 }, { "pop" : "project", "@id" : 1, "exprs" : [ { "ref" : "`B`", "expr" : "`B`" }, { "ref" : "`E1`", "expr" : "`E1`" }, { "ref" : "`E2`", "expr" : "`E2`" }, { "ref" : "`R`", "expr" : "`R`" } ], "child" : 2, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 50.0 }, { "pop" : "screen", "@id" : 0, "child" : 1, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 50.0 } ] } {code} > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Phoenix should have a physical operator that executes a sort on the > server-side which Drill can leverage when re-ordering is necessary. Unlike > PHOENIX-2269 which is clearing going to be more efficient to let Phoenix > handle, the sort is more of a gray area. Phoenix will be faster in the way it > does the scan within the coprocessor, but it still needs to return the same > number of rows. This process puts a pretty heavy burden on the region server > as well. We should measure performance with and without Phoenix doing the > sort. One potential scenario that may be a win for Phoenix is if the rows are > already partially sorted and Phoenix can take advantage of this (which is not > currently the case). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2300) Support the use of secondary indexes in Drill
[ https://issues.apache.org/jira/browse/PHOENIX-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940401#comment-14940401 ] Julian Hyde commented on PHOENIX-2300: -- Phoenix introduces indexes into the plan via a planning step that models indexes as a kind of materialized views. We need to package up Phoenix's planning process into a series of steps that Drill can call, and this would be one of the steps. Calcite has an abstraction called [Program|http://calcite.incubator.apache.org/apidocs/org/apache/calcite/tools/Program.html] that is basically a step in a multi-phase planning process. I think Drill should extend its storage plugin SPI and allow plugins to declare programs to be run at various points in the planning process. Drill would define multiple phases analogous to how Maven defines phases 'generate-sources', 'compile', 'test' etc. > Support the use of secondary indexes in Drill > - > > Key: PHOENIX-2300 > URL: https://issues.apache.org/jira/browse/PHOENIX-2300 > Project: Phoenix > Issue Type: Sub-task >Reporter: James Taylor > Labels: drill > > In the Phoenix/Calcite world, secondary indexes are modeled as materialized > views. We need to figure out in the Phoenix/Drill world how to get Drill to > use materialized views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-2301) NullPointerException when upserting into a char array column
Julian Jaffe created PHOENIX-2301: - Summary: NullPointerException when upserting into a char array column Key: PHOENIX-2301 URL: https://issues.apache.org/jira/browse/PHOENIX-2301 Project: Phoenix Issue Type: Bug Reporter: Julian Jaffe Attempting to upsert into a char array causes an NPE. Minimum example: {code:sql} 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testIntArray" INTEGER[], CONSTRAINT "test_pk" PRIMARY KEY("testIntArray")) DEFAULT_COLUMN_FAMILY='T'; No rows affected (1.28 seconds) 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY[1, 2, 3]); 1 row affected (0.184 seconds) 0: jdbc:phoenix:xx> SELECT * FROM TEST; +--+ | testIntArray | +--+ | [1, 2, 3]| +--+ 1 row selected (0.308 seconds) 0: jdbc:phoenix:xx> DROP TABLE IF EXISTS TEST; No rows affected (3.348 seconds) 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testCharArray" CHAR(3)[], CONSTRAINT "test_pk" PRIMARY KEY("testCharArray")) DEFAULT_COLUMN_FAMILY='T'; No rows affected (1.446 seconds) 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY['aaa', 'bbb', 'ccc']); java.lang.NullPointerException at org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1123) at org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:338) at org.apache.phoenix.schema.types.PCharArray.toObject(PCharArray.java:64) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:967) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1008) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1004) at org.apache.phoenix.util.SchemaUtil.toString(SchemaUtil.java:381) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:572) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:117) at org.apache.phoenix.compile.UpsertCompiler.access$400(UpsertCompiler.java:98) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:821) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:319) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:311) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:309) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1432) at sqlline.Commands.execute(Commands.java:822) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:808) at sqlline.SqlLine.begin(SqlLine.java:681) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:292) 0: jdbc:phoenix:xx> SELECT * FROM TEST; +---+ | testCharArray | +---+ +---+ No rows selected (0.169 seconds) 0: jdbc:phoenix:xx> SELECT "testCharArray" FROM TEST; +---+ | testCharArray | +---+ +---+ No rows selected (0.182 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2285) phoenix.query.timeoutMs doesn't allow callers to set the timeout to less than 1 second
[ https://issues.apache.org/jira/browse/PHOENIX-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940685#comment-14940685 ] Jan Fernando commented on PHOENIX-2285: --- I'll get the repo setup and tackle this tomorrow. > phoenix.query.timeoutMs doesn't allow callers to set the timeout to less than > 1 second > -- > > Key: PHOENIX-2285 > URL: https://issues.apache.org/jira/browse/PHOENIX-2285 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.5.2 >Reporter: Jan Fernando >Assignee: Jan Fernando > Attachments: PHOENIX-2285-v1.txt, PHOENIX-2285-v2.txt > > > When creating a Phoenix JDBC connection I have a use case where I want to > override the default value of phoenix.query.timeoutMs to a value of 200 ms. > Currently if you set phoenix.query.timeoutMs to less than 1000 ms, the > timeout gets rounded up to 1000ms. This is because in > PhoenixStatement.getDefaultQueryTimeout() we convert the value of > phoenix.query.timeoutMs to seconds in order to be compliant with JDBC. In > BaseResultIterators we then convert it back to millis. As a result of the > conversion we loose the millisecond fidelity. > A possible solution is to store the timeout value stored on the > PhoenixStatement in both seconds and milliseconds. Then, in > BaseResultIterators when we read the value from the statement we can check if > the value exists in millisecond fidelity and if so use that value. Otherwise > we would use the value in second granularity and convert. > This would allow Phoenix to remain JDBC compatible with second level > granularity for setting query timeouts on statements, but allow millisecond > granularity of timeouts by explicitly setting phoenix.query.timeoutMs on > connection properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort
[ https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940626#comment-14940626 ] Maryann Xue commented on PHOENIX-2270: -- [~jamestaylor] Like your idea of perf measuring this, and we could probably find out which cases may benefit from Phoenix partial sort and which may not, and model the rels' cost accordingly. > Implement Drill-specific rule for first level server-side sort > -- > > Key: PHOENIX-2270 > URL: https://issues.apache.org/jira/browse/PHOENIX-2270 > Project: Phoenix > Issue Type: Sub-task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite, drill > > Phoenix should have a physical operator that executes a sort on the > server-side which Drill can leverage when re-ordering is necessary. Unlike > PHOENIX-2269 which is clearing going to be more efficient to let Phoenix > handle, the sort is more of a gray area. Phoenix will be faster in the way it > does the scan within the coprocessor, but it still needs to return the same > number of rows. This process puts a pretty heavy burden on the region server > as well. We should measure performance with and without Phoenix doing the > sort. One potential scenario that may be a win for Phoenix is if the rows are > already partially sorted and Phoenix can take advantage of this (which is not > currently the case). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2301) NullPointerException when upserting into a char array column
[ https://issues.apache.org/jira/browse/PHOENIX-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940759#comment-14940759 ] James Taylor commented on PHOENIX-2301: --- [~jja...@marinsoftware.com] - what version of Phoenix are you currently using? [~Dumindux] - do you think you'd have a few spare cycles to take a look? > NullPointerException when upserting into a char array column > > > Key: PHOENIX-2301 > URL: https://issues.apache.org/jira/browse/PHOENIX-2301 > Project: Phoenix > Issue Type: Bug >Reporter: Julian Jaffe > > Attempting to upsert into a char array causes an NPE. Minimum example: > {code:sql} > 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testIntArray" > INTEGER[], CONSTRAINT "test_pk" PRIMARY KEY("testIntArray")) > DEFAULT_COLUMN_FAMILY='T'; > No rows affected (1.28 seconds) > 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY[1, 2, 3]); > 1 row affected (0.184 seconds) > 0: jdbc:phoenix:xx> SELECT * FROM TEST; > +--+ > | testIntArray | > +--+ > | [1, 2, 3]| > +--+ > 1 row selected (0.308 seconds) > 0: jdbc:phoenix:xx> DROP TABLE IF EXISTS TEST; > No rows affected (3.348 seconds) > 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testCharArray" > CHAR(3)[], CONSTRAINT "test_pk" PRIMARY KEY("testCharArray")) > DEFAULT_COLUMN_FAMILY='T'; > No rows affected (1.446 seconds) > 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY['aaa', 'bbb', 'ccc']); > java.lang.NullPointerException > at > org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1123) > at > org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:338) > at > org.apache.phoenix.schema.types.PCharArray.toObject(PCharArray.java:64) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:967) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1008) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1004) > at org.apache.phoenix.util.SchemaUtil.toString(SchemaUtil.java:381) > at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:572) > at > org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:117) > at > org.apache.phoenix.compile.UpsertCompiler.access$400(UpsertCompiler.java:98) > at > org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:821) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:319) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:311) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:309) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1432) > at sqlline.Commands.execute(Commands.java:822) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:808) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > 0: jdbc:phoenix:xx> SELECT * FROM TEST; > +---+ > | testCharArray | > +---+ > +---+ > No rows selected (0.169 seconds) > 0: jdbc:phoenix:xx> SELECT "testCharArray" FROM TEST; > +---+ > | testCharArray | > +---+ > +---+ > No rows selected (0.182 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
phoenix query server connection properties
Hi, We are working on a calcite/avatica based "thin" JDBC driver very similar to what Phoenix has done with for its QueryServer, and I am looking for some feedback/options. Avatica in its current state doesn't have an RPC call for "create connection". As a consequence, connection properties (i.e. the Properties instance passed through the DriverManager.getConnection(url, props)) are currently not RPC-ed from the client to the server. For Phoenix, this means properties such as TenantId, CurrentSCN etc do not work with the thin driver. I saw this question being asked in PHOENIX-1824, so I am not sure whether you were aware of this problem. I've tested it with the phoenix sandbox on master with a multi-tenant table to be sure. There currently is a discussion ongoing on the calcite dev mailing list on this topic as well (with subject "avatica jdbc URL connection properties"). Our understanding of the problem is that we need to extend the RPC with a "create connection", but this doesn't seem to be straightforward in the current Avatica design. It would be interesting to hear your thoughts on this subject. Thanks Jan