[jira] [Commented] (PHOENIX-2298) Problem storing with pig on a salted table

2015-10-01 Thread Guillaume salou (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939863#comment-14939863
 ] 

Guillaume salou commented on PHOENIX-2298:
--

Thanks, I am using 4.3.0-1

> Problem storing with pig on a salted table
> --
>
> Key: PHOENIX-2298
> URL: https://issues.apache.org/jira/browse/PHOENIX-2298
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Guillaume salou
>
> When I try to upsert via pigStorage on a salted table I get this error.
> Store ... using org.apache.phoenix.pig.PhoenixHBaseStorage();
> first field of the table :
> CurrentTime() asINTERNALTS:datetime,
> This date is not used in the primary key of the table.
> Works perfectly on a non salted table.
> Caused by: java.lang.RuntimeException: Unable to process column _SALT:BINARY, 
> innerMessage=org.apache.phoenix.schema.TypeMismatchException: ERROR 203 
> (22005): Type mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:66)
>   at 
> org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:78)
>   at 
> org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:39)
>   at 
> org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:182)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.phoenix.schema.ConstraintViolationException: 
> org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type 
> mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.schema.types.PDataType.throwConstraintViolationException(PDataType.java:282)
>   at org.apache.phoenix.schema.types.PDate.toObject(PDate.java:77)
>   at 
> org.apache.phoenix.pig.util.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:208)
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.convertTypeSpecificValue(PhoenixPigDBWritable.java:79)
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:59)
>   ... 21 more
> Caused by: org.apache.phoenix.schema.TypeMismatchException: ERROR 203 
> (22005): Type mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.exception.SQLExceptionCode$1.newException(SQLExceptionCode.java:68)
>   at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133)
>   ... 26 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: phoenix query server connection properties

2015-10-01 Thread James Taylor
Hi Jan,

I just commented on the Calcite thread. Phoenix needs to know the
connection properties and for each subsequent RPC, needs to know from which
connection you came. Otherwise we wouldn't know your tenant ID when the
statement is run. Something like a connection or session ID would need to
be passed through each RPC (which would perhaps originally be
generated and returned
from the hypothetical createConnection RPC).

Thanks for bringing this up.

  BJames

On Thursday, October 1, 2015, Jan Van Besien  wrote:

> Hi,
>
> We are working on a calcite/avatica based "thin" JDBC driver very
> similar to what Phoenix has done with for its QueryServer, and I am
> looking for some feedback/options.
>
> Avatica in its current state doesn't have an RPC call for "create
> connection". As a consequence, connection properties (i.e. the
> Properties instance passed through the
> DriverManager.getConnection(url, props)) are currently not RPC-ed from
> the client to the server.
>
> For Phoenix, this means properties such as TenantId, CurrentSCN etc do
> not work with the thin driver. I saw this question being asked in
> PHOENIX-1824, so I am not sure whether you were aware of this problem.
> I've tested it with the phoenix sandbox on master with a multi-tenant
> table to be sure.
>
> There currently is a discussion ongoing on the calcite dev mailing
> list on this topic as well (with subject "avatica jdbc URL connection
> properties").
>
> Our understanding of the problem is that we need to extend the RPC
> with a "create connection", but this doesn't seem to be
> straightforward in the current Avatica design.
>
> It would be interesting to hear your thoughts on this subject.
>
> Thanks
> Jan
>


[jira] [Commented] (PHOENIX-2298) Problem storing with pig on a salted table

2015-10-01 Thread maghamravikiran (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939840#comment-14939840
 ] 

maghamravikiran commented on PHOENIX-2298:
--

This issue was fixed in PHOENIX-2181. Can you share the Phoenix version you are 
using. 

> Problem storing with pig on a salted table
> --
>
> Key: PHOENIX-2298
> URL: https://issues.apache.org/jira/browse/PHOENIX-2298
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Guillaume salou
>
> When I try to upsert via pigStorage on a salted table I get this error.
> Store ... using org.apache.phoenix.pig.PhoenixHBaseStorage();
> first field of the table :
> CurrentTime() asINTERNALTS:datetime,
> This date is not used in the primary key of the table.
> Works perfectly on a non salted table.
> Caused by: java.lang.RuntimeException: Unable to process column _SALT:BINARY, 
> innerMessage=org.apache.phoenix.schema.TypeMismatchException: ERROR 203 
> (22005): Type mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:66)
>   at 
> org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:78)
>   at 
> org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:39)
>   at 
> org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:182)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.phoenix.schema.ConstraintViolationException: 
> org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type 
> mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.schema.types.PDataType.throwConstraintViolationException(PDataType.java:282)
>   at org.apache.phoenix.schema.types.PDate.toObject(PDate.java:77)
>   at 
> org.apache.phoenix.pig.util.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:208)
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.convertTypeSpecificValue(PhoenixPigDBWritable.java:79)
>   at 
> org.apache.phoenix.pig.writable.PhoenixPigDBWritable.write(PhoenixPigDBWritable.java:59)
>   ... 21 more
> Caused by: org.apache.phoenix.schema.TypeMismatchException: ERROR 203 
> (22005): Type mismatch. BINARY cannot be coerced to DATE
>   at 
> org.apache.phoenix.exception.SQLExceptionCode$1.newException(SQLExceptionCode.java:68)
>   at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:133)
>   ... 26 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2270:
--
Description: Phoenix should have a physical operator that executes a sort 
on the server-side which Drill can leverage when re-ordering is necessary. 
Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
handle, the sort is more of a gray area. Phoenix will be faster in the way it 
does the scan within the coprocessor, but it still needs to return the same 
number of rows. This process puts a pretty heavy burden on the region server as 
well. We should measure performance with and without Phoenix doing the sort. 
One potential scenario that may be a win for Phoenix is if the rows are already 
partially sorted and Phoenix can take advantage of this (which is not currently 
the case).  (was: Phoenix should have a physical operator that executes a sort 
on the server-side which Drill can leverage when re-ordering is necessary. 
Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
handle, the sort is more of a gray area. Phoenix will be faster in the way it 
does the scan within the coprocessor, but it still needs to return the same 
number of rows. This process puts a pretty heavy burden on the region server as 
well. We should measure performance with and without Phoenix doing the sort.)

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort. One potential scenario that may be a win for Phoenix is if the rows are 
> already partially sorted and Phoenix can take advantage of this (which is not 
> currently the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2269) Implement Drill-specific rule for first level server-side aggregation

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2269:
--
Description: Drill has already modeled aggregation as a distributed, 
multi-step process that includes an initial aggregation step followed by a 
shuffle. Phoenix should have a physical operator that executes the initial 
aggregation step on the server-side. On the Phoenix side, we may be able to use 
our same AggregatePlan since we grab the underlying Scan from the QueryPlan as 
part of the information passed to the DrillBit. As long as the TupleProjector 
contains the correct KeyValueSchema to interpret the results coming back from 
the Scan, we have everything we need. We just need to make sure that Drill uses 
our physical operator.  (was: This includes two sub-tasks:

1. Model server-side partial aggregate and client-side merge aggregate as 
Phoenix physical operators.

2. Split Phoenix AggregatePlan into two parts accordingly.)

> Implement Drill-specific rule for first level server-side aggregation
> -
>
> Key: PHOENIX-2269
> URL: https://issues.apache.org/jira/browse/PHOENIX-2269
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Drill has already modeled aggregation as a distributed, multi-step process 
> that includes an initial aggregation step followed by a shuffle. Phoenix 
> should have a physical operator that executes the initial aggregation step on 
> the server-side. On the Phoenix side, we may be able to use our same 
> AggregatePlan since we grab the underlying Scan from the QueryPlan as part of 
> the information passed to the DrillBit. As long as the TupleProjector 
> contains the correct KeyValueSchema to interpret the results coming back from 
> the Scan, we have everything we need. We just need to make sure that Drill 
> uses our physical operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-1519) Determine the usefulness of tracing

2015-10-01 Thread Nishani (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishani  reassigned PHOENIX-1519:
-

Assignee: Nishani 

> Determine the usefulness of tracing
> ---
>
> Key: PHOENIX-1519
> URL: https://issues.apache.org/jira/browse/PHOENIX-1519
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Nishani 
>
> In order to help prioritize the child JIRAs under PHOENIX-1121, we need to do 
> a usability study around tracing. What problem(s) are we trying to solve and 
> what's the delta between the existing tracing feature and the required 
> tracing features. Are we capturing enough information to be useful? Is it 
> structured in a such a way to lend itself to analysis? 
> For example, is the main use case of tracing to determine the top 10 queries 
> that take the longest amount of time in production? Would we then want 
> tracing on all the time, so having a way to enable/disable tracing 
> (PHOENIX-1433, PHOENIX-1191, PHOENIX-1518, PHOENIX-1115) becomes moot.
> Or is it more of something that would be used in a perf testing environment 
> where we'd need to information generated to be only for a particular query 
> (in which case the above JIRAs are more important)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2270:
--
Summary: Implement Drill-specific rule for first level server-side sort  
(was: Split sort into server-side partial sort and client-side merge sort)

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> This has two parts:
> 1. Model server-side partial sort and client-side merge sort as Phoenix 
> physical operators (rels).
> 2. Split Phoenix ScanPlan and AggregatePlan with orderBy accordingly.
> This might help PHOENIX-2262 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2270:
--
Description: Phoenix should have a physical operator that executes a sort 
on the server-side which Drill can leverage when re-ordering is necessary. 
Unlike PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
handle, the sort is more of a gray area. Phoenix will be faster in the way it 
does the scan within the coprocessor, but it still needs to return the same 
number of rows. This process puts a pretty heavy burden on the region server as 
well. We should measure performance with and without Phoenix doing the sort.  
(was: Phoenix should have a physical operator that executes a sort on the 
server-side which Drill can leverage when re-ordering is necessary. )

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2269) Implement Drill-specific rule for first level server-side aggregation

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2269:
--
Summary: Implement Drill-specific rule for first level server-side 
aggregation  (was: Split aggregate into server-side partial aggregate and 
client-side merge aggregate)

> Implement Drill-specific rule for first level server-side aggregation
> -
>
> Key: PHOENIX-2269
> URL: https://issues.apache.org/jira/browse/PHOENIX-2269
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> This includes two sub-tasks:
> 1. Model server-side partial aggregate and client-side merge aggregate as 
> Phoenix physical operators.
> 2. Split Phoenix AggregatePlan into two parts accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2270:
--
Description: Phoenix should have a physical operator that executes a sort 
on the server-side which Drill can leverage when re-ordering is necessary.   
(was: This has two parts:

1. Model server-side partial sort and client-side merge sort as Phoenix 
physical operators (rels).

2. Split Phoenix ScanPlan and AggregatePlan with orderBy accordingly.

This might help PHOENIX-2262 too.)

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2300) Support the use of materialized views to leverage secondary indexes

2015-10-01 Thread James Taylor (JIRA)
James Taylor created PHOENIX-2300:
-

 Summary: Support the use of materialized views to leverage 
secondary indexes
 Key: PHOENIX-2300
 URL: https://issues.apache.org/jira/browse/PHOENIX-2300
 Project: Phoenix
  Issue Type: Sub-task
Reporter: James Taylor


In the Phoenix/Calcite world, secondary indexes are modeled as materialized 
views. We need to figure out in the Phoenix/Drill world how to get Drill to use 
materialized views.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2300) Support the use of secondary indexes in Drill

2015-10-01 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2300:
--
Summary: Support the use of secondary indexes in Drill  (was: Support the 
use of materialized views to leverage secondary indexes)

> Support the use of secondary indexes in Drill
> -
>
> Key: PHOENIX-2300
> URL: https://issues.apache.org/jira/browse/PHOENIX-2300
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James Taylor
>
> In the Phoenix/Calcite world, secondary indexes are modeled as materialized 
> views. We need to figure out in the Phoenix/Drill world how to get Drill to 
> use materialized views.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940330#comment-14940330
 ] 

Maryann Xue commented on PHOENIX-2270:
--

[~jnadeau] Implemented this in https://github.com/jacques-n/drill/pull/4, but 
not sure if the sort on the Drill side is a merge instead of real sort. Could 
you please verify?
The query plan is (also printed by the test case):
{code}
00-00Screen
00-01  Project(B=[$0], E1=[$1], E2=[$2], R=[$3])
00-02SelectionVectorRemover
00-03  Sort(sort0=[$1], dir0=[DESC])
00-04Project(B=[$0], E1=[$1], E2=[$2], R=[$3])
   PhoenixServerSort(sort0=[$1], dir0=[DESC])
 PhoenixTableScan(table=[[PHOENIX, A, BEER]], 
filter=[>=($1, 1)])
00-05  Phoenix
{
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ {
  "kind" : "LONG",
  "type" : "SESSION",
  "name" : "planner.width.max_per_node",
  "num_val" : 2
} ],
"queue" : 0,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "jdbc-scan",
"@id" : 0,
"scans" : [ 
"CgMKATASMQoNc2NhblByb2plY3RvchIgAP8EBAQBGAIBAhkCATACRTECGQIBMAJFMgIZAgEwAVISJQoKY29sdW1uSW5mbxIXBAFCAkUxAkUyAVISFwoSX05vbkFnZ3JlZ2F0ZVF1ZXJ5EgEBEiUKBV9Ub3BOEhyMAUAAAP8AAQAARRkCAP8EBAEABSQyLkUxKvABCilvcmcuYXBhY2hlLmhhZG9vcC5oYmFzZS5maWx0ZXIuRmlsdGVyTGlzdBLCAQgBElMKPG9yZy5hcGFjaGUucGhvZW5peC5maWx0ZXIuU2luZ2xlQ0ZDUUtleVZhbHVlQ29tcGFyaXNvbkZpbHRlchITFwQCAhkCATACRTEDBYEGAxJpCjBvcmcuYXBhY2hlLnBob2VuaXguZmlsdGVyLkNvbHVtblByb2plY3Rpb25GaWx0ZXISNQAAABUfiwgAMwAAId/b9AEBFR+LCAAzAAAh39v0AQAAMgwIABD//384AUAB"
 ],
"config" : {
  "type" : "phoenix",
  "url" : "jdbc:phoenix:localhost",
  "enabled" : true
},
"table" : "A.BEER",
"userName" : "",
"cost" : 0.0
  }, {
"pop" : "project",
"@id" : 4,
"exprs" : [ {
  "ref" : "`B`",
  "expr" : "`B`"
}, {
  "ref" : "`E1`",
  "expr" : "`E1`"
}, {
  "ref" : "`E2`",
  "expr" : "`E2`"
}, {
  "ref" : "`R`",
  "expr" : "`R`"
} ],
"child" : 0,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "external-sort",
"@id" : 3,
"child" : 4,
"orderings" : [ {
  "expr" : "`E1`",
  "order" : "DESC",
  "nullDirection" : "UNSPECIFIED"
} ],
"reverse" : false,
"initialAllocation" : 2000,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "selection-vector-remover",
"@id" : 2,
"child" : 3,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "project",
"@id" : 1,
"exprs" : [ {
  "ref" : "`B`",
  "expr" : "`B`"
}, {
  "ref" : "`E1`",
  "expr" : "`E1`"
}, {
  "ref" : "`E2`",
  "expr" : "`E2`"
}, {
  "ref" : "`R`",
  "expr" : "`R`"
} ],
"child" : 2,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "screen",
"@id" : 0,
"child" : 1,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  } ]
}
{code} 

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort. One potential scenario that may be a win for Phoenix is if the rows are 
> already partially sorted and Phoenix can take advantage of this (which is not 
> currently the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2300) Support the use of secondary indexes in Drill

2015-10-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940401#comment-14940401
 ] 

Julian Hyde commented on PHOENIX-2300:
--

Phoenix introduces indexes into the plan via a planning step that models 
indexes as a kind of materialized views. We need to package up Phoenix's 
planning process into a series of steps that Drill can call, and this would be 
one of the steps.

Calcite has an abstraction called 
[Program|http://calcite.incubator.apache.org/apidocs/org/apache/calcite/tools/Program.html]
 that is basically a step in a multi-phase planning process. I think Drill 
should extend its storage plugin SPI and allow plugins to declare programs to 
be run at various points in the planning process. Drill would define multiple 
phases analogous to how Maven defines phases 'generate-sources', 'compile', 
'test' etc.

> Support the use of secondary indexes in Drill
> -
>
> Key: PHOENIX-2300
> URL: https://issues.apache.org/jira/browse/PHOENIX-2300
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James Taylor
>  Labels: drill
>
> In the Phoenix/Calcite world, secondary indexes are modeled as materialized 
> views. We need to figure out in the Phoenix/Drill world how to get Drill to 
> use materialized views.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2301) NullPointerException when upserting into a char array column

2015-10-01 Thread Julian Jaffe (JIRA)
Julian Jaffe created PHOENIX-2301:
-

 Summary: NullPointerException when upserting into a char array 
column
 Key: PHOENIX-2301
 URL: https://issues.apache.org/jira/browse/PHOENIX-2301
 Project: Phoenix
  Issue Type: Bug
Reporter: Julian Jaffe


Attempting to upsert into a char array causes an NPE. Minimum example:

{code:sql}
0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testIntArray" 
INTEGER[], CONSTRAINT "test_pk" PRIMARY KEY("testIntArray")) 
DEFAULT_COLUMN_FAMILY='T';
No rows affected (1.28 seconds)
0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY[1, 2, 3]);
1 row affected (0.184 seconds)
0: jdbc:phoenix:xx> SELECT * FROM TEST;
+--+
|   testIntArray   |
+--+
| [1, 2, 3]|
+--+
1 row selected (0.308 seconds)
0: jdbc:phoenix:xx> DROP TABLE IF EXISTS TEST;
No rows affected (3.348 seconds)
0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testCharArray" 
CHAR(3)[], CONSTRAINT "test_pk" PRIMARY KEY("testCharArray")) 
DEFAULT_COLUMN_FAMILY='T';
No rows affected (1.446 seconds)
0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY['aaa', 'bbb', 'ccc']);
java.lang.NullPointerException
at 
org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1123)
at 
org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:338)
at 
org.apache.phoenix.schema.types.PCharArray.toObject(PCharArray.java:64)
at 
org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:967)
at 
org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1008)
at 
org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1004)
at org.apache.phoenix.util.SchemaUtil.toString(SchemaUtil.java:381)
at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:572)
at 
org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:117)
at 
org.apache.phoenix.compile.UpsertCompiler.access$400(UpsertCompiler.java:98)
at 
org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:821)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:319)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:311)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:309)
at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1432)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:808)
at sqlline.SqlLine.begin(SqlLine.java:681)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:292)
0: jdbc:phoenix:xx> SELECT * FROM TEST;
+---+
| testCharArray |
+---+
+---+
No rows selected (0.169 seconds)
0: jdbc:phoenix:xx> SELECT "testCharArray" FROM TEST;
+---+
| testCharArray |
+---+
+---+
No rows selected (0.182 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2285) phoenix.query.timeoutMs doesn't allow callers to set the timeout to less than 1 second

2015-10-01 Thread Jan Fernando (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940685#comment-14940685
 ] 

Jan Fernando commented on PHOENIX-2285:
---

I'll get the repo setup and tackle this tomorrow.

> phoenix.query.timeoutMs doesn't allow callers to set the timeout to less than 
> 1 second
> --
>
> Key: PHOENIX-2285
> URL: https://issues.apache.org/jira/browse/PHOENIX-2285
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.5.2
>Reporter: Jan Fernando
>Assignee: Jan Fernando
> Attachments: PHOENIX-2285-v1.txt, PHOENIX-2285-v2.txt
>
>
> When creating a Phoenix JDBC connection I have a use case where I want to 
> override the default value of phoenix.query.timeoutMs to a value of 200 ms. 
> Currently if you set phoenix.query.timeoutMs to less than 1000 ms, the 
> timeout gets rounded up to 1000ms. This is because in 
> PhoenixStatement.getDefaultQueryTimeout() we convert the value of 
> phoenix.query.timeoutMs to seconds in order to be compliant with JDBC. In 
> BaseResultIterators we then convert it back to millis. As a result of the 
> conversion we loose the millisecond fidelity.
> A possible solution is to store the timeout value stored on the 
> PhoenixStatement in both seconds and milliseconds. Then, in 
> BaseResultIterators when we read the value from the statement we can check if 
> the value exists in millisecond fidelity and if so use that value. Otherwise 
> we would use the value in second granularity and convert. 
> This would allow Phoenix to remain JDBC compatible with second level 
> granularity for setting query timeouts on statements, but allow millisecond 
> granularity of timeouts by explicitly setting phoenix.query.timeoutMs on 
> connection properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940626#comment-14940626
 ] 

Maryann Xue commented on PHOENIX-2270:
--

[~jamestaylor] Like your idea of perf measuring this, and we could probably 
find out which cases may benefit from Phoenix partial sort and which may not, 
and model the rels' cost accordingly.

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort. One potential scenario that may be a win for Phoenix is if the rows are 
> already partially sorted and Phoenix can take advantage of this (which is not 
> currently the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2301) NullPointerException when upserting into a char array column

2015-10-01 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940759#comment-14940759
 ] 

James Taylor commented on PHOENIX-2301:
---

[~jja...@marinsoftware.com] - what version of Phoenix are you currently using?

[~Dumindux] - do you think you'd have a few spare cycles to take a look?

> NullPointerException when upserting into a char array column
> 
>
> Key: PHOENIX-2301
> URL: https://issues.apache.org/jira/browse/PHOENIX-2301
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Julian Jaffe
>
> Attempting to upsert into a char array causes an NPE. Minimum example:
> {code:sql}
> 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testIntArray" 
> INTEGER[], CONSTRAINT "test_pk" PRIMARY KEY("testIntArray")) 
> DEFAULT_COLUMN_FAMILY='T';
> No rows affected (1.28 seconds)
> 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY[1, 2, 3]);
> 1 row affected (0.184 seconds)
> 0: jdbc:phoenix:xx> SELECT * FROM TEST;
> +--+
> |   testIntArray   |
> +--+
> | [1, 2, 3]|
> +--+
> 1 row selected (0.308 seconds)
> 0: jdbc:phoenix:xx> DROP TABLE IF EXISTS TEST;
> No rows affected (3.348 seconds)
> 0: jdbc:phoenix:xx> CREATE TABLE IF NOT EXISTS TEST("testCharArray" 
> CHAR(3)[], CONSTRAINT "test_pk" PRIMARY KEY("testCharArray")) 
> DEFAULT_COLUMN_FAMILY='T';
> No rows affected (1.446 seconds)
> 0: jdbc:phoenix:xx> UPSERT INTO TEST VALUES (ARRAY['aaa', 'bbb', 'ccc']);
> java.lang.NullPointerException
>   at 
> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1123)
>   at 
> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:338)
>   at 
> org.apache.phoenix.schema.types.PCharArray.toObject(PCharArray.java:64)
>   at 
> org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:967)
>   at 
> org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1008)
>   at 
> org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1004)
>   at org.apache.phoenix.util.SchemaUtil.toString(SchemaUtil.java:381)
>   at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:572)
>   at 
> org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:117)
>   at 
> org.apache.phoenix.compile.UpsertCompiler.access$400(UpsertCompiler.java:98)
>   at 
> org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:821)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:319)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:311)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:309)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1432)
>   at sqlline.Commands.execute(Commands.java:822)
>   at sqlline.Commands.sql(Commands.java:732)
>   at sqlline.SqlLine.dispatch(SqlLine.java:808)
>   at sqlline.SqlLine.begin(SqlLine.java:681)
>   at sqlline.SqlLine.start(SqlLine.java:398)
>   at sqlline.SqlLine.main(SqlLine.java:292)
> 0: jdbc:phoenix:xx> SELECT * FROM TEST;
> +---+
> | testCharArray |
> +---+
> +---+
> No rows selected (0.169 seconds)
> 0: jdbc:phoenix:xx> SELECT "testCharArray" FROM TEST;
> +---+
> | testCharArray |
> +---+
> +---+
> No rows selected (0.182 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


phoenix query server connection properties

2015-10-01 Thread Jan Van Besien
Hi,

We are working on a calcite/avatica based "thin" JDBC driver very
similar to what Phoenix has done with for its QueryServer, and I am
looking for some feedback/options.

Avatica in its current state doesn't have an RPC call for "create
connection". As a consequence, connection properties (i.e. the
Properties instance passed through the
DriverManager.getConnection(url, props)) are currently not RPC-ed from
the client to the server.

For Phoenix, this means properties such as TenantId, CurrentSCN etc do
not work with the thin driver. I saw this question being asked in
PHOENIX-1824, so I am not sure whether you were aware of this problem.
I've tested it with the phoenix sandbox on master with a multi-tenant
table to be sure.

There currently is a discussion ongoing on the calcite dev mailing
list on this topic as well (with subject "avatica jdbc URL connection
properties").

Our understanding of the problem is that we need to extend the RPC
with a "create connection", but this doesn't seem to be
straightforward in the current Avatica design.

It would be interesting to hear your thoughts on this subject.

Thanks
Jan