Re: Runtime DDL supported?

2018-02-27 Thread Miles Spielberg
As we discussed, indexes across views are stored in a single HBase table
associated with the original table (_IDX_). That's grand for
limiting the number of HBase tables created, but I just realized that the
actual index data within is differentiated by the 16-bit "viewIndexId",
which limits us to 64K indexes across all views for a given table. That's
concerning for our use case, especially if its a cumulative autoincrement
across all CREATE INDEX and DROP INDEX operations over the lifetime of the
base table.

Is there any workaround for this? A quick grep across the source indicates
that the length of viewIndexId is currently hard-coded.

At least, this limitation should probably be added to the list of caveats
and warnings at https://phoenix.apache.org/views.html.

Miles Spielberg
Staff Software Engineer


O. 650.485.1102
900 Jefferson Ave
Redwood City, CA 94063

On Thu, Feb 22, 2018 at 7:42 AM, James Taylor 
wrote:

> Another option would be to use dynamic columns[1] when querying across
> views. You’d have to disable column encoding [2] in this case.
>
> [1] http://phoenix.apache.org/dynamic_columns.html
> [2] http://phoenix.apache.org/columnencoding.html
>
> On Wed, Feb 21, 2018 at 11:57 PM Miles Spielberg  wrote:
>
>> I believe each query in a UNION needs to have the same result tuple
>> format, which would work in this toy example, but in the general case each
>> view would have a different schema. We could make the result tuples conform
>> with each other by selecting NULL literals for every column except those in
>> a view. It would get quite verbose though. Assuming f1,f2,f3 all have
>> incompatible types, were you suggesting something like this?
>>
>> Select f1, null, null from v1 where PK=?
>> Union all
>> Select null, f2, null from v2 where PK=?
>> Union all
>> Select null, null, f3 from v3 where PK=?
>>
>> We might just run separate parallel queries against each view and merge
>> the results client side. I would guess this should perform well since the
>> block cache can be leveraged for queries after the first.
>>
>> We could also use the HBase API to run a point row get. We'd have to
>> reimplement decoding for Phoenix's column values, which is not ideal but
>> quite doable.
>>
>> Sent from my iPhone
>>
>> On Feb 21, 2018, at 9:09 PM, James Taylor  wrote:
>>
>> Have you tried a UNION ALL query on (f1, f2, f3) instead? It seems
>> you’re on a good track with multiple views over a single (or handful) of
>> physical table(s).
>>
>> On Wed, Feb 21, 2018 at 6:45 PM Miles Spielberg  wrote:
>>
>>> I've done some experimentation with views, with a schema resembling this:
>>>
>>> create table t1(

 pk bigint not null primary key

 );


> create view v1(

 f1 varchar

 ) AS SELECT * FROM t1;

 create INDEX v1_f1 ON v1(f1);


> create view v2(

 f2 varchar

 ) AS SELECT * FROM t1;

 create INDEX v2_f2 ON v2(f2);


> create view v3(

 f3 varchar

 ) AS SELECT * FROM t1;

 create INDEX v3_f3 ON v3(f3);


>>> Most of the time we'll be accessing data via the indexed views, but we'd
>>> also like to be able to query all columns (f1, f2, f3) for a given pk. At
>>> the HBase level, this should be doable as a point get on t1. The SQL-y
>>> way to express this would probably be with JOINs, but the EXPLAIN plan is
>>> not encouraging.
>>>
>>> > explain SELECT * from t1 left join v1 on v1.pk=t1.pk left join v2 on
 v2.pk=t1.pk left  join v3 on v3.pk=t1.pk where t1.pk=12345;
 | CLIENT 1-CHUNK 1 ROWS 281 BYTES PARALLEL 1-WAY ROUND ROBIN POINT
 LOOKUP ON 1 KEY OVER T1
 | PARALLEL LEFT-JOIN TABLE 0
 | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER T1
 | PARALLEL LEFT-JOIN TABLE 1
 | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER T1
 | PARALLEL LEFT-JOIN TABLE 2
 | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER T1

>>>
>>> This is pushing me back towards a design of having a single table,
>>> except for the issue of proliferating physical HBase tables for the
>>> indexes. Would you advise having a single table + a single view on it
>>> containing all columns, to coerce Phoenix to consolidate the indexes into a
>>> single physical table? Are there other alternatives we should be
>>> considering?
>>>
>>> Miles Spielberg
>>> Staff Software Engineer
>>>
>>>
>>> O. 650.485.1102 <(650)%20485-1102>
>>> 900 Jefferson Ave
>>> 
>>> Redwood City
>>> ,
>>> CA 94063
>>> 
>>>
>>> On Fri, Feb 16, 2018 at 3:27 PM, James Taylor 
>>> 

Re: Secondary index question

2018-02-27 Thread Miles Spielberg
I believe this is the same bug we encountered with local indexes where Phoenix 
4.13.1 failed to join back to the main table if the local index is not covering 
for all columns referenced in the WHERE.

> On Feb 27, 2018, at 9:36 AM, James Taylor  wrote:
> 
> Please file a JIRA and include the Phoenix and HBase version. Sounds like 
> you’ve found a bug.
> 
>> On Tue, Feb 27, 2018 at 9:21 AM Jonathan Leech  wrote:
>> I’ve done what you’re looking for by selecting the pk from the index in a 
>> nested query and filtering the other column separately.
>> 
>> > On Feb 27, 2018, at 6:39 AM, Alexey Karpov  wrote:
>> >
>> > Thanks for quick answer, but my case is a slightly different. I've seen 
>> > these
>> > links and already use local index. All cases, described in faq, index_usage
>> > and any other, I've found in this user list, are about SELECT clause. In
>> > WHERE clause there is always field from the index.
>> >
>> > In my case in WHERE clause I have one field from the index and one not from
>> > the index, combined with AND operator:
>> > SELECT * from test WHERE name = 'a' *AND description = 'b'*
>> > name - from the index
>> > description - not from the index
>> >
>> > Without filter on description (only on name) Phoenix uses index, as 
>> > expected
>> > for local index. But with additional filter Phoenix decides to do a full
>> > scan. And my question is: Is there any way to make Phoenix use index in 
>> > such
>> > types of queries, without include all fields in index ?
>> >
>> > Hint does not help:
>> > SELECT /*+ INDEX(test ix_test_name) */ name FROM test WHERE name = 'a' AND
>> > description= 'b'
>> > ColumnNotFoundException: ERROR 504 (42703): Undefined column.
>> > columnName=DESCRIPTION.
>> >
>> > Thanks,
>> > Alexey.
>> >
>> >
>> >
>> > --
>> > Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/


Re: Secondary index question

2018-02-27 Thread James Taylor
Please file a JIRA and include the Phoenix and HBase version. Sounds like
you’ve found a bug.

On Tue, Feb 27, 2018 at 9:21 AM Jonathan Leech  wrote:

> I’ve done what you’re looking for by selecting the pk from the index in a
> nested query and filtering the other column separately.
>
> > On Feb 27, 2018, at 6:39 AM, Alexey Karpov  wrote:
> >
> > Thanks for quick answer, but my case is a slightly different. I've seen
> these
> > links and already use local index. All cases, described in faq,
> index_usage
> > and any other, I've found in this user list, are about SELECT clause. In
> > WHERE clause there is always field from the index.
> >
> > In my case in WHERE clause I have one field from the index and one not
> from
> > the index, combined with AND operator:
> > SELECT * from test WHERE name = 'a' *AND description = 'b'*
> > name - from the index
> > description - not from the index
> >
> > Without filter on description (only on name) Phoenix uses index, as
> expected
> > for local index. But with additional filter Phoenix decides to do a full
> > scan. And my question is: Is there any way to make Phoenix use index in
> such
> > types of queries, without include all fields in index ?
> >
> > Hint does not help:
> > SELECT /*+ INDEX(test ix_test_name) */ name FROM test WHERE name = 'a'
> AND
> > description= 'b'
> > ColumnNotFoundException: ERROR 504 (42703): Undefined column.
> > columnName=DESCRIPTION.
> >
> > Thanks,
> > Alexey.
> >
> >
> >
> > --
> > Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/
>


Re: Secondary index question

2018-02-27 Thread Jonathan Leech
I’ve done what you’re looking for by selecting the pk from the index in a 
nested query and filtering the other column separately.

> On Feb 27, 2018, at 6:39 AM, Alexey Karpov  wrote:
> 
> Thanks for quick answer, but my case is a slightly different. I've seen these
> links and already use local index. All cases, described in faq, index_usage
> and any other, I've found in this user list, are about SELECT clause. In
> WHERE clause there is always field from the index.
> 
> In my case in WHERE clause I have one field from the index and one not from
> the index, combined with AND operator:
> SELECT * from test WHERE name = 'a' *AND description = 'b'*
> name - from the index
> description - not from the index
> 
> Without filter on description (only on name) Phoenix uses index, as expected
> for local index. But with additional filter Phoenix decides to do a full
> scan. And my question is: Is there any way to make Phoenix use index in such
> types of queries, without include all fields in index ?
> 
> Hint does not help:
> SELECT /*+ INDEX(test ix_test_name) */ name FROM test WHERE name = 'a' AND
> description= 'b'
> ColumnNotFoundException: ERROR 504 (42703): Undefined column.
> columnName=DESCRIPTION.
> 
> Thanks,
> Alexey.
> 
> 
> 
> --
> Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/


Re: Secondary index question

2018-02-27 Thread Alexey Karpov
Thanks for quick answer, but my case is a slightly different. I've seen these
links and already use local index. All cases, described in faq, index_usage
and any other, I've found in this user list, are about SELECT clause. In
WHERE clause there is always field from the index.

In my case in WHERE clause I have one field from the index and one not from
the index, combined with AND operator:
SELECT * from test WHERE name = 'a' *AND description = 'b'*
name - from the index
description - not from the index

Without filter on description (only on name) Phoenix uses index, as expected
for local index. But with additional filter Phoenix decides to do a full
scan. And my question is: Is there any way to make Phoenix use index in such
types of queries, without include all fields in index ?

Hint does not help:
SELECT /*+ INDEX(test ix_test_name) */ name FROM test WHERE name = 'a' AND
description= 'b'
ColumnNotFoundException: ERROR 504 (42703): Undefined column.
columnName=DESCRIPTION.

Thanks,
Alexey.



--
Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/