Re: Runtime DDL supported?

2018-02-16 Thread James Taylor
All indexes on views are stored in a single physical table, so you'll be ok
in that regard.

If you could file bugs for any local index issues, we'd really appreciate
it. We've been steadily improving local indexes (see PHOENIX-3941 for some
recent perf improvements - applicable for multi-tenant tables in particular
- these will appear in our 4.14 release). Handling non covered columns is
pretty isolated, so we should be able to fix bugs you find. Plus, there's a
workaround - you can cover your indexes until any issues are fixed.

Global, mutable indexes have had many improvements over the last several
releases too, but there's more operational overhead if/when a data table
gets out of sync with it's index table (plus some amount of configurable
eventual consistency or index disablement). With local indexes (and HBase
1.3), this isn't possible.

Thanks,
James

On Fri, Feb 16, 2018 at 3:10 PM, Miles Spielberg  wrote:

> Hi James,
>
> Thanks for the tips around reducing the number of physical tables while
> still maintaining the appearance of multiple tables via view definitions.
> In our use case we don't anticipate having much if any immutable data, so
> unfortunately I don't expect to be able to take advantage of Phoenix's
> optimizations there.
>
> We're expecting many indexes, mostly likely with several per logical
> per-tenant table. Given that global indexes are implemented as physical
> HBase tables, will the view-oriented optimizations help very much? We've
> done some experiments with local indexes on 4.13.2 and found bugs,
> particularly with the rewrite optimization to read non-covered columns from
> the main table, so we're not confident in using local indexes to optimize
> queries. (I've looked through the 5.0-alpha release notes and couldn't find
> anything related to this issue, so if desired I'll collect info for a
> separate bug report.)
>
> Miles Spielberg
> Staff Software Engineer
>
>
> O. 650.485.1102 <(650)%20485-1102>
> 900 Jefferson Ave
> 
> Redwood City, CA 94063
> 
>
> On Fri, Feb 16, 2018 at 2:49 PM, James Taylor 
> wrote:
>
>> Hi Miles,
>> You'll be fine if you use views [1] and multi-tenancy [2] to limit the
>> number of physical HBase tables. Make sure you read about the limitations
>> of views too [3].
>>
>> Here's the way I've seen this modeled successfully:
>> - create one schema per use case. This will let you leverage some nice
>> features in HBase for quotas and throttling. If you'll have a single use
>> case, you don't have to worry about it. Read about namespaces here [4] and
>> make sure to enable them before you start creating tables.
>> - define an immutable, multi-tenant base table that has TENANT_ID +
>> TYPE_ID primary key. There are optimizations Phoenix does over immutable
>> tables that you'll want to leverage (assuming you have use cases that fit
>> into this category). This Phoenix table will be backed by a physical
>> HBase table, but you won't execute Phoenix DML against it. Think of it as a
>> kind of "abstract" type. Instead, you'll create updatable views over it.
>> - define a regular/mutable, multi-tenant base table that has TENANT_ID +
>> TYPE_ID primary key. Same deal as above, but this would be the base table
>> for any tables in which the rows change in place.
>> - define global views per "logical" table (against either your immutable
>> base table or mutable base table depending on the functionality needed)
>> with each view having a WHERE TYPE_ID='your type identifier' clause which
>> adds specific columns to the primary key. This view will be updatable (i.e.
>> you can execute DML against it). The columns you add to your PK will depend
>> on your most common query patterns.
>> - optionally define indexes on these global views.
>> - each tenant can further extend or just use the global views.
>>
>> FYI, lots of good performance/tuning tips can be found here[5].
>>
>> Thanks,
>> James
>>
>>
>> [1] https://phoenix.apache.org/views.html
>> [2] https://phoenix.apache.org/multi-tenancy.html
>> [3] https://phoenix.apache.org/views.html#Limitations
>> [4] https://phoenix.apache.org/namspace_mapping.html
>> [5] https://phoenix.apache.org/tuning_guide.html
>>
>> On Fri, Feb 16, 2018 at 11:47 AM, Miles Spielberg  wrote:
>>
>>> We're looking at employing Phoenix in a multi-tenant use case where
>>> tenants can create their own tables and indexes, running into totals of
>>> tens-of-thousands of each. Is this a supported scenario, or are we headed
>>> for trouble?
>>>
>>
>>
>


Re: Runtime DDL supported?

2018-02-16 Thread Miles Spielberg
Hi James,

Thanks for the tips around reducing the number of physical tables while
still maintaining the appearance of multiple tables via view definitions.
In our use case we don't anticipate having much if any immutable data, so
unfortunately I don't expect to be able to take advantage of Phoenix's
optimizations there.

We're expecting many indexes, mostly likely with several per logical
per-tenant table. Given that global indexes are implemented as physical
HBase tables, will the view-oriented optimizations help very much? We've
done some experiments with local indexes on 4.13.2 and found bugs,
particularly with the rewrite optimization to read non-covered columns from
the main table, so we're not confident in using local indexes to optimize
queries. (I've looked through the 5.0-alpha release notes and couldn't find
anything related to this issue, so if desired I'll collect info for a
separate bug report.)

Miles Spielberg
Staff Software Engineer


O. 650.485.1102
900 Jefferson Ave
Redwood City, CA 94063

On Fri, Feb 16, 2018 at 2:49 PM, James Taylor 
wrote:

> Hi Miles,
> You'll be fine if you use views [1] and multi-tenancy [2] to limit the
> number of physical HBase tables. Make sure you read about the limitations
> of views too [3].
>
> Here's the way I've seen this modeled successfully:
> - create one schema per use case. This will let you leverage some nice
> features in HBase for quotas and throttling. If you'll have a single use
> case, you don't have to worry about it. Read about namespaces here [4] and
> make sure to enable them before you start creating tables.
> - define an immutable, multi-tenant base table that has TENANT_ID +
> TYPE_ID primary key. There are optimizations Phoenix does over immutable
> tables that you'll want to leverage (assuming you have use cases that fit
> into this category). This Phoenix table will be backed by a physical
> HBase table, but you won't execute Phoenix DML against it. Think of it as a
> kind of "abstract" type. Instead, you'll create updatable views over it.
> - define a regular/mutable, multi-tenant base table that has TENANT_ID +
> TYPE_ID primary key. Same deal as above, but this would be the base table
> for any tables in which the rows change in place.
> - define global views per "logical" table (against either your immutable
> base table or mutable base table depending on the functionality needed)
> with each view having a WHERE TYPE_ID='your type identifier' clause which
> adds specific columns to the primary key. This view will be updatable (i.e.
> you can execute DML against it). The columns you add to your PK will depend
> on your most common query patterns.
> - optionally define indexes on these global views.
> - each tenant can further extend or just use the global views.
>
> FYI, lots of good performance/tuning tips can be found here[5].
>
> Thanks,
> James
>
>
> [1] https://phoenix.apache.org/views.html
> [2] https://phoenix.apache.org/multi-tenancy.html
> [3] https://phoenix.apache.org/views.html#Limitations
> [4] https://phoenix.apache.org/namspace_mapping.html
> [5] https://phoenix.apache.org/tuning_guide.html
>
> On Fri, Feb 16, 2018 at 11:47 AM, Miles Spielberg  wrote:
>
>> We're looking at employing Phoenix in a multi-tenant use case where
>> tenants can create their own tables and indexes, running into totals of
>> tens-of-thousands of each. Is this a supported scenario, or are we headed
>> for trouble?
>>
>
>


Re: Runtime DDL supported?

2018-02-16 Thread James Taylor
Hi Miles,
You'll be fine if you use views [1] and multi-tenancy [2] to limit the
number of physical HBase tables. Make sure you read about the limitations
of views too [3].

Here's the way I've seen this modeled successfully:
- create one schema per use case. This will let you leverage some nice
features in HBase for quotas and throttling. If you'll have a single use
case, you don't have to worry about it. Read about namespaces here [4] and
make sure to enable them before you start creating tables.
- define an immutable, multi-tenant base table that has TENANT_ID + TYPE_ID
primary key. There are optimizations Phoenix does over immutable tables
that you'll want to leverage (assuming you have use cases that fit into
this category). This Phoenix table will be backed by a physical HBase
table, but you won't execute Phoenix DML against it. Think of it as a kind
of "abstract" type. Instead, you'll create updatable views over it.
- define a regular/mutable, multi-tenant base table that has TENANT_ID +
TYPE_ID primary key. Same deal as above, but this would be the base table
for any tables in which the rows change in place.
- define global views per "logical" table (against either your immutable
base table or mutable base table depending on the functionality needed)
with each view having a WHERE TYPE_ID='your type identifier' clause which
adds specific columns to the primary key. This view will be updatable (i.e.
you can execute DML against it). The columns you add to your PK will depend
on your most common query patterns.
- optionally define indexes on these global views.
- each tenant can further extend or just use the global views.

FYI, lots of good performance/tuning tips can be found here[5].

Thanks,
James


[1] https://phoenix.apache.org/views.html
[2] https://phoenix.apache.org/multi-tenancy.html
[3] https://phoenix.apache.org/views.html#Limitations
[4] https://phoenix.apache.org/namspace_mapping.html
[5] https://phoenix.apache.org/tuning_guide.html

On Fri, Feb 16, 2018 at 11:47 AM, Miles Spielberg  wrote:

> We're looking at employing Phoenix in a multi-tenant use case where
> tenants can create their own tables and indexes, running into totals of
> tens-of-thousands of each. Is this a supported scenario, or are we headed
> for trouble?
>


Runtime DDL supported?

2018-02-16 Thread Miles Spielberg
We're looking at employing Phoenix in a multi-tenant use case where tenants
can create their own tables and indexes, running into totals of
tens-of-thousands of each. Is this a supported scenario, or are we headed
for trouble?