date:20180321


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408984#comment-16408984
 ] 

James Taylor edited comment on PHOENIX-4666 at 3/22/18 3:20 AM:


I like the simplicity of your design, [~ortutay], in using the hash as the 
cache ID. You could just assume that the cache is already available and react 
to the exception you get back by generating the cache if it's not. That way 
you'd need no mapping at all (and no central place to check if the cache ID 
maps to an existing cache). That flow is already there, but you'd need to add 
the logic to generate the cache as the current code assumes that it has already 
built the cache (i.e. this handles the situation in which a region splits and 
the new RS doesn't have the cache yet).

Some considerations:
- at a minimum, we could have a global config for the TTL of the cache when 
this feature is enabled (so that it'd be a different config than the standard 
TTL config).
- at the finest granularity, you could even create a new hint that specifies 
the TTL so you could specify it per query.
- we'd want to make it clear in documentation that the cache data would be 
stale once generated (until the TTL expires it).
- might consider having a new table level property on which this feature could 
be enabled (or a table-specific TTL could be specified)
- might consider in the future using a format like Apache Arrow to represent 
the hash join cache data
- might consider off heap memory for hash join cache
- persistent cache could be future work (or you could put interfaces in place 
that could be replaced)


was (Author: jamestaylor):
I like the simplicity of your design, [~ortutay], in using the hash as the 
cache ID. You could just assume that the cache is already available and react 
to the exception you get back by generating the cache if it's not. That way 
you'd need no mapping at all (and no central place to check if the cache ID 
maps to an existing cache). That flow is already there, but you'd need to add 
the logic to generate the cache as the current code assumes that it has already 
built the cache (i.e. this handles the situation in which a region splits and 
the new RS doesn't have the cache yet).

Some considerations:
- at a minimum, we could have a global config for the TTL of the cache when 
this feature is enabled (so that it'd be a different config than the standard 
TTL config).
- at the finest granularity, you could even create a new hint that specifies 
the TTL so you could specify it per query.
- we'd want to make it clear in documentation that the cache data would be 
stale once generated (until the TTL expires it).
- might consider having a new table level property on which this feature could 
be enabled (or a table-specific TTL could be specified)

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira

[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408984#comment-16408984
 ] 

James Taylor commented on PHOENIX-4666:
---

I like the simplicity of your design, [~ortutay], in using the hash as the 
cache ID. You could just assume that the cache is already available and react 
to the exception you get back by generating the cache if it's not. That way 
you'd need no mapping at all (and no central place to check if the cache ID 
maps to an existing cache). That flow is already there, but you'd need to add 
the logic to generate the cache as the current code assumes that it has already 
built the cache (i.e. this handles the situation in which a region splits and 
the new RS doesn't have the cache yet).

Some considerations:
- at a minimum, we could have a global config for the TTL of the cache when 
this feature is enabled (so that it'd be a different config than the standard 
TTL config).
- at the finest granularity, you could even create a new hint that specifies 
the TTL so you could specify it per query.
- we'd want to make it clear in documentation that the cache data would be 
stale once generated (until the TTL expires it).
- might consider having a new table level property on which this feature could 
be enabled (or a table-specific TTL could be specified)

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4579) Add a config to conditionally create Phoenix meta tables on first client connection

2018-03-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408939#comment-16408939
 ] 

ASF GitHub Bot commented on PHOENIX-4579:
-

Github user ChinmaySKulkarni commented on the issue:

https://github.com/apache/phoenix/pull/295
  
@JamesRTaylor Oops, done. Thanks!


> Add a config to conditionally create Phoenix meta tables on first client 
> connection
> ---
>
> Key: PHOENIX-4579
> URL: https://issues.apache.org/jira/browse/PHOENIX-4579
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Mujtaba Chohan
>Assignee: Chinmay Kulkarni
>Priority: Major
> Attachments: PHOENIX-4579.patch
>
>
> Currently we create/modify Phoenix meta tables on first client connection. 
> Adding a property to make it configurable (with default true as it is 
> currently implemented).
> With this property set to false, it will avoid lockstep upgrade requirement 
> for all clients when changing meta properties using PHOENIX-4575 as this 
> property can be flipped back on once all the clients are upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] phoenix issue #295: PHOENIX-4579: Add a config to conditionally create Phoen...

2018-03-21 Thread ChinmaySKulkarni

Github user ChinmaySKulkarni commented on the issue:

https://github.com/apache/phoenix/pull/295
  
@JamesRTaylor Oops, done. Thanks!


---

[jira] [Commented] (PHOENIX-4579) Add a config to conditionally create Phoenix meta tables on first client connection

2018-03-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408899#comment-16408899
 ] 

ASF GitHub Bot commented on PHOENIX-4579:
-

Github user JamesRTaylor commented on the issue:

https://github.com/apache/phoenix/pull/295
  
Please amend your commit message to be in the form: PHOENIX-4579 
That way, PR comments will automatically appear as JIRA comments.


> Add a config to conditionally create Phoenix meta tables on first client 
> connection
> ---
>
> Key: PHOENIX-4579
> URL: https://issues.apache.org/jira/browse/PHOENIX-4579
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Mujtaba Chohan
>Assignee: Chinmay Kulkarni
>Priority: Major
> Attachments: PHOENIX-4579.patch
>
>
> Currently we create/modify Phoenix meta tables on first client connection. 
> Adding a property to make it configurable (with default true as it is 
> currently implemented).
> With this property set to false, it will avoid lockstep upgrade requirement 
> for all clients when changing meta properties using PHOENIX-4575 as this 
> property can be flipped back on once all the clients are upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] phoenix issue #295: WIP: Added check for system catalog timestamp while doin...

2018-03-21 Thread ChinmaySKulkarni

Github user ChinmaySKulkarni commented on the issue:

https://github.com/apache/phoenix/pull/295
  
@twdsilva @JamesRTaylor please take a look and refer to my comments on the 
JIRA. Thanks!


---

[jira] [Commented] (PHOENIX-4579) Add a config to conditionally create Phoenix meta tables on first client connection

2018-03-21 Thread Chinmay Kulkarni (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408871#comment-16408871
 ] 

Chinmay Kulkarni commented on PHOENIX-4579:
---

[~tdsilva] and [~jamestaylor] please take a look at my PR: 
[https://github.com/apache/phoenix/pull/295|https://github.com/apache/phoenix/pull/295]
 and we can discuss if the approach seems correct. 

*Work done in this PR:*
# Modify the _GetVersionResponse_ protobuf declaration to return an optional 
long field corresponding to the system catalog timestamp.
# We use the timestamp returned from the _getVersion()_ call inside 
_checkClientServerCompatibility()_ to decide whether or not the system catalog 
table needs to be updated or not. Of course, if this is the first connection to 
the server then there is no entry for system catalog inside system catalog, so 
this timestamp does not exist yet and is thus ignored.
# Check for the presence of SYSTEM:CATALOG or SYSTEM.CATALOG and check 
client-server compatibility in both cases instead of just in the case when 
namespace mapping is enabled.
# If the physical system catalog table exists, we no longer try to create it 
and thus the code that updates the hbase metadata will no longer run. *Is this 
a problem?*
# In case we detect that system catalog needs to be upgraded, if 
{{phoenix.autoupgrade.enabled}} is true, we upgrade system tables, otherwise we 
just log an error asking the user to manually run "EXECUTE UPGRADE". Before 
this, we have set {{upgradeRequired}} to true, so the user will be blocked from 
executing any statements except for "EXECUTE UPGRADE".

*Testing plan:*
* When namespaceMapping is enabled, check that SYSTEM:CATALOG is created when 
connecting for the first time.
* When namespaceMapping is disabled, check that SYSTEM.CATALOG is created when 
connecting for the first time.
* Check that subsequent connections between jar compatible client-server do not 
try to create system catalog again.
* Connect lower version client to newer version server -> Should connect as is, 
without trying to upgrade system catalog.
* Connect a new client to a newly upgraded cluster which has not been connected 
to yet i.e. system catalog on this cluster is still old. In this case, system 
catalog should get upgraded only if {{phoenix.autoupgrade.enabled}} is true.
* Ensure that "EXECUTE UPGRADE" command works after these changes.

Please let me know if you guys have any suggesstions. Thanks.

> Add a config to conditionally create Phoenix meta tables on first client 
> connection
> ---
>
> Key: PHOENIX-4579
> URL: https://issues.apache.org/jira/browse/PHOENIX-4579
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Mujtaba Chohan
>Assignee: Chinmay Kulkarni
>Priority: Major
> Attachments: PHOENIX-4579.patch
>
>
> Currently we create/modify Phoenix meta tables on first client connection. 
> Adding a property to make it configurable (with default true as it is 
> currently implemented).
> With this property set to false, it will avoid lockstep upgrade requirement 
> for all clients when changing meta properties using PHOENIX-4575 as this 
> property can be flipped back on once all the clients are upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP

2018-03-21 Thread Duo Zhang

It is just a bit confusing to people who want to attend these events
because it is placed in the agenda section... It looks like that
HBaseCon/PhoenixCon is part of the DataWorks Summit, so maybe some
attendees may want to use the ticket for the DataWorks Summit to enter the
HBaseCon/PhoenixCon?

So advertise is good, but maybe move it to another section on the page?

Thanks.

2018-03-22 0:29 GMT+08:00 Josh Elser :

> Hey Duo,
>
> Thanks for digging into this. I am not surprised by it -- last I talked to
> the folks in charge of the website, they mentioned that they would
> cross-advertise for us as well. Seems like their web-staff is a bit faster
> than I am though. I was planning to point them to our event page after we
> had made our announcement, and hopefully they will just link back to us.
>
> Any specific concerns? I think free-advertising is good, but the intent is
> not for HBaseCon/PhoenixCon to be "a part of" DataWorks Summit. I think
> perhaps adding something like "HBaseCon and PhoenixCon (community events)"
> would help? Give me some concrete suggestions please :)
>
>
> On 3/21/18 4:13 AM, 张铎(Duo Zhang) wrote:
>
>> https://dataworkssummit.com/san-jose-2018/
>>
>> Here, in the Agenda section, HBaseCon and PhoenixCon are also included.
>>
>> Monday, June 18
>> 8:30 AM - 5:00 PM
>>
>> Pre-event Training
>> 8:30 AM - 5:00 PM
>>
>> HBaseCon and PhoenixCon
>> 12:00 PM – 7:00 PM
>>
>> Registration
>> 6:00 PM – 8:00 PM
>>
>> Meetups
>>
>> Is this intentional?
>>
>> 2018-03-21 14:59 GMT+08:00 Stack :
>>
>> On Tue, Mar 20, 2018 at 7:51 PM, Josh Elser  wrote:
>>>
>>> Hi all,

 I've published a new website for the upcoming event in June in
 California
 at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
 identical.

 I've not yet updated any links on either website to link to the new
 page.
 I'd appreciate if folks can give their feedback on anything outwardly
 wrong, incorrect, etc. If folks are happy, then I'll work on linking
 from
 the main websites, and coordinating an official announcement via mail
 lists, social media, etc.

 The website is generated from [3]. If you really want to be my
 best-friend, let me know about the above things which are wrong via
 pull-request ;)

 - Josh

 [1] https://hbase.apache.org/hbasecon-phoenixcon-2018/
 [2] https://phoenix.apache.org/hbasecon-phoenixcon-2018/
 [3] https://github.com/joshelser/hbasecon-jekyll

>>>
>>> Thanks Josh for doing this.
>>>
>>> Do they have to be conflated so? Each community is doing their own
>>> conference. This page/announcement makes it look like they have been
>>> squashed together.
>>>
>>> Thanks,
>>> S
>>>
>>>
>>

[jira] [Assigned] (PHOENIX-4668) Remove unnecessary table descriptor modification for SPLIT_POLICY column

2018-03-21 Thread Chinmay Kulkarni (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni reassigned PHOENIX-4668:
-

Assignee: Chinmay Kulkarni

> Remove unnecessary table descriptor modification for SPLIT_POLICY column
> 
>
> Key: PHOENIX-4668
> URL: https://issues.apache.org/jira/browse/PHOENIX-4668
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> Inside _ConnectionQueryServicesImpl.ensureTableCreated()_, we modify the 
> table descriptor with
> newDesc.setValue(HTableDescriptor.SPLIT_POLICY, 
> MetaDataSplitPolicy.class.getName()), however we already have this mentioned 
> in the create statement DDL for system tables, so we can remove this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PHOENIX-4668) Remove unnecessary table descriptor modification for SPLIT_POLICY column

2018-03-21 Thread Chinmay Kulkarni (JIRA)

Chinmay Kulkarni created PHOENIX-4668:
-

 Summary: Remove unnecessary table descriptor modification for 
SPLIT_POLICY column
 Key: PHOENIX-4668
 URL: https://issues.apache.org/jira/browse/PHOENIX-4668
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chinmay Kulkarni


Inside _ConnectionQueryServicesImpl.ensureTableCreated()_, we modify the table 
descriptor with

newDesc.setValue(HTableDescriptor.SPLIT_POLICY, 
MetaDataSplitPolicy.class.getName()), however we already have this mentioned in 
the create statement DDL for system tables, so we can remove this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (PHOENIX-4667) Create index on a view should return error if any of the REPLICATION_SCOPE/TTL/KEEP_DELETED_CELLS attributes are set

2018-03-21 Thread Akshita Malhotra (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshita Malhotra updated PHOENIX-4667:
--
Description: As the physical view index table is shared, create index on a 
view statements should return error if the user tries to set attributes which 
affect the physical table such as REPLICATION_SCOPE, TTL, KEEP_DELETED_CELLS 
etc.  (was: As the physical view index table is shared, create index on a view 
statements should return error if the user tries to set attributes which affect 
the physical table such as SOR settings, TTL, KEEP_DELETED_CELLS etc.)
Summary: Create index on a view should return error if any of the 
REPLICATION_SCOPE/TTL/KEEP_DELETED_CELLS attributes are set  (was: Create index 
on a view should return error if any of the SOR/TTL/KEEP_DELETED_CELLS 
attributes are set)

> Create index on a view should return error if any of the 
> REPLICATION_SCOPE/TTL/KEEP_DELETED_CELLS attributes are set
> 
>
> Key: PHOENIX-4667
> URL: https://issues.apache.org/jira/browse/PHOENIX-4667
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Akshita Malhotra
>Priority: Minor
>  Labels: index, schema
> Fix For: 4.13.0, 4.14.0
>
>
> As the physical view index table is shared, create index on a view statements 
> should return error if the user tries to set attributes which affect the 
> physical table such as REPLICATION_SCOPE, TTL, KEEP_DELETED_CELLS etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PHOENIX-4667) Create index on a view should return error if any of the SOR/TTL/KEEP_DELETED_CELLS attributes are set

2018-03-21 Thread Akshita Malhotra (JIRA)

Akshita Malhotra created PHOENIX-4667:
-

 Summary: Create index on a view should return error if any of the 
SOR/TTL/KEEP_DELETED_CELLS attributes are set
 Key: PHOENIX-4667
 URL: https://issues.apache.org/jira/browse/PHOENIX-4667
 Project: Phoenix
  Issue Type: Bug
Reporter: Akshita Malhotra
 Fix For: 4.13.0, 4.14.0


As the physical view index table is shared, create index on a view statements 
should return error if the user tries to set attributes which affect the 
physical table such as SOR settings, TTL, KEEP_DELETED_CELLS etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408720#comment-16408720
 ] 

Marcell Ortutay edited comment on PHOENIX-4666 at 3/21/18 10:46 PM:


Thanks for the input [~maryannxue]. My current implementation is here: 
[https://github.com/ortutay23andme/phoenix/tree/4.7.0-HBase-1.1] and in 
particular this is my hacky patch: 
[https://github.com/ortutay23andme/phoenix/commit/04c96f672eb4bcdccec27f124373be766f8dd5af]
 . (Implemented on 4.7 for unrelated reasons, but the same idea I think is 
transferable to HEAD) Instead of a random cache ID it takes a hash of the query 
statement and uses that as the cache ID. Each Phoenix client maintains it's own 
memory of which cache IDs have already been executed (this is not ideal, but it 
was easy to implement this way).

If I'm understanding your proposal, the Phoenix client would attempt to use a 
cache ID with the expectation that it exists on region servers. The region 
server would throw an exception if the cache ID is not found, which indicates 
to Phoenix client that it should evaluate the subquery as usual.


was (Author: ortutay):
Thanks for the input [~maryannxue]. My current implementation is here: 
[https://github.com/ortutay23andme/phoenix/tree/4.7.0-HBase-1.1] and in 
particular this is my hacky patch: 
[https://github.com/ortutay23andme/phoenix/commit/04c96f672eb4bcdccec27f124373be766f8dd5af]
 . Instead of a random cache ID it takes a hash of the query statement and uses 
that as the cache ID. Each Phoenix client maintains it's own memory of which 
cache IDs have already been executed (this is not ideal, but it was easy to 
implement this way).

If I'm understanding your proposal, the Phoenix client would attempt to use a 
cache ID with the expectation that it exists on region servers. The region 
server would throw an exception if the cache ID is not found, which indicates 
to Phoenix client that it should evaluate the subquery as usual.

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408720#comment-16408720
 ] 

Marcell Ortutay commented on PHOENIX-4666:
--

Thanks for the input [~maryannxue]. My current implementation is here: 
[https://github.com/ortutay23andme/phoenix/tree/4.7.0-HBase-1.1] and in 
particular this is my hacky patch: 
[https://github.com/ortutay23andme/phoenix/commit/04c96f672eb4bcdccec27f124373be766f8dd5af]
 . Instead of a random cache ID it takes a hash of the query statement and uses 
that as the cache ID. Each Phoenix client maintains it's own memory of which 
cache IDs have already been executed (this is not ideal, but it was easy to 
implement this way).

If I'm understanding your proposal, the Phoenix client would attempt to use a 
cache ID with the expectation that it exists on region servers. The region 
server would throw an exception if the cache ID is not found, which indicates 
to Phoenix client that it should evaluate the subquery as usual.

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query

2018-03-21 Thread Maryann Xue (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408676#comment-16408676
 ] 

Maryann Xue commented on PHOENIX-4666:
--

Here's my initial thought:
 # As opposed to maintaining a map from a subquery to a persistent cache, we 
could have a map from "cache_id" to a persistent cache. What we are doing now 
for "cache_id" should be good enough - client-side-generated unique id.
 # We'll have a certain property to indicate the use of persistent cache, so 
that a HashJoinPlan will re-use the "cache_id" (after it being generated on the 
first call) without having to re-send and re-build the caches on region servers.
 # User will have to hold on to the compiled statement and re-execute it.
 # There's no need for a centralized cache management at this point. A region 
server will throw a specific Exception if it is unable to find the cache 
requested, then on the client side, the sub-query will be re-evaluated and the 
cache will broadcasted to all region servers again with the same cache-id.

Not sure if is similar to your version, but would be nice to see whatever you 
already have and we can go from there.

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query


[ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408666#comment-16408666
 ] 

Marcell Ortutay commented on PHOENIX-4666:
--

As I mentioned above we’re working on a design proposal for this internally at 
23andMe, and there’s one big decision that I wanted to get feedback on.

There is currently “server cache” that is used by the hash join process in 
Phoenix. Hash join tables are broadcast to all region servers that need it, and 
the hash joining happens via coprocessor. This cache is deleted after the query 
ends.

My first thought for a persistent cache was to re-use the server cache, and 
extend the TTL and change the key (“cacheId”) generation. I implemented this as 
a hacky proof-of-concept and it worked quite well, the performance was much 
improved.

However, I’m wondering if a separate cache makes more sense. The current server 
cache has a different use case than a persistent cache, and as such it may be a 
good idea to separate the two.

Some ways in which they are different:

- A persistent cache performs eviction when there is no space left. The server 
cache raises an exception, and the user must do a merge sort join instead.

- Users may want to configure the two differently, eg. allocate more space for 
a persistent cache than the server cache, and set a higher TTL

- The server cache data must be available on all region servers doing the hash 
join. In contrast, the persistent cache only needs 1 copy of the data across 
the system (ie. across all region servers) until the data is needed. Doing this 
would be more space efficient, but result in more network transfer.

- You could in theory have a pluggable system for the persistent cache, eg. use 
memcache or something

 

That said, there are advantages to keeping it all in the server cache:

 

- Simpler implementation, does not add a new system to Phoenix

- Faster in the case that you get a cache hit, since there is no network 
transfer involved

 

Would love to get some feedback / opinions on this, thanks!

> Add a subquery cache that persists beyond the life of a query
> -
>
> Key: PHOENIX-4666
> URL: https://issues.apache.org/jira/browse/PHOENIX-4666
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Marcell Ortutay
>Priority: Major
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> 
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4661) Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: Table qualifier must not be empty"

2018-03-21 Thread Sergey Soldatov (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408657#comment-16408657
 ] 

Sergey Soldatov commented on PHOENIX-4661:
--

I have a strong feeling that changes in PhoenixAccessController.java are not 
related to this issue and need to be reverted. 

> Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: 
> Table qualifier must not be empty"
> 
>
> Key: PHOENIX-4661
> URL: https://issues.apache.org/jira/browse/PHOENIX-4661
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4661.patch, PHOENIX-4661_v1.patch, 
> PHOENIX-4661_v2.patch
>
>
> Noticed this when trying run the python tests against a 5.0 install
> {code:java}
> > create table josh(pk varchar not null primary key);
> > drop table if exists josh;
> > drop table if exists josh;{code}
> We'd expect the first two commands to successfully execute, and the third to 
> do nothing. However, the third command fails:
> {code:java}
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:2034)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16297)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8005)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2394)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2376)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41556)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.IllegalArgumentException: Table qualifier must not be 
> empty
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:186)
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:156)
>     at org.apache.hadoop.hbase.TableName.(TableName.java:346)
>     at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:382)
>     at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:443)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1989)
>     ... 9 more
>     at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:122)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1301)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1264)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.dropTable(ConnectionQueryServicesImpl.java:1515)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2877)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2804)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableDropTableStatement$1.execute(PhoenixStatement.java:1117)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:396)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:379)
>     at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:366)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1758)
>     at sqlline.Commands.execute(Commands.java:822)
>     at sqlline.Commands.sql(Commands.java:732)
>     at sqlline.SqlLine.dispatch(SqlLine.java:813)
>     at sqlline.SqlLine.begin(SqlLine.java:686)
>     at sqlline.SqlLine.start(SqlLine.java:398)
>     at sqlline.SqlLine.main(SqlLine.java:291)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
>

[jira] [Created] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query

Marcell Ortutay created PHOENIX-4666:

Summary: Add a subquery cache that persists beyond the life of a
query
Key: PHOENIX-4666
URL: https://issues.apache.org/jira/browse/PHOENIX-4666
Project: Phoenix
Issue Type: Improvement
Reporter: Marcell Ortutay

The user list thread for additional context is here:
[https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]

A Phoenix query may contain expensive subqueries, and moreover those expensive
subqueries may be used across multiple different queries. While whole result
caching is possible at the application level, it is not possible to cache
subresults in the application. This can cause bad performance for queries in
which the subquery is the most expensive part of the query, and the application
is powerless to do anything at the query level. It would be good if Phoenix
provided a way to cache subquery results, as it would provide a significant
performance gain.

An illustrative example:

SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10)
expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = \{id}

In this case, the subquery "expensive_result" is expensive to compute, but it
doesn't change between queries. The rest of the query does because of the \{id}
parameter. This means the application can't cache it, but it would be good if
there was a way to cache expensive_result.

Note that there is currently a coprocessor based "server cache", but the data
in this "cache" is not persisted across queries. It is deleted after a TTL
expires (30sec by default), or when the query completes.

This is issue is fairly high priority for us at 23andMe and we'd be happy to
provide a patch with some guidance from Phoenix maintainers. We are currently
putting together a design document for a solution, and we'll post it to this
Jira ticket for review in a few days.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4661) Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: Table qualifier must not be empty"


[ 
https://issues.apache.org/jira/browse/PHOENIX-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408638#comment-16408638
 ] 

Josh Elser commented on PHOENIX-4661:
-

Thanks for the heads-up, [~pboado]. Maybe [~ankit.singhal] can look at this 
tmrw?

> Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: 
> Table qualifier must not be empty"
> 
>
> Key: PHOENIX-4661
> URL: https://issues.apache.org/jira/browse/PHOENIX-4661
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4661.patch, PHOENIX-4661_v1.patch, 
> PHOENIX-4661_v2.patch
>
>
> Noticed this when trying run the python tests against a 5.0 install
> {code:java}
> > create table josh(pk varchar not null primary key);
> > drop table if exists josh;
> > drop table if exists josh;{code}
> We'd expect the first two commands to successfully execute, and the third to 
> do nothing. However, the third command fails:
> {code:java}
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:2034)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16297)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8005)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2394)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2376)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41556)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.IllegalArgumentException: Table qualifier must not be 
> empty
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:186)
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:156)
>     at org.apache.hadoop.hbase.TableName.(TableName.java:346)
>     at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:382)
>     at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:443)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1989)
>     ... 9 more
>     at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:122)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1301)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1264)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.dropTable(ConnectionQueryServicesImpl.java:1515)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2877)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2804)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableDropTableStatement$1.execute(PhoenixStatement.java:1117)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:396)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:379)
>     at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:366)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1758)
>     at sqlline.Commands.execute(Commands.java:822)
>     at sqlline.Commands.sql(Commands.java:732)
>     at sqlline.SqlLine.dispatch(SqlLine.java:813)
>     at sqlline.SqlLine.begin(SqlLine.java:686)
>     at sqlline.SqlLine.start(SqlLine.java:398)
>     at sqlline.SqlLine.main(SqlLine.java:291)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
>

[jira] [Reopened] (PHOENIX-4661) Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: Table qualifier must not be empty"


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser reopened PHOENIX-4661:
-

> Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: 
> Table qualifier must not be empty"
> 
>
> Key: PHOENIX-4661
> URL: https://issues.apache.org/jira/browse/PHOENIX-4661
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4661.patch, PHOENIX-4661_v1.patch, 
> PHOENIX-4661_v2.patch
>
>
> Noticed this when trying run the python tests against a 5.0 install
> {code:java}
> > create table josh(pk varchar not null primary key);
> > drop table if exists josh;
> > drop table if exists josh;{code}
> We'd expect the first two commands to successfully execute, and the third to 
> do nothing. However, the third command fails:
> {code:java}
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:2034)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16297)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8005)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2394)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2376)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41556)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.IllegalArgumentException: Table qualifier must not be 
> empty
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:186)
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:156)
>     at org.apache.hadoop.hbase.TableName.(TableName.java:346)
>     at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:382)
>     at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:443)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1989)
>     ... 9 more
>     at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:122)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1301)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1264)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.dropTable(ConnectionQueryServicesImpl.java:1515)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2877)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2804)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableDropTableStatement$1.execute(PhoenixStatement.java:1117)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:396)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:379)
>     at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:366)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1758)
>     at sqlline.Commands.execute(Commands.java:822)
>     at sqlline.Commands.sql(Commands.java:732)
>     at sqlline.SqlLine.dispatch(SqlLine.java:813)
>     at sqlline.SqlLine.begin(SqlLine.java:686)
>     at sqlline.SqlLine.start(SqlLine.java:398)
>     at sqlline.SqlLine.main(SqlLine.java:291)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:2034)
>     at 
>

[jira] [Commented] (PHOENIX-4661) Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: Table qualifier must not be empty"

2018-03-21 Thread Pedro Boado (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408632#comment-16408632
 ] 

Pedro Boado commented on PHOENIX-4661:
--

Hi guys, I'm getting errors in branches 4.x-HBase-1.2 , 4.x-cdh5.11.2 ( I 
haven't checked others ) for ( at least ) SystemTablePermissionsIT and 
ChangePermissionsIT  after this commit  - same errors as in jenkins compilation 
- 

> Repeatedly issuing DROP TABLE fails with "java.lang.IllegalArgumentException: 
> Table qualifier must not be empty"
> 
>
> Key: PHOENIX-4661
> URL: https://issues.apache.org/jira/browse/PHOENIX-4661
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4661.patch, PHOENIX-4661_v1.patch, 
> PHOENIX-4661_v2.patch
>
>
> Noticed this when trying run the python tests against a 5.0 install
> {code:java}
> > create table josh(pk varchar not null primary key);
> > drop table if exists josh;
> > drop table if exists josh;{code}
> We'd expect the first two commands to successfully execute, and the third to 
> do nothing. However, the third command fails:
> {code:java}
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier must not 
> be empty
>     at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:98)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:2034)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16297)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8005)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2394)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2376)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41556)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.IllegalArgumentException: Table qualifier must not be 
> empty
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:186)
>     at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:156)
>     at org.apache.hadoop.hbase.TableName.(TableName.java:346)
>     at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:382)
>     at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:443)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1989)
>     ... 9 more
>     at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:122)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1301)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1264)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.dropTable(ConnectionQueryServicesImpl.java:1515)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2877)
>     at 
> org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2804)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableDropTableStatement$1.execute(PhoenixStatement.java:1117)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:396)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:379)
>     at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:366)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1758)
>     at sqlline.Commands.execute(Commands.java:822)
>     at sqlline.Commands.sql(Commands.java:732)
>     at sqlline.SqlLine.dispatch(SqlLine.java:813)
>     at sqlline.SqlLine.begin(SqlLine.java:686)
>     at sqlline.SqlLine.start(SqlLine.java:398)
>     at sqlline.SqlLine.main(SqlLine.java:291)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: JOSH: Table qualifier

[jira] [Updated] (PHOENIX-4662) NullPointerException in TableResultIterator.java on cache resend


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated PHOENIX-4662:

Fix Version/s: 5.0.0
   4.14.0

> NullPointerException in TableResultIterator.java on cache resend
> 
>
> Key: PHOENIX-4662
> URL: https://issues.apache.org/jira/browse/PHOENIX-4662
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Csaba Skrabak
>Assignee: Csaba Skrabak
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4662.patch
>
>
> In the fix for PHOENIX-4010, there is a potential null dereference. Turned 
> out when we ran a previous version of HashJoinIT with PHOENIX-4010 backported.
> The caches field is initialized to null and may be dereferenced after 
> "Retrying when Hash Join cache is not found on the server ,by sending the 
> cache again".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4662) NullPointerException in TableResultIterator.java on cache resend


[ 
https://issues.apache.org/jira/browse/PHOENIX-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408535#comment-16408535
 ] 

Josh Elser commented on PHOENIX-4662:
-

Looks pretty straightforward -- we have a constructor which can set {{caches}} 
to null :)

Will run some tests locally and commit.

> NullPointerException in TableResultIterator.java on cache resend
> 
>
> Key: PHOENIX-4662
> URL: https://issues.apache.org/jira/browse/PHOENIX-4662
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Csaba Skrabak
>Assignee: Csaba Skrabak
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4662.patch
>
>
> In the fix for PHOENIX-4010, there is a potential null dereference. Turned 
> out when we ran a previous version of HashJoinIT with PHOENIX-4010 backported.
> The caches field is initialized to null and may be dereferenced after 
> "Retrying when Hash Join cache is not found on the server ,by sending the 
> cache again".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PHOENIX-4665) Document python driver on website

Josh Elser created PHOENIX-4665:
---

 Summary: Document python driver on website
 Key: PHOENIX-4665
 URL: https://issues.apache.org/jira/browse/PHOENIX-4665
 Project: Phoenix
  Issue Type: Task
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 4.14.0, 5.0.0


Prior to these releases, we should make sure that the Python driver has 
documentation on the website, not just the docs in the source code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PHOENIX-4664) Time Python driver tests failing

Josh Elser created PHOENIX-4664:
---

 Summary: Time Python driver tests failing
 Key: PHOENIX-4664
 URL: https://issues.apache.org/jira/browse/PHOENIX-4664
 Project: Phoenix
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 4.14.0, 5.0.0


{noformat}
test_time (phoenixdb.tests.test_types.TypesTest) ... FAIL
test_timestamp (phoenixdb.tests.test_types.TypesTest) ... FAIL
{noformat}

These two tests seem to be failing. Ankit thought it might be related to 
timezones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (PHOENIX-4636) Include python-phoenixdb into Phoenix


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved PHOENIX-4636.
-
   Resolution: Fixed
Fix Version/s: 5.0.0
   4.14.0

> Include python-phoenixdb into Phoenix
> -
>
> Key: PHOENIX-4636
> URL: https://issues.apache.org/jira/browse/PHOENIX-4636
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> Include [https://github.com/lalinsky/python-phoenixdb] in Phoenix.
> Details about the library can be found at:-
>  [http://python-phoenixdb.readthedocs.io/en/latest/]
> Discussion thread:-
> [https://www.mail-archive.com/dev@phoenix.apache.org/msg45424.html]
> commit:-
> [https://github.com/lalinsky/python-phoenixdb/commit/1bb34488dd530ca65f91b29ef16aa7b71f26b806]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4636) Include python-phoenixdb into Phoenix


[ 
https://issues.apache.org/jira/browse/PHOENIX-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408455#comment-16408455
 ] 

Josh Elser commented on PHOENIX-4636:
-

Forgot that I wanted to write a standalone program and update the readme. This 
all went well, so I'll push this into the docs and commit. Thanks, Ankit!

> Include python-phoenixdb into Phoenix
> -
>
> Key: PHOENIX-4636
> URL: https://issues.apache.org/jira/browse/PHOENIX-4636
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
>Priority: Major
>
> Include [https://github.com/lalinsky/python-phoenixdb] in Phoenix.
> Details about the library can be found at:-
>  [http://python-phoenixdb.readthedocs.io/en/latest/]
> Discussion thread:-
> [https://www.mail-archive.com/dev@phoenix.apache.org/msg45424.html]
> commit:-
> [https://github.com/lalinsky/python-phoenixdb/commit/1bb34488dd530ca65f91b29ef16aa7b71f26b806]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4651) Support ALTER TABLE ... MODIFY COLUMN

2018-03-21 Thread Thomas D'Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408401#comment-16408401
 ] 

Thomas D'Silva commented on PHOENIX-4651:
-

If we are renaming a column on a table that uses column encoding that we just 
need to change the metadata. 
If we are changing data types we would need to rewrite data, even if the table 
uses column encoding. 
FYI [~karanmehta93]

> Support ALTER TABLE ... MODIFY COLUMN
> -
>
> Key: PHOENIX-4651
> URL: https://issues.apache.org/jira/browse/PHOENIX-4651
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.10.0
>Reporter: Jepson
>Priority: Critical
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Modify the column type length, is very inconvenient, drop first ,then add.
> Such as:
> alter table jydw.test drop column name;
>  alter table jydw.test add name varchar(256);
> The alter table test modify column sql is not support.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP




On 3/21/18 12:44 PM, Stack wrote:

On Wed, Mar 21, 2018 at 9:26 AM, Josh Elser  wrote:



On 3/21/18 2:59 AM, Stack wrote:


On Tue, Mar 20, 2018 at 7:51 PM, Josh Elser  wrote:

Hi all,

I've published a new website for the upcoming event in June in California
at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
identical.

I've not yet updated any links on either website to link to the new page.
I'd appreciate if folks can give their feedback on anything outwardly
wrong, incorrect, etc. If folks are happy, then I'll work on linking from
the main websites, and coordinating an official announcement via mail
lists, social media, etc.

The website is generated from [3]. If you really want to be my
best-friend, let me know about the above things which are wrong via
pull-request;)

- Josh

[1]https://hbase.apache.org/hbasecon-phoenixcon-2018/
[2]https://phoenix.apache.org/hbasecon-phoenixcon-2018/
[3]https://github.com/joshelser/hbasecon-jekyll



Thanks Josh for doing this.

Do they have to be conflated so? Each community is doing their own
conference. This page/announcement makes it look like they have been
squashed together.

Thanks,
S


You have any concrete suggestions I can change?



You have hbasecon+phoenixcon which to me reads as a combined conference. I
thought we wanted to avoid this sort of messaging. Probably best to have
separate announcement pages.


I was originally intending to have separate pages, but I scrapped that 
because:


* I didn't have the time to make two sites (one took longer than I expected)
* I wasn't seeing content differentiation between the two

I'm hoping that, without getting word-y, there's a way that I can better 
express this? I definitely struggled with how to refer to these.


Would something like having the HBase "version" read only "HBaseCon", 
and the Phoenix "version", "PhoenixCon" make you happier? Does the 
"About" section read as to what you were expecting or would you like to 
see more separation there too?

[jira] [Commented] (PHOENIX-4659) Use coprocessor API to write local transactional indexes

2018-03-21 Thread Thomas D'Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408364#comment-16408364
 ] 

Thomas D'Silva commented on PHOENIX-4659:
-

+1

> Use coprocessor API to write local transactional indexes
> 
>
> Key: PHOENIX-4659
> URL: https://issues.apache.org/jira/browse/PHOENIX-4659
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4659_v1.patch
>
>
> Instead of writing local indexes through a separate thread pool, use the 
> coprocessor API to write them so that they are row-level transactionally 
> consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4370) Surface hbase metrics from perconnection to global metrics

2018-03-21 Thread Ethan Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408362#comment-16408362
 ] 

Ethan Wang commented on PHOENIX-4370:
-

I noticed that Jenkins is SUCCESS in Phoenix-4.x-HBase-1.3 integration, but 
FAILURE at PreCommit-PHOENIX. is this a release blocker? 

[~mujtabachohan] please advice

> Surface hbase metrics from perconnection to global metrics
> --
>
> Key: PHOENIX-4370
> URL: https://issues.apache.org/jira/browse/PHOENIX-4370
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ethan Wang
>Assignee: Ethan Wang
>Priority: Major
> Attachments: PHOENIX-4370-v1.patch
>
>
> Surface hbase metrics from perconnection to global metrics
> Currently in phoenix client side, HBASE metrics are recorded and surfaced at 
> Per Connection level. PHOENIX-4370 allow it to be aggregated at global level, 
> i.e., aggregate across all connections within in one JVM so that user can 
> evaluate it as a stable metrics periodically.
> COUNT_RPC_CALLS("rp", "Number of RPC calls"),
> COUNT_REMOTE_RPC_CALLS("rr", "Number of remote RPC calls"),
> COUNT_MILLS_BETWEEN_NEXTS("n", "Sum of milliseconds between sequential 
> next calls"),
> COUNT_NOT_SERVING_REGION_EXCEPTION("nsr", "Number of 
> NotServingRegionException caught"),
> COUNT_BYTES_REGION_SERVER_RESULTS("rs", "Number of bytes in Result 
> objects from region servers"),
> COUNT_BYTES_IN_REMOTE_RESULTS("rrs", "Number of bytes in Result objects 
> from remote region servers"),
> COUNT_SCANNED_REGIONS("rg", "Number of regions scanned"),
> COUNT_RPC_RETRIES("rpr", "Number of RPC retries"),
> COUNT_REMOTE_RPC_RETRIES("rrr", "Number of remote RPC retries"),
> COUNT_ROWS_SCANNED("ws", "Number of rows scanned"),
> COUNT_ROWS_FILTERED("wf", "Number of rows filtered");



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4619) Process transactional updates to local index on server-side

2018-03-21 Thread Thomas D'Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408343#comment-16408343
 ] 

Thomas D'Silva commented on PHOENIX-4619:
-

+1 LGTM

minor nit: In MutationState fix comment:
{code}
final Iterator indexIterator = // Only maintain tables with immutable 
rows through this client-side mechanism
{code}


> Process transactional updates to local index on server-side
> ---
>
> Key: PHOENIX-4619
> URL: https://issues.apache.org/jira/browse/PHOENIX-4619
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4619_v1.patch
>
>
> For local indexes, we'll want to continue to process updates on the 
> server-side. After PHOENIX-4278, updates even for local indexes are occurring 
> on the client-side. The reason is that we know the updates to the index table 
> will be a local write and we can generate the write on the server side. 
> Having a separate RPC and sending the updates across the wire would be 
> tremendously inefficient. On top of that, we need the region boundary 
> information which we have already in the coprocessor, but would need to 
> retrieve it on the client side (with a likely race condition too if a split 
> occurs after we retrieve it).
> To fix this, we need to modify PhoenixTxnIndexMutationGenerator such that it 
> can be use on the server-side as well. The main change will be to change this 
> method signature to pass through an IndexMaintainer instead of a PTable 
> (which isn't available on the server-side):
> {code}
> public List getIndexUpdates(final PTable table, PTable index, 
> List dataMutations) throws IOException, SQLException {
> {code}
> I think this can be changed to the following instead and be used both client 
> and server side:
> {code}
> public List getIndexUpdates(final IndexMaintainer maintainer, 
> byte[] dataTableName, List dataMutations) throws IOException, 
> SQLException {
> {code}
> We can tweak the code that makes PhoenixTransactionalIndexer a noop for 
> clients >= 4.14 to have it execute if the index is a local index. The one 
> downside is that if there's a mix of local and global indexes on the same 
> table, the index update calculation will be done twice. I think having a mix 
> of index types would be rare, though, and we should advise against it.
> There's also this code in UngroupedAggregateRegionObserver which needs to be 
> updated to write shadow cells for Omid:
> {code}
> } else if (buildLocalIndex) {
> for (IndexMaintainer maintainer : 
> indexMaintainers) {
> if (!results.isEmpty()) {
> result.getKey(ptr);
> ValueGetter valueGetter =
> 
> maintainer.createGetterFromKeyValues(
> 
> ImmutableBytesPtr.copyBytesIfNecessary(ptr),
> results);
> Put put = 
> maintainer.buildUpdateMutation(kvBuilder,
> valueGetter, ptr, 
> results.get(0).getTimestamp(),
> 
> env.getRegion().getRegionInfo().getStartKey(),
> 
> env.getRegion().getRegionInfo().getEndKey());
> indexMutations.add(put);
> }
> }
> result.setKeyValues(results);
> {code}
> This is the code that builds a local index initially (unlike the global index 
> code path which executes an UPSERT SELECT on the client side to do this 
> initial population).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4636) Include python-phoenixdb into Phoenix


[ 
https://issues.apache.org/jira/browse/PHOENIX-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408228#comment-16408228
 ] 

Josh Elser commented on PHOENIX-4636:
-

bq. These tests might be failing because of difference in timezone handling by 
Phoenix and dateTime module of python. (probably there were passing on lalinsky 
CI because of GMT timezone)

Ahh! That would make sense.

bq. +1, I would also suggest the same, we can create another Jira to track the 
issue.

Sounds good. Will push this up. Very exciting :)

> Include python-phoenixdb into Phoenix
> -
>
> Key: PHOENIX-4636
> URL: https://issues.apache.org/jira/browse/PHOENIX-4636
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
>Priority: Major
>
> Include [https://github.com/lalinsky/python-phoenixdb] in Phoenix.
> Details about the library can be found at:-
>  [http://python-phoenixdb.readthedocs.io/en/latest/]
> Discussion thread:-
> [https://www.mail-archive.com/dev@phoenix.apache.org/msg45424.html]
> commit:-
> [https://github.com/lalinsky/python-phoenixdb/commit/1bb34488dd530ca65f91b29ef16aa7b71f26b806]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4636) Include python-phoenixdb into Phoenix

2018-03-21 Thread Ankit Singhal (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408220#comment-16408220
 ] 

Ankit Singhal commented on PHOENIX-4636:


{quote}test_time (phoenixdb.tests.test_types.TypesTest) ... FAIL{quote}
{quote}test_timestamp (phoenixdb.tests.test_types.TypesTest) ... FAIL{quote}
These tests might be failing because of difference in timezone handling by 
Phoenix and dateTime module of python. (probably there were passing on lalinsky 
CI because of GMT timezone)
{quote}but I'm thinking I might just push it and deal with them later.
{quote}
+1, I would also suggest the same, we can create another Jira to track the 
issue.

> Include python-phoenixdb into Phoenix
> -
>
> Key: PHOENIX-4636
> URL: https://issues.apache.org/jira/browse/PHOENIX-4636
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Josh Elser
>Priority: Major
>
> Include [https://github.com/lalinsky/python-phoenixdb] in Phoenix.
> Details about the library can be found at:-
>  [http://python-phoenixdb.readthedocs.io/en/latest/]
> Discussion thread:-
> [https://www.mail-archive.com/dev@phoenix.apache.org/msg45424.html]
> commit:-
> [https://github.com/lalinsky/python-phoenixdb/commit/1bb34488dd530ca65f91b29ef16aa7b71f26b806]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP

2018-03-21 Thread Stack

On Wed, Mar 21, 2018 at 9:26 AM, Josh Elser  wrote:

>
>
> On 3/21/18 2:59 AM, Stack wrote:
>
>> On Tue, Mar 20, 2018 at 7:51 PM, Josh Elser  wrote:
>>
>> Hi all,
>>>
>>> I've published a new website for the upcoming event in June in California
>>> at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
>>> identical.
>>>
>>> I've not yet updated any links on either website to link to the new page.
>>> I'd appreciate if folks can give their feedback on anything outwardly
>>> wrong, incorrect, etc. If folks are happy, then I'll work on linking from
>>> the main websites, and coordinating an official announcement via mail
>>> lists, social media, etc.
>>>
>>> The website is generated from [3]. If you really want to be my
>>> best-friend, let me know about the above things which are wrong via
>>> pull-request ;)
>>>
>>> - Josh
>>>
>>> [1] https://hbase.apache.org/hbasecon-phoenixcon-2018/
>>> [2] https://phoenix.apache.org/hbasecon-phoenixcon-2018/
>>> [3] https://github.com/joshelser/hbasecon-jekyll
>>>
>>>
>>
>> Thanks Josh for doing this.
>>
>> Do they have to be conflated so? Each community is doing their own
>> conference. This page/announcement makes it look like they have been
>> squashed together.
>>
>> Thanks,
>> S
>>
>
> You have any concrete suggestions I can change?



You have hbasecon+phoenixcon which to me reads as a combined conference. I
thought we wanted to avoid this sort of messaging. Probably best to have
separate announcement pages.


> Was trying to write this to respect the concerns you gave early on. My
> intent was to use a theme: "two events, one physical location".
>
> Using the same CFP and website content is mostly in hopes of saving myself
> time (as I wanted to get this done this past weekend and failed..)
>


Understood.

If a CFP is wanted, can use the easychair system for CFPs.

Thanks,
S

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP


Hey Duo,

Thanks for digging into this. I am not surprised by it -- last I talked 
to the folks in charge of the website, they mentioned that they would 
cross-advertise for us as well. Seems like their web-staff is a bit 
faster than I am though. I was planning to point them to our event page 
after we had made our announcement, and hopefully they will just link 
back to us.


Any specific concerns? I think free-advertising is good, but the intent 
is not for HBaseCon/PhoenixCon to be "a part of" DataWorks Summit. I 
think perhaps adding something like "HBaseCon and PhoenixCon (community 
events)" would help? Give me some concrete suggestions please :)


On 3/21/18 4:13 AM, 张铎(Duo Zhang) wrote:

https://dataworkssummit.com/san-jose-2018/

Here, in the Agenda section, HBaseCon and PhoenixCon are also included.

Monday, June 18
8:30 AM - 5:00 PM

Pre-event Training
8:30 AM - 5:00 PM

HBaseCon and PhoenixCon
12:00 PM – 7:00 PM

Registration
6:00 PM – 8:00 PM

Meetups

Is this intentional?

2018-03-21 14:59 GMT+08:00 Stack :


On Tue, Mar 20, 2018 at 7:51 PM, Josh Elser  wrote:


Hi all,

I've published a new website for the upcoming event in June in California
at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
identical.

I've not yet updated any links on either website to link to the new page.
I'd appreciate if folks can give their feedback on anything outwardly
wrong, incorrect, etc. If folks are happy, then I'll work on linking from
the main websites, and coordinating an official announcement via mail
lists, social media, etc.

The website is generated from [3]. If you really want to be my
best-friend, let me know about the above things which are wrong via
pull-request ;)

- Josh

[1] https://hbase.apache.org/hbasecon-phoenixcon-2018/
[2] https://phoenix.apache.org/hbasecon-phoenixcon-2018/
[3] https://github.com/joshelser/hbasecon-jekyll




Thanks Josh for doing this.

Do they have to be conflated so? Each community is doing their own
conference. This page/announcement makes it look like they have been
squashed together.

Thanks,
S

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP




On 3/21/18 2:59 AM, Stack wrote:

On Tue, Mar 20, 2018 at 7:51 PM, Josh Elser  wrote:


Hi all,

I've published a new website for the upcoming event in June in California
at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
identical.

I've not yet updated any links on either website to link to the new page.
I'd appreciate if folks can give their feedback on anything outwardly
wrong, incorrect, etc. If folks are happy, then I'll work on linking from
the main websites, and coordinating an official announcement via mail
lists, social media, etc.

The website is generated from [3]. If you really want to be my
best-friend, let me know about the above things which are wrong via
pull-request ;)

- Josh

[1] https://hbase.apache.org/hbasecon-phoenixcon-2018/
[2] https://phoenix.apache.org/hbasecon-phoenixcon-2018/
[3] https://github.com/joshelser/hbasecon-jekyll




Thanks Josh for doing this.

Do they have to be conflated so? Each community is doing their own
conference. This page/announcement makes it look like they have been
squashed together.

Thanks,
S


You have any concrete suggestions I can change? Was trying to write this 
to respect the concerns you gave early on. My intent was to use a theme: 
"two events, one physical location".


Using the same CFP and website content is mostly in hopes of saving 
myself time (as I wanted to get this done this past weekend and failed..)

[jira] [Updated] (PHOENIX-4651) Support ALTER TABLE ... MODIFY COLUMN


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4651:
--
Summary: Support ALTER TABLE ... MODIFY COLUMN  (was: alter table test 
modify column is not support)

> Support ALTER TABLE ... MODIFY COLUMN
> -
>
> Key: PHOENIX-4651
> URL: https://issues.apache.org/jira/browse/PHOENIX-4651
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.10.0
>Reporter: Jepson
>Priority: Critical
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Modify the column type length, is very inconvenient, drop first ,then add.
> Such as:
> alter table jydw.test drop column name;
>  alter table jydw.test add name varchar(256);
> The alter table test modify column sql is not support.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4651) alter table test modify column is not support


[ 
https://issues.apache.org/jira/browse/PHOENIX-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408179#comment-16408179
 ] 

James Taylor commented on PHOENIX-4651:
---

Now that we have an indirection between the column name and the column 
qualifier through column encoding, we should be able to support modifying 
existing columns too. For example, to change a data type, you'd need to:
- create a new column of the new type
- run an async MR job to convert from old to new type
- upon successful completion, removing the old column and rename the existing 
one

> alter table test modify column is not support
> -
>
> Key: PHOENIX-4651
> URL: https://issues.apache.org/jira/browse/PHOENIX-4651
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.10.0
>Reporter: Jepson
>Priority: Critical
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Modify the column type length, is very inconvenient, drop first ,then add.
> Such as:
> alter table jydw.test drop column name;
>  alter table jydw.test add name varchar(256);
> The alter table test modify column sql is not support.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP


Odd, it's working for me, Ram.

Let me add both contact emails -- you are right. I had originally 
intended to make two websites, but got bogged down in the amount of time 
it took to just make one :)


On 3/21/18 2:12 AM, ramkrishna vasudevan wrote:

Hi

I think [2] does not work. Seems to be a broken link.
In the CONTACTs section should it be both dev@hbase and dev@phoenix? Rest
looks good to me.

Regards
Ram

On Wed, Mar 21, 2018 at 9:42 AM, Yu Li  wrote:


Great to know and thanks for the efforts sir.

Minor: in the CfP sector, first line, "The event's call for proposals is
available *on on* EasyChair", the double "on" should be merged (smile)

Best Regards,
Yu

On 21 March 2018 at 10:51, Josh Elser  wrote:


Hi all,

I've published a new website for the upcoming event in June in California
at [1][2] for the HBase and Phoenix websites, respectively. 1 & 2 are
identical.

I've not yet updated any links on either website to link to the new page.
I'd appreciate if folks can give their feedback on anything outwardly
wrong, incorrect, etc. If folks are happy, then I'll work on linking from
the main websites, and coordinating an official announcement via mail
lists, social media, etc.

The website is generated from [3]. If you really want to be my
best-friend, let me know about the above things which are wrong via
pull-request ;)

- Josh

[1] https://hbase.apache.org/hbasecon-phoenixcon-2018/
[2] https://phoenix.apache.org/hbasecon-phoenixcon-2018/
[3] https://github.com/joshelser/hbasecon-jekyll

Re: Almost ready for HBaseCon+PhoenixCon 2018 SanJose CFP