[jira] [Commented] (PHOENIX-4641) Perform index maintenance on server-side for transactional local indexes

2018-03-12 Thread Ohad Shacham (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16395424#comment-16395424
 ] 

Ohad Shacham commented on PHOENIX-4641:
---

FYI [~jamestaylor]. I created OMID-93, will create a pull request tomorrow.

> Perform index maintenance on server-side for transactional local indexes
> 
>
> Key: PHOENIX-4641
> URL: https://issues.apache.org/jira/browse/PHOENIX-4641
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
>
> PHOENIX-4278 changed index maintenance for transactional tables to be 
> performed on the client side. For local indexes, this is not ideal and not 
> really necessary as the updates to the indexes will all be local. By doing 
> this on the client side, we'd incur extra overhead:
> - extra RPCs for updates to local index tables separate from RPCs for data 
> tables
> - related to this, more network bandwidth would be used
> - calculation on client-side to determine region start key (which is someone 
> unclear whether there's a race condition with a split occurring while this is 
> being determined)
> - the updates to local indexes would no longer be row-level atomic with data 
> table HBase updates (though they'd be atomic because they're transactional)
> With Tephra, we can do the index maintenance on the server side without 
> further changes. For Omid, it's more difficult since we must:
> - perform all writes
> - write to the commit table
> - write the shadow cells (which requires knowing the index updates)
> If there will already be an API to write the shadow cells (required for the 
> initial population of local indexes), then perhaps we could piggyback on 
> that. On the client-side, we could do the following:
> - perform all writes
> - write to the commit table
> - perform writes again, but with a flag set to indicate that only the shadow 
> cells need to be written (note we already have the 
> mutation.setAttribute(REPLAY_WRITES, REPLAY_ONLY_INDEX_WRITES) option that 
> will help with this) . In this case, we'd execute the logic to compute the 
> index updates twice, but on the plus side, we wouldn't incur the other 
> overhead mentioned before.
> All in all, it's unclear if this is worth doing. It doesn't make a lot of 
> sense to use local indexes for transactional tables, since one of the biggest 
> benefits of local indexes is row level atomicity between index and table rows 
> is already achieved more generally by transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4641) Perform index maintenance on server-side for transactional local indexes

2018-03-06 Thread Ohad Shacham (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388464#comment-16388464
 ] 

Ohad Shacham commented on PHOENIX-4641:
---

 

I can easily build such an API. I assume it will get a mutation list and return 
a list of puts?

> Perform index maintenance on server-side for transactional local indexes
> 
>
> Key: PHOENIX-4641
> URL: https://issues.apache.org/jira/browse/PHOENIX-4641
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
>
> PHOENIX-4278 changed index maintenance for transactional tables to be 
> performed on the client side. For local indexes, this is not ideal and not 
> really necessary as the updates to the indexes will all be local. By doing 
> this on the client side, we'd incur extra overhead:
> - extra RPCs for updates to local index tables separate from RPCs for data 
> tables
> - related to this, more network bandwidth would be used
> - calculation on client-side to determine region start key (which is someone 
> unclear whether there's a race condition with a split occurring while this is 
> being determined)
> - the updates to local indexes would no longer be row-level atomic with data 
> table HBase updates (though they'd be atomic because they're transactional)
> With Tephra, we can do the index maintenance on the server side without 
> further changes. For Omid, it's more difficult since we must:
> - perform all writes
> - write to the commit table
> - write the shadow cells (which requires knowing the index updates)
> If there will already be an API to write the shadow cells (required for the 
> initial population of local indexes), then perhaps we could piggyback on 
> that. On the client-side, we could do the following:
> - perform all writes
> - write to the commit table
> - perform writes again, but with a flag set to indicate that only the shadow 
> cells need to be written (note we already have the 
> mutation.setAttribute(REPLAY_WRITES, REPLAY_ONLY_INDEX_WRITES) option that 
> will help with this) . In this case, we'd execute the logic to compute the 
> index updates twice, but on the plus side, we wouldn't incur the other 
> overhead mentioned before.
> All in all, it's unclear if this is worth doing. It doesn't make a lot of 
> sense to use local indexes for transactional tables, since one of the biggest 
> benefits of local indexes is row level atomicity between index and table rows 
> is already achieved more generally by transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4641) Perform index maintenance on server-side for transactional local indexes

2018-03-06 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388175#comment-16388175
 ] 

James Taylor commented on PHOENIX-4641:
---

FYI, [~ohads]. Will there be an API that will generate the shadow cells given 
the index updates? If so, this would be pretty easy to implement.

> Perform index maintenance on server-side for transactional local indexes
> 
>
> Key: PHOENIX-4641
> URL: https://issues.apache.org/jira/browse/PHOENIX-4641
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
>
> PHOENIX-4278 changed index maintenance for transactional tables to be 
> performed on the client side. For local indexes, this is not ideal and not 
> really necessary as the updates to the indexes will all be local. By doing 
> this on the client side, we'd incur extra overhead:
> - extra RPCs for updates to local index tables separate from RPCs for data 
> tables
> - related to this, more network bandwidth would be used
> - calculation on client-side to determine region start key (which is someone 
> unclear whether there's a race condition with a split occurring while this is 
> being determined)
> - the updates to local indexes would no longer be row-level atomic with data 
> table HBase updates (though they'd be atomic because they're transactional)
> With Tephra, we can do the index maintenance on the server side without 
> further changes. For Omid, it's more difficult since we must:
> - perform all writes
> - write to the commit table
> - write the shadow cells (which requires knowing the index updates)
> If there will already be an API to write the shadow cells (required for the 
> initial population of local indexes), then perhaps we could piggyback on 
> that. On the client-side, we could do the following:
> - perform all writes
> - write to the commit table
> - perform writes again, but with a flag set to indicate that only the shadow 
> cells need to be written (note we already have the 
> mutation.setAttribute(REPLAY_WRITES, REPLAY_ONLY_INDEX_WRITES) option that 
> will help with this) . In this case, we'd execute the logic to compute the 
> index updates twice, but on the plus side, we wouldn't incur the other 
> overhead mentioned before.
> All in all, it's unclear if this is worth doing. It doesn't make a lot of 
> sense to use local indexes for transactional tables, since one of the biggest 
> benefits of local indexes is row level atomicity between index and table rows 
> is already achieved more generally by transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)