[
https://issues.apache.org/jira/browse/PHOENIX-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824474#comment-15824474
]
Ankit Singhal commented on PHOENIX-3271:
----------------------------------------
ping [~jamestaylor]. can you please take a look , tests were passing locally.
> Distribute UPSERT SELECT across cluster
> ---------------------------------------
>
> Key: PHOENIX-3271
> URL: https://issues.apache.org/jira/browse/PHOENIX-3271
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: Ankit Singhal
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3271.patch, PHOENIX-3271_v1.patch,
> PHOENIX-3271_v2.patch, PHOENIX-3271_v3.patch
>
>
> Based on some informal testing we've done, it seems that creation of a local
> index is orders of magnitude faster that creation of global indexes (17
> seconds versus 10-20 minutes - though more data is written in the global
> index case). Under the covers, a global index is created through the running
> of an UPSERT SELECT. Also, UPSERT SELECT provides an easy way of copying a
> table. In both of these cases, the data being upserted must all flow back to
> the same client which can become a bottleneck for a large table. Instead,
> what can be done is to push each separate, chunked UPSERT SELECT call out to
> a different region server for execution there. One way we could implement
> this would be to have an endpoint coprocessor push the chunked UPSERT SELECT
> out to each region server and return the number of rows that were upserted
> back to the client.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)