[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-18 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4704:
--
Attachment: PHOENIX-4704.master.v2.patch

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch, 
> PHOENIX-4704.master.v2.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4704:
--
Fix Version/s: 5.0.0

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4704:
--
Fix Version/s: 4.14.0

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4704:
--
Attachment: PHOENIX-4704.master.v1.patch

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)