[jira] [Reopened] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-11-07 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully reopened AMQ-7082:
-

peeking at the latest additions to the fix.

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0, 5.15.8
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-11-06 Thread Gary Tully (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Tully reopened AMQ-7082:
-

[~alanprot] pointed out an issue that can cause the concurrent free page 
recovery to walk on newly allocated and freed pages or pages in the process of 
being used.

The recovery processing needs to terminate at the nextFreePage id that exists 
at start. Everything after that is not recoverable, it is in use!

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0, 5.15.7
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-10-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reopened AMQ-7082:
---

Just reopen this one to include in next 5.15.7 release.

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0, 5.15.7
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)