[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-25 Thread Maksim Timonin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570796#comment-17570796
 ] 

Maksim Timonin commented on IGNITE-17215:
-

[~xtern] hi! I found the reason, this is actually a bug, introduced in 
https://issues.apache.org/jira/browse/IGNITE-17272. I prepared a fix in 
https://issues.apache.org/jira/browse/IGNITE-17408. 

> Write ClusterSnapshotRecord to WAL
> --
>
> Key: IGNITE-17215
> URL: https://issues.apache.org/jira/browse/IGNITE-17215
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-89
> Fix For: 2.14
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For PITR [1] process of recovering based on ClusterSnapshot + archived WALs.
> It's required to have a point in WAL which splits whole WAL on 2 areas:
>  # Before this point all data changes are contained within ClusterSnapshot, 
> and no need to recover them from WAL archived files.
>  # After this point all data need to be recovered from WAL archived files.
> It's proposed to write ClusterSnapshotRecord while the checkpoint is running 
> (cp#writeLock has acquired). ClusterSnapshot process guarantees:
>  # there is no active transactions (or any data changes) in moment of running 
> checkpoint.
>  # ClusterSnapshot contains all data pages that will be persisted within this 
> checkpoint process.
> Then every logical record after begin CheckointRecord doesn't belong to 
> ClusterSnapshot. Then it's safe to write ClusterSnapshotRecord within the 
> checkpoint process.
> [1] 
> [https://cwiki.apache.org/confluence/pages/editpage.action?pageId=211884314]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-21 Thread Maksim Timonin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569287#comment-17569287
 ] 

Maksim Timonin commented on IGNITE-17215:
-

Hi [~xtern] thanks for catching that, I'll have a look

> Write ClusterSnapshotRecord to WAL
> --
>
> Key: IGNITE-17215
> URL: https://issues.apache.org/jira/browse/IGNITE-17215
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-89
> Fix For: 2.14
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For PITR [1] process of recovering based on ClusterSnapshot + archived WALs.
> It's required to have a point in WAL which splits whole WAL on 2 areas:
>  # Before this point all data changes are contained within ClusterSnapshot, 
> and no need to recover them from WAL archived files.
>  # After this point all data need to be recovered from WAL archived files.
> It's proposed to write ClusterSnapshotRecord while the checkpoint is running 
> (cp#writeLock has acquired). ClusterSnapshot process guarantees:
>  # there is no active transactions (or any data changes) in moment of running 
> checkpoint.
>  # ClusterSnapshot contains all data pages that will be persisted within this 
> checkpoint process.
> Then every logical record after begin CheckointRecord doesn't belong to 
> ClusterSnapshot. Then it's safe to write ClusterSnapshotRecord within the 
> checkpoint process.
> [1] 
> [https://cwiki.apache.org/confluence/pages/editpage.action?pageId=211884314]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-12 Thread Pavel Pereslegin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565890#comment-17565890
 ] 

Pavel Pereslegin commented on IGNITE-17215:
---

[~timonin.maksim], [~NSAmelchev],

Could you please take a look why this test started to fail?
https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8=2699781523141512829=testDetails_IgniteTests24Java8=%3Cdefault%3E

{noformat}
lass org.apache.ignite.IgniteException: Failed to read WAL record at position: 
54496847, size: 67108669, expectedPtr: WALPointer [idx=0, fileOff=54496847, 
len=0]
at 
org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:48)
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.AbstractSnapshotSelfTest.checkSnpWalRecords(AbstractSnapshotSelfTest.java:682)
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManagerSelfTest.testSnapshotLocalPartitionMultiCpWithLoad(IgniteSnapshotManagerSelfTest.java:220)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2428)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read WAL 
record at position: 54496847, size: 67108669, expectedPtr: WALPointer [idx=0, 
fileOff=54496847, len=0]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.handleRecordException(AbstractWalRecordsIterator.java:327)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.reader.StandaloneWalRecordsIterator.handleRecordException(StandaloneWalRecordsIterator.java:389)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:291)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.reader.StandaloneWalRecordsIterator.advanceRecord(StandaloneWalRecordsIterator.java:286)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:176)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:135)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:52)
at 
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:44)
at 
org.apache.ignite.internal.util.lang.GridIteratorAdapter.next(GridIteratorAdapter.java:35)
... 14 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read WAL 
record at position: 54496847, size: 67108669, expectedPtr: WALPointer [idx=0, 
fileOff=54496847, len=0]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:396)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer.readRecord(RecordV2Serializer.java:244)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:273)
... 20 more
Caused by: java.nio.InvalidMarkException
at java.base/java.nio.Buffer.reset(Buffer.java:399)
at java.base/java.nio.ByteBuffer.reset(ByteBuffer.java:1197)
at java.base/java.nio.MappedByteBuffer.reset(MappedByteBuffer.java:253)
at java.base/java.nio.MappedByteBuffer.reset(MappedByteBuffer.java:67)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer$1.readWithHeaders(RecordV2Serializer.java:129)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:374)
... 22 more

[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-08 Thread Maksim Timonin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564227#comment-17564227
 ] 

Maksim Timonin commented on IGNITE-17215:
-

[~NSAmelchev] thanks for review. Merged to master

> Write ClusterSnapshotRecord to WAL
> --
>
> Key: IGNITE-17215
> URL: https://issues.apache.org/jira/browse/IGNITE-17215
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-89
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For PITR [1] process of recovering based on ClusterSnapshot + archived WALs.
> It's required to have a point in WAL which splits whole WAL on 2 areas:
>  # Before this point all data changes are contained within ClusterSnapshot, 
> and no need to recover them from WAL archived files.
>  # After this point all data need to be recovered from WAL archived files.
> It's proposed to write ClusterSnapshotRecord while the checkpoint is running 
> (cp#writeLock has acquired). ClusterSnapshot process guarantees:
>  # there is no active transactions (or any data changes) in moment of running 
> checkpoint.
>  # ClusterSnapshot contains all data pages that will be persisted within this 
> checkpoint process.
> Then every logical record after begin CheckointRecord doesn't belong to 
> ClusterSnapshot. Then it's safe to write ClusterSnapshotRecord within the 
> checkpoint process.
> [1] 
> [https://cwiki.apache.org/confluence/pages/editpage.action?pageId=211884314]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564225#comment-17564225
 ] 

Ignite TC Bot commented on IGNITE-17215:


{panel:title=Branch: [pull/10105/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10105/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6672103buildTypeId=IgniteTests24Java8_RunAll]

> Write ClusterSnapshotRecord to WAL
> --
>
> Key: IGNITE-17215
> URL: https://issues.apache.org/jira/browse/IGNITE-17215
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-89
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For PITR [1] process of recovering based on ClusterSnapshot + archived WALs.
> It's required to have a point in WAL which splits whole WAL on 2 areas:
>  # Before this point all data changes are contained within ClusterSnapshot, 
> and no need to recover them from WAL archived files.
>  # After this point all data need to be recovered from WAL archived files.
> It's proposed to write ClusterSnapshotRecord while the checkpoint is running 
> (cp#writeLock has acquired). ClusterSnapshot process guarantees:
>  # there is no active transactions (or any data changes) in moment of running 
> checkpoint.
>  # ClusterSnapshot contains all data pages that will be persisted within this 
> checkpoint process.
> Then every logical record after begin CheckointRecord doesn't belong to 
> ClusterSnapshot. Then it's safe to write ClusterSnapshotRecord within the 
> checkpoint process.
> [1] 
> [https://cwiki.apache.org/confluence/pages/editpage.action?pageId=211884314]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17215) Write ClusterSnapshotRecord to WAL

2022-07-05 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562486#comment-17562486
 ] 

Ignite TC Bot commented on IGNITE-17215:


{panel:title=Branch: [pull/10105/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10105/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6641005buildTypeId=IgniteTests24Java8_RunAll]

> Write ClusterSnapshotRecord to WAL
> --
>
> Key: IGNITE-17215
> URL: https://issues.apache.org/jira/browse/IGNITE-17215
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-89
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For PITR [1] process of recovering based on ClusterSnapshot + archived WALs.
> It's required to have a point in WAL which splits whole WAL on 2 areas:
>  # Before this point all data changes are contained within ClusterSnapshot, 
> and no need to recover them from WAL archived files.
>  # After this point all data need to be recovered from WAL archived files.
> It's proposed to write ClusterSnapshotRecord at the moment checkpoint process 
> starts (cp#writeLock has acquired). ClusterSnapshot process guarantees:
>  # there is no active transactions (or any data changes) in moment of writing 
> begin CheckpointRecord.
>  # ClusterSnapshot consist of data pages that are materialized within this 
> checkpoint process.
> Then every logical record after begin CheckointRecord doesn't belong to 
> ClusterSnapshot. Then it's safe to write ClusterSnapshotRecord align with 
> CheckpointRecord.
>  
> [1] 
> [https://cwiki.apache.org/confluence/pages/editpage.action?pageId=211884314]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)