[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans
[ https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124926#comment-16124926 ] ASF subversion and git services commented on JENA-1379: --- Commit 16cacfe1732d8afa72b46f48ec7dc1c324752e00 in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=16cacfe ] JENA-1379: Merge commit 'refs/pull/272/head' of github.com:apache/jena This closes #272. > Replace TDB NodeTableTrans > -- > > Key: JENA-1379 > URL: https://issues.apache.org/jira/browse/JENA-1379 > Project: Apache Jena > Issue Type: Bug > Components: TDB >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne > > TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} > with an additional index (often in-memory) and a journal-like {{ObjectFile}} > to hold new nodes added in a transaction. It has to maintain a mapping > between the new nodes in the journal-ObjectFile and the eventual location on > the main node file. On commit, it writes the journal-ObjectFile nodes to > underlying index. There is a problem that writing the index isn't done > completely safely. The window of vulnerability is quite small though > (coordinating the index update and the object file update). > {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler > design is to make {{NodeTable}}s be built from the basic components on > `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed > fashion. The potential flexibility of the current design has never been > exploited. > There are two parts to this change: they are independent. > # a transactional index (based on the same machinery as the tuple indexes) > and directly appending to the object file of the {{NodeTable}}. > # independent transactional object file. > Directly appending is safe because these files only grow. Only nodes in the > associated index are accessible. Abort resets the append point; a crash > during a write transaction can, at worst, create unused junk in the object > file but this is a trade-off of speed and recovery. A journalled addition > object file would avoid junk in some crash situations, though it imposes a > copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make > crash-safe. > The alternative here is not to keep the existing code - there is some unused > (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working > name) for a more complicated journalled object file. > The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to > Jena 3.5.0 should be safe for valid databases. Going backwards should also > work if the database (not tested). The safest way is to require that > recovery is done with the same version of TDB with a test in new code that > notices and exist if it encounters old files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans
[ https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124927#comment-16124927 ] ASF GitHub Bot commented on JENA-1379: -- Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/272 > Replace TDB NodeTableTrans > -- > > Key: JENA-1379 > URL: https://issues.apache.org/jira/browse/JENA-1379 > Project: Apache Jena > Issue Type: Bug > Components: TDB >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne > > TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} > with an additional index (often in-memory) and a journal-like {{ObjectFile}} > to hold new nodes added in a transaction. It has to maintain a mapping > between the new nodes in the journal-ObjectFile and the eventual location on > the main node file. On commit, it writes the journal-ObjectFile nodes to > underlying index. There is a problem that writing the index isn't done > completely safely. The window of vulnerability is quite small though > (coordinating the index update and the object file update). > {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler > design is to make {{NodeTable}}s be built from the basic components on > `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed > fashion. The potential flexibility of the current design has never been > exploited. > There are two parts to this change: they are independent. > # a transactional index (based on the same machinery as the tuple indexes) > and directly appending to the object file of the {{NodeTable}}. > # independent transactional object file. > Directly appending is safe because these files only grow. Only nodes in the > associated index are accessible. Abort resets the append point; a crash > during a write transaction can, at worst, create unused junk in the object > file but this is a trade-off of speed and recovery. A journalled addition > object file would avoid junk in some crash situations, though it imposes a > copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make > crash-safe. > The alternative here is not to keep the existing code - there is some unused > (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working > name) for a more complicated journalled object file. > The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to > Jena 3.5.0 should be safe for valid databases. Going backwards should also > work if the database (not tested). The safest way is to require that > recovery is done with the same version of TDB with a test in new code that > notices and exist if it encounters old files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans
[ https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124924#comment-16124924 ] ASF subversion and git services commented on JENA-1379: --- Commit 3d406fe16973bea5af6f51bde10f583546f3f6c3 in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=3d406fe ] JENA-1379: Build NodeTables from ObjectFiles and BlockMgrs. Remove NodeTableBuilder Remove NodeTableTrans > Replace TDB NodeTableTrans > -- > > Key: JENA-1379 > URL: https://issues.apache.org/jira/browse/JENA-1379 > Project: Apache Jena > Issue Type: Bug > Components: TDB >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne > > TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} > with an additional index (often in-memory) and a journal-like {{ObjectFile}} > to hold new nodes added in a transaction. It has to maintain a mapping > between the new nodes in the journal-ObjectFile and the eventual location on > the main node file. On commit, it writes the journal-ObjectFile nodes to > underlying index. There is a problem that writing the index isn't done > completely safely. The window of vulnerability is quite small though > (coordinating the index update and the object file update). > {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler > design is to make {{NodeTable}}s be built from the basic components on > `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed > fashion. The potential flexibility of the current design has never been > exploited. > There are two parts to this change: they are independent. > # a transactional index (based on the same machinery as the tuple indexes) > and directly appending to the object file of the {{NodeTable}}. > # independent transactional object file. > Directly appending is safe because these files only grow. Only nodes in the > associated index are accessible. Abort resets the append point; a crash > during a write transaction can, at worst, create unused junk in the object > file but this is a trade-off of speed and recovery. A journalled addition > object file would avoid junk in some crash situations, though it imposes a > copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make > crash-safe. > The alternative here is not to keep the existing code - there is some unused > (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working > name) for a more complicated journalled object file. > The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to > Jena 3.5.0 should be safe for valid databases. Going backwards should also > work if the database (not tested). The safest way is to require that > recovery is done with the same version of TDB with a test in new code that > notices and exist if it encounters old files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans
[ https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124925#comment-16124925 ] ASF subversion and git services commented on JENA-1379: --- Commit afc9b8c038eff18c3654d47e112a33b3a653f66f in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=afc9b8c ] JENA-1379: Check for old-style dat-jrnl files > Replace TDB NodeTableTrans > -- > > Key: JENA-1379 > URL: https://issues.apache.org/jira/browse/JENA-1379 > Project: Apache Jena > Issue Type: Bug > Components: TDB >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne > > TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} > with an additional index (often in-memory) and a journal-like {{ObjectFile}} > to hold new nodes added in a transaction. It has to maintain a mapping > between the new nodes in the journal-ObjectFile and the eventual location on > the main node file. On commit, it writes the journal-ObjectFile nodes to > underlying index. There is a problem that writing the index isn't done > completely safely. The window of vulnerability is quite small though > (coordinating the index update and the object file update). > {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler > design is to make {{NodeTable}}s be built from the basic components on > `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed > fashion. The potential flexibility of the current design has never been > exploited. > There are two parts to this change: they are independent. > # a transactional index (based on the same machinery as the tuple indexes) > and directly appending to the object file of the {{NodeTable}}. > # independent transactional object file. > Directly appending is safe because these files only grow. Only nodes in the > associated index are accessible. Abort resets the append point; a crash > during a write transaction can, at worst, create unused junk in the object > file but this is a trade-off of speed and recovery. A journalled addition > object file would avoid junk in some crash situations, though it imposes a > copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make > crash-safe. > The alternative here is not to keep the existing code - there is some unused > (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working > name) for a more complicated journalled object file. > The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to > Jena 3.5.0 should be safe for valid databases. Going backwards should also > work if the database (not tested). The safest way is to require that > recovery is done with the same version of TDB with a test in new code that > notices and exist if it encounters old files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans
[ https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115436#comment-16115436 ] ASF GitHub Bot commented on JENA-1379: -- GitHub user afs opened a pull request: https://github.com/apache/jena/pull/272 JENA-1379: Better (simpler, more robust) transactional NodeTables See [JENA-1379](https://issues.apache.org/jira/browse/JENA-1379) for more details. You can merge this pull request into a Git repository by running: $ git pull https://github.com/afs/jena tdb-nodetable-txn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/272.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #272 commit e6e1b16aaca2c433120d61f2d7ad4edaaa1e22cf Author: Andy Seaborne Date: 2017-08-04T16:14:33Z Build from ObjectFiles and BlockMgrs. Remove NodeTableBuilder Remove NodeTableTrans > Replace TDB NodeTableTrans > -- > > Key: JENA-1379 > URL: https://issues.apache.org/jira/browse/JENA-1379 > Project: Apache Jena > Issue Type: Bug > Components: TDB >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Andy Seaborne > > TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} > with an additional index (often in-memory) and a journal-like {{ObjectFile}} > to hold new nodes added in a transaction. It has to maintain a mapping > between the new nodes in the journal-ObjectFile and the eventual location on > the main node file. On commit, it writes the journal-ObjectFile nodes to > underlying index. There is a problem that writing the index isn't done > completely safely. The window of vulnerability is quite small though > (coordinating the index update and the object file update). > {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler > design is to make {{NodeTable}}s be built from the basic components on > `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed > fashion. The potential flexibility of the current design has never been > exploited. > There are two parts to this change: they are independent. > # a transactional index (based on the same machinery as the tuple indexes) > and directly appending to the object file of the {{NodeTable}}. > # independent transactional object file. > Directly appending is safe because these files only grow. Only nodes in the > associated index are accessible. Abort resets the append point; a crash > during a write transaction can, at worst, create unused junk in the object > file but this is a trade-off of speed and recovery. A journalled addition > object file would avoid junk in some crash situations, though it imposes a > copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make > crash-safe. > The alternative here is not to keep the existing code - there is some unused > (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working > name) for a more complicated journalled object file. > The on-disk format is not changed except that existing (up to Jena 3.4.0) > "dat-jrnl" files do not exist. Presence of indicates crash recovery is > needed. The safest way is to require that recovery is done with the same > version of TDB with a test in new code that notices and exist if it > encounters old files. Oddly, old code should recover new version datasets > correctly! All the work has been moved the the main index journal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)