[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans

2017-08-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124926#comment-16124926
 ] 

ASF subversion and git services commented on JENA-1379:
---

Commit 16cacfe1732d8afa72b46f48ec7dc1c324752e00 in jena's branch 
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=16cacfe ]

JENA-1379: Merge commit 'refs/pull/272/head' of github.com:apache/jena

This closes #272.


> Replace TDB NodeTableTrans
> --
>
> Key: JENA-1379
> URL: https://issues.apache.org/jira/browse/JENA-1379
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.4.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>
> TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} 
> with an additional index (often in-memory) and a journal-like {{ObjectFile}} 
> to hold new nodes added in a transaction. It has to maintain a mapping 
> between the new nodes in the journal-ObjectFile and the eventual location on 
> the main node file. On commit, it writes the journal-ObjectFile nodes to 
> underlying index. There is a problem that writing the index isn't done 
> completely safely. The window of vulnerability is quite small though 
> (coordinating the index update and the object file update).
> {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler 
> design is to make {{NodeTable}}s be built from the basic components on 
> `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed 
> fashion. The potential flexibility of the current design has never been 
> exploited.
> There are two parts to this change: they are independent.
> # a transactional index (based on the same machinery as the tuple indexes) 
> and directly appending to the object file of the {{NodeTable}}.
> # independent transactional object file.
> Directly appending is safe because these files only grow. Only nodes in the 
> associated index are accessible.  Abort resets the append point; a crash 
> during a write transaction can, at worst, create unused junk in the object 
> file but this is a trade-off of speed and recovery. A journalled addition 
> object file would avoid junk in some crash situations, though it imposes a 
> copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make 
> crash-safe.
> The alternative here is not to keep the existing code - there is some unused 
> (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working 
> name) for a more complicated journalled object file.
> The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to 
> Jena 3.5.0 should be safe for valid databases. Going backwards should also 
> work if the database (not tested).  The safest way is to require that 
> recovery is done with the same version of TDB with a test in new code that 
> notices and exist if it encounters old files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans

2017-08-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124927#comment-16124927
 ] 

ASF GitHub Bot commented on JENA-1379:
--

Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/272


> Replace TDB NodeTableTrans
> --
>
> Key: JENA-1379
> URL: https://issues.apache.org/jira/browse/JENA-1379
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.4.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>
> TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} 
> with an additional index (often in-memory) and a journal-like {{ObjectFile}} 
> to hold new nodes added in a transaction. It has to maintain a mapping 
> between the new nodes in the journal-ObjectFile and the eventual location on 
> the main node file. On commit, it writes the journal-ObjectFile nodes to 
> underlying index. There is a problem that writing the index isn't done 
> completely safely. The window of vulnerability is quite small though 
> (coordinating the index update and the object file update).
> {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler 
> design is to make {{NodeTable}}s be built from the basic components on 
> `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed 
> fashion. The potential flexibility of the current design has never been 
> exploited.
> There are two parts to this change: they are independent.
> # a transactional index (based on the same machinery as the tuple indexes) 
> and directly appending to the object file of the {{NodeTable}}.
> # independent transactional object file.
> Directly appending is safe because these files only grow. Only nodes in the 
> associated index are accessible.  Abort resets the append point; a crash 
> during a write transaction can, at worst, create unused junk in the object 
> file but this is a trade-off of speed and recovery. A journalled addition 
> object file would avoid junk in some crash situations, though it imposes a 
> copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make 
> crash-safe.
> The alternative here is not to keep the existing code - there is some unused 
> (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working 
> name) for a more complicated journalled object file.
> The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to 
> Jena 3.5.0 should be safe for valid databases. Going backwards should also 
> work if the database (not tested).  The safest way is to require that 
> recovery is done with the same version of TDB with a test in new code that 
> notices and exist if it encounters old files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans

2017-08-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124924#comment-16124924
 ] 

ASF subversion and git services commented on JENA-1379:
---

Commit 3d406fe16973bea5af6f51bde10f583546f3f6c3 in jena's branch 
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=3d406fe ]

JENA-1379: Build NodeTables from ObjectFiles and BlockMgrs.

Remove NodeTableBuilder
Remove NodeTableTrans


> Replace TDB NodeTableTrans
> --
>
> Key: JENA-1379
> URL: https://issues.apache.org/jira/browse/JENA-1379
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.4.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>
> TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} 
> with an additional index (often in-memory) and a journal-like {{ObjectFile}} 
> to hold new nodes added in a transaction. It has to maintain a mapping 
> between the new nodes in the journal-ObjectFile and the eventual location on 
> the main node file. On commit, it writes the journal-ObjectFile nodes to 
> underlying index. There is a problem that writing the index isn't done 
> completely safely. The window of vulnerability is quite small though 
> (coordinating the index update and the object file update).
> {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler 
> design is to make {{NodeTable}}s be built from the basic components on 
> `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed 
> fashion. The potential flexibility of the current design has never been 
> exploited.
> There are two parts to this change: they are independent.
> # a transactional index (based on the same machinery as the tuple indexes) 
> and directly appending to the object file of the {{NodeTable}}.
> # independent transactional object file.
> Directly appending is safe because these files only grow. Only nodes in the 
> associated index are accessible.  Abort resets the append point; a crash 
> during a write transaction can, at worst, create unused junk in the object 
> file but this is a trade-off of speed and recovery. A journalled addition 
> object file would avoid junk in some crash situations, though it imposes a 
> copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make 
> crash-safe.
> The alternative here is not to keep the existing code - there is some unused 
> (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working 
> name) for a more complicated journalled object file.
> The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to 
> Jena 3.5.0 should be safe for valid databases. Going backwards should also 
> work if the database (not tested).  The safest way is to require that 
> recovery is done with the same version of TDB with a test in new code that 
> notices and exist if it encounters old files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans

2017-08-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124925#comment-16124925
 ] 

ASF subversion and git services commented on JENA-1379:
---

Commit afc9b8c038eff18c3654d47e112a33b3a653f66f in jena's branch 
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=afc9b8c ]

JENA-1379: Check for old-style dat-jrnl files


> Replace TDB NodeTableTrans
> --
>
> Key: JENA-1379
> URL: https://issues.apache.org/jira/browse/JENA-1379
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.4.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>
> TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} 
> with an additional index (often in-memory) and a journal-like {{ObjectFile}} 
> to hold new nodes added in a transaction. It has to maintain a mapping 
> between the new nodes in the journal-ObjectFile and the eventual location on 
> the main node file. On commit, it writes the journal-ObjectFile nodes to 
> underlying index. There is a problem that writing the index isn't done 
> completely safely. The window of vulnerability is quite small though 
> (coordinating the index update and the object file update).
> {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler 
> design is to make {{NodeTable}}s be built from the basic components on 
> `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed 
> fashion. The potential flexibility of the current design has never been 
> exploited.
> There are two parts to this change: they are independent.
> # a transactional index (based on the same machinery as the tuple indexes) 
> and directly appending to the object file of the {{NodeTable}}.
> # independent transactional object file.
> Directly appending is safe because these files only grow. Only nodes in the 
> associated index are accessible.  Abort resets the append point; a crash 
> during a write transaction can, at worst, create unused junk in the object 
> file but this is a trade-off of speed and recovery. A journalled addition 
> object file would avoid junk in some crash situations, though it imposes a 
> copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make 
> crash-safe.
> The alternative here is not to keep the existing code - there is some unused 
> (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working 
> name) for a more complicated journalled object file.
> The on-disk format is not changed. Switching from Jena 3.4.0 or earlier to 
> Jena 3.5.0 should be safe for valid databases. Going backwards should also 
> work if the database (not tested).  The safest way is to require that 
> recovery is done with the same version of TDB with a test in new code that 
> notices and exist if it encounters old files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JENA-1379) Replace TDB NodeTableTrans

2017-08-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115436#comment-16115436
 ] 

ASF GitHub Bot commented on JENA-1379:
--

GitHub user afs opened a pull request:

https://github.com/apache/jena/pull/272

JENA-1379: Better (simpler, more robust) transactional NodeTables

See [JENA-1379](https://issues.apache.org/jira/browse/JENA-1379) for more 
details.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afs/jena tdb-nodetable-txn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #272


commit e6e1b16aaca2c433120d61f2d7ad4edaaa1e22cf
Author: Andy Seaborne 
Date:   2017-08-04T16:14:33Z

Build from ObjectFiles and BlockMgrs.

Remove NodeTableBuilder
Remove NodeTableTrans




> Replace TDB NodeTableTrans
> --
>
> Key: JENA-1379
> URL: https://issues.apache.org/jira/browse/JENA-1379
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Affects Versions: Jena 3.4.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>
> TDB {{NodeTableTrans}} is complicated. It combines an existing {{NodeTable}} 
> with an additional index (often in-memory) and a journal-like {{ObjectFile}} 
> to hold new nodes added in a transaction. It has to maintain a mapping 
> between the new nodes in the journal-ObjectFile and the eventual location on 
> the main node file. On commit, it writes the journal-ObjectFile nodes to 
> underlying index. There is a problem that writing the index isn't done 
> completely safely. The window of vulnerability is quite small though 
> (coordinating the index update and the object file update).
> {{NodeTableBuilder}} is part of the way TDB datasets get built. A simpler 
> design is to make {{NodeTable}}s be built from the basic components on 
> `BlockMgr`s and `ObjectFile`s (the two units of storage in TDB) in a fixed 
> fashion. The potential flexibility of the current design has never been 
> exploited.
> There are two parts to this change: they are independent.
> # a transactional index (based on the same machinery as the tuple indexes) 
> and directly appending to the object file of the {{NodeTable}}.
> # independent transactional object file.
> Directly appending is safe because these files only grow. Only nodes in the 
> associated index are accessible.  Abort resets the append point; a crash 
> during a write transaction can, at worst, create unused junk in the object 
> file but this is a trade-off of speed and recovery. A journalled addition 
> object file would avoid junk in some crash situations, though it imposes a 
> copy cost. It is proposed to go for simple+speed. "Simpler" is easier to make 
> crash-safe.
> The alternative here is not to keep the existing code - there is some unused 
> (and hence no deployment-tested) code in {{ObjectFileTransComplex}} (working 
> name) for a more complicated journalled object file.
> The on-disk format is not changed except that existing (up to Jena 3.4.0) 
> "dat-jrnl" files do not exist. Presence of indicates crash recovery is 
> needed. The safest way is to require that recovery is done with the same 
> version of TDB with a test in new code that notices and exist if it 
> encounters old files. Oddly, old code should recover new version datasets 
> correctly! All the work has been moved the the main index journal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)