[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932288#comment-15932288 ] Julian Reschke commented on OAK-5704: - [~mreutegg] - right, thanks for the reminder. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932262#comment-15932262 ] Marcel Reutegger commented on OAK-5704: --- The modCount field is an implementation detail and e.g. the MemoryDocumentStore does not use it. I'm also sceptical if using only the modCount field for this case is a good idea. At least the two other implementations initialize the modCount with 1, which was debated in the context of VersionGC already in OAK-4494. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929728#comment-15929728 ] Julian Reschke commented on OAK-5704: - We currently reset the flag if the modification date did not change. For stores that have modcount, wouldn't checking that be better? > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902874#comment-15902874 ] Julian Reschke commented on OAK-5704: - trunk: [r1786130|http://svn.apache.org/r1786130] [r1784204|http://svn.apache.org/r1784204] [r1784128|http://svn.apache.org/r1784128] > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882474#comment-15882474 ] Julian Reschke commented on OAK-5704: - trunk: [r1784204|http://svn.apache.org/r1784204] [r1784128|http://svn.apache.org/r1784128] > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881257#comment-15881257 ] Marcel Reutegger commented on OAK-5704: --- Applied to trunk: http://svn.apache.org/r1784204 > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881105#comment-15881105 ] Julian Reschke commented on OAK-5704: - Oops. Please apply. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704-5.patch, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880417#comment-15880417 ] Julian Reschke commented on OAK-5704: - trunk: [r1784128|http://svn.apache.org/r1784128] > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.7.0, 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880394#comment-15880394 ] Julian Reschke commented on OAK-5704: - On second thought, I prefer to leave things as proposed. The reason being that changing it would make keeping the statistics different from the other operations. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880360#comment-15880360 ] Marcel Reutegger commented on OAK-5704: --- You are right, it doesn't really make sense to collect those IDs when we are going to update the documents one by one. It may make sense to introduce a batch update operation later with conditions, but for now I think it is OK with single document calls. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880347#comment-15880347 ] Julian Reschke commented on OAK-5704: - While reviewing the patch, I noticed that there really isn't any point in collecting 450 UpdateOps, as we need to submit them one-by-one anyway. [~mreutegg] - should I remove that part, reducing the overall complexity of the change a bit? > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, > OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880139#comment-15880139 ] Marcel Reutegger commented on OAK-5704: --- bq. should we worry about code duplication wrt to parsing id/modified (which is now in multiple places...)? I noticed as well when I reviewed the last patch but didn't mention it. So, looks like it bothers both of us and we should fix it. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_6 > Fix For: 1.8 > > Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875920#comment-15875920 ] Julian Reschke commented on OAK-5704: - bq. The patch adds the current document id to the info message logged in collectDeletedDocuments(). I'm not sure how useful this is. The documents can be in arbitrary order and might give a false sense of progress. Ah. It was supposed to give a sense of progress; I assumed that the query indeed returns the documents sorted by {{_id}} - it does in RDB, and also everywhere else in the document store API. bq. In resetDeletedOnce(), I would rather log "Proceeding to reset ...". Ack. bq. The UpdateOp sets the _deletedOnce flag to false. I would prefer a new remove() method on UpdateOp. At least with MongoDB there is a sparse index on _deletedOnce and we are only interested in documents that have this field set to true. Documents with a _deletedOnce set to false would bloat the index. With MongoDB 3.2 we could work around this with a partial index, but I think it would be cleaner to remove the field. Ack. bq. The UpdateOp also updates the _modified field. This field is related to revisioned entries on the document. I think it would be better to leave the value as is, because there is no actual modification on the node related to this update. Yes, I wasn't sure about whether we should update _modified. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Attachments: OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected
[ https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875908#comment-15875908 ] Marcel Reutegger commented on OAK-5704: --- I like the idea of resetting the deletedOnce flag. A couple of comments about the patch: - The patch adds the current document id to the info message logged in collectDeletedDocuments(). I'm not sure how useful this is. The documents can be in arbitrary order and might give a false sense of progress. - In resetDeletedOnce(), I would rather log "Proceeding to reset ...". - The UpdateOp sets the _deletedOnce flag to false. I would prefer a new remove() method on UpdateOp. At least with MongoDB there is a sparse index on _deletedOnce and we are only interested in documents that have this field set to true. Documents with a _deletedOnce set to false would bloat the index. With MongoDB 3.2 we could work around this with a partial index, but I think it would be cleaner to remove the field. - The UpdateOp also updates the _modified field. This field is related to revisioned entries on the document. I think it would be better to leave the value as is, because there is no actual modification on the node related to this update. > VersionGC: reset _deletedOnce for documents that have been resurrected > -- > > Key: OAK-5704 > URL: https://issues.apache.org/jira/browse/OAK-5704 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Attachments: OAK-5704.diff > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)