[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-03-20 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932288#comment-15932288
 ] 

Julian Reschke commented on OAK-5704:
-

[~mreutegg] - right, thanks for the reminder.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-03-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932262#comment-15932262
 ] 

Marcel Reutegger commented on OAK-5704:
---

The modCount field is an implementation detail and e.g. the MemoryDocumentStore 
does not use it. I'm also sceptical if using only the modCount field for this 
case is a good idea. At least the two other implementations initialize the 
modCount with 1, which was debated in the context of VersionGC already in 
OAK-4494.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-03-17 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929728#comment-15929728
 ] 

Julian Reschke commented on OAK-5704:
-

We currently reset the flag if the modification date did not change. For stores 
that have modcount, wouldn't checking that be better?

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-03-09 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902874#comment-15902874
 ] 

Julian Reschke commented on OAK-5704:
-

trunk: [r1786130|http://svn.apache.org/r1786130] 
[r1784204|http://svn.apache.org/r1784204] 
[r1784128|http://svn.apache.org/r1784128]


> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-24 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882474#comment-15882474
 ] 

Julian Reschke commented on OAK-5704:
-

trunk: [r1784204|http://svn.apache.org/r1784204] 
[r1784128|http://svn.apache.org/r1784128]


> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881257#comment-15881257
 ] 

Marcel Reutegger commented on OAK-5704:
---

Applied to trunk: http://svn.apache.org/r1784204

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881105#comment-15881105
 ] 

Julian Reschke commented on OAK-5704:
-

Oops. Please apply.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704-5.patch, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880417#comment-15880417
 ] 

Julian Reschke commented on OAK-5704:
-

trunk: [r1784128|http://svn.apache.org/r1784128]


> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880394#comment-15880394
 ] 

Julian Reschke commented on OAK-5704:
-

On second thought, I prefer to leave things as proposed. The reason being that 
changing it would make keeping the statistics different from the other 
operations.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880360#comment-15880360
 ] 

Marcel Reutegger commented on OAK-5704:
---

You are right, it doesn't really make sense to collect those IDs when we are 
going to update the documents one by one. It may make sense to introduce a 
batch update operation later with conditions, but for now I think it is OK with 
single document calls.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880347#comment-15880347
 ] 

Julian Reschke commented on OAK-5704:
-

While reviewing the patch, I noticed that there really isn't any point in 
collecting 450 UpdateOps, as we need to submit them one-by-one anyway. 
[~mreutegg] - should I remove that part, reducing the overall complexity of the 
change a bit?

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704-4.diff, 
> OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-23 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880139#comment-15880139
 ] 

Marcel Reutegger commented on OAK-5704:
---

bq. should we worry about code duplication wrt to parsing id/modified (which is 
now in multiple places...)?

I noticed as well when I reviewed the last patch but didn't mention it. So, 
looks like it bothers both of us and we should fix it.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
> Attachments: OAK-5704-2.diff, OAK-5704-3.diff, OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-21 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875920#comment-15875920
 ] 

Julian Reschke commented on OAK-5704:
-

bq. The patch adds the current document id to the info message logged in 
collectDeletedDocuments(). I'm not sure how useful this is. The documents can 
be in arbitrary order and might give a false sense of progress.

Ah. It was supposed to give a sense of progress; I assumed that the query 
indeed returns the documents sorted by {{_id}} - it does in RDB, and also 
everywhere else in the document store API.

bq. In resetDeletedOnce(), I would rather log "Proceeding to reset ...".

Ack.

bq. The UpdateOp sets the _deletedOnce flag to false. I would prefer a new 
remove() method on UpdateOp. At least with MongoDB there is a sparse index on 
_deletedOnce and we are only interested in documents that have this field set 
to true. Documents with a _deletedOnce set to false would bloat the index. With 
MongoDB 3.2 we could work around this with a partial index, but I think it 
would be cleaner to remove the field.

Ack.

bq. The UpdateOp also updates the _modified field. This field is related to 
revisioned entries on the document. I think it would be better to leave the 
value as is, because there is no actual modification on the node related to 
this update.

Yes, I wasn't sure about whether we should update _modified.


> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Attachments: OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5704) VersionGC: reset _deletedOnce for documents that have been resurrected

2017-02-21 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875908#comment-15875908
 ] 

Marcel Reutegger commented on OAK-5704:
---

I like the idea of resetting the deletedOnce flag. A couple of comments about 
the patch:

- The patch adds the current document id to the info message logged in 
collectDeletedDocuments(). I'm not sure how useful this is. The documents can 
be in arbitrary order and might give a false sense of progress.
- In resetDeletedOnce(), I would rather log "Proceeding to reset ...".
- The UpdateOp sets the _deletedOnce flag to false. I would prefer a new 
remove() method on UpdateOp. At least with MongoDB there is a sparse index on 
_deletedOnce and we are only interested in documents that have this field set 
to true. Documents with a _deletedOnce set to false would bloat the index. With 
MongoDB 3.2 we could work around this with a partial index, but I think it 
would be cleaner to remove the field.
- The UpdateOp also updates the _modified field. This field is related to 
revisioned entries on the document. I think it would be better to leave the 
value as is, because there is no actual modification on the node related to 
this update.

> VersionGC: reset _deletedOnce for documents that have been resurrected
> --
>
> Key: OAK-5704
> URL: https://issues.apache.org/jira/browse/OAK-5704
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Attachments: OAK-5704.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)