[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903251#comment-16903251 ] Matt Ryan commented on OAK-8520: I created JCR-4463 suggesting a possible documentation change to the {{JackrabbitValueFactory.completeBinaryUpload()}} method reflecting the behavior. The documentation does not claim that the performance is different than what is implemented; rather it is just not clear on the point. Making it clear would be helpful. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob-cloud, blob-cloud-azure, blob-plugins >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > Fix For: 1.18.0, 1.10.4 > > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903247#comment-16903247 ] Matt Ryan commented on OAK-8520: Added to 1.10.4 in revision [r1864728|https://svn.apache.org/viewvc?view=revision&revision=1864728]. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > Fix For: 1.18.0, 1.10.4 > > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901583#comment-16901583 ] Alexander Klimetschek commented on OAK-8520: [~mattvryan] Awesome, thanks! Looks like it was even simpler than I thought. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > Fix For: 1.18.0, 1.10.4 > > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901564#comment-16901564 ] Matt Ryan commented on OAK-8520: [~alexander.klimetschek] I can align with the proposal to make it idempotent and return the existing blob if it already exists. The change I added was implemented in line with your suggestion. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > Fix For: 1.18.0, 1.10.4 > > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901562#comment-16901562 ] Matt Ryan commented on OAK-8520: This bug fix should also be backported to version 1.10.4 in my opinion. I've sent a proposal to the mailing list for a vote. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > Fix For: 1.18.0 > > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901503#comment-16901503 ] Matt Ryan commented on OAK-8520: Fixed in revision [r1864570|https://svn.apache.org/viewvc?view=revision&revision=1864570]. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OAK-8520) [Direct Binary Access] Avoid overwriting existing binaries via direct binary upload
[ https://issues.apache.org/jira/browse/OAK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900629#comment-16900629 ] Alexander Klimetschek commented on OAK-8520: IMO this should be classified as a bug and given a higher priority. The blob store implementation has to ensure the immutability of blobs and not wipe them out if applications (accidentally) call completeBinaryUpload() twice. AFACS {{completeBinaryUpload()}} should simply be an idempotent operation, returning the existing blob as {{Binary}} if called the 2nd (or 3rd, or 4th...) time with the same upload token. Then clients can safely retry their request that leads to the application code to call {{completeBinaryUpload() }}and try writing the same JCR structure. > [Direct Binary Access] Avoid overwriting existing binaries via direct binary > upload > --- > > Key: OAK-8520 > URL: https://issues.apache.org/jira/browse/OAK-8520 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob-cloud, blob-cloud-azure, blob-plugins, doc >Reporter: Matt Ryan >Assignee: Matt Ryan >Priority: Major > > Since direct binary upload generates a unique blob ID for each upload, it is > generally impossible to overwrite any existing binary. However, if a client > issues the {{completeBinaryUpload()}} call more than one time with the same > upload token, it is possible to overwrite an existing binary. > One use case where this can happen is if a client call to complete the upload > times out. Lacking a successful return a client could assume that it needs > to repeat the call to complete the upload. If the binary was already > uploaded before, the subsequent call to complete the upload would have the > effect of overwriting the binary with new content generated from any > uncommitted uploaded blocks. In practice usually there are no uncommitted > blocks so this generates a zero-length binary. > There may be a use case for a zero-length binary so simply failing in such a > case is not sufficient. > One easy way to handle this would be to simply check for the existence of the > binary before completing the upload. This would have the effect of making > uploaded binaries un-modifiable by the client. In such a case the > implementation could throw an exception indicating that the binary already > exists and cannot be written again. -- This message was sent by Atlassian JIRA (v7.6.14#76016)