Re: Slowness in Direct Binary Upload

Matt Ryan Mon, 27 Jul 2020 07:57:52 -0700

Hi Tanvi,

I recommend starting by breaking down the upload performance from the
client perspective.  For example, if you are using a browser client, you
should be able to break the upload down into three parts - upload
initiation, upload, and upload completion.  Try to obtain timing
performance for each step.  Note of course that the "upload" portion (2nd
step) should be directly between your browser client and S3, not going
through Oak at all.  The first and third steps are probably going to a web
endpoint that interacts directly with the JCR API (like a servlet).


If this doesn't describe your environment, please provide a description of
how you are using the direct upload API so we can help you better.

Once you know which step is the slow one, you'll have information to
diagnose further.  What you should expect:
- The "init" step does not interact with S3's REST API if I understand the
AWS SDK implementation correctly, so it should be quite quick.
- The "complete" step usually takes a bit longer because it requires a few
S3 REST API calls - in my testing this is still usually on the order of
hundreds of milliseconds.

If your findings are similar, with the "upload" step taking most of the
time, you'll need to look directly at your upload code and how to optimize
that; Oak should not be in the path at all if you are using the direct
upload API.

I'd also recommend taking a careful look at the documentation at [0] (and
the JavaDoc referenced from that page).  In particular, pay attention to
the recommended upload algorithm described at [1].  I recommend using the
"maxPartSize" value in the returned BinaryUpload as your part size, at
least initially, if you aren't already doing that.

I look forward to hearing what you find out.


[0] -
https://jackrabbit.apache.org/oak/docs/features/direct-binary-access.html
[1] -
https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/api/binary/BinaryUpload.html


-MR



On Mon, Jul 27, 2020 at 6:27 AM Tanvi Shah <[email protected]>
wrote:

> Hi,
>
> We are using Direct Binary S3 Upload feature to save the files in
> repository.
>
> We faced the issue that when we are storing more than 10 files then direct
> binary feature(upload token) takes around 1 hour to save the files and with
> the same files if we don't use the direct binary approach it takes few
> seconds to upload the files.
>
>
> Please can you let us know what could be the issue here.
>
>
> Also We have multiple instances of application and all use the same
> Postgres database and S3 bucket. Will this cause any issue.
>
>
> Thank you in advance.
>
> **********************************************************************
> Disclaimer: This e-mail is confidential and should not be used by anyone
> who is not the original intended recipient. If you have received this
> e-mail in error please inform the sender and delete it from your mailbox or
> any other storage mechanism. Springer Nature Technology and Publishing
> Solutions Private Limited does not accept liability for any statements made
> which are clearly the sender's own and not expressly made on behalf of
> Springer Nature Technology and Publishing Solutions Private Limited or one
> of their agents.
> Please note that Springer Nature Technology and Publishing Solutions
> Private Limited and their agents and affiliates do not accept any
> responsibility for viruses or malware that may be contained in this e-mail
> or its attachments and it is your responsibility to scan the e-mail and
> attachments (if any).
> Springer Nature Technology and Publishing Solutions Private Limited.
> Registered office: Upper Ground Floor, Wing B, Tower 8, Magarpatta City
> SEZ, Hadapsar Pune MH 411013 IN
> Registered number: U72200PN2006FTC128967
>

Re: Slowness in Direct Binary Upload

Reply via email to