[HDF Forum] [HSDS] How to issue requests to HSDS Lambda?

John Readey Wed, 31 Aug 2022 12:42:54 -0700


Glad it worked with the smaller selection!

One problem with the Lambda implementation for HSDS is that it only supported
JSON responses. For data selections, converting binary data to JSON adds a lot
of overhead and memory usage.

To improve this I pushed out an HSDS update yesterday that enables hex-encoded
responses. You just need to add a header specifying "octet-stream". Here's an
example event:

{
"method": "GET",
"path": "/datasets/d-096b7930-5dc5b556-dbc8-00c5ad-8aca89/value",
"headers": {
"accept": "application/octet-stream"
},
"params": {
"domain": "/nrel/nsrdb/v3/nsrdb_2000.h5",
"select": "[0:1000,0:1000]",
"bucket": "nrel-pds-hsds"
}
}

You'll still get a JSON response from Lambda, but the body key will have a
hex-encoded value (i.e. it will use twice as many bytes as the binary
equivalent).

The above request took 6.3 seconds to run and consumed 268 MB of memory.

I was hopefully that a larger selection would work as well, but with a
[1000,10000] selection I get an AWS error:

"Response payload size exceeded maximum allowed payload size (6291556 bytes)."

So it looks like there's no support yet for responses larger than 6MB.

What would be nice would be if AWS Lambda supported true binary responses and
HTTP streaming as discussed in my blog from last week:
https://www.hdfgroup.org/2022/08/hsds-streaming/. Amazon has been adding new
features to Lambda each year, so we can be hopeful!

BTW, you can now use h5pyd with Lambda. You just need to setup your .hscfg
like this:

hs_endpoint = http+lambda://hslambda
hs_username = hslambda
hs_password = lambda
hs_api_key = None

Where the endpoint is "http+lambda://" plus the name of your lambda function
("hslambda" in my case). Other than that h5pyd programs should work the same
as with a regular HSDS server (if not as fast).

Anyway, would just setting up an HSDS server be the best approach in your case?
In my view Lambda works best for moderately sized selections when there is a
fairly large reduction in the amount of data returned vs. the number of chunks
touched. E.g. with the NSRDB data above, if I have a selection of [1234, 0:
2018392] it hits 10 GB of chunk data to return a 4MB response. In this case,
it's a big advantage to run Lambda in the same AWS region as the S3 store vs.
having to move the entire 10 GB out of Amazon (say you were using the ros3 VFD
on your laptop).

---
[Visit
Topic](https://forum.hdfgroup.org/t/how-to-issue-requests-to-hsds-lambda/10136/13)
or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://forum.hdfgroup.org/email/unsubscribe/10c7ad5c80c22937d685e8d6bbb7488146b11f5ec19dff72f6ffc6195cff5ef6).

[HDF Forum] [HSDS] How to issue requests to HSDS Lambda?

Reply via email to