Glad it worked with the smaller selection! 

One problem with the Lambda implementation for HSDS is that it only supported 
JSON responses.  For data selections, converting binary data to JSON adds a lot 
of overhead and memory usage.

To improve this I pushed out an HSDS update yesterday that enables hex-encoded 
responses.  You just need to add a header specifying "octet-stream".  Here's an 
example event:

    {
      "method": "GET",
      "path": "/datasets/d-096b7930-5dc5b556-dbc8-00c5ad-8aca89/value",
       "headers": {
        "accept": "application/octet-stream"
      },
      "params": {
        "domain": "/nrel/nsrdb/v3/nsrdb_2000.h5",
        "select": "[0:1000,0:1000]",
        "bucket": "nrel-pds-hsds"
      }
    }

You'll still get a JSON response from Lambda, but the body key will have a 
hex-encoded value (i.e. it will use twice as many bytes as the binary 
equivalent).

The above request took 6.3 seconds to run and consumed 268 MB of memory.

I was hopefully that a larger selection would work as well, but with a 
[1000,10000] selection I get an AWS error:

"Response payload size exceeded maximum allowed payload size (6291556 bytes)."

So it looks like there's no support yet for responses larger than 6MB.  

What would be nice would be if AWS Lambda supported true binary responses and 
HTTP streaming as discussed in my blog from last week: 
https://www.hdfgroup.org/2022/08/hsds-streaming/.   Amazon has been adding new 
features to Lambda each year, so we can be hopeful!  

BTW, you can now use h5pyd with Lambda.  You just need to setup your .hscfg 
like this:

    hs_endpoint = http+lambda://hslambda
    hs_username = hslambda
    hs_password = lambda
    hs_api_key = None

Where the endpoint is "http+lambda://" plus the name of your lambda function 
("hslambda" in my case).  Other than that h5pyd programs should work the same 
as with a regular HSDS server (if not as fast).

Anyway, would just setting up an HSDS server be the best approach in your case? 
 In my view Lambda works best for moderately sized selections when there is a 
fairly large reduction in the amount of data returned vs. the number of chunks 
touched.  E.g. with the NSRDB data above, if I have a selection of [1234, 0: 
2018392] it hits 10 GB of chunk data to return a 4MB response.  In this case, 
it's a big advantage to run Lambda in the same AWS region as the S3 store vs. 
having to move the entire 10 GB out of Amazon (say you were using the ros3 VFD 
on your laptop).





---
[Visit 
Topic](https://forum.hdfgroup.org/t/how-to-issue-requests-to-hsds-lambda/10136/13)
 or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://forum.hdfgroup.org/email/unsubscribe/10c7ad5c80c22937d685e8d6bbb7488146b11f5ec19dff72f6ffc6195cff5ef6).

Reply via email to