Riley Kuttruff created SDAP-440: ----------------------------------- Summary: Switch handling of tile data to L2 format Key: SDAP-440 URL: https://issues.apache.org/jira/browse/SDAP-440 Project: Apache Science Data Analytics Platform Issue Type: Task Reporter: Riley Kuttruff
In our current design, when tiles are loaded the data is formatted to be shaped like gridded data (L3/L4). This is obviously fine for L4/L3 tiles. The problem is with L2 (swath) tiles. The swath -> grid-like transform requires transforming an m x n data array for the L2 tile to an (m * n) x (m * n) array with the original data values occupying the diagonal of the array and the rest of the array locations unused. It goes without saying that this is EXTREMELY inefficient for memory, and L2 tile sizes >15x15 can very easily consume memory by the gigabyte. Proposed solution: Instead of handling loaded tiles in gridded format, handle them in swath format. This would remove the issues from having to transform L2 tiles, but would still require expanding the latitude and longitude (and time?) arrays to match shape with the data array. This would require SIGNIFICANTLY less extra memory to do (I even believe numpy can do it with constant extra memory rather than the expected O(n)). The problem with this is that we would need to individually adapt each of SDAP's algorithms to work with swath formatted data rather than grid formatted data. The scale of this required change has caused us to hold off on this implementation. Plan: I plan to mitigate that issue by adapting the NexusTileService to be (temporarily) configurable to allow choice in how returned tile data is formatted (default will be gridded). We can then roll out the changes for the various algorithms and switch over their NT -- This message was sent by Atlassian Jira (v8.20.10#820010)