[jira] [Updated] (SDAP-440) Switch handling of tile data to L2 format

Riley Kuttruff (Jira) Thu, 16 Feb 2023 10:42:08 -0800


     [ 
https://issues.apache.org/jira/browse/SDAP-440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Riley Kuttruff updated SDAP-440:
--------------------------------
    Description: 
In our current design, when tiles are loaded the data is formatted to be shaped 
like gridded data (L3/L4). This is obviously fine for L4/L3 tiles. The problem 
is with L2 (swath) tiles. The swath -> grid-like transform requires 
transforming an m x n data array for the L2 tile to an (m * n) x (m * n) array 
with the original data values occupying the diagonal of the array and the rest 
of the array locations unused. It goes without saying that this is EXTREMELY 
inefficient for memory, and L2 tile sizes >15x15 can very easily consume memory 
by the gigabyte.

 

Proposed solution: Instead of handling loaded tiles in gridded format, handle 
them in swath format. This would remove the issues from having to transform L2 
tiles, but would still require expanding the latitude and longitude (and time?) 
arrays to match shape with the data array. This would require SIGNIFICANTLY 
less extra memory to do (I even believe numpy can do it with constant extra 
memory rather than the expected O(n)).

The problem with this is that we would need to individually adapt each of 
SDAP's algorithms to work with swath formatted data rather than grid formatted 
data. The scale of this required change has caused us to hold off on this 
implementation.

 

Plan: I plan to mitigate that issue by adapting the NexusTileService to be 
(temporarily) configurable to allow choice in how returned tile data is 
formatted (default will be gridded). We can then roll out the changes for the 
various algorithms and switch over their NTS to serve swath data. Upon 
completion of the rollout, we can (optionally) remove the configuration option 
from the NTS or switch its default to swath.

  was:
In our current design, when tiles are loaded the data is formatted to be shaped 
like gridded data (L3/L4). This is obviously fine for L4/L3 tiles. The problem 
is with L2 (swath) tiles. The swath -> grid-like transform requires 
transforming an m x n data array for the L2 tile to an (m * n) x (m * n) array 
with the original data values occupying the diagonal of the array and the rest 
of the array locations unused. It goes without saying that this is EXTREMELY 
inefficient for memory, and L2 tile sizes >15x15 can very easily consume memory 
by the gigabyte.

 

Proposed solution: Instead of handling loaded tiles in gridded format, handle 
them in swath format. This would remove the issues from having to transform L2 
tiles, but would still require expanding the latitude and longitude (and time?) 
arrays to match shape with the data array. This would require SIGNIFICANTLY 
less extra memory to do (I even believe numpy can do it with constant extra 
memory rather than the expected O(n)).

The problem with this is that we would need to individually adapt each of 
SDAP's algorithms to work with swath formatted data rather than grid formatted 
data. The scale of this required change has caused us to hold off on this 
implementation.

 

Plan: I plan to mitigate that issue by adapting the NexusTileService to be 
(temporarily) configurable to allow choice in how returned tile data is 
formatted (default will be gridded). We can then roll out the changes for the 
various algorithms and switch over their NT


> Switch handling of tile data to L2 format
> -----------------------------------------
>
>                 Key: SDAP-440
>                 URL: https://issues.apache.org/jira/browse/SDAP-440
>             Project: Apache Science Data Analytics Platform
>          Issue Type: Task
>            Reporter: Riley Kuttruff
>            Priority: Major
>
> In our current design, when tiles are loaded the data is formatted to be 
> shaped like gridded data (L3/L4). This is obviously fine for L4/L3 tiles. The 
> problem is with L2 (swath) tiles. The swath -> grid-like transform requires 
> transforming an m x n data array for the L2 tile to an (m * n) x (m * n) 
> array with the original data values occupying the diagonal of the array and 
> the rest of the array locations unused. It goes without saying that this is 
> EXTREMELY inefficient for memory, and L2 tile sizes >15x15 can very easily 
> consume memory by the gigabyte.
>  
> Proposed solution: Instead of handling loaded tiles in gridded format, handle 
> them in swath format. This would remove the issues from having to transform 
> L2 tiles, but would still require expanding the latitude and longitude (and 
> time?) arrays to match shape with the data array. This would require 
> SIGNIFICANTLY less extra memory to do (I even believe numpy can do it with 
> constant extra memory rather than the expected O(n)).
> The problem with this is that we would need to individually adapt each of 
> SDAP's algorithms to work with swath formatted data rather than grid 
> formatted data. The scale of this required change has caused us to hold off 
> on this implementation.
>  
> Plan: I plan to mitigate that issue by adapting the NexusTileService to be 
> (temporarily) configurable to allow choice in how returned tile data is 
> formatted (default will be gridded). We can then roll out the changes for the 
> various algorithms and switch over their NTS to serve swath data. Upon 
> completion of the rollout, we can (optionally) remove the configuration 
> option from the NTS or switch its default to swath.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (SDAP-440) Switch handling of tile data to L2 format

Reply via email to