[DISCUSS] Swath tile optimization effects & related questions

Riley Kuttruff Tue, 21 Feb 2023 16:00:32 -0800

Hi everyone,

We've been running into major performance issues with loading swath formatted 
tiles due to how SDAP transforms them to work with the existing algorithm 
implementations. I estimate that it wastes memory on the order of n^4 based on 
the tile size. Stepheny proposed a solution to change how SDAP's algorithms 
work with tile data to handle swath data and require a (much less inefficient) 
transformation to the gridded data. [1]


I have been working on a solution that would allow us to slowly roll out this 
adaptation, allowing unadapted algorithms to use the old method so we don't 
have to make this change all at once [2]. Further, I adapted the /match_spark 
[3] and /cdmssubset [no pr yet] endpoints to work with these changes 
successfully, though I have yet to fully compare the memory usages between the 
two.

A concern was brought up about this change so I would like to ask if any 
current SDAP-implementing projects would be adversely affected by this change. 
In other words, do any SDAP algorithms *absolutely require* gridded-format 
data? Concerns were raised specifically for time-series analyses. As a side 
question, is anyone familiar with the TimeSeriesTile format [4] or can that be 
safely removed? 

Thanks,
Riley

[1] https://issues.apache.org/jira/browse/SDAP-440
[2] https://github.com/apache/incubator-sdap-nexus/pull/229
[3] https://github.com/apache/incubator-sdap-nexus/pull/230
[4] 
https://github.com/apache/incubator-sdap-nexusproto/blob/da59f5136b7b979b040ea870be80d3e734ed1cd8/src/main/proto/DataTile.proto#L97-L109

[DISCUSS] Swath tile optimization effects & related questions

Reply via email to