Hi everyone, We've been running into major performance issues with loading swath formatted tiles due to how SDAP transforms them to work with the existing algorithm implementations. I estimate that it wastes memory on the order of n^4 based on the tile size. Stepheny proposed a solution to change how SDAP's algorithms work with tile data to handle swath data and require a (much less inefficient) transformation to the gridded data. [1]
I have been working on a solution that would allow us to slowly roll out this adaptation, allowing unadapted algorithms to use the old method so we don't have to make this change all at once [2]. Further, I adapted the /match_spark [3] and /cdmssubset [no pr yet] endpoints to work with these changes successfully, though I have yet to fully compare the memory usages between the two. A concern was brought up about this change so I would like to ask if any current SDAP-implementing projects would be adversely affected by this change. In other words, do any SDAP algorithms *absolutely require* gridded-format data? Concerns were raised specifically for time-series analyses. As a side question, is anyone familiar with the TimeSeriesTile format [4] or can that be safely removed? Thanks, Riley [1] https://issues.apache.org/jira/browse/SDAP-440 [2] https://github.com/apache/incubator-sdap-nexus/pull/229 [3] https://github.com/apache/incubator-sdap-nexus/pull/230 [4] https://github.com/apache/incubator-sdap-nexusproto/blob/da59f5136b7b979b040ea870be80d3e734ed1cd8/src/main/proto/DataTile.proto#L97-L109