skorper opened a new pull request, #235: URL: https://github.com/apache/incubator-sdap-nexus/pull/235
Fixed bug where satellite to satellite queries fail if using an L2 dataset. This issue was being caused by a bug where secondary satellite tile masks were being incorrectly combined, causing the entire secondary tile to be masked. This is due to the nature of Python masks and the fact that `True` means the value is invalid, meaning a logical_or against an entirely masked np array would result in an entirely masked np array. This issue cropped up when running sat to sat matchup where VIIRS was the secondary dataset. This is because VIIRS contains lots of null values for some variables -- in many cases the entire variable in the tile is masked. This would cause the above issue. Simplifying the problem, our old logic was like this: ```python >>> a = np.ma.masked_array([1.0, 2.0, 3.0, 4.0], mask=[0, 0, 1, 0]) >>> b = np.ma.masked_array([5.0, 6.0, 7.0, 8.0], mask=[1, 1, 1, 1]) >>> np.logical_or(a, b) masked_array(data=[--, --, --, --], mask=[ True, True, True, True], fill_value=1e+20, dtype=bool) ``` where `True` means drop the value and `False` means keep the value. This is not what we want! We want the inverse logic, where a masked array "or'd" against an entirely masked array "or'd" == the first array. Our new logic is like this: ```python >>> np.logical_not(np.logical_and(a.mask, b.mask)) array([True, True, False, True]) ``` where `True` == keep the value and `False` means drop the value. In addition to the above, made a few small changes: 1. Only query the insitu API for the schema if `parameter_s` is provided 2. Retrieve tiles one-by-one rather than all at once when finding/retrieving data for secondary tiles. 3. If no secondary tiles are found (in sat to sat matchup), handle gracefully and return `[]` rather than letting an error get raised 4. Fixed bug where only the first two variables are considered in the tile mask computed in `get_indices` Tested like so: - Tested Shawn's `ASCATB-L2-Coastal` -> `VIIRS_NPP-2018_Heatwave` query locally. It works! - Manually ran Riley's regression tests -- all passed. NOTE: Only sat to sat run for now. See below. Please note the following needs to be done before this PR is approved/merged: 1. Run full regression test suite - This is not currently possible because the insitu api is down. 3. Run benchmarks for (2) above, to ensure sure retrieving sat tiles one-by-one is faster than retrieving them all at once. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sdap.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org