Github user agoodm commented on the issue: https://github.com/apache/climate/pull/374 @huikyole Not sure if I agree with your first suggestion. I thought the intent of having data_source as a separate directory was originally due to having multiple modules for loading from each data source. The modules in the base ocw directory represent each of the individual steps in the workflow, eg dataset processing, running evaluations, and plotting. The main thing that was missing previously was loading the datasets which is exactly what this module aims to do. So for now I think leaving it here is appropriate. I think @lewismc should share his thoughts on this though too. I absolutely agree with your second suggestion though. I originally had it set up this way because to my recollection, the rest of the OCW codebase (particularly metrics and evaluations) were designed with "one reference" dataset. Given that Loikith et al. 2013 uses two reanalysis datasets, we should get rid of this rigid assumption not only for `dataset_loader.py` but potentially for `evaluation.py` as well. I think for now changing the former obviously takes precedence but we should consider exploring the the latter as well.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---