GitHub user agoodm opened a pull request: https://github.com/apache/climate/pull/451
CLIMATE-926 - Metadata Extractors For now this is only being used for our soon to be shown CORDEX examples, but I do plan to make this a more permanent part of OCW with unit tests and documentation. Example usage in an ipython shell: ```python In [1]: from metadata_extractor import CORDEXMetadataExtractor, obs4MIPSMetadataExtractor In [2]: obs_extractor = obs4MIPSMetadataExtractor('/proj3/data/obs4mips') In [3]: models_extractor = CORDEXMetadataExtractor('/proj3/data/CORDEX/NAM-44/*') In [4]: obs_extractor.instruments Out[4]: {'AIRS', 'AMSRE', 'ATSR', 'AVISO', 'CALIOP', 'CERES-EBAF', 'GPCP-1DD', 'GPCP-SG', 'HIRS', 'MISR', 'MLS', 'MODIS', 'MODIS-Level', 'MODIS-level', 'OISST', 'QuikSCAT', 'RO', 'SSMI-MERIS', 'SSMI-NH', 'SSMI-SH', 'TES', 'TRMM-L3', 'ghgcci'} In [5]: models_extractor.models Out[5]: {'DMI-HIRHAM5', 'MOHC-HadRM3P', 'SMHI-RCA4', 'UQAM-CRCM5'} In [6]: obs_extractor.query(variable='pr') Out[6]: [{'filename': '/proj3/data/obs4mips/pr_GPCP-SG_L3_v2.2_*.nc', 'instrument': 'GPCP-SG', 'processing_level': 'L3', 'variable': 'pr', 'version': 'v2.2'}, {'filename': '/proj3/data/obs4mips/pr_GPCP-1DD_L3_v1.2_*.nc', 'instrument': 'GPCP-1DD', 'processing_level': 'L3', 'variable': 'pr', 'version': 'v1.2'}, {'filename': '/proj3/data/obs4mips/pr_TRMM-L3_*.nc', 'instrument': 'TRMM-L3', 'variable': 'pr'}] In [7]: obs_extractor.group(models_extractor, 'variable')[0] Out[7]: ({'filename': '/proj3/data/obs4mips/rlut_CERES-EBAF_L3B_Ed2-8_*.nc', 'instrument': 'CERES-EBAF', 'processing_level': 'L3B', 'variable': 'rlut', 'version': 'Ed2-8'}, [{'domain': 'NAM-44', 'driving_model': 'ECMWF-ERAINT', 'ensemble': 'r1i1p1', 'experiment': 'evaluation', 'filename': '/proj3/data/CORDEX/NAM-44/rlut/rlut_NAM-44_ECMWF-ERAINT_evaluation_r1i1p1_MOHC-HadRM3P_v1_mon_*.nc', 'model': 'MOHC-HadRM3P', 'time_step': 'mon', 'variable': 'rlut', 'version': 'v1'}, {'domain': 'NAM-44', 'driving_model': 'ECMWF-ERAINT', 'ensemble': 'r1i1p1', 'experiment': 'evaluation', 'filename': '/proj3/data/CORDEX/NAM-44/rlut/rlut_NAM-44_ECMWF-ERAINT_evaluation_r1i1p1_UQAM-CRCM5_v1_mon_*.nc', 'model': 'UQAM-CRCM5', 'time_step': 'mon', 'variable': 'rlut', 'version': 'v1'}, {'domain': 'NAM-44', 'driving_model': 'ECMWF-ERAINT', 'ensemble': 'r1i1p1', 'experiment': 'evaluation', 'filename': '/proj3/data/CORDEX/NAM-44/rlut/rlut_NAM-44_ECMWF-ERAINT_evaluation_r1i1p1_SMHI-RCA4_v1_mon_*.nc', 'model': 'SMHI-RCA4', 'time_step': 'mon', 'variable': 'rlut', 'version': 'v1'}, {'domain': 'NAM-44', 'driving_model': 'ECMWF-ERAINT', 'ensemble': 'r1i1p1', 'experiment': 'evaluation', 'filename': '/proj3/data/CORDEX/NAM-44/rlut/rlut_NAM-44_ECMWF-ERAINT_evaluation_r1i1p1_DMI-HIRHAM5_v1_mon_*.nc', 'model': 'DMI-HIRHAM5', 'time_step': 'mon', 'variable': 'rlut', 'version': 'v1'}]) ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/agoodm/climate CLIMATE-926 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/climate/pull/451.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #451 ---- commit 8217d12f06987d852f9294da94a5af243116e751 Author: Alex Goodman <ago...@users.noreply.github.com> Date: 2017-09-25T17:35:20Z CLIMATE-926 - Metadata Extractors commit ec81037a21dbf2cf1999422127bbe924e1072c4a Author: Alex Goodman <ago...@users.noreply.github.com> Date: 2017-09-25T17:41:07Z Fix model attribute in CORDEXMetadataExtractor ---- ---