[ https://issues.apache.org/jira/browse/SDAP-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph C. Jacob updated SDAP-326: --------------------------------- Component/s: collection-ingester > Make ingest processors optional in incubator-sdap-ingestor > ---------------------------------------------------------- > > Key: SDAP-326 > URL: https://issues.apache.org/jira/browse/SDAP-326 > Project: Apache Science Data Analytics Platform > Issue Type: Task > Components: collection-ingester, granule-ingester > Reporter: Joseph C. Jacob > Priority: Major > > h3. The Problem: > The old *incubator-sdap-ningesterpy* / *incubator-sdap-ningester* required > that we list the processors to be applied to each dataset at ingest time in > the configuration file for the dataset. The new *incubator-sdap-ingester* > applies these processors automatically and has no mechanism to change the > behavior via a data collection config setting. This is a problem with the > processor that converts any variable with units "kelvin" to units "celsius" > because some variables are in units "kelvin", but represent a difference from > a norm and should not be transformed. > Currently, "*kelvintocelsius*" is the only processor that has been identified > as one that we need to be able to turn off. However, this may apply to any > units conversion or to other processors added in the future. > h3. The Details: > In particular, for the *{{MUR25-JPL-L4-GLOB-v4.2}}* dataset, we commonly > ingest both the *{{analysed_sst}}* and the *{{sst_anomaly}}*, both of which > natively have units of degrees Kelvin, but the {{*sst_anomaly* represents a > difference from some norm and should not be subject to the “subtract 273.15” > operation. An *sst_anomaly*}} of 0 degrees in degrees Kelvin is still a 0 > degree “anomaly” or “difference” in degrees Celsius. So, we need to restrict > which variables get this operation applied to them. > h3. Proposed Solution: > I propose to solve this in a way that is not specific to *kelvintocelsius* > processor. Currently that processor is the only one that has been identified > as one that we need to be able to turn off, but there may be others in the > future. The proposed solution is to add a keyword in the > *collections-config* where we can list any processors to be turned OFF for a > dataset. Then we would just need to check that a processor is not in this > list before applying it. This approach would work for the *kelvintocelsius* > processor and any other processor that is already supported or is added in > the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)