[ 
https://issues.apache.org/jira/browse/SDAP-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph C. Jacob updated SDAP-326:
---------------------------------
    Component/s: collection-ingester

> Make ingest processors optional in incubator-sdap-ingestor
> ----------------------------------------------------------
>
>                 Key: SDAP-326
>                 URL: https://issues.apache.org/jira/browse/SDAP-326
>             Project: Apache Science Data Analytics Platform
>          Issue Type: Task
>          Components: collection-ingester, granule-ingester
>            Reporter: Joseph C. Jacob
>            Priority: Major
>
> h3. The Problem:
> The old *incubator-sdap-ningesterpy* / *incubator-sdap-ningester* required 
> that we list the processors to be applied to each dataset at ingest time in 
> the configuration file for the dataset.  The new *incubator-sdap-ingester* 
> applies these processors automatically and has no mechanism to change the 
> behavior via a data collection config setting.  This is a problem with the 
> processor that converts any variable with units "kelvin" to units "celsius" 
> because some variables are in units "kelvin", but represent a difference from 
> a norm and should not be transformed.
> Currently, "*kelvintocelsius*" is the only processor that has been identified 
> as one that we need to be able to turn off.  However, this may apply to any 
> units conversion or to other processors added in the future.
> h3. The Details:
> In particular, for the *{{MUR25-JPL-L4-GLOB-v4.2}}* dataset, we commonly 
> ingest both the *{{analysed_sst}}* and the *{{sst_anomaly}}*, both of which 
> natively have units of degrees Kelvin, but the {{*sst_anomaly* represents a 
> difference from some norm and should not be subject to the “subtract 273.15” 
> operation.  An *sst_anomaly*}} of 0 degrees in degrees Kelvin is still a 0 
> degree “anomaly” or “difference” in degrees Celsius.  So, we need to restrict 
> which variables get this operation applied to them.
> h3. Proposed Solution:
> I propose to solve this in a way that is not specific to *kelvintocelsius* 
> processor.  Currently that processor is the only one that has been identified 
> as one that we need to be able to turn off, but there may be others in the 
> future.  The proposed solution is to add a keyword in the 
> *collections-config* where we can list any processors to be turned OFF for a 
> dataset.  Then we would just need to check that a processor is not in this 
> list before applying it.  This approach would work for the *kelvintocelsius* 
> processor and any other processor that is already supported or is added in 
> the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to