[ 
https://issues.apache.org/jira/browse/OODT-667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated OODT-667:
-----------------------------------

    Fix Version/s:     (was: 0.7)
                   0.8

- push to 0.8, get ready for 0.7 release.

> CAS-PGE no longer respects writers and file tags from earlier pgeConfig.xml 
> files
> ---------------------------------------------------------------------------------
>
>                 Key: OODT-667
>                 URL: https://issues.apache.org/jira/browse/OODT-667
>             Project: OODT
>          Issue Type: Bug
>          Components: pge wrapper framework
>    Affects Versions: 0.4, 0.5, 0.6
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>              Labels: back, compat, config, files, fix, pge, regex
>             Fix For: 0.8
>
>
> It's been a long standing bug post Apache OODT 0.3 (0.4 and beyond) that the 
> updates to CAS-PGE to simplify its crawling system for met extraction based 
> on files and regExp tags and to unify it with the AutoDetectProductCrawler 
> has caused cas-pge to no longer honor the following blocks from pgeConfig.xml 
> files:
> {code:xml}
> <output>
>   <dir>
>     <files regExp="someRegExp" metWriter="some.class" args="some args"/>
>   <!--...-->
>    </dir>
> </output>
> {code}
> This was a conscious decision and discuss by Brian Foster and myself and 
> others on several occasions:
> https://issues.apache.org/jira/browse/OODT-426
> http://markmail.org/message/oe5tmutu374wqldb
> I support Brian's implementation but I think we took a step back in not 
> offering backwards compatibility that simply:
> 1. still reads the pgeConfig.xml files tags above and then;
> 2. constructs the appropriate AutoDetectCrawler and RenamingConventions and 
> other plumbing behind the scenes.
> Note one of the key features that becomes important in these situations is to 
> have CAS-PGE job directories contain the metadata files serialized for 
> offline inspection in case there are errors. Currently we lost support for 
> that (as evidenced by the removal of the met key MET_FILE_EXT). I am also 
> going to add that back in, and simply subclass AutoDetectProductCrawler in 
> cas-pge, and then override its crawling step to also serialize the met files 
> it generates. 
> That will get us back to full forwards and backwards compat support starting 
> in 0.7 for *all* versions of CAS-PGE pgeConfig.xml files. wish me luck!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to