[
https://issues.apache.org/jira/browse/OODT-667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated OODT-667:
-----------------------------------
Fix Version/s: (was: 0.7)
0.8
- push to 0.8, get ready for 0.7 release.
> CAS-PGE no longer respects writers and file tags from earlier pgeConfig.xml
> files
> ---------------------------------------------------------------------------------
>
> Key: OODT-667
> URL: https://issues.apache.org/jira/browse/OODT-667
> Project: OODT
> Issue Type: Bug
> Components: pge wrapper framework
> Affects Versions: 0.4, 0.5, 0.6
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Labels: back, compat, config, files, fix, pge, regex
> Fix For: 0.8
>
>
> It's been a long standing bug post Apache OODT 0.3 (0.4 and beyond) that the
> updates to CAS-PGE to simplify its crawling system for met extraction based
> on files and regExp tags and to unify it with the AutoDetectProductCrawler
> has caused cas-pge to no longer honor the following blocks from pgeConfig.xml
> files:
> {code:xml}
> <output>
> <dir>
> <files regExp="someRegExp" metWriter="some.class" args="some args"/>
> <!--...-->
> </dir>
> </output>
> {code}
> This was a conscious decision and discuss by Brian Foster and myself and
> others on several occasions:
> https://issues.apache.org/jira/browse/OODT-426
> http://markmail.org/message/oe5tmutu374wqldb
> I support Brian's implementation but I think we took a step back in not
> offering backwards compatibility that simply:
> 1. still reads the pgeConfig.xml files tags above and then;
> 2. constructs the appropriate AutoDetectCrawler and RenamingConventions and
> other plumbing behind the scenes.
> Note one of the key features that becomes important in these situations is to
> have CAS-PGE job directories contain the metadata files serialized for
> offline inspection in case there are errors. Currently we lost support for
> that (as evidenced by the removal of the met key MET_FILE_EXT). I am also
> going to add that back in, and simply subclass AutoDetectProductCrawler in
> cas-pge, and then override its crawling step to also serialize the met files
> it generates.
> That will get us back to full forwards and backwards compat support starting
> in 0.7 for *all* versions of CAS-PGE pgeConfig.xml files. wish me luck!
--
This message was sent by Atlassian JIRA
(v6.2#6252)