[ 
https://issues.apache.org/jira/browse/OODT-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147207#comment-14147207
 ] 

Lewis John McGibbney edited comment on OODT-754 at 9/25/14 1:11 AM:
--------------------------------------------------------------------

[~rickdn] this is an excellent idea. [~skhudiky] and myself were discussing 
this today and it is certainly a shortcoming of other extractor implementations 
where they do not account for the following case
Say you have a file which is as follows AAAA-BB-CCCCCC-DD.png which you wish to 
consider as a product.
 * AAAA represents the instrument/device which produced the picture
 * BB is an identifier for the project the picture was produced for
 * CCCCCC is the datee.g. YYMMDD
 * DD is the number of products produced on that date for that project by that 
instrument.

What happens is DD > 99?
Well what happens is that the FileNameExtractor (or whatever it is called) 
policy is broken and we begin ingesting incorrect information.
The extractor you describe on the wiki makes life so much easier to deal with 
cases like the above.
Thanks 


was (Author: lewismc):
[~rickdn] this is an excellent idea. [~skhudiky] and myself were discussing 
this today and it is certainly a shortcoming of other extractor implementations 
where they do not account for the following case
Say you have a file which is as follows AAAA-BB-CCCCCC-DD.png which you wish to 
consider as a product.
 * AAAA represents the instrument/device which produced the picture
 * BB is an identifier for the project the picture was produced for
 * CCCCCC is the datee.g. YYMMDD
 * DD is the number of products produced on that date for that project by that 
instrument.
What happens is DD > 99?
Well what happens is that the FileNameExtractor (or whatever it is called) 
policy is broken and we begin ingesting incorrect information.
The extractor you describe on the wiki makes life so much easier to deal with 
cases like the above.
Thanks 

> contribute ProdTypePatternMetExtractor
> --------------------------------------
>
>                 Key: OODT-754
>                 URL: https://issues.apache.org/jira/browse/OODT-754
>             Project: OODT
>          Issue Type: New Feature
>          Components: metadata container
>            Reporter: Ricky Nguyen
>            Assignee: Ricky Nguyen
>             Fix For: 0.8
>
>
> There has been renewed interest in implementing the 
> ProdTypePatternMetExtractor proposed 
> [here|https://cwiki.apache.org/confluence/display/OODT/MetExtractors+for+Crawler].
> I was going to add it to the "metadata" module under the 
> "org.apache.oodt.cas.metadata.extractors" package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to