[
https://issues.apache.org/jira/browse/FALCON-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118722#comment-14118722
]
Venkatesh Seetharam commented on FALCON-580:
--------------------------------------------
bq. What are the effects of not mandating the pattern to be present ?
1. How will eviction ever work? It will be broken or at best will not evict
anything.
2. DistCp becomes very inefficient as time and data grows since it now needs to
check every file on each side for copy.
3. Hive will also be broken now. I wonder we will be exporting entire table and
import may fail.
This is at best non-deterministic. I always suggest falcon users to have a
dated pattern.
bq. I had considered originally maintaining a property at feed level to
indicate if it is time based on not, but haven't actually felt a need for it
till now. Wanted to avoid the complexity associated with that additional
property on the feed. Depending on what problems you are facing by not having
the time part on the feed, we can discuss the way forward.
Cant imagine how this would have helped.
> Mandate date pattern for the feed path in the xsd
> -------------------------------------------------
>
> Key: FALCON-580
> URL: https://issues.apache.org/jira/browse/FALCON-580
> Project: Falcon
> Issue Type: Bug
> Reporter: Sowmya Ramesh
>
> The granularity of date pattern in the feed path should be atleast that of a
> frequency of a feed. This should be mandated in the feed xsd.
> e.g.:
> {noformat}
> Valid format: <location type="data"
> path="/hdfsDataLocation/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
> Invalid format: <location type="data" path="/hdfsDataLocation"/>
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)