Hello,
Running Falcon 6.1...
My use case is to have a top-level directory where files are deposited,
such as /incoming. Files should be processed or evicted based on their HDFS
timestamps. Falcon appears to be forcing me into an ingestion pattern of
writing files to a path of /incoming/[${YEAR}, ${MONTH}, ${DAY}, ${HOUR},
or ${MINUTE}}. Is this absolutely the case, without exception or option?
I am learning Apache Falcon, and confused by how the <location
type="data"/> path of a Feed Entity is coupled to the
FALCON_FEED_RETENTION_... ooozie workflow that is created and scheduled.
>From documentation and hands-on testing it appears that I cannot create a
location data path without applying a ${YEAR}, ${MONTH}, ${DAY}, ${HOUR},
or ${MINUTE} template to the physical path.
My preference is to create a feed that looks like this:
<feed xmlns='uri:falcon:feed:0.1' name='greenema-folder1'>
<frequency>minutes(5)</frequency>
<timezone>GMT-06:00</timezone>
<clusters>
<cluster name='MyCluster' type='source'>
<validity start='2015-01-16T19:54Z' end='2030-01-17T19:54Z'/>
<retention limit='hours(5)' action='delete'/>
<locations>
<location type='data'>
</location>
......
</locations>
</cluster>
</clusters>
<locations>
.....
<location type='data' path='/user/greenema/folder1'>
</location>
......
</locations>
</feed>
With the current definition I get the following error from
org.apache.falcon.entity.FileSystemStorage.fileSystemEvictor:
Launcher exception: org.apache.falcon.FalconException: Couldn't evict feed
from fileSystem
org.apache.oozie.action.hadoop.JavaMainException:
org.apache.falcon.FalconException: Couldn't evict feed from fileSystem
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:59)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.falcon.FalconException: Couldn't evict feed from
fileSystem
at
org.apache.falcon.entity.FileSystemStorage.evict(FileSystemStorage.java:306)
at org.apache.falcon.retention.FeedEvictor.run(FeedEvictor.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.falcon.retention.FeedEvictor.main(FeedEvictor.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:56)
... 15 more
Caused by: java.io.IOException: Unable to resolve pattern for feedPath:
/user/greenema/folder1
at org.apache.falcon.entity.FeedHelper.getFeedBasePath(FeedHelper.java:435)
at
org.apache.falcon.entity.FileSystemStorage.fileSystemEvictor(FileSystemStorage.java:331)
at
org.apache.falcon.entity.FileSystemStorage.evict(FileSystemStorage.java:300)
... 23 more
--
Any help or clarification to the Entity Definition documentation on what is
*required* of a path definition would be very helpful. Thanks!
Mark Greene
*E:* [email protected]