----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18626/ -----------------------------------------------------------
(Updated March 10, 2014, 5:48 p.m.) Review request for Falcon and Srikanth Sundarrajan. Repository: falcon-git Description ------- When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates a partition filter that contains only one of these patterns. However if HCatalog table is defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute), then feed retention doesn't delete any partitions that are granular than first level (year). Diffs (updated) ----- common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1 common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14 retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java 13c447c webapp/src/test/java/org/apache/falcon/catalog/HiveCatalogServiceIT.java fd004a1 webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e Diff: https://reviews.apache.org/r/18626/diff/ Testing ------- - Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog feed where date consists of multiple partitions columns (year/month/day). - Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned by year/month/day/hour/minute/country. Thanks, Satish Mittal
