[
https://issues.apache.org/jira/browse/FALCON-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791939#comment-13791939
]
Venkatesh Seetharam commented on FALCON-94:
-------------------------------------------
Thanks [~sriksun] for taking time to review. My comments are below.
bq. Generic CatalogService (AbstractCatalogService) seems to return
HCatPartition, which would mean that the CatalogService is tied to HCatalog.
You are correct and was lazy to roll my own. Done by introducing a
CatalogPartition object.
bq. We are requiring some hcatalog related jars to be copied to oozie shared
lib directory.
Good questions. Yes. HCatalog and Pig-hcat adaptors need to be made available.
bq. I am assuming this will be covered in documentation.
Yes, its mentioned in the docs but covered in detail in Oozie documentation. :-)
bq. Also this would mean that Off the shelf Oozie patched with falcon config
wont work any more.
The sharelib tar file is created as part of the oozie bundle and is available
for users to upload it to hdfs.
bq. We further require the shared lib dirs to be setup and the contents copied.
Oozie needs to be setup in any case for DB and setting up with hadoop jars and
hcatalog jars in libext. This will be an additional step for setiing up
sharelibs.
bq. Are there any challenges in making these jars available in the retention
lib path by default and not requiring shared libs ?
The only challenge I faced was 10s of jar files for both Hcatalog and its
dependencies, same with Pig and Hive.
HCatalog needs 47 jar files
Pig needs 24 jar files
Hive needs 57 jar files
Not sure if I should create a uber jar and then distribute using Falcon but
decided to trump this and use oozie sharelib.
Makes sense?
bq. Is "feedStorageType" a new property added to oozie coordinator for
retention. If so, can this be prefixed with "falcon.".
Yes. I'll address this for all added properties in a separate jira: FALCON-144
bq. Can ${ & ?{ be defined as constants?
Done.
bq. behaviors for listing partitions and drop partitions are listed there, why
do we need to hardcode the eviction behavior and instance deletion discovery
for filesystem need to happen in FeedEvictor ?
Table eviction is NOT implemented in CatalogStorage but in FeedEvictor.
bq. Why can't eviction be implemented in appropriate Storage implementation.
That way FeedEvictor would simpler and lot cleaner. Thoughts ?
Very good thought. Behavior on the storage makes sense. This will apply for
replication as well for import and export which can be a behavior on the
storage and will be portable across workflow engine implementations as well.
Opened FALCON-145 to track this.
> Retention to handle hive table eviction
> ---------------------------------------
>
> Key: FALCON-94
> URL: https://issues.apache.org/jira/browse/FALCON-94
> Project: Falcon
> Issue Type: Sub-task
> Affects Versions: 0.3
> Reporter: Venkatesh Seetharam
> Assignee: Venkatesh Seetharam
> Attachments: FALCON-94.patch, FALCON-94-r1.patch, FALCON-94-r2.patch
>
>
> Must handle both hive managed and external tables.
--
This message was sent by Atlassian JIRA
(v6.1#6144)