[
https://issues.apache.org/jira/browse/FALCON-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279720#comment-14279720
]
Sowmya Ramesh commented on FALCON-634:
--------------------------------------
[~sriksun]
{quote}
CLI can query the server for recipe artifacts, use them in the client process
to build the recipe
{quote}
I am not convinced that recipe artifacts should be packaged with server and
deployed either locally on server or HDFS.
Recipe is a client side concept. If we deploy the artifacts on HDFS or local FS
on server, every recipe submission will require all the recipe artifacts to be
copied from the remote m/c [HDFS or server if its running on a different m/c
than client] to client m/c to build the recipe.
If recipe artifacts are packaged with Client, list and describe recipe
functionality can still be implemented. client.properties can be updated with
the path where the artifacts will be installed and one time deployment can be
done as part of client installation.
Below, I am listing pros and cons of packaging artifacts with Client or Server.
*Packaging artifacts with Server:*
h4. Pros
1. On addition of new recipe support, user has to upgrade only server to latest
version
h4.Cons
1. Every time a recipe is submitted all the template files has to be copied
from remote m/c to client
2. User has to copy the property template file from remote m/c [HDFS or local
FS on server if server and client are running on diff m/c's] for updating it
with required values
*Packaging artifacts with Client:*
h4. Pros:
1. User has to copy the property template file from local FS for updating it
with required values. Better usability as there is no need to SCP or copying
from HDFS
2. Downloading template files from remote m/c on every recipe submission is not
required
h4. Cons:
1. On addition of new recipe artifacts all the clients have to be upgraded to
use this functionality
2. Size of client jar will increase but this will be minimal
Permissions for these artifacts should be read only as these are just the
templates. User is expected to make a copy of the .property template, edit it
accordingly before recipe submission.
Every instance of recipe will have its own set of properties. User is expected
to copy the .properties template, update it with the required values and pass
the path in CLI
{code}
e.g. falcon recipe -name hdfs-replication -propertyFilePath <path>
{code}
Also, users may have a use case requiring to edit process template or WF
template to add additional elements for a given recipe instance. If recipe tool
uses templates from the shared location and if we allow editing templates then
all clients/ recipe instances are forced to use edited template which may not
be intended. Option should be provided to override recipe artifact location.
User can make a copy of the templates, edit it and pass the path in the CLI.
{code}
e.g. falcon recipe -name hdfs-replication -propertyFilePath <path> -location
<pathToTemplates> [-location is optional and should be used only for overriding
the artifact path]
{code}
> Add recipes in Falcon
> ---------------------
>
> Key: FALCON-634
> URL: https://issues.apache.org/jira/browse/FALCON-634
> Project: Falcon
> Issue Type: Improvement
> Affects Versions: 0.6
> Reporter: Venkatesh Seetharam
> Labels: recipes
>
> Falcon offers many services OOTB and caters to a wide array of use cases.
> However, there has been many asks that does not fit the functionality offered
> by Falcon. I'm proposing that we add recipes to Falcon which is similar to
> recipes in Whirr and other management solutions such as puppet and chef.
> Overview:
> A recipe essentially is a static process template with parameterized workflow
> to realize a specific use case. For example:
> * replicating directories from one HDFS cluster to another (not timed
> partitions)
> * replicating hive metadata (database, table, views, etc.)
> * replicating between HDFS and Hive - either way
> * anonymization of data based on schema
> * data masking
> * etc.
> Proposal:
> Falcon provides a Process abstraction that encapsulates the configuration
> for a user workflow with scheduling controls. All recipes can be modeled
> as a Process with in Falcon which executes the user workflow
> periodically. The process and its associated workflow are parameterized. The
> user will provide a properties file with name value pairs that are
> substituted by falcon before scheduling it.
> This is a client side concept. The server does not know about a recipe but
> only accepts the cooked recipe as a process entity.
> The CLI would look something like this:
> falcon -recipe $recipe_name -properties $properties_file
> Recipes will reside inside addons (contrib) with source code and will have an
> option to package 'em.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)