[ 
https://issues.apache.org/jira/browse/FALCON-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279720#comment-14279720
 ] 

Sowmya Ramesh commented on FALCON-634:
--------------------------------------

[~sriksun]
{quote}
CLI can query the server for recipe artifacts, use them in the client process 
to build the recipe 
{quote}

I am not convinced that recipe artifacts should be packaged with server and 
deployed either locally on server or HDFS.

Recipe is a client side concept. If we deploy the artifacts on HDFS or local FS 
on server, every recipe submission will require all the recipe artifacts to be 
copied from the remote m/c [HDFS or server if its running on a different m/c 
than client] to client m/c to build the recipe.

If recipe artifacts are packaged with Client, list and describe recipe 
functionality can still be implemented. client.properties can be updated with 
the path where the artifacts will be installed and one time deployment can be 
done as part of client installation. 

Below, I am listing pros and cons of packaging artifacts with Client or Server.

*Packaging artifacts with Server:*

h4. Pros
1. On addition of new recipe support, user has to upgrade only server to latest 
version

h4.Cons
1. Every time a recipe is submitted all the template files has to be copied 
from remote m/c to client
2. User has to copy the property template file from remote m/c [HDFS or local 
FS on server if server and client are running on diff m/c's] for updating it 
with required values

*Packaging artifacts with Client:*

h4. Pros:
1. User has to copy the property template file from local FS for updating it 
with required values. Better usability as there is no need to SCP or copying 
from HDFS
2. Downloading template files from remote m/c on every recipe submission is not 
required

h4. Cons:
1. On addition of new recipe artifacts all the clients have to be upgraded to 
use this functionality
2. Size of client jar will increase but this will be minimal

Permissions for these artifacts should be read only as these are just the 
templates. User is expected to make a copy of the .property template, edit it 
accordingly before recipe submission.
Every instance of recipe will have its own set of properties. User is expected 
to copy the .properties template, update it with the required values and pass 
the path in CLI 
{code}
e.g. falcon recipe -name hdfs-replication -propertyFilePath <path>
{code}

Also, users may have  a use case requiring to edit process template or WF 
template to add additional elements for a given recipe instance. If recipe tool 
uses templates from the shared location and if we allow editing templates then 
all clients/ recipe instances are forced to use edited template which may not 
be intended. Option should be provided to override recipe artifact location. 
User can make a copy of the templates, edit it and pass the path in the CLI.

{code}
e.g. falcon recipe -name hdfs-replication -propertyFilePath <path> -location 
<pathToTemplates> [-location is optional and should be used only for overriding 
the artifact path]
{code}

> Add recipes in Falcon
> ---------------------
>
>                 Key: FALCON-634
>                 URL: https://issues.apache.org/jira/browse/FALCON-634
>             Project: Falcon
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Venkatesh Seetharam
>              Labels: recipes
>
> Falcon offers many services OOTB and caters to a wide array of use cases. 
> However, there has been many asks that does not fit the functionality offered 
> by Falcon. I'm proposing that we add recipes to Falcon which is similar to 
> recipes in Whirr and other management solutions such as puppet and chef.
> Overview:
> A recipe essentially is a static process template with parameterized workflow 
> to realize a specific use case. For example:
> * replicating directories from one HDFS cluster to another (not timed 
> partitions)
> * replicating hive metadata (database, table, views, etc.)
> * replicating between HDFS and Hive - either way
> * anonymization of data based on schema
> * data masking
> * etc.
> Proposal:
> Falcon provides a Process abstraction that encapsulates the configuration 
> for a user workflow with scheduling controls. All recipes can be modeled 
> as a Process with in Falcon which executes the user workflow 
> periodically. The process and its associated workflow are parameterized. The 
> user will provide a properties file with name value pairs that are 
> substituted by falcon before scheduling it.
> This is a client side concept. The server does not know about a recipe but 
> only accepts the cooked recipe as a process entity. 
> The CLI would look something like this:
> falcon -recipe $recipe_name -properties $properties_file
> Recipes will reside inside addons (contrib) with source code and will have an 
> option to package 'em.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to