[jira] [Updated] (AMBARI-4481) Add to the agent ability to download service scripts and hooks

Dmitry Lysnichenko (JIRA) Fri, 31 Jan 2014 04:40:45 -0800

     [ 
https://issues.apache.org/jira/browse/AMBARI-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dmitry Lysnichenko updated AMBARI-4481:
---------------------------------------

    Description: 
h1. Proposal:
h2. General conception
Ambari server shares some files at /var/lib/ambari-server/resources/ via HTTP. 
These files are accessible via url like 
http://hostname:8080/resources/jdk-6u31-linux-x64.bin . Among these files there 
are service scripts, templates and hooks. Agent has a cache of these files. 
Cache directory structure is similar to contents of a stacks folder at server. 
For example:
$ ls /var/lib/ambari-agent/cache
{code}
└── stacks
    └── HDP
        ├── 2.0.7
        │   ├── Accumulo
        │   └── Flume
        └── 2.0.8
            ├── Accumulo
            ├── Flume
            └── YetAnotherService
{code}
If files for some service, component and stack version is not available at 
cache, agent downloads appropriate files on first use.
h2. Packaging files into archieves:
The trouble is that in current Jetty configuration, ambari-server does not 
allow to list directories.  We have two options:
- To speed up download and avoid need to list script files explicitly, the 
proposal is to pack directories "hooks" and "packages" into gz archieves. 
- We may set "dirAllowed" servlet option for /resources/* and in this case 
agent will download all files one by one. User will not have to run additional 
commands to have stack files updated (improved usability). For every file being 
downloaded, a separate request will be sent. This way to fetch files seems to 
be too slow, especially on big clusters.

As second way seems to be not applicable, I'm going to implement the first way. 
Implementation steps:
- on server startup, python script iterates over "hooks"/"packages" directories 
and counts directory md5 hashes. Files and directories are listed in 
alphabetical order.
- if directory archieve does not exist or md5 hash differs from previously 
counted, archieve is regenerated
- md5 hash of directory is saved to 

Archieves are created/updated by command "ambari-server 
create-stack-archieves". During ambari-server setup, this command is executed 
automatically. Original (not compressed) folders are not deleted, so user may 
change stack files and rerun command to update archieves.

  was:
h1. Proposal:
h2. General conception
Ambari server shares some files at /var/lib/ambari-server/resources/ via HTTP. 
These files are accessible via url like 
http://hostname:8080/resources/jdk-6u31-linux-x64.bin . Among these files there 
are service scripts, templates and hooks. Agent has a cache of these files. 
Cache directory structure is similar to contents of a stacks folder at server. 
For example:
$ ls /var/lib/ambari-agent/cache
{code}
└── stacks
    └── HDP
        ├── 2.0.7
        │   ├── Accumulo
        │   └── Flume
        └── 2.0.8
            ├── Accumulo
            ├── Flume
            └── YetAnotherService
{code}
If files for some service, component and stack version is not available at 
cache, agent downloads appropriate files on first use.
h2. Implementation details:
1. The trouble is that in current Jetty configuration, ambari-server does not 
allow to list directories.  We have two options:
- To speed up download and avoid need to list script files explicitly, the 
proposal is to pack directories "hooks" and "packages" into gz archieves. 
Archieves are created/updated by command "ambari-server 
create-stack-archieves". During ambari-server setup, this command is executed 
automatically. Original (not compressed) folders are not deleted, so user may 
change stack files and rerun command to update archieves.
- We may set "dirAllowed" servlet option for /resources/* and in this case 
agent will download all files one by one. User will not have to run additional 
commands to have stack files updated (improved usability). For every file being 
downloaded, a separate request will be sent. This way to fetch files seems to 
be too slow, especially on big clusters.

As second way seems to be not applicable, I'm going to implement the first way.


> Add to the agent ability to download service scripts and hooks
> --------------------------------------------------------------
>
>                 Key: AMBARI-4481
>                 URL: https://issues.apache.org/jira/browse/AMBARI-4481
>             Project: Ambari
>          Issue Type: Task
>          Components: agent, controller
>    Affects Versions: 1.5.0
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>             Fix For: 1.5.0
>
>
> h1. Proposal:
> h2. General conception
> Ambari server shares some files at /var/lib/ambari-server/resources/ via 
> HTTP. These files are accessible via url like 
> http://hostname:8080/resources/jdk-6u31-linux-x64.bin . Among these files 
> there are service scripts, templates and hooks. Agent has a cache of these 
> files. Cache directory structure is similar to contents of a stacks folder at 
> server. For example:
> $ ls /var/lib/ambari-agent/cache
> {code}
> └── stacks
>     └── HDP
>         ├── 2.0.7
>         │   ├── Accumulo
>         │   └── Flume
>         └── 2.0.8
>             ├── Accumulo
>             ├── Flume
>             └── YetAnotherService
> {code}
> If files for some service, component and stack version is not available at 
> cache, agent downloads appropriate files on first use.
> h2. Packaging files into archieves:
> The trouble is that in current Jetty configuration, ambari-server does not 
> allow to list directories.  We have two options:
> - To speed up download and avoid need to list script files explicitly, the 
> proposal is to pack directories "hooks" and "packages" into gz archieves. 
> - We may set "dirAllowed" servlet option for /resources/* and in this case 
> agent will download all files one by one. User will not have to run 
> additional commands to have stack files updated (improved usability). For 
> every file being downloaded, a separate request will be sent. This way to 
> fetch files seems to be too slow, especially on big clusters.
> As second way seems to be not applicable, I'm going to implement the first 
> way. Implementation steps:
> - on server startup, python script iterates over "hooks"/"packages" 
> directories and counts directory md5 hashes. Files and directories are listed 
> in alphabetical order.
> - if directory archieve does not exist or md5 hash differs from previously 
> counted, archieve is regenerated
> - md5 hash of directory is saved to 
> Archieves are created/updated by command "ambari-server 
> create-stack-archieves". During ambari-server setup, this command is executed 
> automatically. Original (not compressed) folders are not deleted, so user may 
> change stack files and rerun command to update archieves.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (AMBARI-4481) Add to the agent ability to download service scripts and hooks

Reply via email to