GitHub user hys9958 opened a pull request:

    https://github.com/apache/tajo/pull/275

    TAJO-1199: EMR bootstrap script for Tajo

    Bootstrap Action Arguments:
    ==========================
    
    Usage: install-tajo.sh [OPTIONS]
    
        -t [S3_PATH_TO_TAJO_BIN_TARBALL]
           Ex: s3://[your_bucket]/[your_path]/tajo-{version}.tar.gz
           Default: 
http://d3kp3z3ppbkcio.cloudfront.net/tajo-0.9.0/tajo-0.9.0.tar.gz
        -c [S3_PATH_TO_TAJO_CONF_DIR] 
           Ex: s3://[your_bucket]/[your_path]/conf
        -l [S3_PATH_TO_THIRD_PARTY_JARS_DIR]
           Ex: s3://[your_bucket]/[your_path]/lib
        -h
           Display help message
        -T [LOCAL_PATH_TO_TEST_ROOT] (only used for local test)
           Ex: /[LOCAL_PATH_TO_TEST_ROOT]
        -H [LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST] (only used for local test)
           Ex: /[LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST]
    
    Note that all arguments are optional. ``-T`` and ``-H`` are only for local 
test.
    
    Sample Commands:
    ================
    
    Launching a Tajo cluster with a default configurations
    -------------------------------------------------------
     * It uses EMR HDFS as ```tajo.root``` which includes the warehouse 
directory
     * It uses all default heap and concurrency configs.
     * It is good for a simple test. 
     
    ```
    $ aws emr create-cluster    \
        --name="[CLUSTER_NAME]"  \
        --ami-version=3.3        \
        --ec2-attributes KeyName=[KEY_FIAR_NAME] \
        --instance-groups 
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge 
InstanceGroupType=CORE,InstanceCount=1,InstanceType=c3.xlarge \
        --bootstrap-action Name="Install 
tajo",Path=s3://[your_bucket]/[your_path]/install-tajo.sh
    ```
    
    Launching a Tajo cluster with additional configurations
    -------------------------------------------------------
    
     * To use your Tajo tarball, you should use ```-t``` to specify S3 URL.
     * To change ```tajo.rootdir```, you should make your own 
```tajo-site.xml``` and use ```-c``` option to specify S3 URL for config dirs.
       * You can find appropriate config templates in tajo-emr/template.
     * To use RDS, you needs appropriate JDBC jars like mysql-connector.jar. 
```-l``` option allows you to specify S3 directory URL, including third party 
Jars.
    
     
    ```
        aws emr create-cluster \
        --name="[CLUSTER_NAME]" \
        --ami-version=3.3 \
        --ec2-attributes KeyName=[KEY_FIAR_NAME] \
        --instance-groups 
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge 
InstanceGroupType=CORE,InstanceCount=1,InstanceType=c3.xlarge \
        --bootstrap-action Name="Install 
tajo",Path=s3://[your_bucket]/[your_path]/install-tajo.sh,Args=["-t","s3://[your_bucket]/tajo-0.9.0.tar.gz","-c","s3://[your_bucket]/conf","-l","s3://[your_bucket]/lib"]
    ```
    
    How to test bootstrap in local machine
    =======================================
    ```install-tajo.sh``` allows users to test the bootstrap in local machine 
without EMR instances. For it, you need to use ```-T``` and ```-H``` options.
     * ```-T``` - Testing root dir which is temporarily used for testing.
     * ```-H``` - Hadoop binary directory which is used to pretended to be EMR 
Hadoop home
    
    ```   
    $ ./install-EMR-tajo.sh -t /[your_local_binary_path]/tajo-0.9.0.tar.gz -c 
/[your_test_conf_dir]/conf -l /[your_test_lib_dir]/lib -T 
/[LOCAL_PATH_TO_TEST_ROOT] -H /[LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST]
    ```
    
    Running with AWS RDS
    ====================
    Tajo can use RDS. For it, you need to make sure you already have a running 
RDS instance. Then, you need to make your ```catalog-site.xml```. Please refer 
to [Catalog configuration documentation] 
(http://tajo.apache.org/docs/current/configuration/catalog_configuration.html) 
in Tajo doc.
    
    Also, you should use ```-c``` option in order to use your custom 
```catalog-site.xml``` file.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hys9958/tajo tajo-1199

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/275.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #275
    
----
commit 0b4b135c81ca3548e78d622c26027808883b9c9f
Author: hys9958 <[email protected]>
Date:   2014-12-01T07:06:43Z

    TAJO-1199: EMR bootstrap script for Tajo

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to