Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tajo Wiki" for change notification.
The "Configuration" page has been changed by HyunsikChoi: https://wiki.apache.org/tajo/Configuration?action=diff&rev1=12&rev2=13 - = tajo-env.sh = - == Heap Memory Setting == - The following configuration settings in '''''conf/tajo-env.sh''''' allow Tajo Master and each worker to use specified heap memory respectively. - * TAJO_MASTER_HEAPSIZE - * TAJO_WORKER_HEAPSIZE - - If you want to adjust heap memory size for either, set TAJO_MASTER_HEAPSIZE or TAJO_WORKER_HEAPSIZE variable in '''''conf/tajo-env.sh''''' with a proper size as follows: - {{{ - TAJO_MASTER_HEAPSIZE=2000 - TAJO_WORKER_HEAPSIZE=8000 - }}} - - The default size is 1000 (1GB). - - = tajo-site.xml = - == Preliminary == + = Preliminary = + == catalog-site.xml and tajo-site.xml == Tajo's configuration is based on Hadoop's configuration system. Tajo provides two config files: * catalog-site.xml - configuration for the catalog server. @@ -32, +18 @@ Tajo has a variety of internal configs. If you don't set some config explicitly, the default config will be used for for that config. Tajo is designed to use only a few of configs in usual cases. You may not be concerned with the configuration. - == Setting up tajo-site.xml == + In default, there is no tajo-site.xml in ${TAJO}/conf directory. If you set some configs, first copy $TAJO_HOME/conf/tajo-site.xml.templete to tajo-site.xml. Then, add the configs to your tajo-site.xml. - Copy $TAJO_HOME/conf/tajo-site.xml.templete to tajo-site.xml. Then, add the following configs to your tajo-site.xml. + == tajo-env.sh == + '''''tajo-env.sh''''' is a shell script file. The main purpose of this file is to set shell environment variables for TajoMaster and TajoWorker java program. So, you can set some variable as follows: + {{{ + VARIABLE=value + }}} + If a value is a literal string, type this as follows: + {{{ + VARIABLE='value' + }}} + + = TajoMaster Configuration = + + == Tajo Rootdir Setting == + Tajo uses HDFS as a primary storage layer. So, one Tajo cluster instance should have one tajo rootdir. A user is allowed to specific your tajo rootdir as follows: {{{ <property> <name>tajo.rootdir</name> - <value>hdfs://hostname:port/tajo</value> + <value>hdfs://namenode_hostname:port/path</value> </property> }}} - * tajo.rootdir - Specify the root directory of tajo. This parameter should be a url form (e.g., hdfs://namenode_hostname:port/path). The default value is file///tmp/tajo-${user.name}/. + Tajo rootdir must be a url form like scheme://hostname:port/path. The current implementaion only supports hdfs:// and file:// schemes. The default value is ''file:///tmp/tajo-${user.name}/''. - == Setting up catalog-site.xml == + == TajoMaster Heap Memory Size == + The environment variable ''TAJO_MASTER_HEAPSIZE'' in '''''conf/tajo-env.sh''''' allow Tajo Master to use the specified heap memory size. + + If you want to adjust heap memory size, set TAJO_MASTER_HEAPSIZE variable in '''''conf/tajo-env.sh''''' with a proper size as follows: + {{{ + TAJO_MASTER_HEAPSIZE=2000 + }}} + + The default size is 1000 (1GB). + + + = Tajo Worker Configuration = + + == Worker Heap Memory Size == + The environment variable ''TAJO_WORKER_HEAPSIZE'' in '''''conf/tajo-env.sh''''' allow Tajo Worker to use the specified heap memory size. + + If you want to adjust heap memory size, set TAJO_WORKER_HEAPSIZE variable in '''''conf/tajo-env.sh''''' with a proper size as follows: + {{{ + TAJO_WORKER_HEAPSIZE=8000 + }}} + + The default size is 1000 (1GB). + == Temporary Data Directory == + TajoWorker stores temporary data on local file system due to out-of-core algorithms. It is possible to specify one or more temporary data directories where temporary data will be stored. + + '''''tajo-site.xml''''' + {{{ + <property> + <name>tajo.worker.tmpdir.locations</name> + <value>/disk1/tmpdir,/disk2/tmpdir,/disk3/tmpdir</value> + </property> + }}} + + = Catalog Configuration = If you want to customize the catalog service, copy $TAJO_HOME/conf/catalog-site.xml.templete to catalog-site.xml. Then, add the following configs to catalog-site.xml. Note that the default configs are enough to launch Tajo cluster in most cases. * tajo.catalog.master.addr - If you want to launch a catalog server separately, specify this address. This config has a form of hostname:port. Its default value is 0.0.0.0:9002. @@ -56, +88 @@ * tajo.catalog.store.MemStore - this is the in-memory storage. It is only used in unit tests to shorten the duration of unit tests. <<Anchor(DefaultPortNumbers)>> - == Configuration Properties for RPC/Http Services and Default Addresses == + = RPC/Http Service Configuration and Default Addresses = - === Tajo Master === + == Tajo Master == ||Service Name||Config Property Name||Description||default address|| ||Tajo Master Umbilical Rpc||tajo.master.umbilical-rpc.address|| ||localhost:26001|| ||Tajo Master Client Rpc||tajo.master.client-rpc.address|| ||localhost:26002|| ||Tajo Master Info Http||tajo.master.info-http.address|| ||0.0.0.0:26080|| ||Tajo Catalog Client Rpc||tajo.catalog.client-rpc.address|| ||localhost:26005|| - === Worker === + == Worker == ||Service Name||Config Property Name||Description||default address|| ||Tajo Worker Peer Rpc||tajo.worker.peer-rpc.address|| ||0.0.0.0:28091|| ||Tajo Worker Client Rpc||tajo.worker.client-rpc.address|| ||0.0.0.0:28092||
