Re: Re: about mr accelerator question.

Vladimir Ozerov Thu, 24 Mar 2016 06:00:59 -0700

Hi,

Possible speedup greatly depends on the nature of your task. Typically, the
more MR tasks you have and the more intensively you work with actual data,
the bigger improvement could be achieved. Please give more details on what
kind of jobs do you run and probably I will be able to suggest something.


One possible change you can make to your config - switch temporal file
system paths used by your jobs to PRIMARY mode. This way all temp data will
reside only in memory and will not hit HDFS.

Vladimir.

On Wed, Mar 23, 2016 at 8:48 AM, [email protected] <[email protected]>
wrote:

> I am so glad to tell you the problem has been solved,thanks a lot. but the
> peformance improve only 300%, is there other good idea for config？
>
> there is another problem is i am not able to follow the track the job like 
> use framework yarn.so I  cant count the jobs and view the state which I have 
> been
> finished.is there good suggestion.
>
> the ignite config is
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <!--
>   Licensed to the Apache Software Foundation (ASF) under one or more
>   contributor license agreements.  See the NOTICE file distributed with
>   this work for additional information regarding copyright ownership.
>   The ASF licenses this file to You under the Apache License, Version 2.0
>   (the "License"); you may not use this file except in compliance with
>   the License.  You may obtain a copy of the License at
>
>        http://www.apache.org/licenses/LICENSE-2.0
>
>   Unless required by applicable law or agreed to in writing, software
>   distributed under the License is distributed on an "AS IS" BASIS,
>   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>   See the License for the specific language governing permissions and
>   limitations under the License.
> -->
>
> <!--
>     Ignite Spring configuration file.
>
>
>     When starting a standalone Ignite node, you need to execute the following 
> command:
>     {IGNITE_HOME}/bin/ignite.{bat|sh} path-to-this-file/default-config.xml
>
>
>     When starting Ignite from Java IDE, pass path to this file into Ignition:
>     Ignition.start("path-to-this-file/default-config.xml");
> -->
> <beans xmlns="http://www.springframework.org/schema/beans";
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:util="
> http://www.springframework.org/schema/util";
>        xsi:schemaLocation="http://www.springframework.org/schema/beans
>        http://www.springframework.org/schema/beans/spring-beans.xsd
>        http://www.springframework.org/schema/util
>        http://www.springframework.org/schema/util/spring-util.xsd";>
>
>     <!--
>         Optional description.
>     -->
>     <description>
>
>         Spring file for Ignite node configuration with IGFS and Apache Hadoop 
> map-reduce support enabled.
>         Ignite node will start with this configuration by default.
>     </description>
>
>     <!--
>
>         Initialize property configurer so we can reference environment 
> variables.
>     -->
>
>     <bean id="propertyConfigurer" 
> class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
>
>         <property name="systemPropertiesModeName" 
> value="SYSTEM_PROPERTIES_MODE_FALLBACK"/>
>         <property name="searchSystemEnvironment" value="true"/>
>     </bean>
>
>     <!--
>         Abstract IGFS file system configuration to be used as a template.
>     -->
>
>     <bean id="igfsCfgBase" 
> class="org.apache.ignite.configuration.FileSystemConfiguration" 
> abstract="true">
>         <!-- Must correlate with cache affinity mapper. -->
>         <property name="blockSize" value="#{128 * 1024}"/>
>         <property name="perNodeBatchSize" value="512"/>
>         <property name="perNodeParallelBatchCount" value="16"/>
>
>         <property name="prefetchBlocks" value="32"/>
>     </bean>
>
>     <bean class="org.apache.ignite.configuration.CacheConfiguration">
>   <!-- Store cache entries on-heap. -->
>   <property name="memoryMode" value="ONHEAP_TIERED"/>
>
>   <!-- Enable Off-Heap memory with max size of 10 Gigabytes (0 for 
> unlimited). -->
>   <property name="offHeapMaxMemory" value="#{14 * 1024L * 1024L * 1024L}"/>
>   <!-- Configure eviction policy. -->
>   <property name="evictionPolicy">
>
>     <bean 
> class="org.apache.ignite.cache.eviction.igfs.IgfsPerBlockLruEvictionPolicy">
>       <!-- Evict to off-heap after cache size reaches maxSize. -->
>       <property name="maxSize" value="800000"/>
>     </bean>
>   </property>
>   </bean>
>
>     <!--
>
>         Abstract cache configuration for IGFS file data to be used as a 
> template.
>     -->
>
>     <bean id="dataCacheCfgBase" 
> class="org.apache.ignite.configuration.CacheConfiguration" abstract="true">
>         <property name="cacheMode" value="PARTITIONED"/>
>         <property name="atomicityMode" value="TRANSACTIONAL"/>
>         <property name="writeSynchronizationMode" value="FULL_SYNC"/>
>         <property name="backups" value="0"/>
>         <property name="affinityMapper">
>
>             <bean class="org.apache.ignite.igfs.IgfsGroupDataBlocksKeyMapper">
>
>                 <!-- How many sequential blocks will be stored on the same 
> node. -->
>                 <constructor-arg value="512"/>
>             </bean>
>         </property>
>     </bean>
>
>     <!--
>
>         Abstract cache configuration for IGFS metadata to be used as a 
> template.
>     -->
>
>     <bean id="metaCacheCfgBase" 
> class="org.apache.ignite.configuration.CacheConfiguration" abstract="true">
>         <property name="cacheMode" value="REPLICATED"/>
>         <property name="atomicityMode" value="TRANSACTIONAL"/>
>         <property name="writeSynchronizationMode" value="FULL_SYNC"/>
>     </bean>
>
>     <!--
>         Configuration of Ignite node.
>     -->
>
>     <bean id="grid.cfg" 
> class="org.apache.ignite.configuration.IgniteConfiguration">
>         <!--
>             Apache Hadoop Accelerator configuration.
>         -->
>         <property name="hadoopConfiguration">
>
>             <bean class="org.apache.ignite.configuration.HadoopConfiguration">
>
>                 <!-- Information about finished jobs will be kept for 30 
> seconds. -->
>                 <property name="finishedJobInfoTtl" value="30000"/>
>             </bean>
>         </property>
>
>         <!--
>
>             This port will be used by Apache Hadoop client to connect to 
> Ignite node as if it was a job tracker.
>         -->
>         <property name="connectorConfiguration">
>
>             <bean 
> class="org.apache.ignite.configuration.ConnectorConfiguration">
>                 <property name="port" value="11211"/>
>             </bean>
>         </property>
>
>         <!--
>
>             Configure one IGFS file system instance named "igfs" on this node.
>         -->
>         <property name="fileSystemConfiguration">
>             <list>
>
>                 <bean 
> class="org.apache.ignite.configuration.FileSystemConfiguration" 
> parent="igfsCfgBase">
>                     <property name="name" value="igfs"/>
>
>                     <!-- Caches with these names must be configured. -->
>                     <property name="metaCacheName" value="igfs-meta"/>
>                     <property name="dataCacheName" value="igfs-data"/>
>
>
>                     <!-- Configure TCP endpoint for communication with the 
> file system instance. -->
>                     <property name="ipcEndpointConfiguration">
>
>                         <bean 
> class="org.apache.ignite.igfs.IgfsIpcEndpointConfiguration">
>                             <property name="type" value="TCP" />
>                             <property name="host" value="0.0.0.0" />
>                             <property name="port" value="10500" />
>                         </bean>
>                     </property>
>
>                     <!-- Sample secondary file system configuration.
>                         'uri'      - the URI of the secondary file system.
>
>                         'cfgPath'  - optional configuration path of the 
> secondary file system,
>
>                             e.g. /opt/foo/core-site.xml. Typically left to be 
> null.
>
>                         'userName' - optional user name to access the 
> secondary file system on behalf of. Use it
>
>                             if Hadoop client and the Ignite node are running 
> on behalf of different users.
>                     -->
>                     <property name="secondaryFileSystem">
>
>                         <bean 
> class="org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem">
>
>                             <constructor-arg name="uri" 
> value="hdfs://*.*.*.*:9000"/>
>
>                             <constructor-arg 
> name="cfgPath"><null/></constructor-arg>
>
>                             <constructor-arg name="userName" 
> value="client-user-name"/>
>                         </bean>
>                     </property>
>                 </bean>
>             </list>
>         </property>
>
>         <!--
>             Caches needed by IGFS.
>         -->
>         <property name="cacheConfiguration">
>             <list>
>                 <!-- File system metadata cache. -->
>
>                 <bean 
> class="org.apache.ignite.configuration.CacheConfiguration" 
> parent="metaCacheCfgBase">
>                     <property name="name" value="igfs-meta"/>
>                 </bean>
>
>                 <!-- File system files data cache. -->
>
>                 <bean 
> class="org.apache.ignite.configuration.CacheConfiguration" 
> parent="dataCacheCfgBase">
>                     <property name="name" value="igfs-data"/>
>                 </bean>
>             </list>
>         </property>
>
>         <!--
>             Disable events.
>         -->
>         <property name="includeEventTypes">
>             <list>
>
>                 <util:constant 
> static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED"/>
>
>                 <util:constant 
> static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED"/>
>
>                 <util:constant 
> static-field="org.apache.ignite.events.EventType.EVT_JOB_MAPPED"/>
>             </list>
>         </property>
>
>         <!--
>
>             TCP discovery SPI can be configured with list of addresses if 
> multicast is not available.
>         -->
>         <property name="discoverySpi">
>
>             <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
>                 <property name="ipFinder">
>
>                     <bean 
> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
>                         <property name="addresses">
>                             <list>
>                                 <value>*.*.*.*</value>
>                                 <value>*.*.*.*:47500..47509</value>
>                             </list>
>                         </property>
>                     </bean>
>                 </property>
>             </bean>
>         </property>
>     </bean>
> </beans>
>
> ------------------------------
> [email protected]
> 北京润通丰华科技有限公司
> 李宜明 like wind exist
> 电话 13811682465
>
>
> *From:* Vladimir Ozerov <[email protected]>
> *Date:* 2016-03-17 13:37
> *To:* user <[email protected]>
> *Subject:* Re: about mr accelerator question.
> Hi,
>
> The fact that you can work with 29G cluster with only 8G of memory might be
> caused by the following things:
> 1) Your job doesn't use all data form cluster and hence caches only part of
> it. This is the most likely case.
> 2) You have eviction policy configured for IGFS data cache.
> 3) Or may be you use offheap.
> Please provide the full XML configuration and we will be able to understand
> it.
>
> Anyways, your initial question was about out-of-memory. Could you provide
> exact error message? Is it about heap memory or may be permgen?
>
> As per execution time, this depends on your workload. If there are lots map
> tasks and very active work with data, you will see improvement in speed. If
> there are lots operations on file system (e.g. mkdirs, move, etc.) and very
> little amount of map jobs, chances there will be no speedup at all. Provide
> more details on the job you test and type of data you use and we will be
> able to give you more ideas on what to do.
>
> Vladimir.
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/about-mr-accelerator-question-tp3502p3552.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>
>

Re: Re: about mr accelerator question.

Reply via email to