[
https://issues.apache.org/jira/browse/IGNITE-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732279#comment-15732279
]
ASF GitHub Bot commented on IGNITE-4270:
----------------------------------------
GitHub user devozerov opened a pull request:
https://github.com/apache/ignite/pull/1334
IGNITE-4270
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gridgain/apache-ignite ignite-4270
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/1334.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1334
----
commit 6e62ae1b8ad2cb843945ad9961d16ed898d656a5
Author: devozerov <[email protected]>
Date: 2016-11-17T15:50:05Z
Fix.
commit c173c1c744a95d5bd82ab4924405ce8582fa03a2
Author: devozerov <[email protected]>
Date: 2016-11-17T15:58:43Z
Fix.
commit be74c2961f2f80acd3b29f299468ffbd93082161
Author: devozerov <[email protected]>
Date: 2016-11-17T16:22:49Z
Fixing write.
commit 65b59b571b7c601aa82199324a6a33da8ddf233f
Author: devozerov <[email protected]>
Date: 2016-11-18T09:37:54Z
Optimizing shuffle ket write (offheap -> heap).
commit a435b6eb893be80d2d40ceeebff18d79ea4bbabb
Author: devozerov <[email protected]>
Date: 2016-11-18T09:51:45Z
Fix.
commit 797841da9c445cbdf3296a8068df79963d7b9408
Author: devozerov <[email protected]>
Date: 2016-11-21T08:48:44Z
Reworked shuffle message to take advantage of direct marshalling.
commit c2bff238eaca73274b52bf18453e155dbd450a51
Author: devozerov <[email protected]>
Date: 2016-11-21T09:15:52Z
Removed sleep from shuffle job thread.
commit 5c733d5a9613a1893cdf2f66f5df27db61618d22
Author: devozerov <[email protected]>
Date: 2016-11-21T14:34:02Z
Other fixes.
commit 84b23474ba9917f5160691fa677d7736a602b329
Author: devozerov <[email protected]>
Date: 2016-11-22T13:58:38Z
Switched to hashmap for mapper.
commit ed323764236b91561b522d64aded3c9e7cf86039
Author: devozerov <[email protected]>
Date: 2016-11-22T14:46:36Z
Reverted sorted map -> hash map switch.
commit b173659677e1992c64b69ba7604096791f38e700
Author: devozerov <[email protected]>
Date: 2016-11-23T10:53:37Z
Added mapper index to task info.
commit 152e23336a89a5db9d6ed8887ff87d5b760adc09
Author: devozerov <[email protected]>
Date: 2016-11-23T11:15:23Z
WIP.
commit 2c502135247127129cf2572d32ab717acac9ac2c
Author: devozerov <[email protected]>
Date: 2016-11-23T11:52:07Z
Merge branch 'ignite-1.6.10-hadoop-debug' into ignite-1.7.4-hadoop-debug
commit 512d2edf5203d127ca645ad2df606ea5f310d7f2
Author: devozerov <[email protected]>
Date: 2016-11-23T12:01:25Z
IGNITE-4286: Hadoop: Introduced plain property names.
commit f321e2aaa09dc7824fb157a344a8c1f15628b81d
Author: devozerov <[email protected]>
Date: 2016-11-23T12:08:34Z
IGNITE-4274: Hadoop: introduced "ignite.shuffle.message.size" property to
control approximate shuffle message size.
commit 4eb63be40166140bd778b98875efa7f496d88f96
Author: devozerov <[email protected]>
Date: 2016-11-23T12:17:35Z
IGNITE-4271: Hadoop: shuffle messages now use direct marshalling. This
closes #1266.
commit 2cc438a25de5b95f3083aac2cc07efe923ea848c
Author: devozerov <[email protected]>
Date: 2016-11-23T12:18:58Z
IGNITE-4281: Hadoop: decoupled local and remote reduce maps. This closes
#1264.
commit a3b754da76056beb310e923a563ffee93a2913e7
Author: devozerov <[email protected]>
Date: 2016-11-23T12:23:22Z
Merge branch 'ignite-1.7.4-hadoop' into ignite-4270
# Conflicts:
#
modules/hadoop/src/main/java/org/apache/ignite/internal/processors/hadoop/shuffle/HadoopShuffle.java
#
modules/hadoop/src/main/java/org/apache/ignite/internal/processors/hadoop/shuffle/HadoopShuffleJob.java
commit da5739075487ce9f7ff9019a0cc77a9c40facc50
Author: devozerov <[email protected]>
Date: 2016-11-23T12:23:47Z
Fixes after merge.
commit 6258d1c245ce312a2dee4a3ab34161281d6e9fee
Author: devozerov <[email protected]>
Date: 2016-11-23T12:25:11Z
Added relevant property.
commit 8240813a326b6c6ac862d1642dea903ce28670a0
Author: devozerov <[email protected]>
Date: 2016-11-24T08:08:32Z
WIP.
commit c421db1790d3c8528ade44c6089b19855a692254
Author: devozerov <[email protected]>
Date: 2016-11-24T08:14:32Z
Merge branch 'ignite-1.7.4' into ignite-1.7.4-hadoop-debug
commit b73fa8c4e9f52e96bd806e1cdbec1bed3623a54f
Author: devozerov <[email protected]>
Date: 2016-11-24T08:53:29Z
IGNITE-4295: Replaced "unsafe" byte[]->byte[] copying with System.arrayCopy.
commit 2c61b6622f77b52ad1f568f0d7fc28c2c40bf027
Author: devozerov <[email protected]>
Date: 2016-11-24T09:01:10Z
Merge branch 'ignite-4295' into ignite-1.7.4-hadoop-debug
commit 1469f72ea5994bcb05ea20fa42db0d006161d832
Author: devozerov <[email protected]>
Date: 2016-11-24T10:09:49Z
Experimental commit for threshold-based per-byte memory copy.
commit 3ecd2773f5663a10164cbcc87c1ee2a152a48e93
Author: devozerov <[email protected]>
Date: 2016-11-24T10:27:02Z
Fix to default threshold value.
commit bd045d1760188a2118f8f28bfe451a2a3c90b076
Author: devozerov <[email protected]>
Date: 2016-11-23T12:01:25Z
IGNITE-4286: Hadoop: Introduced plain property names.
commit 5547fcba07d36e34a0453168143cd0a5bd70380d
Author: devozerov <[email protected]>
Date: 2016-11-23T12:08:34Z
IGNITE-4274: Hadoop: introduced "ignite.shuffle.message.size" property to
control approximate shuffle message size.
commit 967579d7553de0ac0bba04538ac811c4492dd0f9
Author: devozerov <[email protected]>
Date: 2016-11-24T11:25:16Z
Fixes after merge.
commit bb3c07088bdcc59bea0b954276f21fece012959f
Author: devozerov <[email protected]>
Date: 2016-11-24T11:28:33Z
Merge commit 'f321e2aa' into ignite-1.7.4-hadoop-debug
----
> Hadoop: optionally stripe mapper output for every partition.
> ------------------------------------------------------------
>
> Key: IGNITE-4270
> URL: https://issues.apache.org/jira/browse/IGNITE-4270
> Project: Ignite
> Issue Type: Sub-task
> Components: hadoop
> Affects Versions: 1.8
> Reporter: Vladimir Ozerov
> Assignee: Vladimir Ozerov
> Labels: performance
> Fix For: 2.0
>
>
> Currently we have R maps for M mappers, where R is number of reducers. For
> this reason many mappers writes to concurrent offheap data structure, loosing
> time on concurrency burden.
> Let's add an option to create R * M maps, so that every mapper has dedicated
> map for every reducer. This will eliminate almost all concurrency overhead.
> Design:
> 1) Every mapper works with it's own set of "remote" output maps;
> 2) These maps are essentially not "maps", but IO messages, which we fill up
> to certain threshold;
> 3) Once filled, message is sent to remote node.
> 4) Async shuffle thread is no longer need in this architecture.
> As a result we decrease concurrency, removes slowdown from a single shuffle
> thread which is not able to send messages fast enough, and removes
> unnecessary intermediate sorting.
> NB! Be careful with "combiner" case and with "external" execution.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)