[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196305#comment-16196305
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4445


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.4.0
>
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-10-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195032#comment-16195032
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4445
  
Agree with @KurtYoung.
Merging this...


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-09-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154711#comment-16154711
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/4445
  
I would bet on deserialization for it. And why sorter suffers more 
regression than hash join is that sorter will cause more deserializations 
during compare records than hash join.

Despite the regression we will face, i think it's still worthy since we can 
avoid an extra copy from network to runtime. It's better if we can take the 
extra copy into account during benchmark, but it's ok we don't have it. 

+1 to merge this.


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143910#comment-16143910
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4445
  
Thanks!

I am currently trying to pinpoint what part of the code exactly suffers 
most from the regression. If that is for example specific to the 
microbenchmark, we can merge this without concern...


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123110#comment-16123110
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4445
  
FYI: I just rebased this PR onto current `master` to make this mergable and 
support further extensions


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116530#comment-16116530
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/4445
  
I think the implementation of the change is good, but the performance 
impact seems noticeable, at least in some cases. I think the additional bounds 
checking in the hybrid case shows. Out of curiosity I deactivated the index 
bounds checks and this closed all gaps between `HeapMemorySegment` and 
`HybridMemorySegment` in the benchmarks that @NicoK mentioned.

If @StephanEwen has no concerns about the performance regression, I think 
this could be merged.


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114322#comment-16114322
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4445
  
in a non-exhaustive mini benchmark, I ran `HashVsSortMiniBenchmark` and got 
the following results:

# Best out of 5 (in ms)

Test | `master` | `Flink-7310`
 | -- | --
Hash Build First | 5541 | 5629
Sort-Merge | 6194 | 6816
Hash Build | 3587 | 3629

# All results

## `master`

Test | 1 | 2 | 3 | 4 | 5
 | - | - | - | - | -
Hash Build First | 5772.0 | 5541.0 | 5707.0 | 5733.0 | 5751.0
Sort-Merge | 6704.0 | 7146.0 | 6194.0 | 6915.0 | 6445.0
Hash Build Second | 3834.0 | 3805.0 | 3811.0 | 3587.0 | 3563.0

## `FLINK-7310`

Test | 1 | 2 | 3 | 4 | 5
 | - | - | - | - | -
Hash Build First | 5816.0 | 5770.0 | 5629.0 | 5656.0 | 5745.0
Sort-Merge | 7284.0 | 7233.0 | 6816.0 | 6861.0 | 7218.0
Hash Build Second | 3802.0 | 3836.0 | 3629.0 | 3782.0 | 3804.0


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108870#comment-16108870
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4445
  
These changes look good to me!

There is in fact a potential performance impact of this change. It would be 
cool to get an understanding of the potential performance impact of only using 
the HybridMemorySegment now.
We could run something like a Hash Join Performance test with key/value 
pairs of String keys (which are the most performance sensitive to serialize / 
deserialize with individual byte operations) and see if this has a measurable 
impact there.


> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7310) always use HybridMemorySegment

2017-08-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108578#comment-16108578
 ] 

ASF GitHub Bot commented on FLINK-7310:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4445

[FLINK-7310][core] always use the HybridMemorySegment

## What is the purpose of the change

Since we'd like to use our own off-heap buffers for network communication, 
we cannot use `HeapMemorySegment` anymore and need to rely on 
`HybridMemorySegment`. We thus drop any code that loads the `HeapMemorySegment` 
(it is still available if needed) in favour of the `HybridMemorySegment` which 
is able to work on both heap and off-heap memory.

For the performance penalty of this change compared to using 
`HeapMemorySegment` alone, see this interesting blob article (from 2015):
https://flink.apache.org/news/2015/09/16/off-heap-memory.html

## Brief change log

  - drop any use of the `HeapMemorySegment` (however, for now, keep the 
class and its factory)
  - integrate `HybridMemorySegmentFactory` into `MemorySegmentFactory` 
(with hard-coded use of `HybridMemorySegment`)

## Verifying this change

This change is already covered by existing tests, such as: memory-backend 
specific tests under `flink/core/memory` or actually all other tests running 
programs on Flink. Actually, the `HybridMemorySegment` was not really tested 
much in integration tests so far because most tests used on-heap memory and 
thus `HeapMemorySegment`. Since we now only use `HybridMemorySegment`, we do 
add a lot of tests for this.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (yes)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-7310

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4445.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4445


commit c62e793712effbfb53ea6442b5d714a68081f7ec
Author: Nico Kruber 
Date:   2017-07-31T10:06:14Z

[hotfix] fix some typos

commit d3c4e231a96b6ae133576a74646294749ab3809a
Author: Nico Kruber 
Date:   2017-07-31T12:18:42Z

[FLINK-7310][core] always use the HybridMemorySegment

Since we'd like to use our own off-heap buffers for network communication, 
we
cannot use HeapMemorySegment anymore and need to rely on 
HybridMemorySegment.
We thus drop any code that loads the HeapMemorySegment (it is still 
available
if needed) in favour of the HybridMemorySegment which is able to work on 
both
heap and off-heap memory.

For the performance penalty of this change compared to using 
HeapMemorySegment
alone, see this interesting blob article (from 2015):
https://flink.apache.org/news/2015/09/16/off-heap-memory.html




> always use HybridMemorySegment
> --
>
> Key: FLINK-7310
> URL: https://issues.apache.org/jira/browse/FLINK-7310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> For future changes to the network buffers (sending our own off-heap buffers 
> through to netty), we cannot use {{HeapMemorySegment}} anymore and need to 
> rely on {{HybridMemorySegment}} instead.
> We should thus drop any code that loads the {{HeapMemorySegment}} (it is 
> still available if needed) in favour of the {{HybridMemorySegment}} which is 
> able to work on both heap and off-heap memory.
> FYI: For the performance penalty of this change compared to using 
> {{HeapMemorySegment}} alone, see this interesting blob article (from 2015):
> https://flink.apache.org/news/2015/09/16/off-heap-memory.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)