Hello,

> Second, additional classes will turn performance critical callsites 
> megamorphic.

Yes, this is a completely valid point, thanks for raising this issue
Greg. We were planning to have an offline discussion tomorrow with
Pattarawat about this. We have a few options:
1. We could fuse the QuickSort and NormalizedKeySorter into a single
class when generating the code.
2. As you said, duplicating the QuickSort class might also be a nice
solution, if we can find a simple way to pull it off.
3. We might do some benchmarks and determine that the effect of this
issue is negligible and therefore we can ignore it. But I'm worried
that this is not the case, so we'll probably have to go with one of
the above options.

Best,
Gábor



On Thu, Mar 23, 2017 at 6:12 PM, Greg Hogan <c...@greghogan.com> wrote:
> I would be more than happy to shepherd and review this PR.
>
> I have two discussion points. First, a strategy for developing with 
> templates. IntelliJ has a FreeMarker plugin but we lose formatting and code 
> completion. To minimize this issue we can retain the untemplated code in an 
> abstract class which is then concretely subclassed by the template.
>
> Second, additional classes will turn performance critical callsites 
> megamorphic. Stephan noted this issue in his work on MemorySegment.
>   http://flink.apache.org/news/2015/09/16/off-heap-memory.html
>
> For example, QuickSort calls IndexedSortable#compare and 
> IndexedSortable#swap. With multiple compiled implementations of the sorter 
> template these callsites can no longer be inlined (the same is true with 
> NormalizedKeySorter and FixedLengthRecordSorter if the latter was 
> instrumented).
>
> I have not found a way to duplicate a Java class at runtime, but we may be 
> able to use Janino to compile a class which is then uniquely renamed: each 
> IndexSortable type would map to a different QuickSort type (same bytecode, 
> but uniquely optimized). This should also boost performance of runtime 
> operators calling user defined functions.
>
> Given the code already written, I expect we can refactor, review, and 
> benchmark for the 1.3 release.
>
> Greg
>
>
>> On Mar 21, 2017, at 3:46 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>>
>> Hi Pat,
>>
>> thanks a lot for this great proposal! I think it is very well structured and 
>> has the right level of detail.
>> The improvements of your performance benchmarks look very promising and I 
>> think code-gen'd sorters would be a very nice improvement.
>> I like that you plan to add a switch to activate this feature.
>>
>> In order move on, we will need a committer who "champions" your FLIP, 
>> reviews the pull request, and eventually merges it.
>>
>> @Greg and @Stephan, what do you think about this proposal?
>>
>> Best, Fabian
>>
>>
>> 2017-03-14 16:10 GMT+01:00 Pattarawat Chormai <pat.chor...@gmail.com 
>> <mailto:pat.chor...@gmail.com>>:
>> Hi all,
>>
>> I would like to initiate a discussion of applying code generation to 
>> NormalizedKeySorter. The goal is to improve sorting performance by 
>> generating suitable NormalizedKeySorter for underlying data. This generated 
>> sorter will contains only necessary code in important methods, such as swap 
>> and compare, hence improving sorting performance.
>>
>> Details of the implementation is illustrated at FLIP-18 : Code Generation 
>> for improving sorting performance. 
>> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-18%3A+Code+Generation+for+improving+sorting+performance>
>>
>>
>> Also, because we’re doing it as a course project at TUB, we have completed 
>> the implementation and made a pull-request 
>> <https://github.com/apache/flink/pull/3511> to Flink repo already.
>>
>> From our evaluation, we have found that the pull-request reduces sorting 
>> time around 7-10% and together with FLINK-3722 
>> <https://issues.apache.org/jira/browse/FLINK-3722> the sorting time is 
>> decreased by 12-20%.
>>
>>
>>
>> Please take a look at the document and the pull-request and let me know if 
>> you have any suggestion.
>>
>> Best,
>> Pat
>>
>

Reply via email to