[ 
https://issues.apache.org/jira/browse/TRAFODION-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546971#comment-15546971
 ] 

ASF GitHub Bot commented on TRAFODION-2259:
-------------------------------------------

Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/743#discussion_r81871556
  
    --- Diff: core/sql/sort/Topn.cpp ---
    @@ -0,0 +1,337 @@
    +/**********************************************************************
    +// @@@ START COPYRIGHT @@@
    +//
    +// Licensed to the Apache Software Foundation (ASF) under one
    +// or more contributor license agreements.  See the NOTICE file
    +// distributed with this work for additional information
    +// regarding copyright ownership.  The ASF licenses this file
    +// to you under the Apache License, Version 2.0 (the
    +// "License"); you may not use this file except in compliance
    +// with the License.  You may obtain a copy of the License at
    +//
    +//   http://www.apache.org/licenses/LICENSE-2.0
    +//
    +// Unless required by applicable law or agreed to in writing,
    +// software distributed under the License is distributed on an
    +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +// KIND, either express or implied.  See the License for the
    +// specific language governing permissions and limitations
    +// under the License.
    +//
    +// @@@ END COPYRIGHT @@@
    +**********************************************************************/
    +/* -*-C++-*-
    
+******************************************************************************
    +*
    +* File:         TopN.cpp
    +*                               
    +* Description:  This file contains the implementation of all member 
functions
    +*               of the class TopN.
    +*               
    +* 1. Sort would initially maintain Top N array of elements to being with.
    +* 2. Read records into TopN array. 
    +* 3. Once TopN array is full, heapify the array into max heap. Top node in 
the heap is always the highest node.
    +* 4. Subsequent record read either gets discarded( if greater than top 
node) or replace top node( if lesser then top node) . if replaced top node, 
re-balance the heap.
    +* 5. Repeat steps 4 until last record is read.
    +* 6. sort the final heap using heap sort.
    
+*******************************************************************************/
    +
    +#include <iostream>
    +#include <fstream>
    +
    +#ifndef DEBUG
    +#undef NDEBUG
    +#define NDEBUG
    +#endif
    +#include "ex_stdh.h"
    +#include "Topn.h"
    +#include "ScratchSpace.h"
    +#include "logmxevent.h"
    +#include "SortUtil.h"
    +#include "ex_ex.h"
    +#include "ExStats.h"
    +
    +//------------------------------------------------------------------------
    +// Class Constructor.
    +//------------------------------------------------------------------------
    +TopN::TopN(ULng32 runsize, ULng32 sortmaxmem, ULng32  recsize,
    +             NABoolean doNotallocRec, ULng32  keysize, 
    +             SortScratchSpace* scratch, NABoolean iterSort,
    +             CollHeap* heap, SortError* sorterror, Lng32 explainNodeId, 
SortUtil* sortutil):
    +             SortAlgo(runsize, recsize, doNotallocRec, keysize, scratch, 
explainNodeId),
    +             loopIndex_(0), heap_(heap), sortError_(sorterror),
    +             sortUtil_(sortutil)
    +{
    +   //runsize is TopN size. Fixed.
    +   allocRunSize_ = runsize;
    +   
    +   //Actual run size after all elements read.
    +   runSize_ = 0;   
    +   
    +   isHeapified_ = FALSE;
    +   
    +  //allocateMemory failureIsFatal is defaulted to TRUE means allocation 
failure results in 
    +  //longjump to jump handler defined in ex_sort.cpp. Only applicable on 
NSK.
    --- End diff --
    
    Is there a reason to perpetuate NSK-isms?


> Sort TopN operator
> ------------------
>
>                 Key: TRAFODION-2259
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2259
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-exe
>    Affects Versions: 2.1-incubating
>            Reporter: Prashanth Vasudev
>            Assignee: Prashanth Vasudev
>
> Sort operator consumes all records before producing sorted records. For 
> certain use cases where only Top N records are required, today sort consumes 
> all records into memory and overflows( spills ) to disk. This impacts 
> performance. 
> if topN is pushed down to sort, only required memory can be allocated and 
> sort would only hold topN records in memory. Once all the records are read, 
> sorted records in topN is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to