Dileep created KYLIN-1641:
-----------------------------

             Summary: Spark - pagination
                 Key: KYLIN-1641
                 URL: https://issues.apache.org/jira/browse/KYLIN-1641
             Project: Kylin
          Issue Type: Improvement
            Reporter: Dileep



Issue: we have inserted around 10 million records in hive and show the results 
in web interface through spark dataframe. We cannot get all those 10 million 
and do the pagination in the front end. So we did the pagination in the spark 
dataframe using following approach 
  df1 =df.limit(rowsperPage * pagenumer)
        df2 = df1.limit(rowsperPage * (pagenumer  -1))
df1.subtract(df2)).collect().

This working fine but when we go up the pagenumber (last page ) it is slowing 
down and not get the results back to front end. 
Just want to check what we are doing right or any other solution for this 
problem

Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to