[ 
https://issues.apache.org/jira/browse/SPARK-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126133#comment-15126133
 ] 

Vishal Gupta commented on SPARK-13083:
--------------------------------------

cc [~marmbrus] , [~rxin] 

> Small spark sql queries get blocked if there is a long running query over a 
> lot a partitions
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13083
>                 URL: https://issues.apache.org/jira/browse/SPARK-13083
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.5.1
>            Reporter: Vishal Gupta
>
> Steps to reproduce :
> a) Run first query doing count(*) over a lot of paritions ( ~4500 partitions 
> ) in s3.
> b) The spark-job for the first query starts running.
> c) Run second query "show tables"  to the same spark-application. ( i did it 
> using zeppellin ) 
> d) As soon as the second query "show tables" is submitted, it starts showing 
> up in the "Spark Application UI" > "SQL".
> e) At this point there is only one active job running in the application 
> which corresponds to the first query.
> f) Only after the job for the first query is near completion, the job for 
> "show tables" starts appearing in "Spark Application UI" > "Jobs". 
> g) As soon as the job for "show tables" starts, it completes very fast and 
> gives the results.
> Sometime step (c) has to performed after 1-2 minutes of execution of the 
> long-running-query. But after this point, jobs do not get started for any 
> number of smaller queries submitted to the spark-application till the 
> long-running-query is near execution. 
> They seem to be blocked on the long-running query. Ideally, they should have 
> started running as the all settings are for fair-scheduler.
> I am running spark-1.5.1. In addtion to it, I have the following configs :
> {code}
> spark.scheduler.mode FAIR
> spark.scheduler.allocation.file /usr/lib/spark/conf/fairscheduler.xml
> {code}
> /usr/lib/spark/conf/fairscheduler.xml has the following contents 
> {code}
> <?xml version="1.0"?>
> <allocations>
>   <pool name="default">
>       <schedulingMode>FAIR</schedulingMode>
>    </pool>
>  </allocations>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to