[
https://issues.apache.org/jira/browse/SPARK-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126905#comment-15126905
]
Michael Armbrust commented on SPARK-13083:
------------------------------------------
You need to also ensure the queries are running in different pools if you want
them to get a fair share of the resources.
http://spark.apache.org/docs/latest/job-scheduling.html#fair-scheduler-pools
> Small spark sql queries get blocked if there is a long running query over a
> lot a partitions
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-13083
> URL: https://issues.apache.org/jira/browse/SPARK-13083
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.5.1
> Reporter: Vishal Gupta
>
> Steps to reproduce :
> a) Run first query doing count(*) over a lot of paritions ( ~4500 partitions
> ) in s3.
> b) The spark-job for the first query starts running.
> c) Run second query "show tables" to the same spark-application. ( i did it
> using zeppellin )
> d) As soon as the second query "show tables" is submitted, it starts showing
> up in the "Spark Application UI" > "SQL".
> e) At this point there is only one active job running in the application
> which corresponds to the first query.
> f) Only after the job for the first query is near completion, the job for
> "show tables" starts appearing in "Spark Application UI" > "Jobs".
> g) As soon as the job for "show tables" starts, it completes very fast and
> gives the results.
> Sometime step (c) has to performed after 1-2 minutes of execution of the
> long-running-query. But after this point, jobs do not get started for any
> number of smaller queries submitted to the spark-application till the
> long-running-query is near execution.
> They seem to be blocked on the long-running query. Ideally, they should have
> started running as the all settings are for fair-scheduler.
> I am running spark-1.5.1. In addtion to it, I have the following configs :
> {code}
> spark.scheduler.mode FAIR
> spark.scheduler.allocation.file /usr/lib/spark/conf/fairscheduler.xml
> {code}
> /usr/lib/spark/conf/fairscheduler.xml has the following contents
> {code}
> <?xml version="1.0"?>
> <allocations>
> <pool name="default">
> <schedulingMode>FAIR</schedulingMode>
> </pool>
> </allocations>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]