Re: Disable queuing of spark job on Mesos cluster if sufficient resources are not found

2017-05-30 Thread Michael Gummelt
The driver will remain in the queue indefinitely, unless you issue a kill
command at /v1/submissions/kill/
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala#L64

On Mon, May 29, 2017 at 1:15 AM, Mevada, Vatsal <mev...@sky.optymyze.com>
wrote:

> Is there any configurable timeout which controls queuing of the driver in
> Mesos cluster mode or the driver will remain in queue for indefinite until
> it find resource on cluster?
>
>
>
> *From:* Michael Gummelt [mailto:mgumm...@mesosphere.io]
> *Sent:* Friday, May 26, 2017 11:33 PM
> *To:* Mevada, Vatsal <mev...@sky.optymyze.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: Disable queuing of spark job on Mesos cluster if
> sufficient resources are not found
>
>
>
> Nope, sorry.
>
>
>
> On Fri, May 26, 2017 at 4:38 AM, Mevada, Vatsal <mev...@sky.optymyze.com>
> wrote:
>
> Hello,
>
> I am using Mesos with cluster deployment mode to submit my jobs.
>
> When sufficient resources are not available on Mesos cluster, I can see
> that my jobs are queuing up on Mesos dispatcher UI.
>
> Is it possible to tweak some configuration so that my job submission fails
> gracefully(instead of queuing up) if sufficient resources are not found on
> Mesos cluster?
>
> Regards,
>
> Vatsal
>
>
>
>
> --
>
> Michael Gummelt
>
> Software Engineer
>
> Mesosphere
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Disable queuing of spark job on Mesos cluster if sufficient resources are not found

2017-05-26 Thread Michael Gummelt
Nope, sorry.

On Fri, May 26, 2017 at 4:38 AM, Mevada, Vatsal <mev...@sky.optymyze.com>
wrote:

> Hello,
>
> I am using Mesos with cluster deployment mode to submit my jobs.
>
> When sufficient resources are not available on Mesos cluster, I can see
> that my jobs are queuing up on Mesos dispatcher UI.
>
> Is it possible to tweak some configuration so that my job submission fails
> gracefully(instead of queuing up) if sufficient resources are not found on
> Mesos cluster?
>
> Regards,
>
> Vatsal
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: One question / kerberos, yarn-cluster -> connection to hbase

2017-05-24 Thread Michael Gummelt
What version of Spark are you using?  Can you provide your logs with DEBUG
logging enabled?  You should see these logs:
https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L475

On Wed, May 24, 2017 at 10:07 AM, Sudhir Jangir <sud...@infoobjects.com>
wrote:

> Facing one issue with Kerberos enabled Hadoop/CDH cluster.
>
>
>
> We are trying to run a streaming job on yarn-cluster, which interacts with
> Kafka (direct stream), and hbase.
>
>
>
> Somehow, we are not able to connect to hbase in the cluster mode. We use
> keytab to login to hbase.
>
>
>
> This is what we do:
>
> *spark-submit --master yarn-cluster --keytab "dev.keytab" --principal
> "d...@io-int.com <d...@io-int.com>"*  --conf "spark.executor.
> extraJavaOptions=-Dlog4j.configuration=log4j_executor_conf.properties
> -XX:+UseG1GC" --conf "spark.driver.extraJavaOptions=-Dlog4j.
> configuration=log4j_driver_conf.properties -XX:+UseG1GC" --conf
> spark.yarn.stagingDir=hdfs:///tmp/spark/ --files
> "job.properties,log4j_driver_conf.properties,log4j_executor_conf.properties"
> service-0.0.1-SNAPSHOT.jar job.properties
>
>
>
> To connect to hbase:
>
>  def getHbaseConnection(properties: SerializedProperties): (Connection,
> UserGroupInformation) = {
>
>
>
>
>
> val config = HBaseConfiguration.create();
>
> config.set("hbase.zookeeper.quorum", HBASE_ZOOKEEPER_QUORUM_VALUE);
>
> config.set("hbase.zookeeper.property.clientPort", 2181);
>
> config.set("hadoop.security.authentication", "kerberos");
>
> config.set("hbase.security.authentication", "kerberos");
>
> config.set("hbase.cluster.distributed", "true");
>
> config.set("hbase.rpc.protection", "privacy");
>
>config.set("hbase.regionserver.kerberos.principal", “hbase/_
> h...@io-int.com”);
>
> config.set("hbase.master.kerberos.principal", “hbase/_h...@io-int.com
> ”);
>
>
>
> UserGroupInformation.setConfiguration(config);
>
>
>
>  var ugi: UserGroupInformation = null;
>
>   if (SparkFiles.get(properties.keytab) != null
>
> && (new java.io.File(SparkFiles.get(properties.keytab)).exists)) {
>
> ugi = UserGroupInformation.loginUserFromKeytabAndReturnUG
> I(properties.kerberosPrincipal,
>
>   SparkFiles.get(properties.keytab));
>
>   } else {
>
> ugi = UserGroupInformation.loginUserFromKeytabAndReturnUG
> I(properties.kerberosPrincipal,
>
>   properties.keytab);
>
>   }
>
>
>
>
>
> val connection = ConnectionFactory.createConnection(config);
>
> return (connection, ugi);
>
>   }
>
>
>
> and we connect to hbase:
>
>  ….foreachRDD { rdd =>
>
>   if (!rdd.isEmpty()) {
>
> //*var* *ugi*: UserGroupInformation = Utils.getHbaseConnection(
> properties)._2
>
> rdd.foreachPartition { partition =>
>
>   val connection = Utils.getHbaseConnection(propsObj)._1
>
>   val table = …
>
>   partition.foreach { json =>
>
>
>
>   }
>
>   table.put(puts)
>
>   table.close()
>
>   connection.close()
>
> }
>
>   }
>
> }
>
>
>
>
>
> Keytab file is not getting copied to yarn staging/temp directory, we are
> not getting that in SparkFiles.get… and if we pass keytab with --files,
> spark-submit is failing because it’s there in --keytab already.
>
>
>
> Thanks,
>
> Sudhir
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Not able pass 3rd party jars to mesos executors

2017-05-18 Thread Michael Gummelt
No, --jars doesn't work in cluster mode on Mesos.  We need to document that
better.  Do you have some problem that can't be solved by bundling your
dependency into your application (i.e. uberjar)?

On Tue, May 16, 2017 at 10:00 PM, Satya Narayan1 <
satyanarayan.pa...@gmail.com> wrote:

> Hi , Is anyone able to use --jars with spark-submit in mesos  cluster mode.
>
> We have tried giving local file, hdfs file, file from http server , --jars
> didnt work with any of the approach
>
>
> Saw couple of similar open question with no answer
> http://stackoverflow.com/questions/33978672/spark-mesos-cluster-mode-who-
> uploads-the-jar
>
>
>   mesos cluster mode  with jar upload capability is very limiting.
> wondering
> anyone has any solution to this.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Not-able-pass-3rd-party-jars-to-
> mesos-executors-tp26918p28689.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -----
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark diclines mesos offers

2017-04-24 Thread Michael Gummelt
Have you run with debug logging?  There are some hints in the debug logs:
https://github.com/apache/spark/blob/branch-2.1/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala#L316

On Mon, Apr 24, 2017 at 4:53 AM, Pavel Plotnikov <
pavel.plotni...@team.wrike.com> wrote:

> Hi, everyone! I run spark 2.1.0 jobs on the top of Mesos cluster in
> coarse-grained mode with dynamic resource allocation. And sometimes spark
> mesos scheduler declines mesos offers despite the fact that not all
> available resources were used (I have less workers than the possible
> maximum) and the maximum threshold in the spark configuration is not
> reached and the queue have lot of pending tasks.
>
> May be I have wrong spark or mesos configuration? Does anyone have the
> same problems?
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark on Mesos with Docker in bridge networking mode

2017-02-17 Thread Michael Gummelt
There's a JIRA here: https://issues.apache.org/jira/browse/SPARK-11638

I haven't had time to look at it.

On Thu, Feb 16, 2017 at 11:00 AM, cherryii <cherr...@adobe.com> wrote:

> I'm getting errors when I try to run my docker container in bridge
> networking
> mode on mesos.
> Here is my spark submit script
>
> /spark/bin/spark-submit \
>  --class com.package.MySparkJob \
>  --name My-Spark-Job \
>  --files /path/config.cfg, ${JAR} \
>  --master ${SPARK_MASTER_HOST} \
>  --deploy-mode client \
>  --supervise \
>  --total-executor-cores ${SPARK_EXECUTOR_TOTAL_CORES} \
>  --driver-cores ${SPARK_DRIVER_CORES} \
>  --driver-memory ${SPARK_DRIVER_MEMORY} \
>  --num-executors ${SPARK_NUM_EXECUTORS} \
>  --executor-cores ${SPARK_EXECUTOR_CORES} \
>  --executor-memory ${SPARK_EXECUTOR_MEMORY} \
>  --driver-class-path ${JAR} \
>  --conf
> "spark.mesos.executor.docker.image=${SPARK_MESOS_EXECUTOR_DOCKER_IMAGE}" \
>  --conf
> "spark.mesos.executor.docker.volumes=${SPARK_MESOS_
> EXECUTOR_DOCKER_VOLUMES}"
> \
>  --conf "spark.mesos.uris=${SPARK_MESOS_URIS}" \
>  --conf "spark.executorEnv.OBERON_DB_PASS=${OBERON_DB_PASS}" \
>  --conf "spark.executorEnv.S3_SECRET_ACCESS_KEY=${S3_SECRET_ACCESS_KEY}" \
>  --conf "spark.executorEnv.S3_ACCESS_KEY=${S3_ACCESS_KEY}" \
>  --conf "spark.mesos.executor.home=${SPARK_HOME}" \
>  --conf "spark.executorEnv.MESOS_NATIVE_JAVA_LIBRARY=${SPARK_MESOS_LIB}" \
>  --conf "spark.files.overwrite=true" \
>  --conf "spark.shuffle.service.enabled=false" \
>  --conf "spark.dynamicAllocation.enabled=false" \
>  --conf "spark.ui.port=${PORT_SPARKUI}" \
>  --conf "spark.driver.host=${SPARK_PUBLIC_DNS}" \
>  --conf "spark.driver.port=${PORT_SPARKDRIVER}" \
>  --conf "spark.driver.blockManager.port=${PORT_SPARKBLOCKMANAGER}" \
>  --conf "spark.jars=${JAR}" \
>  --conf "spark.executor.extraClassPath=${JAR}" \
>  ${JAR}
>
> Here is the error I'm seeing:
> java.net.BindException: Cannot assign requested address: Service
> 'sparkDriver' failed after 16 retries! Consider explicitly setting the
> appropriate port for the service 'sparkDriver' (for example spark.ui.port
> for SparkUI) to an available port or increasing spark.port.maxRetries.
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:
> 223)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> at
> io.netty.channel.socket.nio.NioServerSocketChannel.doBind(
> NioServerSocketChannel.java:125)
> at
> io.netty.channel.AbstractChannel$AbstractUnsafe.bind(
> AbstractChannel.java:485)
> at
> io.netty.channel.DefaultChannelPipeline$HeadContext.bind(
> DefaultChannelPipeline.java:1089)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeBind(
> AbstractChannelHandlerContext.java:430)
> at
> io.netty.channel.AbstractChannelHandlerContext.bind(
> AbstractChannelHandlerContext.java:415)
> at
> io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:
> 903)
> at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
> at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(
> SingleThreadEventExecutor.java:357)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
>
> I was trying to follow instructions here:
> https://github.com/apache/spark/pull/15120
> So in my Marathon json I'm defining the ports to use for the spark driver,
> spark ui and block manager.
>
> Can anyone help me get this running in bridge networking mode?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Spark-on-Mesos-with-Docker-in-
> bridge-networking-mode-tp28397.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Dynamic resource allocation to Spark on Mesos

2017-02-09 Thread Michael Gummelt
> by specifying a larger heap size than default on each worker node.

I don't follow.  Which heap?  Are you specifying a large heap size on the
executors?  If so, do you mean you somehow launch the shuffle service when
you launch executors?  Or something else?

On Wed, Feb 8, 2017 at 5:50 PM, Sun Rui <sunrise_...@163.com> wrote:

> Michael,
> No. We directly launch the external shuffle service by specifying a larger
> heap size than default on each worker node. It is observed that the
> processes are quite stable.
>
> On Feb 9, 2017, at 05:21, Michael Gummelt <mgumm...@mesosphere.io> wrote:
>
> Sun, are you using marathon to run the shuffle service?
>
> On Tue, Feb 7, 2017 at 7:36 PM, Sun Rui <sunrise_...@163.com> wrote:
>
>> Yi Jan,
>>
>> We have been using Spark on Mesos with dynamic allocation enabled, which
>> works and improves the overall cluster utilization.
>>
>> In terms of job, do you mean jobs inside a Spark application or jobs
>> among different applications? Maybe you can read
>> http://spark.apache.org/docs/latest/job-scheduling.html for help.
>>
>> On Jan 31, 2017, at 03:34, Michael Gummelt <mgumm...@mesosphere.io>
>> wrote:
>>
>>
>>
>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> Tasks begin scheduling as soon as the first executor comes up
>>>
>>>
>>> Thanks all for the clarification. Is this the default behavior of Spark
>>> on Mesos today? I think this is what we are looking for because sometimes a
>>> job can take up lots of resources and later jobs could not get all the
>>> resources that it asks for. If a Spark job starts with only a subset of
>>> resources that it asks for, does it know to expand its resources later when
>>> more resources become available?
>>>
>>
>> Yes.
>>
>>
>>>
>>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>>>> some moment, then launch an executor with 2GB RAM
>>>
>>>
>>> This is less useful in our use case. But I am also quite interested in
>>> cases in which this could be helpful. I think this will also help with
>>> overall resource utilization on the cluster if when another job starts up
>>> that has a hard requirement on resources, the extra resources to the first
>>> job can be flexibly re-allocated to the second job.
>>>
>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgumm...@mesosphere.io
>>> > wrote:
>>>
>>>> We've talked about that, but it hasn't become a priority because we
>>>> haven't had a driving use case.  If anyone has a good argument for
>>>> "variable" resource allocation like this, please let me know.
>>>>
>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com> w
>>>> rote:
>>>>
>>>>> An alternative behavior is to launch the job with the best resource
>>>>>> offer Mesos is able to give
>>>>>
>>>>>
>>>>> Michael has just made an excellent explanation about dynamic
>>>>> allocation support in mesos. But IIUC, what you want to achieve is
>>>>> something like (using RAM as an example) : "Launch each executor with at
>>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch an
>>>>> executor with 2GB RAM".
>>>>>
>>>>> I wonder what's benefit of that? To reduce the "resource
>>>>> fragmentation"?
>>>>>
>>>>> Anyway, that is not supported at this moment. In all the supported
>>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>>>> spark on kubernetes), you have to specify the cores and memory of each
>>>>> executor.
>>>>>
>>>>> It may not be supported in the future, because only mesos has the
>>>>> concepts of offers because of its two-level scheduling model.
>>>>>
>>>>>
>>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>
>>>>>> Dear Spark Users,
>>>>>>
>>>>>> Currently is there a way to dynamically allocate resources to Spark
>>>>>> on Mesos? Within Spark we can specify the CPU cores, memory before 
>>>>>> running
>>>>>> job. The way I understand is that the Spark job will not run if the 
>>>>>> 

Re: Dynamic resource allocation to Spark on Mesos

2017-02-08 Thread Michael Gummelt
Sun, are you using marathon to run the shuffle service?

On Tue, Feb 7, 2017 at 7:36 PM, Sun Rui <sunrise_...@163.com> wrote:

> Yi Jan,
>
> We have been using Spark on Mesos with dynamic allocation enabled, which
> works and improves the overall cluster utilization.
>
> In terms of job, do you mean jobs inside a Spark application or jobs among
> different applications? Maybe you can read http://spark.apache.org/
> docs/latest/job-scheduling.html for help.
>
> On Jan 31, 2017, at 03:34, Michael Gummelt <mgumm...@mesosphere.io> wrote:
>
>
>
> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>
>> Tasks begin scheduling as soon as the first executor comes up
>>
>>
>> Thanks all for the clarification. Is this the default behavior of Spark
>> on Mesos today? I think this is what we are looking for because sometimes a
>> job can take up lots of resources and later jobs could not get all the
>> resources that it asks for. If a Spark job starts with only a subset of
>> resources that it asks for, does it know to expand its resources later when
>> more resources become available?
>>
>
> Yes.
>
>
>>
>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>>> some moment, then launch an executor with 2GB RAM
>>
>>
>> This is less useful in our use case. But I am also quite interested in
>> cases in which this could be helpful. I think this will also help with
>> overall resource utilization on the cluster if when another job starts up
>> that has a hard requirement on resources, the extra resources to the first
>> job can be flexibly re-allocated to the second job.
>>
>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgumm...@mesosphere.io>
>>  wrote:
>>
>>> We've talked about that, but it hasn't become a priority because we
>>> haven't had a driving use case.  If anyone has a good argument for
>>> "variable" resource allocation like this, please let me know.
>>>
>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com> w
>>> rote:
>>>
>>>> An alternative behavior is to launch the job with the best resource
>>>>> offer Mesos is able to give
>>>>
>>>>
>>>> Michael has just made an excellent explanation about dynamic allocation
>>>> support in mesos. But IIUC, what you want to achieve is something like
>>>> (using RAM as an example) : "Launch each executor with at least 1GB RAM,
>>>> but if mesos offers 2GB at some moment, then launch an executor with 2GB
>>>> RAM".
>>>>
>>>> I wonder what's benefit of that? To reduce the "resource fragmentation"?
>>>>
>>>> Anyway, that is not supported at this moment. In all the supported
>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>>> spark on kubernetes), you have to specify the cores and memory of each
>>>> executor.
>>>>
>>>> It may not be supported in the future, because only mesos has the
>>>> concepts of offers because of its two-level scheduling model.
>>>>
>>>>
>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>
>>>>> Dear Spark Users,
>>>>>
>>>>> Currently is there a way to dynamically allocate resources to Spark on
>>>>> Mesos? Within Spark we can specify the CPU cores, memory before running
>>>>> job. The way I understand is that the Spark job will not run if the 
>>>>> CPU/Mem
>>>>> requirement is not met. This may lead to decrease in overall utilization 
>>>>> of
>>>>> the cluster. An alternative behavior is to launch the job with the best
>>>>> resource offer Mesos is able to give. Is this possible with the current
>>>>> implementation?
>>>>>
>>>>> Thanks
>>>>> Ji
>>>>>
>>>>> The information in this email is confidential and may be legally
>>>>> privileged. It is intended solely for the addressee. Access to this email
>>>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>>
>>
>>
>> The information in this email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful.
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Michael Gummelt
> Looking into Mesos attributes this seems the perfect fit for it. Is that
correct?

Yes.

On Tue, Feb 7, 2017 at 3:43 AM, Muhammad Asif Abbasi <asif.abb...@gmail.com>
wrote:

> YARN provides the concept of node labels. You should explore the
> "spark.yarn.executor.nodeLabelConfiguration" property.
>
>
> Cheers,
> Asif Abbasi
>
> On Tue, 7 Feb 2017 at 10:21, Alvaro Brandon <alvarobran...@gmail.com>
> wrote:
>
>> Hello all:
>>
>> I have the following scenario.
>> - I have a cluster of 50 machines with Hadoop and Spark installed on
>> them.
>> - I want to launch one Spark application through spark submit. However I
>> want this application to run on only a subset of these machines,
>> disregarding data locality. (e.g. 10 machines)
>>
>> Is this possible?. Is there any option in the standalone scheduler, YARN
>> or Mesos that allows such thing?.
>>
>>
>>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Dynamic resource allocation to Spark on Mesos

2017-02-02 Thread Michael Gummelt
Yes, that's expected.  spark.executor.cores sizes a single executor.  It
doesn't limit the number of executors.  For that, you need spark.cores.max
(--total-executor-cores).

And rdd.parallelize does not specify the number of executors.  It specifies
the number of partitions, which relates to the number of tasks, not
executors.  Unless you're running with dynamic allocation enabled, the
number of executors for your job is static, and determined at start time.
It's not influenced by your job itself.

On Thu, Feb 2, 2017 at 2:42 PM, Ji Yan <ji...@drive.ai> wrote:

> I tried setting spark.executor.cores per executor, but Spark seems to be
> spinning up as many executors as possible up to spark.cores.max or however
> many cpu cores available on the cluster, and this may be undesirable
> because the number of executors in rdd.parallelize(collection, # of
> partitions) is being overriden
>
> On Thu, Feb 2, 2017 at 1:30 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> As of Spark 2.0, Mesos mode does support setting cores on the executor
>> level, but you might need to set the property directly (--conf
>> spark.executor.cores=).  I've written about this here:
>> https://docs.mesosphere.com/1.8/usage/service-guides/spark/j
>> ob-scheduling/.  That doc is for DC/OS, but the configuration is the
>> same.
>>
>> On Thu, Feb 2, 2017 at 1:06 PM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> I was mainly confused why this is the case with memory, but with cpu
>>> cores, it is not specified on per executor level
>>>
>>> On Thu, Feb 2, 2017 at 1:02 PM, Michael Gummelt <mgumm...@mesosphere.io>
>>> wrote:
>>>
>>>> It sounds like you've answered your own question, right?
>>>> --executor-memory means the memory per executor.  If you have no executor
>>>> w/ 200GB memory, then the driver will accept no offers.
>>>>
>>>> On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <ji...@drive.ai> wrote:
>>>>
>>>>> sorry, to clarify, i was using --executor-memory for memory,
>>>>> and --total-executor-cores for cpu cores
>>>>>
>>>>> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <
>>>>> mgumm...@mesosphere.io> wrote:
>>>>>
>>>>>> What CLI args are your referring to?  I'm aware of spark-submit's
>>>>>> arguments (--executor-memory, --total-executor-cores, and 
>>>>>> --executor-cores)
>>>>>>
>>>>>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>
>>>>>>> I have done a experiment on this today. It shows that only CPUs are
>>>>>>> tolerant of insufficient cluster size when a job starts. On my cluster, 
>>>>>>> I
>>>>>>> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
>>>>>>> with --cpu_cores set to 1000, the job starts up with 64 cores. but when 
>>>>>>> I
>>>>>>> set --memory to 200Gb, the job fails to start with "Initial job has
>>>>>>> not accepted any resources; check your cluster UI to ensure that workers
>>>>>>> are registered and have sufficient resources"
>>>>>>>
>>>>>>> Also it is confusing to me that --cpu_cores specifies the number of
>>>>>>> cpu cores across all executors, but --memory specifies per executor 
>>>>>>> memory
>>>>>>> requirement.
>>>>>>>
>>>>>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>>>>>> mgumm...@mesosphere.io> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>>>
>>>>>>>>> Tasks begin scheduling as soon as the first executor comes up
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks all for the clarification. Is this the default behavior of
>>>>>>>>> Spark on Mesos today? I think this is what we are looking for because
>>>>>>>>> sometimes a job can take up lots of resources and later jobs could 
>>>>>>>>> not get
>>>>>>>>> all the resources that it asks for. If a Spark job starts with only a
>>>>>>>>> subset of resources that it asks for, does

Re: Dynamic resource allocation to Spark on Mesos

2017-02-02 Thread Michael Gummelt
As of Spark 2.0, Mesos mode does support setting cores on the executor
level, but you might need to set the property directly (--conf
spark.executor.cores=).  I've written about this here:
https://docs.mesosphere.com/1.8/usage/service-guides/spark/job-scheduling/.
That doc is for DC/OS, but the configuration is the same.

On Thu, Feb 2, 2017 at 1:06 PM, Ji Yan <ji...@drive.ai> wrote:

> I was mainly confused why this is the case with memory, but with cpu
> cores, it is not specified on per executor level
>
> On Thu, Feb 2, 2017 at 1:02 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> It sounds like you've answered your own question, right?
>> --executor-memory means the memory per executor.  If you have no executor
>> w/ 200GB memory, then the driver will accept no offers.
>>
>> On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> sorry, to clarify, i was using --executor-memory for memory,
>>> and --total-executor-cores for cpu cores
>>>
>>> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <mgumm...@mesosphere.io
>>> > wrote:
>>>
>>>> What CLI args are your referring to?  I'm aware of spark-submit's
>>>> arguments (--executor-memory, --total-executor-cores, and --executor-cores)
>>>>
>>>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:
>>>>
>>>>> I have done a experiment on this today. It shows that only CPUs are
>>>>> tolerant of insufficient cluster size when a job starts. On my cluster, I
>>>>> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
>>>>> with --cpu_cores set to 1000, the job starts up with 64 cores. but when I
>>>>> set --memory to 200Gb, the job fails to start with "Initial job has
>>>>> not accepted any resources; check your cluster UI to ensure that workers
>>>>> are registered and have sufficient resources"
>>>>>
>>>>> Also it is confusing to me that --cpu_cores specifies the number of
>>>>> cpu cores across all executors, but --memory specifies per executor memory
>>>>> requirement.
>>>>>
>>>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>>>> mgumm...@mesosphere.io> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>
>>>>>>> Tasks begin scheduling as soon as the first executor comes up
>>>>>>>
>>>>>>>
>>>>>>> Thanks all for the clarification. Is this the default behavior of
>>>>>>> Spark on Mesos today? I think this is what we are looking for because
>>>>>>> sometimes a job can take up lots of resources and later jobs could not 
>>>>>>> get
>>>>>>> all the resources that it asks for. If a Spark job starts with only a
>>>>>>> subset of resources that it asks for, does it know to expand its 
>>>>>>> resources
>>>>>>> later when more resources become available?
>>>>>>>
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB
>>>>>>>> at some moment, then launch an executor with 2GB RAM
>>>>>>>
>>>>>>>
>>>>>>> This is less useful in our use case. But I am also quite interested
>>>>>>> in cases in which this could be helpful. I think this will also help 
>>>>>>> with
>>>>>>> overall resource utilization on the cluster if when another job starts 
>>>>>>> up
>>>>>>> that has a hard requirement on resources, the extra resources to the 
>>>>>>> first
>>>>>>> job can be flexibly re-allocated to the second job.
>>>>>>>
>>>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <
>>>>>>> mgumm...@mesosphere.io> wrote:
>>>>>>>
>>>>>>>> We've talked about that, but it hasn't become a priority because we
>>>>>>>> haven't had a driving use case.  If anyone has a good argument for
>>>>>>>> "variable" resource allocation like this, please le

Re: Dynamic resource allocation to Spark on Mesos

2017-02-02 Thread Michael Gummelt
It sounds like you've answered your own question, right?  --executor-memory
means the memory per executor.  If you have no executor w/ 200GB memory,
then the driver will accept no offers.

On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <ji...@drive.ai> wrote:

> sorry, to clarify, i was using --executor-memory for memory,
> and --total-executor-cores for cpu cores
>
> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> What CLI args are your referring to?  I'm aware of spark-submit's
>> arguments (--executor-memory, --total-executor-cores, and --executor-cores)
>>
>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> I have done a experiment on this today. It shows that only CPUs are
>>> tolerant of insufficient cluster size when a job starts. On my cluster, I
>>> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
>>> with --cpu_cores set to 1000, the job starts up with 64 cores. but when I
>>> set --memory to 200Gb, the job fails to start with "Initial job has not
>>> accepted any resources; check your cluster UI to ensure that workers are
>>> registered and have sufficient resources"
>>>
>>> Also it is confusing to me that --cpu_cores specifies the number of cpu
>>> cores across all executors, but --memory specifies per executor memory
>>> requirement.
>>>
>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>> mgumm...@mesosphere.io> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>
>>>>> Tasks begin scheduling as soon as the first executor comes up
>>>>>
>>>>>
>>>>> Thanks all for the clarification. Is this the default behavior of
>>>>> Spark on Mesos today? I think this is what we are looking for because
>>>>> sometimes a job can take up lots of resources and later jobs could not get
>>>>> all the resources that it asks for. If a Spark job starts with only a
>>>>> subset of resources that it asks for, does it know to expand its resources
>>>>> later when more resources become available?
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>
>>>>>
>>>>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>>>>>> some moment, then launch an executor with 2GB RAM
>>>>>
>>>>>
>>>>> This is less useful in our use case. But I am also quite interested in
>>>>> cases in which this could be helpful. I think this will also help with
>>>>> overall resource utilization on the cluster if when another job starts up
>>>>> that has a hard requirement on resources, the extra resources to the first
>>>>> job can be flexibly re-allocated to the second job.
>>>>>
>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <
>>>>> mgumm...@mesosphere.io> wrote:
>>>>>
>>>>>> We've talked about that, but it hasn't become a priority because we
>>>>>> haven't had a driving use case.  If anyone has a good argument for
>>>>>> "variable" resource allocation like this, please let me know.
>>>>>>
>>>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> An alternative behavior is to launch the job with the best resource
>>>>>>>> offer Mesos is able to give
>>>>>>>
>>>>>>>
>>>>>>> Michael has just made an excellent explanation about dynamic
>>>>>>> allocation support in mesos. But IIUC, what you want to achieve is
>>>>>>> something like (using RAM as an example) : "Launch each executor with at
>>>>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch an
>>>>>>> executor with 2GB RAM".
>>>>>>>
>>>>>>> I wonder what's benefit of that? To reduce the "resource
>>>>>>> fragmentation"?
>>>>>>>
>>>>>>> Anyway, that is not supported at this moment. In all the supported
>>>>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>>>>>> spark on kubernetes), you have to spec

Re: Dynamic resource allocation to Spark on Mesos

2017-02-02 Thread Michael Gummelt
What CLI args are your referring to?  I'm aware of spark-submit's arguments
(--executor-memory, --total-executor-cores, and --executor-cores)

On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:

> I have done a experiment on this today. It shows that only CPUs are
> tolerant of insufficient cluster size when a job starts. On my cluster, I
> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
> with --cpu_cores set to 1000, the job starts up with 64 cores. but when I
> set --memory to 200Gb, the job fails to start with "Initial job has not
> accepted any resources; check your cluster UI to ensure that workers are
> registered and have sufficient resources"
>
> Also it is confusing to me that --cpu_cores specifies the number of cpu
> cores across all executors, but --memory specifies per executor memory
> requirement.
>
> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>>
>>
>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> Tasks begin scheduling as soon as the first executor comes up
>>>
>>>
>>> Thanks all for the clarification. Is this the default behavior of Spark
>>> on Mesos today? I think this is what we are looking for because sometimes a
>>> job can take up lots of resources and later jobs could not get all the
>>> resources that it asks for. If a Spark job starts with only a subset of
>>> resources that it asks for, does it know to expand its resources later when
>>> more resources become available?
>>>
>>
>> Yes.
>>
>>
>>>
>>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>>>> some moment, then launch an executor with 2GB RAM
>>>
>>>
>>> This is less useful in our use case. But I am also quite interested in
>>> cases in which this could be helpful. I think this will also help with
>>> overall resource utilization on the cluster if when another job starts up
>>> that has a hard requirement on resources, the extra resources to the first
>>> job can be flexibly re-allocated to the second job.
>>>
>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgumm...@mesosphere.io
>>> > wrote:
>>>
>>>> We've talked about that, but it hasn't become a priority because we
>>>> haven't had a driving use case.  If anyone has a good argument for
>>>> "variable" resource allocation like this, please let me know.
>>>>
>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com>
>>>> wrote:
>>>>
>>>>> An alternative behavior is to launch the job with the best resource
>>>>>> offer Mesos is able to give
>>>>>
>>>>>
>>>>> Michael has just made an excellent explanation about dynamic
>>>>> allocation support in mesos. But IIUC, what you want to achieve is
>>>>> something like (using RAM as an example) : "Launch each executor with at
>>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch an
>>>>> executor with 2GB RAM".
>>>>>
>>>>> I wonder what's benefit of that? To reduce the "resource
>>>>> fragmentation"?
>>>>>
>>>>> Anyway, that is not supported at this moment. In all the supported
>>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>>>> spark on kubernetes), you have to specify the cores and memory of each
>>>>> executor.
>>>>>
>>>>> It may not be supported in the future, because only mesos has the
>>>>> concepts of offers because of its two-level scheduling model.
>>>>>
>>>>>
>>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>
>>>>>> Dear Spark Users,
>>>>>>
>>>>>> Currently is there a way to dynamically allocate resources to Spark
>>>>>> on Mesos? Within Spark we can specify the CPU cores, memory before 
>>>>>> running
>>>>>> job. The way I understand is that the Spark job will not run if the 
>>>>>> CPU/Mem
>>>>>> requirement is not met. This may lead to decrease in overall utilization 
>>>>>> of
>>>>>> the cluster. An alternative behavior is to launch the job with the best
>>>>>> resource offer Mesos 

Re: Dynamic resource allocation to Spark on Mesos

2017-01-30 Thread Michael Gummelt
On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:

> Tasks begin scheduling as soon as the first executor comes up
>
>
> Thanks all for the clarification. Is this the default behavior of Spark on
> Mesos today? I think this is what we are looking for because sometimes a
> job can take up lots of resources and later jobs could not get all the
> resources that it asks for. If a Spark job starts with only a subset of
> resources that it asks for, does it know to expand its resources later when
> more resources become available?
>

Yes.


>
> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>> some moment, then launch an executor with 2GB RAM
>
>
> This is less useful in our use case. But I am also quite interested in
> cases in which this could be helpful. I think this will also help with
> overall resource utilization on the cluster if when another job starts up
> that has a hard requirement on resources, the extra resources to the first
> job can be flexibly re-allocated to the second job.
>
> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> We've talked about that, but it hasn't become a priority because we
>> haven't had a driving use case.  If anyone has a good argument for
>> "variable" resource allocation like this, please let me know.
>>
>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com>
>> wrote:
>>
>>> An alternative behavior is to launch the job with the best resource
>>>> offer Mesos is able to give
>>>
>>>
>>> Michael has just made an excellent explanation about dynamic allocation
>>> support in mesos. But IIUC, what you want to achieve is something like
>>> (using RAM as an example) : "Launch each executor with at least 1GB RAM,
>>> but if mesos offers 2GB at some moment, then launch an executor with 2GB
>>> RAM".
>>>
>>> I wonder what's benefit of that? To reduce the "resource fragmentation"?
>>>
>>> Anyway, that is not supported at this moment. In all the supported
>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>> spark on kubernetes), you have to specify the cores and memory of each
>>> executor.
>>>
>>> It may not be supported in the future, because only mesos has the
>>> concepts of offers because of its two-level scheduling model.
>>>
>>>
>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>
>>>> Dear Spark Users,
>>>>
>>>> Currently is there a way to dynamically allocate resources to Spark on
>>>> Mesos? Within Spark we can specify the CPU cores, memory before running
>>>> job. The way I understand is that the Spark job will not run if the CPU/Mem
>>>> requirement is not met. This may lead to decrease in overall utilization of
>>>> the cluster. An alternative behavior is to launch the job with the best
>>>> resource offer Mesos is able to give. Is this possible with the current
>>>> implementation?
>>>>
>>>> Thanks
>>>> Ji
>>>>
>>>> The information in this email is confidential and may be legally
>>>> privileged. It is intended solely for the addressee. Access to this email
>>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>
>>>
>>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Dynamic resource allocation to Spark on Mesos

2017-01-28 Thread Michael Gummelt
We've talked about that, but it hasn't become a priority because we haven't
had a driving use case.  If anyone has a good argument for "variable"
resource allocation like this, please let me know.

On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com> wrote:

> An alternative behavior is to launch the job with the best resource offer
>> Mesos is able to give
>
>
> Michael has just made an excellent explanation about dynamic allocation
> support in mesos. But IIUC, what you want to achieve is something like
> (using RAM as an example) : "Launch each executor with at least 1GB RAM,
> but if mesos offers 2GB at some moment, then launch an executor with 2GB
> RAM".
>
> I wonder what's benefit of that? To reduce the "resource fragmentation"?
>
> Anyway, that is not supported at this moment. In all the supported cluster
> managers of spark (mesos, yarn, standalone, and the up-to-coming spark on
> kubernetes), you have to specify the cores and memory of each executor.
>
> It may not be supported in the future, because only mesos has the concepts
> of offers because of its two-level scheduling model.
>
>
> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>
>> Dear Spark Users,
>>
>> Currently is there a way to dynamically allocate resources to Spark on
>> Mesos? Within Spark we can specify the CPU cores, memory before running
>> job. The way I understand is that the Spark job will not run if the CPU/Mem
>> requirement is not met. This may lead to decrease in overall utilization of
>> the cluster. An alternative behavior is to launch the job with the best
>> resource offer Mesos is able to give. Is this possible with the current
>> implementation?
>>
>> Thanks
>> Ji
>>
>> The information in this email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful.
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Dynamic resource allocation to Spark on Mesos

2017-01-27 Thread Michael Gummelt
> The way I understand is that the Spark job will not run if the CPU/Mem
requirement is not met.

Spark jobs will still run if they only have a subset of the requested
resources.  Tasks begin scheduling as soon as the first executor comes up.
Dynamic allocation yields increased utilization by only allocating as many
executors as a job needs, rather than a single static amount set up front.

Dynamic Allocation is supported in Spark on Mesos, but we here at
Mesosphere haven't been testing it much, and I'm not sure what the
community adoption is.  So I can't yet speak to its robustness, but we will
be investing in it soon.  Many users want it.

On Fri, Jan 27, 2017 at 9:35 AM, Ji Yan <ji...@drive.ai> wrote:

> Dear Spark Users,
>
> Currently is there a way to dynamically allocate resources to Spark on
> Mesos? Within Spark we can specify the CPU cores, memory before running
> job. The way I understand is that the Spark job will not run if the CPU/Mem
> requirement is not met. This may lead to decrease in overall utilization of
> the cluster. An alternative behavior is to launch the job with the best
> resource offer Mesos is able to give. Is this possible with the current
> implementation?
>
> Thanks
> Ji
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: spark locality

2017-01-12 Thread Michael Gummelt
If the executor reports a different hostname inside the CNI container, then
no, I don't think so.

On Thu, Jan 12, 2017 at 2:28 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> So even if I make the Spark executors run on the same node as Casssandra
> nodes, I am not sure each worker will connect to c* nodes on the same mesos
> agent ?
>
> 2017-01-12 21:13 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>:
>
>> The code in there w/ docs that reference CNI doesn't actually run when
>> CNI is in effect, and doesn't have anything to do with locality.  It's just
>> making Spark work in a no-DNS environment
>>
>> On Thu, Jan 12, 2017 at 12:04 PM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> I have found this but I am not sure how it can help...
>>> https://github.com/mesosphere/spark-build/blob/a9efef8850976
>>> f787956660262f3b77cd636f3f5/conf/spark-env.sh
>>>
>>>
>>> 2017-01-12 20:16 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>:
>>>
>>>> That's a good point. I hadn't considered the locality implications of
>>>> CNI yet.  I think tasks are placed based on the hostname reported by the
>>>> executor, which in a CNI container will be different than the
>>>> HDFS/Cassandra hostname.  I'm not aware of anyone running Spark+CNI in prod
>>>> yet, either.
>>>>
>>>> However, locality in Mesos isn't great right now anyway.  Executors are
>>>> placed w/o regard to locality.  Locality is only taken into account when
>>>> tasks are assigned to executors.  So if you get a locality-poor executor
>>>> placement, you'll also have locality poor task placement.  It could be
>>>> better.
>>>>
>>>> On Thu, Jan 12, 2017 at 7:55 AM, vincent gromakowski <
>>>> vincent.gromakow...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>> Does anyone have experience running Spark on Mesos with CNI (ip per
>>>>> container) ?
>>>>> How would Spark use IP or hostname for data locality with backend
>>>>> framework like HDFS or Cassandra ?
>>>>>
>>>>> V
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael Gummelt
>>>> Software Engineer
>>>> Mesosphere
>>>>
>>>
>>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: spark locality

2017-01-12 Thread Michael Gummelt
The code in there w/ docs that reference CNI doesn't actually run when CNI
is in effect, and doesn't have anything to do with locality.  It's just
making Spark work in a no-DNS environment

On Thu, Jan 12, 2017 at 12:04 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> I have found this but I am not sure how it can help...
> https://github.com/mesosphere/spark-build/blob/
> a9efef8850976f787956660262f3b77cd636f3f5/conf/spark-env.sh
>
>
> 2017-01-12 20:16 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>:
>
>> That's a good point. I hadn't considered the locality implications of CNI
>> yet.  I think tasks are placed based on the hostname reported by the
>> executor, which in a CNI container will be different than the
>> HDFS/Cassandra hostname.  I'm not aware of anyone running Spark+CNI in prod
>> yet, either.
>>
>> However, locality in Mesos isn't great right now anyway.  Executors are
>> placed w/o regard to locality.  Locality is only taken into account when
>> tasks are assigned to executors.  So if you get a locality-poor executor
>> placement, you'll also have locality poor task placement.  It could be
>> better.
>>
>> On Thu, Jan 12, 2017 at 7:55 AM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> Hi all,
>>> Does anyone have experience running Spark on Mesos with CNI (ip per
>>> container) ?
>>> How would Spark use IP or hostname for data locality with backend
>>> framework like HDFS or Cassandra ?
>>>
>>> V
>>>
>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: spark locality

2017-01-12 Thread Michael Gummelt
That's a good point. I hadn't considered the locality implications of CNI
yet.  I think tasks are placed based on the hostname reported by the
executor, which in a CNI container will be different than the
HDFS/Cassandra hostname.  I'm not aware of anyone running Spark+CNI in prod
yet, either.

However, locality in Mesos isn't great right now anyway.  Executors are
placed w/o regard to locality.  Locality is only taken into account when
tasks are assigned to executors.  So if you get a locality-poor executor
placement, you'll also have locality poor task placement.  It could be
better.

On Thu, Jan 12, 2017 at 7:55 AM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Hi all,
> Does anyone have experience running Spark on Mesos with CNI (ip per
> container) ?
> How would Spark use IP or hostname for data locality with backend
> framework like HDFS or Cassandra ?
>
> V
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Could not parse Master URL for Mesos on Spark 2.1.0

2017-01-10 Thread Michael Gummelt
Oh, interesting.  I've never heard of that sort of architecture.  And I'm
not sure exactly how the JNI bindings do the native library discovery, but
I know the MESOS_NATIVE_JAVA_LIBRARY env var has always been the documented
discovery method, so I'd definitely always provide that if I were you.  I'm
not sure why the bindings can't discover based on the standard shared
library mechanisms (LD_LIBRARY_PATH, ld.so.conf).  That's a question for
the Mesos team.

On Tue, Jan 10, 2017 at 12:46 PM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> nop, there is no "distribution", no spark-submit at the start of my
> process.
> But I found the problem, the behavior when loading mesos native dependency
> changed, and the static initialization block inside 
> org.apache.mesos.MesosSchedulerDriver
> needed the specific reference to libmesos-1.0.0.so.
>
> So just for the record, setting the env variable
> MESOS_NATIVE_JAVA_LIBRARY="//
> libmesos-1.0.0.so" fixed the whole thing.
>
> Thanks for the help !
>
> @michael if you want to talk about the setup we're using, we can talk
> about it directly [image: simple_smile].
>
>
>
> On Tue, Jan 10, 2017 9:31 PM, Michael Gummelt mgumm...@mesosphere.io
> wrote:
>
>> What do you mean your driver has all the dependencies packaged?  What are
>> "all the dependencies"?  Is the distribution you use to launch your driver
>> built with -Pmesos?
>>
>> On Tue, Jan 10, 2017 at 12:18 PM, Olivier Girardot <
>> o.girar...@lateral-thoughts.com> wrote:
>>
>> Hi Michael,
>> I did so, but it's not exactly the problem, you see my driver has all the
>> dependencies packaged, and only the executors fetch via the
>> spark.executor.uri the tgz,
>> The strange thing is that I see in my classpath the
>> org.apache.mesos:mesos-1.0.0-shaded-protobuf dependency packaged in the
>> final dist of my app…
>> So everything should work in theory.
>>
>>
>>
>> On Tue, Jan 10, 2017 7:22 PM, Michael Gummelt mgumm...@mesosphere.io
>> wrote:
>>
>> Just build with -Pmesos http://spark.apache.org/docs/l
>> atest/building-spark.html#building-with-mesos-support
>>
>> On Tue, Jan 10, 2017 at 8:56 AM, Olivier Girardot <
>> o.girar...@lateral-thoughts.com> wrote:
>>
>> I had the same problem, added spark-mesos as dependency and now I get :
>> [2017-01-10 17:45:16,575] {bash_operator.py:77} INFO - Exception in
>> thread "main" java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.mesos.MesosSchedulerDriver
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosSchedulerUtils
>> $class.createSchedulerDriver(MesosSchedulerUtils.scala:105)
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedS
>> chedulerBackend.createSchedulerDriver(MesosCoarseGrainedSche
>> dulerBackend.scala:48)
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedS
>> chedulerBackend.start(MesosCoarseGrainedSchedulerBackend.scala:155)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSched
>> ulerImpl.scala:156)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.SparkContext.(SparkContext.scala:509)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(S
>> parkSession.scala:868)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(S
>> parkSession.scala:860)
>> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
>> scala.Option.getOrElse(Option.scala:121)
>> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkS
>> ession.scala:860)
>>
>> Is there any other dependency to add for spark 2.1.0 ?
>>
>>
>>
>> On Tue, Jan 10, 2017 1:26 AM, Abhishek Bhandari abhi10...@gmail.com
>> wrote:
>>
>> Glad that you found it.
>> ᐧ
>>
>> On Mon, Jan 9, 2017 at 3:29 PM, Richard Siebeling <rsiebel...@gmail.com>
>> wrote:
>>
>> Probably found it, it turns out that Mesos should be explicitly added
>> while building Spark, I assumed I could use the o

Re: Could not parse Master URL for Mesos on Spark 2.1.0

2017-01-10 Thread Michael Gummelt
What do you mean your driver has all the dependencies packaged?  What are
"all the dependencies"?  Is the distribution you use to launch your driver
built with -Pmesos?

On Tue, Jan 10, 2017 at 12:18 PM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> Hi Michael,
> I did so, but it's not exactly the problem, you see my driver has all the
> dependencies packaged, and only the executors fetch via the
> spark.executor.uri the tgz,
> The strange thing is that I see in my classpath the
> org.apache.mesos:mesos-1.0.0-shaded-protobuf dependency packaged in the
> final dist of my app…
> So everything should work in theory.
>
>
>
> On Tue, Jan 10, 2017 7:22 PM, Michael Gummelt mgumm...@mesosphere.io
> wrote:
>
>> Just build with -Pmesos http://spark.apache.org/docs/
>> latest/building-spark.html#building-with-mesos-support
>>
>> On Tue, Jan 10, 2017 at 8:56 AM, Olivier Girardot <
>> o.girar...@lateral-thoughts.com> wrote:
>>
>> I had the same problem, added spark-mesos as dependency and now I get :
>> [2017-01-10 17:45:16,575] {bash_operator.py:77} INFO - Exception in
>> thread "main" java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.mesos.MesosSchedulerDriver
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosSchedulerUtils
>> $class.createSchedulerDriver(MesosSchedulerUtils.scala:105)
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedS
>> chedulerBackend.createSchedulerDriver(MesosCoarseGrainedSchedulerBackend.
>> scala:48)
>> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedS
>> chedulerBackend.start(MesosCoarseGrainedSchedulerBackend.scala:155)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSched
>> ulerImpl.scala:156)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.SparkContext.(SparkContext.scala:509)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(
>> SparkSession.scala:868)
>> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(
>> SparkSession.scala:860)
>> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
>> scala.Option.getOrElse(Option.scala:121)
>> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(
>> SparkSession.scala:860)
>>
>> Is there any other dependency to add for spark 2.1.0 ?
>>
>>
>>
>> On Tue, Jan 10, 2017 1:26 AM, Abhishek Bhandari abhi10...@gmail.com
>> wrote:
>>
>> Glad that you found it.
>> ᐧ
>>
>> On Mon, Jan 9, 2017 at 3:29 PM, Richard Siebeling <rsiebel...@gmail.com>
>> wrote:
>>
>> Probably found it, it turns out that Mesos should be explicitly added
>> while building Spark, I assumed I could use the old build command that I
>> used for building Spark 2.0.0... Didn't see the two lines added in the
>> documentation...
>>
>> Maybe these kind of changes could be added in the changelog under changes
>> of behaviour or changes in the build process or something like that,
>>
>> kind regards,
>> Richard
>>
>>
>> On 9 January 2017 at 22:55, Richard Siebeling <rsiebel...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> I'm setting up Apache Spark 2.1.0 on Mesos and I am getting a "Could not
>> parse Master URL: 'mesos://xx.xx.xxx.xxx:5050'" error.
>> Mesos is running fine (both the master as the slave, it's a single
>> machine configuration).
>>
>> I really don't understand why this is happening since the same
>> configuration but using a Spark 2.0.0 is running fine within Vagrant.
>> Could someone please help?
>>
>> thanks in advance,
>> Richard
>>
>>
>>
>>
>>
>>
>>
>> --
>> *Abhishek J Bhandari*
>> Mobile No. +1 510 493 6205 <(510)%20493-6205> (USA)
>> Mobile No. +91 96387 93021 <+91%2096387%2093021> (IND)
>> *R & D Department*
>> *Valent Software Inc. CA*
>> Email: *abhis...@valent-software.com <abhis...@valent-software.com>*
>>
>>
>>
>> *Olivier Girardot* | Associé
>> o.girar...@lateral-thoughts.com
>> +33 6 24 09 17 94
>>
>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>
> *Olivier Girardot* | Associé
> o.girar...@lateral-thoughts.com
> +33 6 24 09 17 94
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Could not parse Master URL for Mesos on Spark 2.1.0

2017-01-10 Thread Michael Gummelt
Just build with -Pmesos
http://spark.apache.org/docs/latest/building-spark.html#building-with-mesos-support

On Tue, Jan 10, 2017 at 8:56 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> I had the same problem, added spark-mesos as dependency and now I get :
> [2017-01-10 17:45:16,575] {bash_operator.py:77} INFO - Exception in thread
> "main" java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.mesos.MesosSchedulerDriver
> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
> org.apache.spark.scheduler.cluster.mesos.MesosSchedulerUtils$class.
> createSchedulerDriver(MesosSchedulerUtils.scala:105)
> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBac
> kend.createSchedulerDriver(MesosCoarseGrainedSchedulerBackend.scala:48)
> [2017-01-10 17:45:16,576] {bash_operator.py:77} INFO - at
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBac
> kend.start(MesosCoarseGrainedSchedulerBackend.scala:155)
> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(
> TaskSchedulerImpl.scala:156)
> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
> org.apache.spark.SparkContext.(SparkContext.scala:509)
> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$
> 6.apply(SparkSession.scala:868)
> [2017-01-10 17:45:16,577] {bash_operator.py:77} INFO - at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$
> 6.apply(SparkSession.scala:860)
> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
> scala.Option.getOrElse(Option.scala:121)
> [2017-01-10 17:45:16,578] {bash_operator.py:77} INFO - at
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.
> scala:860)
>
> Is there any other dependency to add for spark 2.1.0 ?
>
>
>
> On Tue, Jan 10, 2017 1:26 AM, Abhishek Bhandari abhi10...@gmail.com wrote:
>
>> Glad that you found it.
>> ᐧ
>>
>> On Mon, Jan 9, 2017 at 3:29 PM, Richard Siebeling <rsiebel...@gmail.com>
>> wrote:
>>
>> Probably found it, it turns out that Mesos should be explicitly added
>> while building Spark, I assumed I could use the old build command that I
>> used for building Spark 2.0.0... Didn't see the two lines added in the
>> documentation...
>>
>> Maybe these kind of changes could be added in the changelog under changes
>> of behaviour or changes in the build process or something like that,
>>
>> kind regards,
>> Richard
>>
>>
>> On 9 January 2017 at 22:55, Richard Siebeling <rsiebel...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> I'm setting up Apache Spark 2.1.0 on Mesos and I am getting a "Could not
>> parse Master URL: 'mesos://xx.xx.xxx.xxx:5050'" error.
>> Mesos is running fine (both the master as the slave, it's a single
>> machine configuration).
>>
>> I really don't understand why this is happening since the same
>> configuration but using a Spark 2.0.0 is running fine within Vagrant.
>> Could someone please help?
>>
>> thanks in advance,
>> Richard
>>
>>
>>
>>
>>
>>
>>
>> --
>> *Abhishek J Bhandari*
>> Mobile No. +1 510 493 6205 <(510)%20493-6205> (USA)
>> Mobile No. +91 96387 93021 <+91%2096387%2093021> (IND)
>> *R & D Department*
>> *Valent Software Inc. CA*
>> Email: *abhis...@valent-software.com <abhis...@valent-software.com>*
>>
>
>
> *Olivier Girardot* | Associé
> o.girar...@lateral-thoughts.com
> +33 6 24 09 17 94
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark/Mesos with GPU support

2016-12-30 Thread Michael Gummelt
I've cc'd Tim and Kevin, who worked on GPU support.

On Wed, Dec 28, 2016 at 11:22 AM, Ji Yan <ji...@drive.ai> wrote:

> Dear Spark Users,
>
> Has anyone had successful experience running Spark on Mesos with GPU
> support? We have a Mesos cluster that can see and offer nvidia GPU
> resources. With Spark, it seems that the GPU support with Mesos (
> https://github.com/apache/spark/pull/14644) has only recently been merged
> into Spark Master which is not found in 2.0.2 release yet. We have a custom
> built Spark from 2.1-rc5 which is confirmed to have the above change.
> However when we try to run any code from Spark on this Mesos setup, the
> spark program hangs and keeps saying
>
> “WARN TaskSchedulerImpl: Initial job has not accepted any resources;
> check your cluster UI to ensure that workers are registered and have
> sufficient resources”
>
> We are pretty sure that the cluster has enough resources as there is
> nothing running on it. If we disable the GPU support in configuration and
> restart mesos and retry the same program, it would work.
>
> Any comment/advice on this greatly appreciated
>
> Thanks,
> Ji
>
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-26 Thread Michael Gummelt
In fine-grained mode (which is deprecated), Spark tasks (which are threads)
were implemented as Mesos tasks.  When a Mesos task starts and stops, its
underlying cgroup, and therefore the resources its consuming on the
cluster, grows or shrinks based on the resources allocated to the tasks,
which in Spark is just CPU.  This is what I mean by CPU usage "elastically
growing".

However, all Mesos tasks are run by an "executor", which has its own
resource allocation.  In Spark, the executor is the JVM, and all memory is
allocated to the executor, because JVMs can't relinquish memory.  If memory
were allocated to the tasks, then the cgroup's memory allocation would
shrink when the task terminated, but the JVM's memory consumption would
stay constant, and the JVM would OOM.

And, without dynamic allocation, executors never terminate during the
duration of a Spark job, because even if they're idle (no tasks), they
still may be hosting shuffle files.  That's why dynamic allocation depends
on an external shuffle service.  Since executors never terminate, and all
memory is allocated to the executors, Spark jobs even in fine-grained mode
only grow in memory allocation, they don't shrink.

On Mon, Dec 26, 2016 at 12:39 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi Michael,
>
> That caught my attention...
>
> Could you please elaborate on "elastically grow and shrink CPU usage"
> and how it really works under the covers? It seems that CPU usage is
> just a "label" for an executor on Mesos. Where's this in the code?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Mon, Dec 26, 2016 at 6:25 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
> >> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
> >> allocation
> >
> > Maybe for CPU, but definitely not for memory.  Executors never shut down
> in
> > fine-grained mode, which means you only elastically grow and shrink CPU
> > usage, not memory.
> >
> > On Sat, Dec 24, 2016 at 10:14 PM, Davies Liu <davies@gmail.com>
> wrote:
> >>
> >> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
> >> allocation, but have to pay a little more overhead for launching a
> >> task, which should be OK if the task is not trivial.
> >>
> >> Since the direct result (up to 1M by default) will also go through
> >> mesos, it's better to tune it lower, otherwise mesos could become the
> >> bottleneck.
> >>
> >> spark.task.maxDirectResultSize
> >>
> >> On Mon, Dec 19, 2016 at 3:23 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> >> wrote:
> >> > Tim,
> >> >
> >> > We will try to run the application in coarse grain mode, and share the
> >> > findings with you.
> >> >
> >> > Regards
> >> > Sumit Chawla
> >> >
> >> >
> >> > On Mon, Dec 19, 2016 at 3:11 PM, Timothy Chen <tnac...@gmail.com>
> wrote:
> >> >
> >> >> Dynamic allocation works with Coarse grain mode only, we wasn't aware
> >> >> a need for Fine grain mode after we enabled dynamic allocation
> support
> >> >> on the coarse grain mode.
> >> >>
> >> >> What's the reason you're running fine grain mode instead of coarse
> >> >> grain + dynamic allocation?
> >> >>
> >> >> Tim
> >> >>
> >> >> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
> >> >> <mehdi.mezi...@ldmobile.net> wrote:
> >> >> > We will be interested by the results if you give a try to Dynamic
> >> >> allocation
> >> >> > with mesos !
> >> >> >
> >> >> >
> >> >> > - Mail Original -
> >> >> > De: "Michael Gummelt" <mgumm...@mesosphere.io>
> >> >> > À: "Sumit Chawla" <sumitkcha...@gmail.com>
> >> >> > Cc: u...@mesos.apache.org, d...@mesos.apache.org, "User"
> >> >> > <user@spark.apache.org>, d...@spark.apache.org
> >> >> > Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam /
> Berlin
> >> >> > /
> >> >> > Berne / Rome / Stockholm / Vienne
> >> >> > Objet: Re: Mesos Spark Fine Grained Execution - CPU count
> >> >> >
> >> >> >
> &g

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-26 Thread Michael Gummelt
> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
allocation

Maybe for CPU, but definitely not for memory.  Executors never shut down in
fine-grained mode, which means you only elastically grow and shrink CPU
usage, not memory.

On Sat, Dec 24, 2016 at 10:14 PM, Davies Liu <davies@gmail.com> wrote:

> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
> allocation, but have to pay a little more overhead for launching a
> task, which should be OK if the task is not trivial.
>
> Since the direct result (up to 1M by default) will also go through
> mesos, it's better to tune it lower, otherwise mesos could become the
> bottleneck.
>
> spark.task.maxDirectResultSize
>
> On Mon, Dec 19, 2016 at 3:23 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
> > Tim,
> >
> > We will try to run the application in coarse grain mode, and share the
> > findings with you.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Mon, Dec 19, 2016 at 3:11 PM, Timothy Chen <tnac...@gmail.com> wrote:
> >
> >> Dynamic allocation works with Coarse grain mode only, we wasn't aware
> >> a need for Fine grain mode after we enabled dynamic allocation support
> >> on the coarse grain mode.
> >>
> >> What's the reason you're running fine grain mode instead of coarse
> >> grain + dynamic allocation?
> >>
> >> Tim
> >>
> >> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
> >> <mehdi.mezi...@ldmobile.net> wrote:
> >> > We will be interested by the results if you give a try to Dynamic
> >> allocation
> >> > with mesos !
> >> >
> >> >
> >> > - Mail Original -
> >> > De: "Michael Gummelt" <mgumm...@mesosphere.io>
> >> > À: "Sumit Chawla" <sumitkcha...@gmail.com>
> >> > Cc: u...@mesos.apache.org, d...@mesos.apache.org, "User"
> >> > <user@spark.apache.org>, d...@spark.apache.org
> >> > Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin
> /
> >> > Berne / Rome / Stockholm / Vienne
> >> > Objet: Re: Mesos Spark Fine Grained Execution - CPU count
> >> >
> >> >
> >> >> Is this problem of idle executors sticking around solved in Dynamic
> >> >> Resource Allocation?  Is there some timeout after which Idle
> executors
> >> can
> >> >> just shutdown and cleanup its resources.
> >> >
> >> > Yes, that's exactly what dynamic allocation does.  But again I have no
> >> idea
> >> > what the state of dynamic allocation + mesos is.
> >> >
> >> > On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <sumitkcha...@gmail.com
> >
> >> > wrote:
> >> >>
> >> >> Great.  Makes much better sense now.  What will be reason to have
> >> >> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't
> >> include
> >> >> the number of cores for tasks.
> >> >>
> >> >> So in my case it seems like 30 CPUs are allocated to executors.  And
> >> there
> >> >> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap of
> 30 is
> >> >> maintained till the last task exits.  This explains the gap.   Thanks
> >> >> everyone.  I am still not sure how this number 30 is calculated.  (
> Is
> >> it
> >> >> dynamic based on current resources, or is it some configuration.  I
> >> have 32
> >> >> nodes in my cluster).
> >> >>
> >> >> Is this problem of idle executors sticking around solved in Dynamic
> >> >> Resource Allocation?  Is there some timeout after which Idle
> executors
> >> can
> >> >> just shutdown and cleanup its resources.
> >> >>
> >> >>
> >> >> Regards
> >> >> Sumit Chawla
> >> >>
> >> >>
> >> >> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <
> >> mgumm...@mesosphere.io>
> >> >> wrote:
> >> >>>
> >> >>> >  I should preassume that No of executors should be less than
> number
> >> of
> >> >>> > tasks.
> >> >>>
> >> >>> No.  Each executor runs 0 or more tasks.
> >> >>>
> >> >>> Each executor consumes 1 CPU, and each task running on that executor
> >>

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
> Is this problem of idle executors sticking around solved in Dynamic
Resource Allocation?  Is there some timeout after which Idle executors can
just shutdown and cleanup its resources.

Yes, that's exactly what dynamic allocation does.  But again I have no idea
what the state of dynamic allocation + mesos is.

On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:

> Great.  Makes much better sense now.  What will be reason to have
> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't
> include the number of cores for tasks.
>
> So in my case it seems like 30 CPUs are allocated to executors.  And there
> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap of 30 is
> maintained till the last task exits.  This explains the gap.   Thanks
> everyone.  I am still not sure how this number 30 is calculated.  ( Is it
> dynamic based on current resources, or is it some configuration.  I have 32
> nodes in my cluster).
>
> Is this problem of idle executors sticking around solved in Dynamic
> Resource Allocation?  Is there some timeout after which Idle executors can
> just shutdown and cleanup its resources.
>
>
> Regards
> Sumit Chawla
>
>
> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> >  I should preassume that No of executors should be less than number of
>> tasks.
>>
>> No.  Each executor runs 0 or more tasks.
>>
>> Each executor consumes 1 CPU, and each task running on that executor
>> consumes another CPU.  You can customize this via
>> spark.mesos.mesosExecutor.cores (https://github.com/apache/spa
>> rk/blob/v1.6.3/docs/running-on-mesos.md) and spark.task.cpus (
>> https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md)
>>
>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <sumitkcha...@gmail.com>
>> wrote:
>>
>>> Ah thanks. looks like i skipped reading this *"Neither will executors
>>> terminate when they’re idle."*
>>>
>>> So in my job scenario,  I should preassume that No of executors should
>>> be less than number of tasks. Ideally one executor should execute 1 or more
>>> tasks.  But i am observing something strange instead.  I start my job with
>>> 48 partitions for a spark job. In mesos ui i see that number of tasks is
>>> 48, but no. of CPUs is 78 which is way more than 48.  Here i am assuming
>>> that 1 CPU is 1 executor.   I am not specifying any configuration to set
>>> number of cores per executor.
>>>
>>> Regards
>>> Sumit Chawla
>>>
>>>
>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere <
>>> jo...@mesosphere.io> wrote:
>>>
>>>> That makes sense. From the documentation it looks like the executors
>>>> are not supposed to terminate:
>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#fi
>>>> ne-grained-deprecated
>>>>
>>>>> Note that while Spark tasks in fine-grained will relinquish cores as
>>>>> they terminate, they will not relinquish memory, as the JVM does not give
>>>>> memory back to the Operating System. Neither will executors terminate when
>>>>> they’re idle.
>>>>
>>>>
>>>> I suppose your task to executor CPU ratio is low enough that it looks
>>>> like most of the resources are not being reclaimed. If your tasks were
>>>> using significantly more CPU the amortized cost of the idle executors would
>>>> not be such a big deal.
>>>>
>>>>
>>>> —
>>>> *Joris Van Remoortere*
>>>> Mesosphere
>>>>
>>>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen <tnac...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Chawla,
>>>>>
>>>>> One possible reason is that Mesos fine grain mode also takes up cores
>>>>> to run the executor per host, so if you have 20 agents running Fine
>>>>> grained executor it will take up 20 cores while it's still running.
>>>>>
>>>>> Tim
>>>>>
>>>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <sumitkcha...@gmail.com>
>>>>> wrote:
>>>>> > Hi
>>>>> >
>>>>> > I am using Spark 1.6. I have one query about Fine Grained model in
>>>>> Spark.
>>>>> > I have a simple Spark application which transforms A -> B.  Its a
>>>>> single
>>>>> > stage application.  To begin the program, It starts with 48
>>>>> partitions.
>>>>> > When the program starts running, in mesos UI it shows 48 tasks and
>>>>> 48 CPUs
>>>>> > allocated to job.  Now as the tasks get done, the number of active
>>>>> tasks
>>>>> > number starts decreasing.  How ever, the number of CPUs does not
>>>>> decrease
>>>>> > propotionally.  When the job was about to finish, there was a single
>>>>> > remaininig task, however CPU count was still 20.
>>>>> >
>>>>> > My questions, is why there is no one to one mapping between tasks
>>>>> and cpus
>>>>> > in Fine grained?  How can these CPUs be released when the job is
>>>>> done, so
>>>>> > that other jobs can start.
>>>>> >
>>>>> >
>>>>> > Regards
>>>>> > Sumit Chawla
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
>  I should preassume that No of executors should be less than number of
tasks.

No.  Each executor runs 0 or more tasks.

Each executor consumes 1 CPU, and each task running on that executor
consumes another CPU.  You can customize this via
spark.mesos.mesosExecutor.cores (
https://github.com/apache/spark/blob/v1.6.3/docs/running-on-mesos.md) and
spark.task.cpus (
https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md)

On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:

> Ah thanks. looks like i skipped reading this *"Neither will executors
> terminate when they’re idle."*
>
> So in my job scenario,  I should preassume that No of executors should be
> less than number of tasks. Ideally one executor should execute 1 or more
> tasks.  But i am observing something strange instead.  I start my job with
> 48 partitions for a spark job. In mesos ui i see that number of tasks is
> 48, but no. of CPUs is 78 which is way more than 48.  Here i am assuming
> that 1 CPU is 1 executor.   I am not specifying any configuration to set
> number of cores per executor.
>
> Regards
> Sumit Chawla
>
>
> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere <
> jo...@mesosphere.io> wrote:
>
>> That makes sense. From the documentation it looks like the executors are
>> not supposed to terminate:
>> http://spark.apache.org/docs/latest/running-on-mesos.html#fi
>> ne-grained-deprecated
>>
>>> Note that while Spark tasks in fine-grained will relinquish cores as
>>> they terminate, they will not relinquish memory, as the JVM does not give
>>> memory back to the Operating System. Neither will executors terminate when
>>> they’re idle.
>>
>>
>> I suppose your task to executor CPU ratio is low enough that it looks
>> like most of the resources are not being reclaimed. If your tasks were
>> using significantly more CPU the amortized cost of the idle executors would
>> not be such a big deal.
>>
>>
>> —
>> *Joris Van Remoortere*
>> Mesosphere
>>
>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen <tnac...@gmail.com> wrote:
>>
>>> Hi Chawla,
>>>
>>> One possible reason is that Mesos fine grain mode also takes up cores
>>> to run the executor per host, so if you have 20 agents running Fine
>>> grained executor it will take up 20 cores while it's still running.
>>>
>>> Tim
>>>
>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <sumitkcha...@gmail.com>
>>> wrote:
>>> > Hi
>>> >
>>> > I am using Spark 1.6. I have one query about Fine Grained model in
>>> Spark.
>>> > I have a simple Spark application which transforms A -> B.  Its a
>>> single
>>> > stage application.  To begin the program, It starts with 48 partitions.
>>> > When the program starts running, in mesos UI it shows 48 tasks and 48
>>> CPUs
>>> > allocated to job.  Now as the tasks get done, the number of active
>>> tasks
>>> > number starts decreasing.  How ever, the number of CPUs does not
>>> decrease
>>> > propotionally.  When the job was about to finish, there was a single
>>> > remaininig task, however CPU count was still 20.
>>> >
>>> > My questions, is why there is no one to one mapping between tasks and
>>> cpus
>>> > in Fine grained?  How can these CPUs be released when the job is done,
>>> so
>>> > that other jobs can start.
>>> >
>>> >
>>> > Regards
>>> > Sumit Chawla
>>>
>>
>>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
Yea, the idea is to use dynamic allocation.  I can't speak to how well it
works with Mesos, though.

On Mon, Dec 19, 2016 at 11:01 AM, Mehdi Meziane <mehdi.mezi...@ldmobile.net>
wrote:

> I think that what you are looking for is Dynamic resource allocation:
> http://spark.apache.org/docs/latest/job-scheduling.html#
> dynamic-resource-allocation
>
> Spark provides a mechanism to dynamically adjust the resources your
> application occupies based on the workload. This means that your
> application may give resources back to the cluster if they are no longer
> used and request them again later when there is demand. This feature is
> particularly useful if multiple applications share resources in your Spark
> cluster.
>
> - Mail Original -
> De: "Sumit Chawla" <sumitkcha...@gmail.com>
> À: "Michael Gummelt" <mgumm...@mesosphere.io>
> Cc: u...@mesos.apache.org, "Dev" <d...@mesos.apache.org>, "User" <
> user@spark.apache.org>, "dev" <d...@spark.apache.org>
> Envoyé: Lundi 19 Décembre 2016 19h35:51 GMT +01:00 Amsterdam / Berlin /
> Berne / Rome / Stockholm / Vienne
> Objet: Re: Mesos Spark Fine Grained Execution - CPU count
>
>
> But coarse grained does the exact same thing which i am trying to avert
> here.  At the cost of lower startup, it keeps the resources reserved till
> the entire duration of the job.
>
> Regards
> Sumit Chawla
>
>
> On Mon, Dec 19, 2016 at 10:06 AM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> Hi
>>
>> I don't have a lot of experience with the fine-grained scheduler.  It's
>> deprecated and fairly old now.  CPUs should be relinquished as tasks
>> complete, so I'm not sure why you're seeing what you're seeing.  There have
>> been a few discussions on the spark list regarding deprecating the
>> fine-grained scheduler, and no one seemed too dead-set on keeping it.  I'd
>> recommend you move over to coarse-grained.
>>
>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <sumitkcha...@gmail.com>
>> wrote:
>>
>>> Hi
>>>
>>> I am using Spark 1.6. I have one query about Fine Grained model in
>>> Spark.  I have a simple Spark application which transforms A -> B.  Its a
>>> single stage application.  To begin the program, It starts with 48
>>> partitions.  When the program starts running, in mesos UI it shows 48 tasks
>>> and 48 CPUs allocated to job.  Now as the tasks get done, the number of
>>> active tasks number starts decreasing.  How ever, the number of CPUs does
>>> not decrease propotionally.  When the job was about to finish, there was a
>>> single remaininig task, however CPU count was still 20.
>>>
>>> My questions, is why there is no one to one mapping between tasks and
>>> cpus in Fine grained?  How can these CPUs be released when the job is done,
>>> so that other jobs can start.
>>>
>>>
>>> Regards
>>> Sumit Chawla
>>>
>>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Michael Gummelt
Hi

I don't have a lot of experience with the fine-grained scheduler.  It's
deprecated and fairly old now.  CPUs should be relinquished as tasks
complete, so I'm not sure why you're seeing what you're seeing.  There have
been a few discussions on the spark list regarding deprecating the
fine-grained scheduler, and no one seemed too dead-set on keeping it.  I'd
recommend you move over to coarse-grained.

On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:

> Hi
>
> I am using Spark 1.6. I have one query about Fine Grained model in Spark.
> I have a simple Spark application which transforms A -> B.  Its a single
> stage application.  To begin the program, It starts with 48 partitions.
> When the program starts running, in mesos UI it shows 48 tasks and 48 CPUs
> allocated to job.  Now as the tasks get done, the number of active tasks
> number starts decreasing.  How ever, the number of CPUs does not decrease
> propotionally.  When the job was about to finish, there was a single
> remaininig task, however CPU count was still 20.
>
> My questions, is why there is no one to one mapping between tasks and cpus
> in Fine grained?  How can these CPUs be released when the job is done, so
> that other jobs can start.
>
>
> Regards
> Sumit Chawla
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: driver in queued state and not started

2016-12-06 Thread Michael Gummelt
Client mode or cluster mode?

On Mon, Dec 5, 2016 at 10:05 PM, Yu Wei <yu20...@hotmail.com> wrote:

> Hi Guys,
>
>
> I tried to run spark on mesos cluster.
>
> However, when I tried to submit jobs via spark-submit. The driver is in
> "Queued state" and not started.
>
>
> Which should I check?
>
>
>
> Thanks,
>
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: two spark-shells spark on mesos not working

2016-11-22 Thread Michael Gummelt
What are the full driver logs?  If you enable DEBUG logging, it should give
you more information about the rejected offers.  This can also happen if
offers are being accepted, but tasks immediately die for some reason.  You
should check the Mesos UI for failed tasks.  If they exist, please include
those logs here as well.

On Tue, Nov 22, 2016 at 4:52 AM, John Yost <hokiege...@gmail.com> wrote:

> Hi Everyone,
>
> There is probably an obvious answer to this, but not sure what it is. :)
>
> I am attempting to launch 2..n spark shells using Mesos as the master
> (this is to support 1..n researchers running pyspark stuff on our data). I
> can launch two or more spark shells without any problem. But, when I
> attempt any kind of operation that requires a Spark executor outside the
> driver program such as:
>
> val numbers = Ranger(1,1000)
> val pNumbers = sc.parallelize(numbers)
> pNumbers.take(5)
>
> I get the dreaded message:
> TaskSchedulerImpl: Initial job has not accepted any resources; check your
> cluster UI to ensure that workers are registered and sufficient resources
>
> I confirmed that both spark shells are listed as separate, uniquely-named
> Mesos frameworks and that there are plenty of CPU core and memory resources
> on our cluster.
>
> I am using Spark 2.0.1 on Mesos 0.28.1. Any ideas that y'all may have
> would be very much appreciated.
>
> Thanks! :)
>
> --John
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Two questions about running spark on mesos

2016-11-14 Thread Michael Gummelt
1. I had never even heard of conf/slaves until this email, and I only see
it referenced in the docs next to Spark Standalone, so I doubt that works.

2. Yes.  See the --kill option in spark-submit.

Also, we're considering dropping the Spark dispatcher in DC/OS in favor of
Metronome, which will be our consolidated method of running any one-off
jobs.  The dispatcher is really just a lesser maintained and more
feature-sparse metronome.  If I were you, I would look into running
Metronome rather than the dispatcher (or just run DC/OS).

On Mon, Nov 14, 2016 at 3:10 AM, Yu Wei <yu20...@hotmail.com> wrote:

> Hi Guys,
>
>
> Two questions about running spark on mesos.
>
> 1, Does spark configuration of conf/slaves still work when running spark
> on mesos?
>
> According to my observations, it seemed that conf/slaves still took
> effect when running spark-shell.
>
> However, it doesn't take effect when deploying in cluster mode.
>
> Is this expected behavior?
>
>Or did I miss anything?
>
>
> 2, Could I kill submitted jobs when running spark on mesos in cluster mode?
>
> I launched spark on mesos in cluster mode. Then submitted a long
> running job succeeded.
>
> Then I want to kill the job.
> How could I do that? Is there any similar commands as launching spark
> on yarn?
>
>
> Thanks,
>
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: sanboxing spark executors

2016-11-04 Thread Michael Gummelt
Mesos will let you run in docker containers, so you get filesystem
isolation, and we're about to merge CNI support:
https://github.com/apache/spark/pull/15740, which would allow you to set up
network policies.  Though you might be able to achieve whatever network
isolation you need without CNI, depending on your requirements.

As far as unauthenticated HDFS clusters, I would recommend against running
untrusted code on the same network as your secure HDFS cluster.

On Fri, Nov 4, 2016 at 4:13 PM, blazespinnaker <blazespinna...@gmail.com>
wrote:

> In particular, we need to make sure the RDDs execute the lambda functions
> securely as they are provided by user code.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/sanboxing-spark-executors-tp28014p28024.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Submit job with driver options in Mesos Cluster mode

2016-10-31 Thread Michael Gummelt
Can you check if this JIRA is relevant?
https://issues.apache.org/jira/browse/SPARK-2608

If not, can you make a new one?

On Thu, Oct 27, 2016 at 10:27 PM, Rodrick Brown <rodr...@orchard-app.com>
wrote:

> Try setting the values in $SPARK_HOME/conf/spark-defaults.conf
>
> i.e.
>
> $ egrep 'spark.(driver|executor).extra' /data/orchard/spark-2.0.1/
> conf/spark-defaults.conf
> spark.executor.extraJavaOptions -Duser.timezone=UTC
> -Xloggc:garbage-collector.log
> spark.driver.extraJavaOptions   -Duser.timezone=UTC
> -Xloggc:garbage-collector.log
>
> --
>
> [image: Orchard Platform] <http://www.orchardplatform.com/>
>
> Rodrick Brown / DevOPs Engineer
> +1 917 445 6839 / rodr...@orchardplatform.com
> <char...@orchardplatform.com>
>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY 10003
> http://www.orchardplatform.com
>
> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending
> Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
>
> On Oct 6, 2016, at 12:20 PM, vonnagy <i...@vadio.com> wrote:
>
> I am trying to submit a job to spark running in a Mesos cluster. We need to
> pass custom java options to the driver and executor for configuration, but
> the driver task never includes the options. Here is an example submit.
>
> GC_OPTS="-XX:+UseConcMarkSweepGC
> -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out
> -XX:MaxPermSize=512m
> -XX:+CMSClassUnloadingEnabled "
>
> EXEC_PARAMS="-Dloglevel=DEBUG -Dkafka.broker-address=${KAFKA_ADDRESS}
> -Dredis.master=${REDIS_MASTER} -Dredis.port=${REDIS_PORT}
>
> spark-submit \
>  --name client-events-intake \
>  --class ClientEventsApp \
>  --deploy-mode cluster \
>  --driver-java-options "${EXEC_PARAMS} ${GC_OPTS}" \
>  --conf "spark.ui.killEnabled=true" \
>  --conf "spark.mesos.coarse=true" \
>  --conf "spark.driver.extraJavaOptions=${EXEC_PARAMS}" \
>  --conf "spark.executor.extraJavaOptions=${EXEC_PARAMS}" \
>  --master mesos://someip:7077 \
>  --verbose \
>  some.jar
>
> When the driver task runs in Mesos it is creating the following command:
>
> sh -c 'cd spark-1*;  bin/spark-submit --name client-events-intake --class
> ClientEventsApp --master mesos://someip:5050 --driver-cores 1.0
> --driver-memory 512M ../some.jar '
>
> There are no options for the driver here, thus the driver app blows up
> because it can't find the java options. However, the environment variables
> contain the executor options:
>
> SPARK_EXECUTOR_OPTS -> -Dspark.executor.extraJavaOptions=-Dloglevel=DEBUG
> ...
>
> Any help would be great. I know that we can set some "spark.*" settings in
> default configs, but these are not necessarily spark related. This is not
> an
> issue when running the same logic outside of a Mesos cluster in Spark
> standalone mode.
>
> Thanks!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Submit-job-with-driver-options-in-
> Mesos-Cluster-mode-tp27853.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>
> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
> for the use of the addressee only. If you are not an intended recipient of
> this communication, please delete it immediately and notify the sender by
> return email. Unauthorized reading, dissemination, distribution or copying
> of this communication is prohibited. This communication does not constitute
> an offer to sell or a solicitation of an indication of interest to purchase
> any loan, security or any other financial product or instrument, nor is it
> an offer to sell or a solicitation of an indication of interest to purchase
> any products or services to any persons who are prohibited from receiving
> such information under applicable law. The contents of this communication
> may not be accurate or complete and are subject to change without notice.
> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
> "Orchard") makes no representation regarding the accuracy or completeness
> of the information contained herein. The intended recipient is advised to
> consult its own professional advisors, including those specializing in
> legal, tax and accounting matters. Orchard does not provide legal, tax or
> accounting advice.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: How to make Mesos Cluster Dispatcher of Spark 1.6.1 load my config files?

2016-10-19 Thread Michael Gummelt
See https://issues.apache.org/jira/browse/SPARK-13258 for an explanation
and workaround.

On Wed, Oct 19, 2016 at 1:35 AM, Chanh Le <giaosu...@gmail.com> wrote:

> Thank you Daniel,
> Actually I tried this before but this way is still not flexible way if you
> are running multiple jobs at the time and may different dependencies
> between each job configuration so I gave up.
>
> Another simple solution is set the command bellow as a service and I am
> using it.
>
> /build/analytics/spark-1.6.1-bin-hadoop2.6/bin/spark-submit \
>>
>>
>>
>>
>> *--files /build/analytics/kafkajobs/prod.conf \--conf
>> 'spark.executor.extraJavaOptions=-Dconfig.fuction.conf' \--conf
>> 'spark.driver.extraJavaOptions=-Dconfig.file=/build/analytics/kafkajobs/prod.conf'
>> \--conf
>> 'spark.driver.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \--conf
>> 'spark.executor.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \*
>> --class com.ants.util.kafka.PersistenceData \
>>
>> *--master mesos://10.199.0.19:5050 \*--executor-memory 5G \
>> --driver-memory 2G \
>> --total-executor-cores 4 \
>> --jars /build/analytics/kafkajobs/spark-streaming-kafka_2.10-1.6.2.jar \
>> /build/analytics/kafkajobs/kafkajobs-prod.jar
>>
>
> [Unit]
> Description=Mesos Cluster Dispatcher
>
> [Service]
> ExecStart=/build/analytics/kafkajobs/persist-job.sh
> PIDFile=/var/run/spark-persist.pid
> [Install]
> WantedBy=multi-user.target
>
>
> Regards,
> Chanh
>
> On Oct 19, 2016, at 2:15 PM, Daniel Carroza <dcarr...@stratio.com> wrote:
>
> Hi Chanh,
>
> I found a workaround that works to me:
> http://stackoverflow.com/questions/29552799/spark-
> unable-to-find-jdbc-driver/40114125#40114125
>
> Regards,
> Daniel
>
> El jue., 6 oct. 2016 a las 6:26, Chanh Le (<giaosu...@gmail.com>)
> escribió:
>
>> Hi everyone,
>> I have the same config in both mode and I really want to change config
>> whenever I run so I created a config file and run my application with it.
>> My problem is:
>> It’s works with these config without using Mesos Cluster Dispatcher.
>>
>> /build/analytics/spark-1.6.1-bin-hadoop2.6/bin/spark-submit \
>>
>>
>>
>>
>> *--files /build/analytics/kafkajobs/prod.conf \--conf
>> 'spark.executor.extraJavaOptions=-Dconfig.fuction.conf' \--conf
>> 'spark.driver.extraJavaOptions=-Dconfig.file=/build/analytics/kafkajobs/prod.conf'
>> \--conf
>> 'spark.driver.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \--conf
>> 'spark.executor.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \*
>> --class com.ants.util.kafka.PersistenceData \
>>
>> *--master mesos://10.199.0.19:5050 \*--executor-memory 5G \
>> --driver-memory 2G \
>> --total-executor-cores 4 \
>> --jars /build/analytics/kafkajobs/spark-streaming-kafka_2.10-1.6.2.jar \
>> /build/analytics/kafkajobs/kafkajobs-prod.jar
>>
>>
>> And it’s didn't work with these:
>>
>> /build/analytics/spark-1.6.1-bin-hadoop2.6/bin/spark-submit \
>>
>>
>>
>>
>> *--files /build/analytics/kafkajobs/prod.conf \--conf
>> 'spark.executor.extraJavaOptions=-Dconfig.fuction.conf' \--conf
>> 'spark.driver.extraJavaOptions=-Dconfig.file=/build/analytics/kafkajobs/prod.conf'
>> \--conf
>> 'spark.driver.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \--conf
>> 'spark.executor.extraClassPath=/build/analytics/spark-1.6.1-bin-hadoop2.6/lib/postgresql-9.3-1102.jdbc41.jar'
>> \*
>> --class com.ants.util.kafka.PersistenceData \
>>
>>
>> *--master mesos://10.199.0.19:7077 \--deploy-mode cluster \--supervise \*
>> --executor-memory 5G \
>> --driver-memory 2G \
>> --total-executor-cores 4 \
>> --jars /build/analytics/kafkajobs/spark-streaming-kafka_2.10-1.6.2.jar \
>> /build/analytics/kafkajobs/kafkajobs-prod.jar
>>
>> It threw me an error: *Exception in thread "main" java.sql.SQLException:
>> No suitable driver found for jdbc:postgresql://psqlhost:5432/kafkajobs*
>> which means my —conf didn’t work and those config I put in 
>> */build/analytics/kafkajobs/prod.conf
>> *wasn’t loaded. It only loaded thing I put in application.conf (default
>> config).
>>
>> How to make MCD load my config?
>>
>> Regards,
>> Chanh
>>
>> --
> Daniel Carroza Santana
> Vía de las Dos Castillas, 33, Ática 4, 3ª Planta.
> 28224 Pozuelo de Alarcón. Madrid.
> Tel: +34 91 828 64 73 // *@stratiobd <https://twitter.com/StratioBD>*
>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: No way to set mesos cluster driver memory overhead?

2016-10-13 Thread Michael Gummelt
We see users run both in the dispatcher and marathon.  I generally prefer
marathon, because there's a higher likelihood it's going to have some
feature you need that the dispatcher lacks (like in this case).

It doesn't look like we support overhead for the driver.

On Thu, Oct 13, 2016 at 10:42 AM, drewrobb <drewr...@gmail.com> wrote:

> When using spark on mesos and deploying a job in cluster mode using
> dispatcher, there appears to be no memory overhead configuration for the
> launched driver processes ("--driver-memory" is the same as Xmx which is
> the
> same as the memory quota). This makes it almost a guarantee that a long
> running driver will be OOM killed by mesos. Yarn cluster mode has an
> equivalent option -- spark.yarn.driver.memoryOverhead. Is there some way
> to
> configure driver memory overhead that I'm missing?
>
> Bigger picture question-- Is it even best practice to deploy long running
> spark streaming jobs using dispatcher? I could alternatively launch the
> driver by itself using marathon for example, where it would be trivial to
> grant the process additional memory.
>
> Thanks!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/No-way-to-set-mesos-cluster-driver-
> memory-overhead-tp27897.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: spark on mesos memory sizing with offheap

2016-10-13 Thread Michael Gummelt
It doesn't look like we are.  Can you file a JIRA?  A workaround is to set
spark.mesos.executor.overhead to be at least spark.memory.offheap.size.
This is how the container is sized:
https://github.com/apache/spark/blob/master/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L366

On Thu, Oct 13, 2016 at 7:23 AM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Hi,
> I am trying to understand how mesos allocate memory when offheap is
> enabled but it seems that the framework is only taking the heap + 400 MB
> overhead into consideration for resources allocation.
> Example: spark.executor.memory=3g spark.memory.offheap.size=1g ==> mesos
> report 3.4g allocated for the executor
> Is there any configuration to use both heap and offheap for mesos
> allocation ?
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Sending extraJavaOptions for Spark 1.6.1 on mesos 0.28.2 in cluster mode

2016-09-20 Thread Michael Gummelt
Probably this: https://issues.apache.org/jira/browse/SPARK-13258

As described in the JIRA, workaround is to use SPARK_JAVA_OPTS

On Mon, Sep 19, 2016 at 5:07 PM, sagarcasual . <sagarcas...@gmail.com>
wrote:

> Hello,
> I have my Spark application running in cluster mode in CDH with
> extraJavaOptions.
> However when I am attempting a same application to run with apache mesos,
> it does not recognize the properties below at all and code returns null
> that reads them.
>
> --conf spark.driver.extraJavaOptions=-Dsome.url=http://some-url \
> --conf spark.executor.extraJavaOptions=-Dsome.url=http://some-url
>
> I tried option specified in http://stackoverflow.com/
> questions/35872093/missing-java-system-properties-when-
> running-spark-streaming-on-mesos-cluster?noredirect=1=1
>
> and still got no change in the result.
>
> Any idea ho to achieve this in mesos.
>
> -Regards
> Sagar
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: very high maxresults setting (no collect())

2016-09-19 Thread Michael Gummelt
When you say "started seeing", do you mean after a Spark version upgrade?
After running a new job?

On Mon, Sep 19, 2016 at 2:05 PM, Adrian Bridgett <adr...@opensignal.com>
wrote:

> Hi,
>
> We've recently started seeing a huge increase in
> spark.driver.maxResultSize - we are starting to set it at 3GB (and increase
> our driver memory a lot to 12GB or so).  This is on v1.6.1 with Mesos
> scheduler.
>
> All the docs I can see is that this is to do with .collect() being called
> on a large RDD (which isn't the case AFAIK - certainly nothing in the code)
> and it's rather puzzling me as to what's going on.  I thought that the
> number of tasks was coming into it (about 14000 tasks in each of about a
> dozen stages).  Adding a coalesce seemed to help but now we are hitting the
> problem again after a few minor code tweaks.
>
> What else could be contributing to this?   Thoughts I've had:
> - number of tasks
> - metrics?
> - um, a bit stuck!
>
> The code looks like this:
> df=
> df.persist()
> val rows = df.count()
>
> // actually we loop over this a few times
> val output = df. groupBy("id").agg(
>   avg($"score").as("avg_score"),
>   count($"id").as("rows")
> ).
> select(
>   $"id",
>   $"avg_score,
>   $"rows",
> ).sort($"id")
> output.coalesce(1000).write.format("com.databricks.spark.csv
> ").save('/tmp/...')
>
> Cheers for any help/pointers!  There are a couple of memory leak tickets
> fixed in v1.6.2 that may affect the driver so I may try an upgrade (the
> executors are fine).
>
> Adrian
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: No SparkR on Mesos?

2016-09-07 Thread Michael Gummelt
Quite possibly.  I've never used it.  I know Python was "unsupported" for a
while, which turned out to mean there was a silly conditional that would
fail the submission, even though all the support was there.  Could be the
same for R.  Can you submit a JIRA?

On Wed, Sep 7, 2016 at 5:02 AM, Peter Griessl <grie...@ihs.ac.at> wrote:

> Hello,
>
>
>
> does SparkR really not work (yet?) on Mesos (Spark 2.0 on Mesos 1.0)?
>
>
>
> $ /opt/spark/bin/sparkR
>
>
>
> R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
>
> Copyright (C) 2016 The R Foundation for Statistical Computing
>
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> Launching java with spark-submit command /opt/spark/bin/spark-submit
> "sparkr-shell" /tmp/RtmpPYVJxF/backend_port338581f434
>
> Error: *SparkR is not supported for Mesos cluster*.
>
> Error in sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap,  :
>
>   JVM is not ready after 10 seconds
>
>
>
>
>
> I couldn’t find any information on this subject in the docs – am I missing
> something?
>
>
>
> Thanks for any hints,
>
> Peter
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Mesos coarse-grained problem with spark.shuffle.service.enabled

2016-09-07 Thread Michael Gummelt
The shuffle service is run out of band from any specific Spark job, and you
only run one on any given node.  You need to get the Spark distribution on
each node somehow, then run the shuffle service out of that distribution.
The most common way I see people doing this is via Marathon (using the
"uris" field in the marathon app to download the Spark distribution).

On Wed, Sep 7, 2016 at 2:16 AM, Tamas Szuromi <
tamas.szur...@odigeo.com.invalid> wrote:

> Hello,
>
> For a while, we're using Spark on Mesos with fine-grained mode in
> production.
> Since Spark 2.0 the fine-grained mode is deprecated so we'd shift to
> dynamic allocation.
>
> When I tried to setup the dynamic allocation I run into the following
> problem:
> So I set spark.shuffle.service.enabled = true and 
> spark.dynamicAllocation.enabled
> = true as the documentation said. We're using Spark on Mesos
> with spark.executor.uri where we download the pipeline's
> corresponding Spark version from HDFS. The documentation also says In Mesos
> coarse-grained mode, run $SPARK_HOME/sbin/start-mesos-shuffle-service.sh
> on all slave nodes. But how is it possible to launch it before start the
> application, if the given Spark will be downloaded to the Mesos executor
> after executor launch but it's looking for the started external shuffle
> service in advance?
>
> Is it possible I can't use spark.executor.uri and spark.dynamicAllocation.
> enabled together?
>
> Thanks in advance!
>
> Tamas
>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Please assist: Building Docker image containing spark 2.0

2016-08-26 Thread Michael Gummelt
Run with "-X -e" like the error message says. See what comes out.

On Fri, Aug 26, 2016 at 2:23 PM, Tal Grynbaum <tal.grynb...@gmail.com>
wrote:

> Did you specify -Dscala-2.10
> As in
> ./dev/change-scala-version.sh 2.10 ./build/mvn -Pyarn -Phadoop-2.4
> -Dscala-2.10 -DskipTests clean package
> If you're building with scala 2.10
>
> On Sat, Aug 27, 2016, 00:18 Marco Mistroni <mmistr...@gmail.com> wrote:
>
>> Hello Michael
>> uhm i celebrated too soon
>> Compilation of spark on docker image went near the end and then it
>> errored out with this message
>>
>> INFO] BUILD FAILURE
>> [INFO] 
>> 
>> [INFO] Total time: 01:01 h
>> [INFO] Finished at: 2016-08-26T21:12:25+00:00
>> [INFO] Final Memory: 69M/324M
>> [INFO] 
>> 
>> [ERROR] Failed to execute goal 
>> net.alchim31.maven:scala-maven-plugin:3.2.2:compile
>> (scala-compile-first) on project spark-mllib_2.11: Execution
>> scala-compile-first of goal 
>> net.alchim31.maven:scala-maven-plugin:3.2.2:compile
>> failed. CompileFailed -> [Help 1]
>> [ERROR]
>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>> -e switch.
>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>> [ERROR]
>> [ERROR] For more information about the errors and possible solutions,
>> please read the following articles:
>> [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/
>> PluginExecutionException
>> [ERROR]
>> [ERROR] After correcting the problems, you can resume the build with the
>> command
>> [ERROR]   mvn  -rf :spark-mllib_2.11
>> The command '/bin/sh -c ./build/mvn -Pyarn -Phadoop-2.4
>> -Dhadoop.version=2.4.0 -DskipTests clean package' returned a non-zero code:
>> 1
>>
>> what am i forgetting?
>> once again, last command i launched on the docker file is
>>
>>
>> RUN ./build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests
>> clean package
>>
>> kr
>>
>>
>>
>> On Fri, Aug 26, 2016 at 6:18 PM, Michael Gummelt <mgumm...@mesosphere.io>
>> wrote:
>>
>>> :)
>>>
>>> On Thu, Aug 25, 2016 at 2:29 PM, Marco Mistroni <mmistr...@gmail.com>
>>> wrote:
>>>
>>>> No i wont accept that :)
>>>> I can't believe i have wasted 3 hrs for a space!
>>>>
>>>> Many thanks MIchael!
>>>>
>>>> kr
>>>>
>>>> On Thu, Aug 25, 2016 at 10:01 PM, Michael Gummelt <
>>>> mgumm...@mesosphere.io> wrote:
>>>>
>>>>> You have a space between "build" and "mvn"
>>>>>
>>>>> On Thu, Aug 25, 2016 at 1:31 PM, Marco Mistroni <mmistr...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> HI all
>>>>>>  sorry for the partially off-topic, i hope there's someone on the
>>>>>> list who has tried the same and encountered similar issuse
>>>>>>
>>>>>> Ok so i have created a Docker file to build an ubuntu container which
>>>>>> inlcudes spark 2.0, but somehow when it gets to the point where it has to
>>>>>> kick off  ./build/mvn command, it errors out with the following
>>>>>>
>>>>>> ---> Running in 8c2aa6d59842
>>>>>> /bin/sh: 1: ./build: Permission denied
>>>>>> The command '/bin/sh -c ./build mvn -Pyarn -Phadoop-2.4
>>>>>> -Dhadoop.version=2.4.0 -DskipTests clean package' returned a non-zero 
>>>>>> code:
>>>>>> 126
>>>>>>
>>>>>> I am puzzled as i am root when i build the container, so i should not
>>>>>> encounter this issue (btw, if instead of running mvn from the build
>>>>>> directory  i use the mvn which i installed on the container, it works 
>>>>>> fine
>>>>>> but it's  painfully slow)
>>>>>>
>>>>>> here are the details of my Spark command( scala 2.10, java 1.7 , mvn
>>>>>> 3.3.9 and git have already been installed)
>>>>>>
>>>>>> # Spark
>>>>>> RUN echo "Installing Apache spark 2.0"
>>>>>> RUN git clone git://github.com/apache/spark.git
>>>>>> WORKDIR /spark
>>>>>> RUN ./build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0
>>>>>> -DskipTests clean package
>>>>>>
>>>>>>
>>>>>> Could anyone assist pls?
>>>>>>
>>>>>> kindest regarsd
>>>>>>  Marco
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Gummelt
>>>>> Software Engineer
>>>>> Mesosphere
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>>
>>
>>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Please assist: Building Docker image containing spark 2.0

2016-08-26 Thread Michael Gummelt
:)

On Thu, Aug 25, 2016 at 2:29 PM, Marco Mistroni <mmistr...@gmail.com> wrote:

> No i wont accept that :)
> I can't believe i have wasted 3 hrs for a space!
>
> Many thanks MIchael!
>
> kr
>
> On Thu, Aug 25, 2016 at 10:01 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> You have a space between "build" and "mvn"
>>
>> On Thu, Aug 25, 2016 at 1:31 PM, Marco Mistroni <mmistr...@gmail.com>
>> wrote:
>>
>>> HI all
>>>  sorry for the partially off-topic, i hope there's someone on the list
>>> who has tried the same and encountered similar issuse
>>>
>>> Ok so i have created a Docker file to build an ubuntu container which
>>> inlcudes spark 2.0, but somehow when it gets to the point where it has to
>>> kick off  ./build/mvn command, it errors out with the following
>>>
>>> ---> Running in 8c2aa6d59842
>>> /bin/sh: 1: ./build: Permission denied
>>> The command '/bin/sh -c ./build mvn -Pyarn -Phadoop-2.4
>>> -Dhadoop.version=2.4.0 -DskipTests clean package' returned a non-zero code:
>>> 126
>>>
>>> I am puzzled as i am root when i build the container, so i should not
>>> encounter this issue (btw, if instead of running mvn from the build
>>> directory  i use the mvn which i installed on the container, it works fine
>>> but it's  painfully slow)
>>>
>>> here are the details of my Spark command( scala 2.10, java 1.7 , mvn
>>> 3.3.9 and git have already been installed)
>>>
>>> # Spark
>>> RUN echo "Installing Apache spark 2.0"
>>> RUN git clone git://github.com/apache/spark.git
>>> WORKDIR /spark
>>> RUN ./build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests
>>> clean package
>>>
>>>
>>> Could anyone assist pls?
>>>
>>> kindest regarsd
>>>  Marco
>>>
>>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: zookeeper mesos logging in spark

2016-08-26 Thread Michael Gummelt
These are the libmesos logs.  Maybe look here
http://mesos.apache.org/documentation/latest/logging/

On Fri, Aug 26, 2016 at 8:31 AM, aecc <alessandroa...@gmail.com> wrote:

> Hi,
>
> Everytime I run my spark application using mesos, I get logs in my console
> in the form:
>
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> I0826 15:25:30.949254 960752 sched.cpp:222] Version: 0.28.2
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@log_env
> 2016-08-26 15:25:30,949:960521(0x7f6bccff9700):ZOO_INFO@zookeep
> 2016-08-26 15:25:30,951:960521(0x7f6bb4ff9700):ZOO_INFO@check_e
> 2016-08-26 15:25:30,952:960521(0x7f6bb4ff9700):ZOO_INFO@check_e
> I0826 15:25:30.952505 960729 group.cpp:349] Group process (grou
> I0826 15:25:30.952570 960729 group.cpp:831] Syncing group opera
> I0826 15:25:30.952592 960729 group.cpp:427] Trying to create pa
> I0826 15:25:30.954211 960722 detector.cpp:152] Detected a new l
> I0826 15:25:30.954320 960744 group.cpp:700] Trying to get '/mes
> I0826 15:25:30.955345 960724 detector.cpp:479] A new leading ma
> I0826 15:25:30.955451 960724 sched.cpp:326] New master detected
> I0826 15:25:30.955567 960724 sched.cpp:336] No credentials prov
> I0826 15:25:30.956478 960732 sched.cpp:703] Framework registere
>
> Anybody know how to disable them through spark-submit ?
>
> Cheers and many thanks
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/zookeeper-mesos-logging-in-spark-tp27607.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: What do I loose if I run spark without using HDFS or Zookeeper?

2016-08-25 Thread Michael Gummelt
> You would lose the ability to process data closest to where it resides if
you do not use hdfs.

This isn't true.  Many other data sources (e.g. Cassandra) support locality.

On Thu, Aug 25, 2016 at 3:36 PM, ayan guha <guha.a...@gmail.com> wrote:

> At the core of it map reduce relies heavily on data locality. You would
> lose the ability to process data closest to where it resides if you do not
> use hdfs.
> S3 or NFS will not able to provide that.
> On 26 Aug 2016 07:49, "kant kodali" <kanth...@gmail.com> wrote:
>
>> yeah so its seems like its work in progress. At very least Mesos took the
>> initiative to provide alternatives to ZK. I am just really looking forward
>> for this.
>>
>> https://issues.apache.org/jira/browse/MESOS-3797
>>
>>
>>
>> On Thu, Aug 25, 2016 2:00 PM, Michael Gummelt mgumm...@mesosphere.io
>> wrote:
>>
>>> Mesos also uses ZK for leader election.  There seems to be some effort
>>> in supporting etcd, but it's in progress: https://issues.apache.org/jira
>>> /browse/MESOS-1806
>>>
>>> On Thu, Aug 25, 2016 at 1:55 PM, kant kodali <kanth...@gmail.com> wrote:
>>>
>>> @Ofir @Sean very good points.
>>>
>>> @Mike We dont use Kafka or Hive and I understand that Zookeeper can do
>>> many things but for our use case all we need is for high availability and
>>> given the devops people frustrations here in our company who had extensive
>>> experience managing large clusters in the past we would be very happy to
>>> avoid Zookeeper. I also heard that Mesos can provide High Availability
>>> through etcd and consul and if that is true I will be left with the
>>> following stack
>>>
>>> Spark + Mesos scheduler + Distributed File System or to be precise I
>>> should say Distributed Storage since S3 is an object store so I guess this
>>> will be HDFS for us + etcd & consul. Now the big question for me is how do
>>> I set all this up
>>>
>>>
>>>
>>> On Thu, Aug 25, 2016 1:35 PM, Ofir Manor ofir.ma...@equalum.io wrote:
>>>
>>> Just to add one concrete example regarding HDFS dependency.
>>> Have a look at checkpointing https://spark.ap
>>> ache.org/docs/1.6.2/streaming-programming-guide.html#checkpointing
>>> For example, for Spark Streaming, you can not do any window operation in
>>> a cluster without checkpointing to HDFS (or S3).
>>>
>>> Ofir Manor
>>>
>>> Co-Founder & CTO | Equalum
>>>
>>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>>>
>>> On Thu, Aug 25, 2016 at 11:13 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>> Hi Kant,
>>>
>>> I trust the following would be of use.
>>>
>>> Big Data depends on Hadoop Ecosystem from whichever angle one looks at
>>> it.
>>>
>>> In the heart of it and with reference to points you raised about HDFS,
>>> one needs to have a working knowledge of Hadoop Core System including HDFS,
>>> Map-reduce algorithm and Yarn whether one uses them or not. After all Big
>>> Data is all about horizontal scaling with master and nodes (as opposed to
>>> vertical scaling like SQL Server running on a Host). and distributed data
>>> (by default data is replicated three times on different nodes for
>>> scalability and availability).
>>>
>>> Other members including Sean provided the limits on how far one operate
>>> Spark in its own space. If you are going to deal with data (data in motion
>>> and data at rest), then you will need to interact with some form of storage
>>> and HDFS and compatible file systems like S3 are the natural choices.
>>>
>>> Zookeeper is not just about high availability. It is used in Spark
>>> Streaming with Kafka, it is also used with Hive for concurrency. It is also
>>> a distributed locking system.
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this em

Re: Please assist: Building Docker image containing spark 2.0

2016-08-25 Thread Michael Gummelt
You have a space between "build" and "mvn"

On Thu, Aug 25, 2016 at 1:31 PM, Marco Mistroni <mmistr...@gmail.com> wrote:

> HI all
>  sorry for the partially off-topic, i hope there's someone on the list who
> has tried the same and encountered similar issuse
>
> Ok so i have created a Docker file to build an ubuntu container which
> inlcudes spark 2.0, but somehow when it gets to the point where it has to
> kick off  ./build/mvn command, it errors out with the following
>
> ---> Running in 8c2aa6d59842
> /bin/sh: 1: ./build: Permission denied
> The command '/bin/sh -c ./build mvn -Pyarn -Phadoop-2.4
> -Dhadoop.version=2.4.0 -DskipTests clean package' returned a non-zero code:
> 126
>
> I am puzzled as i am root when i build the container, so i should not
> encounter this issue (btw, if instead of running mvn from the build
> directory  i use the mvn which i installed on the container, it works fine
> but it's  painfully slow)
>
> here are the details of my Spark command( scala 2.10, java 1.7 , mvn 3.3.9
> and git have already been installed)
>
> # Spark
> RUN echo "Installing Apache spark 2.0"
> RUN git clone git://github.com/apache/spark.git
> WORKDIR /spark
> RUN ./build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests
> clean package
>
>
> Could anyone assist pls?
>
> kindest regarsd
>  Marco
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: What do I loose if I run spark without using HDFS or Zookeeper?

2016-08-25 Thread Michael Gummelt
e driver and send it out to the workers, and collect data back
>> from the workers. You can't read or write data in a distributed way. There
>> are use cases for this, but pretty limited (unless you're running on 1
>> machine).
>>
>> I can't really imagine a serious use of (distributed) Spark without
>> (distribute) storage, in a way I don't think many apps exist that don't
>> read/write data.
>>
>> The premise here is not just replication, but partitioning data across
>> compute resources. With a distributed file system, your big input exists
>> across a bunch of machines and you can send the work to the pieces of data.
>>
>> On Thu, Aug 25, 2016 at 7:57 PM, kant kodali <kanth...@gmail.com> wrote:
>>
>> @Mich I understand why I would need Zookeeper. It is there for fault
>> tolerance given that spark is a master-slave architecture and when a mater
>> goes down zookeeper will run a leader election algorithm to elect a new
>> leader however DevOps hate Zookeeper they would be much happier to go with
>> etcd & consul and looks like if we mesos scheduler we should be able to
>> drop Zookeeper.
>>
>> HDFS I am still trying to understand why I would need for spark. I
>> understand the purpose of distributed file systems in general but I don't
>> understand in the context of spark since many people say you can run a
>> spark distributed cluster in a stand alone mode but I am not sure what are
>> its pros/cons if we do it that way. In a hadoop world I understand that one
>> of the reasons HDFS is there is for replication other words if we write
>> some data to a HDFS it will store that block across different nodes such
>> that if one of nodes goes down it can still retrieve that block from other
>> nodes. In the context of spark I am not really sure because 1) I am new 2)
>> Spark paper says it doesn't replicate data instead it stores the
>> lineage(all the transformations) such that it can reconstruct it.
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 25, 2016 9:18 AM, Mich Talebzadeh mich.talebza...@gmail.com
>> wrote:
>>
>> You can use Spark on Oracle as a query tool.
>>
>> It all depends on the mode of the operation.
>>
>> If you running Spark with yarn-client/cluster then you will need yarn. It
>> comes as part of Hadoop core (HDFS, Map-reduce and Yarn).
>>
>> I have not gone and installed Yarn without installing Hadoop.
>>
>> What is the overriding reason to have the Spark on its own?
>>
>>  You can use Spark in Local or Standalone mode if you do not want Hadoop
>> core.
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 August 2016 at 21:54, kant kodali <kanth...@gmail.com> wrote:
>>
>> What do I loose if I run spark without using HDFS or Zookeper ? which of
>> them is almost a must in practice?
>>
>>
>>
>>
>>
>>
>>
>>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: 2.0.1/2.1.x release dates

2016-08-19 Thread Michael Gummelt
Adrian,

We haven't had any reports of hangs on Mesos in 2.0, so it's likely that if
you wait until the release, your problem still won't be solved unless you
file a bug.  Can you create a JIRA so we can look into it?

On Thu, Aug 18, 2016 at 2:40 AM, Sean Owen <so...@cloudera.com> wrote:

> Historically, minor releases happen every ~4 months, and maintenance
> releases are a bit ad hoc but come about a month after the minor
> release. It's up to the release manager to decide to do them but maybe
> realistic to expect 2.0.1 in early September.
>
> On Thu, Aug 18, 2016 at 10:35 AM, Adrian Bridgett <adr...@opensignal.com>
> wrote:
> > Just wondering if there were any rumoured release dates for either of the
> > above.  I'm seeing some odd hangs with 2.0.0 and mesos (and I know that
> the
> > mesos integration has had a bit of updating in 2.1.x).   Looking at JIRA,
> > there's no suggested release date and issues seem to be added to a
> release
> > version once resolved so the usual trick of looking at the
> > resolved/unresolved ratio isn't helping :-)  The wiki only mentions
> 2.0.0 so
> > no joy there either.
> >
> > Still doing testing but then I don't want to test with 2.1.x if it's
> going
> > to be under heavy development for a while longer.
> >
> > Thanks for any info,
> >
> > Adrian
> >
> > -
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Attempting to accept an unknown offer

2016-08-19 Thread Michael Gummelt
ched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910493
>>>>>
>>>>> W0816 23:17:01.985124 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910494
>>>>>
>>>>> W0816 23:17:01.985339 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910495
>>>>>
>>>>> W0816 23:17:01.985508 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910496
>>>>>
>>>>> W0816 23:17:01.985651 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910497
>>>>>
>>>>> W0816 23:17:01.985801 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910498
>>>>>
>>>>> W0816 23:17:01.985961 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910499
>>>>>
>>>>> W0816 23:17:01.986121 16360 sched.cpp:1195] Attempting to accept an
>>>>> unknown offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910500
>>>>>
>>>>> 2016-08-16 23:18:41,877:16226(0x7f71271b6
>>>>> 700):ZOO_WARN@zookeeper_interest@1557: Exceeded deadline by 13ms
>>>>>
>>>>> 2016-08-16 23:21:12,007:16226(0x7f71271b6
>>>>> 700):ZOO_WARN@zookeeper_interest@1557: Exceeded deadline by 11ms
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: mesos or kubernetes ?

2016-08-13 Thread Michael Gummelt
DC/OS Spark *is* Apache Spark on Mesos, along with some packaging that
makes it easy to install and manage on DC/OS.

For example:

$ dcos package install spark
$ dcos spark run --submit-args="--class SparkPi ..."

The single command install gives runs the cluster dispatcher and the
history server in your cluster via marathon, so it's HA.
It provides a local CLI that your end users can use to submit jobs.
And it's integrated with other DC/OS packages like HDFS.

It sort of does for Spark what e.g. CDH does for Hadoop.

On Sat, Aug 13, 2016 at 1:35 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> I'm wondering why not DC/OS (with Mesos)?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Sat, Aug 13, 2016 at 11:24 AM, guyoh <g12...@gmail.com> wrote:
> > My company is trying to decide whether to use kubernetes or mesos. Since
> we
> > are planning to use Spark in the near future, I was wandering what is the
> > best choice for us.
> > Thanks,
> > Guy
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/mesos-or-kubernetes-tp27530.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > -
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: mesos or kubernetes ?

2016-08-13 Thread Michael Gummelt
Spark has a first-class scheduler for Mesos, whereas it doesn't for
Kubernetes.  Running Spark on Kubernetes means running Spark in standalone
mode, wrapped in a Kubernetes service:
https://github.com/kubernetes/kubernetes/tree/master/examples/spark

So you're effectively comparing standalone vs. Mesos.  For basic purposes,
standalone works fine.  Mesos adds support for things like docker images,
security, resource reservations via roles, targeting specific nodes via
attributes, etc.

The main benefit of Mesos, however, is that you can share the same
infrastructure with other, non-Spark services.  We have users, for example,
running Spark on the same cluster as HDFS, Cassandra, Kafka, web apps,
Jenkins, etc.  You can do this with Kubernetes to some extent, but running
in standalone means that the Spark "partition" isn't elastic.  You must
statically partition to exclusively run Spark.

On Sat, Aug 13, 2016 at 11:24 AM, guyoh <g12...@gmail.com> wrote:

> My company is trying to decide whether to use kubernetes or mesos. Since we
> are planning to use Spark in the near future, I was wandering what is the
> best choice for us.
> Thanks,
> Guy
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/mesos-or-kubernetes-tp27530.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark on mesos in docker not getting parameters

2016-08-09 Thread Michael Gummelt
> However, they are missing in subsequent child processes and the final java
process started doesn't contain them either.

I don't see any evidence of this in your process list.  `launcher.Main` is
not the final java process.  `launcher.Main` prints a java command, which
`spark-class` then runs.  That command is the final java process.
`launcher.Main` should take the contents of SPARK_EXECUTOR_OPTS and include
those opts in the command which it prints out.

If you could include the process listing for that final command, and you
observe it doesn't contain the aws system properties from
SPARK_EXECUTOR_OPTS, then I would see something wrong.

On Tue, Aug 9, 2016 at 10:13 AM, Jim Carroll <jimfcarr...@gmail.com> wrote:

> I'm running spark 2.0.0 on Mesos using spark.mesos.executor.docker.image
> to
> point to a docker container that I built with the Spark installation.
>
> Everything is working except the Spark client process that's started inside
> the container doesn't get any of my parameters I set in the spark config in
> the driver.
>
> I set spark.executor.extraJavaOptions and spark.executor.extraClassPath in
> the driver and they don't get passed all the way through. Here is a capture
> of the chain of processes that are started on the mesos slave, in the
> docker
> container:
>
> root  1064  1051  0 12:46 ?00:00:00 docker -H
> unix:///var/run/docker.sock run --cpu-shares 8192 --memory 4723834880 -e
> SPARK_CLASSPATH=[path to my jar] -e SPARK_EXECUTOR_OPTS=
> -Daws.accessKeyId=[myid] -Daws.secretKey=[mykey] -e SPARK_USER=root -e
> SPARK_EXECUTOR_MEMORY=4096m -e MESOS_SANDBOX=/mnt/mesos/sandbox -e
> MESOS_CONTAINER_NAME=mesos-90e2c720-1e45-4dbc-8271-
> f0c47a33032a-S0.772f8080-6278-4a35-9e57-0009787ac605
> -v
> /tmp/mesos/slaves/90e2c720-1e45-4dbc-8271-f0c47a33032a-
> S0/frameworks/f5794f8a-b56f-4958-b906-f05c426dcef0-0001/
> executors/0/runs/772f8080-6278-4a35-9e57-0009787ac605:/mnt/mesos/sandbox
> --net host --entrypoint /bin/sh --name
> mesos-90e2c720-1e45-4dbc-8271-f0c47a33032a-S0.772f8080-6278-
> 4a35-9e57-0009787ac605
> [my docker image] -c  "/opt/spark/./bin/spark-class"
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
> spark://CoarseGrainedScheduler@192.168.10.145:46121 --executor-id 0
> --hostname 192.168.10.145 --cores 8 --app-id
> f5794f8a-b56f-4958-b906-f05c426dcef0-0001
>
> root  1193  1175  0 12:46 ?00:00:00 /bin/sh -c
> "/opt/spark/./bin/spark-class"
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
> spark://CoarseGrainedScheduler@192.168.10.145:46121 --executor-id 0
> --hostname 192.168.10.145 --cores 8 --app-id
> f5794f8a-b56f-4958-b906-f05c426dcef0-0001
>
> root  1208  1193  0 12:46 ?00:00:00 bash
> /opt/spark/./bin/spark-class
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
> spark://CoarseGrainedScheduler@192.168.10.145:46121 --executor-id 0
> --hostname 192.168.10.145 --cores 8 --app-id
> f5794f8a-b56f-4958-b906-f05c426dcef0-0001
>
> root  1213  1208  0 12:46 ?00:00:00 bash
> /opt/spark/./bin/spark-class
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
> spark://CoarseGrainedScheduler@192.168.10.145:46121 --executor-id 0
> --hostname 192.168.10.145 --cores 8 --app-id
> f5794f8a-b56f-4958-b906-f05c426dcef0-0001
>
> root  1215  1213  0 12:46 ?00:00:00
> /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx128m -cp /opt/spark/jars/*
> org.apache.spark.launcher.Main
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
> spark://CoarseGrainedScheduler@192.168.10.145:46121 --executor-id 0
> --hostname 192.168.10.145 --cores 8 --app-id
> f5794f8a-b56f-4958-b906-f05c426dcef0-0001
>
> Notice, in the initial process started by mesos both the SPARK_CLASSPATH is
> set to the value of spark.executor.extraClassPath and the -D options are
> set
> as I set them on spark.executor.extraJavaOptions (in this case, to my aws
> creds) in the drive configuration.
>
> However, they are missing in subsequent child processes and the final java
> process started doesn't contain them either.
>
> I "fixed" the classpath problem by putting my jar in /opt/spark/jars
> (/opt/spark is the location I have spark installed in the docker
> container).
>
> Can someone tell me what I'm missing?
>
> Thanks
> Jim
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Spark-on-mesos-in-docker-not-
> getting-parameters-tp27500.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark Job Doesn't End on Mesos

2016-08-09 Thread Michael Gummelt
Is this a new issue?
What version of Spark?
What version of Mesos/libmesos?
Can you run the job with debug logging turned on and attach the output?
Do you see the corresponding message in the mesos master that indicates it
received the teardown?

On Tue, Aug 9, 2016 at 1:28 AM, Todd Leo <todd.f@gmail.com> wrote:

> Hi,
>
> I’m running Spark jobs on Mesos. When the job finishes, *SparkContext* is
> manually closed by sc.stop(). Then Mesos log shows:
>
> I0809 15:48:34.132014 11020 sched.cpp:1589] Asked to stop the driver
> I0809 15:48:34.132181 11277 sched.cpp:831] Stopping framework 
> '20160808-170425-2365980426-5050-4372-0034'
>
> However, the process doesn’t quit after all. This is critical, because I’d
> like to use SparkLauncher to submit such jobs. If my job doesn’t end, jobs
> will pile up and fill up the memory. Pls help. :-|
>
> —
> BR,
> Todd Leo
> ​
>



-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: standalone mode only supports FIFO scheduler across applications ? still in spark 2.0 time ?

2016-08-03 Thread Michael Gummelt
DC/OS was designed to reduce the operational cost of maintaining a cluster,
and DC/OS Spark runs well on it.

On Sat, Jul 16, 2016 at 11:11 AM, Teng Qiu <teng...@gmail.com> wrote:

> Hi Mark, thanks, we just want to keep our system as simple as
> possible, using YARN means we need to maintain a full-size hadoop
> cluster, we are using s3 as storage layer, so HDFS is not needed, a
> hadoop cluster is a little bit overkill, mesos is an option, but
> still, it brings extra operation costs.
>
> So... any suggestion from you?
>
> Thanks
>
>
> 2016-07-15 18:51 GMT+02:00 Mark Hamstra <m...@clearstorydata.com>:
> > Nothing has changed in that regard, nor is there likely to be "progress",
> > since more sophisticated or capable resource scheduling at the
> Application
> > level is really beyond the design goals for standalone mode.  If you want
> > more in the way of multi-Application resource scheduling, then you
> should be
> > looking at Yarn or Mesos.  Is there some reason why neither of those
> options
> > can work for you?
> >
> > On Fri, Jul 15, 2016 at 9:15 AM, Teng Qiu <teng...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >>
> >>
> http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/spark-standalone.html#resource-scheduling
> >> The standalone cluster mode currently only supports a simple FIFO
> >> scheduler across applications.
> >>
> >> is this sentence still true? any progress on this? it will really
> >> helpful. some roadmap?
> >>
> >> Thanks
> >>
> >> Teng
> >>
> >> ---------
> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >>
> >
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Executors assigned to STS and number of workers in Stand Alone Mode

2016-08-03 Thread Michael Gummelt
> but Spark on Mesos is certainly lagging behind Spark on YARN regarding
the features Spark uses off the scheduler backends -- security, data
locality, queues, etc.

If by security you mean Kerberos, we'll be upstreaming that to Apache Spark
soon.  It's been in DC/OS Spark for a while:
https://github.com/mesosphere/spark/commit/73ba2ab8d97510d5475ef9a48c673ce34f7173fa

Locality is implemented in a scheduler independent way:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L327,
but it is possible that the offer model could result in different
placement.  I haven't seen any analysis to that effect.

YARN queues are very similar to Mesos quota and roles, which Spark
supports.  We'll also be adding support for revocable resource support
sometime soon, which solves the HoL blocking problem, where one Spark app
eats up your cluster while others wait.  I don't think YARN has a solution
for this, but I could be wrong.

So, yea, there are some differences, but I think the biggest feature gap
right now is really just Kerberos, which will be added soon.

There are also other Mesos-specific features we'll be adding soon, such as
GPU, CNI, and virtual network but the biggest advantage for running on
Mesos is that you can run multi-tenant alongside other Mesos frameworks.








On Mon, Jul 25, 2016 at 2:04 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> On Mon, Jul 25, 2016 at 10:57 PM, Mich Talebzadeh
> <mich.talebza...@gmail.com> wrote:
>
> > Yarn promises the best resource management I believe. Having said that I
> have not used Mesos myself.
>
> I'm glad you've mentioned it.
>
> I think Cloudera (and Hortonworks?) guys are doing a great job with
> bringing all the features of YARN to Spark and I think Spark on YARN
> shines features-wise.
>
> I'm not in a position to compare YARN vs Mesos for their resource
> management, but Spark on Mesos is certainly lagging behind Spark on
> YARN regarding the features Spark uses off the scheduler backends --
> security, data locality, queues, etc. (or I might be simply biased
> after having spent months with Spark on YARN mostly?).
>
> Jacek
>
> ---------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: how to use spark.mesos.constraints

2016-08-03 Thread Michael Gummelt
If you run your jobs with debug logging on in Mesos, it should print why
the offer is being declined:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala#L301

On Tue, Jul 26, 2016 at 6:38 PM, Rodrick Brown <rodr...@orchardplatform.com>
wrote:

> Shuffle service has nothing to do with constraints it is however advised
> to run the mesos-shuffle-service on each of your agent nodes running spark.
>
> Here is the command I use to run a typical spark jobs on my cluster using
> constraints (this is generated but from another script we run but should
> give you a clear idea)
>
> Jobs not being accepted by any resources could mean what you're asking for
> is way larger than the resources you have available.
>
> /usr/bin/timeout 3600 /opt/spark-1.6.1/bin/spark-submit
> --master "mesos://zk://prod-zk-1:2181,prod-zk-2:2181,prod-zk-3:2181/mesos"
> --conf spark.ui.port=40046
> --conf spark.mesos.coarse=true
> --conf spark.sql.broadcastTimeout=3600
> --conf spark.cores.max=5
> --conf spark.mesos.constraints="rack:spark"
> --conf spark.sql.tungsten.enabled=true
> --conf spark.shuffle.service.enabled=true
> --conf spark.dynamicAllocation.enabled=true
> --conf spark.mesos.executor.memoryOverhead=3211
> --class
> com.orchard.dataloader.library.originators..LoadAccountDetail_LC
> --total-executor-cores 5
> --driver-memory 5734M
> --executor-memory 8028M
> --jars /data/orchard/etc/config/load-accountdetail-accumulo-prod.jar
> /data/orchard/jars/dataloader-library-assembled.jar 1
>
> Nodes used for my spark jobs are all using the constraint 'rack:spark'
>
> I hope this helps!
>
>
> On Tue, Jul 26, 2016 at 7:10 PM, Jia Yu <jiayu198...@gmail.com> wrote:
>
>> Hi,
>>
>> I am also trying to use the spark.mesos.constraints but it gives me the
>> same error: job has not be accepted by any resources.
>>
>> I am doubting that I should start some additional service like
>> ./sbin/start-mesos-shuffle-service.sh. Am I correct?
>>
>> Thanks,
>> Jia
>>
>> On Tue, Dec 1, 2015 at 5:14 PM, rarediel <bryce.ag...@gettyimages.com>
>> wrote:
>>
>>> I am trying to add mesos constraints to my spark-submit command in my
>>> marathon file I am setting it to spark.mesos.coarse=true.
>>>
>>> Here is an example of a constraint I am trying to set.
>>>
>>>  --conf spark.mesos.constraint=cpus:2
>>>
>>> I want to use the constraints to control the amount of executors are
>>> created
>>> so I can control the total memory of my spark job.
>>>
>>> I've tried many variations of resource constraints, but no matter which
>>> resource or what number, range, etc. I do I always get the error "Initial
>>> job has not accepted any resources; check your cluster UI...".  My
>>> cluster
>>> has the available resources.  Is there any examples I can look at where
>>> people use resource constraints?
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-use-spark-mesos-constraints-tp25541.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>
>
> --
>
> [image: Orchard Platform] <http://www.orchardplatform.com/>
>
> *Rodrick Brown */ *DevOPs*
>
> 9174456839 / rodr...@orchardplatform.com
>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY
>
> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
> for the use of the addressee only. If you are not an intended recipient of
> this communication, please delete it immediately and notify the sender by
> return email. Unauthorized reading, dissemination, distribution or copying
> of this communication is prohibited. This communication does not constitute
> an offer to sell or a solicitation of an indication of interest to purchase
> any loan, security or any other financial product or instrument, nor is it
> an offer to sell or a solicitation of an indication of interest to purchase
> any products or services to any persons who are prohibited from receiving
> such information under applicable law. The contents of this communication
> may not be accurate or complete and are subject to change without notice.
> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
> "Orchard") makes no representation regarding the accuracy or completeness
> of the information contained herein. The intended recipient is advised to
> consult its own professional advisors, including those specializing in
> legal, tax and accounting matters. Orchard does not provide legal, tax or
> accounting advice.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere