Linear Regression with SGD

2015-06-09 Thread Stephen Carman
Hi User group,

We are using spark Linear Regression with SGD as the optimization technique and 
we are achieving very sub-optimal results.

Can anyone shed some light on why this implementation seems to produce such 
poor results vs our own implementation?

We are using a very small dataset, but we have to use a very large number of 
iterations to achieve similar results to our implementation, we’ve tried 
normalizing the data
not normalizing the data and tuning every param. Our implementation is a closed 
form solution so we should be guaranteed convergence but the spark one is not, 
which is
understandable, but why is it so far off?

Has anyone experienced this?

Steve Carman, M.S.
Artificial Intelligence Engineer
Coldlight-PTC
scar...@coldlight.com
This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.


Where does partitioning and data loading happen?

2015-05-27 Thread Stephen Carman
A colleague and I were having a discussion and we were disagreeing about 
something in Spark/Mesos that perhaps someone can shed some light into.

We have a mesos cluster that runs spark via a sparkHome, rather than 
downloading an executable and such.

My colleague says that say we have parquet files in S3, that slaves should know 
what data is in their partition and only pull from the S3 the partitions of 
parquet data they need, but this seems inherinitly wrong to me.
as I have no idea how it’s possible for Spark or Mesos to know what partitions 
to know what to pull on the slave. It makes much more sense to me for the 
partitioning to be done on the driver and then distributed to the
slaves so the slaves don’t have to necessarily worry about these details. If 
this were the case there is some data loading that is done on the driver, 
correct? Or does spark/mesos do some magic to pass a reference so the slaves
know what to pull per say?

So I guess in summation, where does partitioning and data loading happen? On 
the driver or on the executor?

Thanks,
Steve
This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.


RE: swap tuple

2015-05-14 Thread Stephen Carman
Yea, I wouldn't try and modify the current since RDDs are suppose to be 
immutable, just create a new one...

val newRdd = oldRdd.map(r = (r._2(), r._1()))

or something of that nature...

Steve

From: Evo Eftimov [evo.efti...@isecc.com]
Sent: Thursday, May 14, 2015 1:24 PM
To: 'Holden Karau'; 'Yasemin Kaya'
Cc: user@spark.apache.org
Subject: RE: swap tuple

Where is the “Tuple”  supposed to be in String, String - you can refer to a 
“Tuple” if it was e.g. String, Tuple2String, String

From: holden.ka...@gmail.com [mailto:holden.ka...@gmail.com] On Behalf Of 
Holden Karau
Sent: Thursday, May 14, 2015 5:56 PM
To: Yasemin Kaya
Cc: user@spark.apache.org
Subject: Re: swap tuple

Can you paste your code? transformations return a new RDD rather than modifying 
an existing one, so if you were to swap the values of the tuple using a map you 
would get back a new RDD and then you would want to try and print this new RDD 
instead of the original one.

On Thursday, May 14, 2015, Yasemin Kaya 
godo...@gmail.commailto:godo...@gmail.com wrote:
Hi,

I have JavaPairRDDString, String and I want to swap tuple._1() to tuple._2(). 
I use tuple.swap() but it can't be changed JavaPairRDD in real. When I print 
JavaPairRDD, the values are same.

Anyone can help me for that?

Thank you.
Have nice day.

yasemin

--
hiç ender hiç


--
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau
Linked In: https://www.linkedin.com/in/holdenkarau

This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.


Re: Spark on Mesos

2015-05-13 Thread Stephen Carman
Sander,

I eventually solved this problem via the --[no-]switch_user flag, which is set 
to true by default. I set this to false, which would have the user that owns 
the process run the job, otherwise it was my username (scarman)
running the job, which would fail because obviously my username didn’t exist 
there. When ran as root, it ran totally fine with no problems what so ever.

Hopefully this works for you too,

Steve
 On May 13, 2015, at 11:45 AM, Sander van Dijk sgvand...@gmail.com wrote:

 Hey all,

 I seem to be experiencing the same thing as Stephen. I run Spark 1.2.1 with 
 Mesos 0.22.1, with Spark coming from the spark-1.2.1-bin-hadoop2.4.tgz 
 prebuilt package, and Mesos installed from the Mesosphere repositories. I 
 have been running with Spark standalone successfully for a while and now 
 trying to setup Mesos. Mesos is up and running, the UI at port 5050 reports 
 all slaves alive. I then run Spark shell with: `spark-shell --master 
 mesos://1.1.1.1:5050` (with 1.1.1.1 the master's ip address), which starts up 
 fine, with output:

 I0513 15:02:45.340287 28804 sched.cpp:448] Framework registered with 
 20150512-150459-2618695596-5050-3956-0009 15/05/13 15:02:45 INFO 
 mesos.MesosSchedulerBackend: Registered as framework ID 
 20150512-150459-2618695596-5050-3956-0009

 and the framework shows up in the Mesos UI. Then when trying to run something 
 (e.g. 'val rdd = sc.txtFile(path); rdd.count') fails with lost executors. 
 In /var/log/mesos-slave.ERROR on the slave instances there are entries like:

 E0513 14:57:01.198995 13077 slave.cpp:3112] Container 
 'eaf33d36-dde5-498a-9ef1-70138810a38c' for executor 
 '20150512-145720-2618695596-5050-3082-S10' of framework 
 '20150512-150459-2618695596-5050-3956-0009' failed to start: Failed to 
 execute mesos-fetcher: Failed to chown work directory

 From what I can find, the work directory is in /tmp/mesos, where indeed I see 
 a directory structure with executor and framework IDs, with at the leaves 
 stdout and stderr files of size 0. Everything there is owned by root, but I 
 assume the processes are also run by root, so any chowning in there should be 
 possible.

 I was thinking maybe it fails to fetch the Spark package executor? I uploaded 
 spark-1.2.1-bin-hadoop2.4.tgz to hdfs, SPARK_EXECUTOR_URI is set in 
 spark-env.sh, and in the Environment section of the web UI I see this picked 
 up in the spark.executor.uriparameter. I checked and the URI is reachable by 
 the slaves: an `hdfs dfs -stat $SPARK_EXECUTOR_URI` is successful.

 Any pointers?

 Many thanks,
 Sander

 On Fri, May 1, 2015 at 8:35 AM Tim Chen t...@mesosphere.io wrote:
 Hi Stephen,

 It looks like Mesos slave was most likely not able to launch some mesos 
 helper processes (fetcher probably?).

 How did you install Mesos? Did you build from source yourself?

 Please install Mesos through a package or actually from source run make 
 install and run from the installed binary.

 Tim

 On Mon, Apr 27, 2015 at 11:11 AM, Stephen Carman scar...@coldlight.com 
 wrote:
 So I installed spark on each of the slaves 1.3.1 built with hadoop2.6 I just 
 basically got the pre-built from the spark website…

 I placed those compiled spark installs on each slave at /opt/spark

 My spark properties seem to be getting picked up on my side fine…

 Screen Shot 2015-04-27 at 10.30.01 AM.png
 The framework is registered in Mesos, it shows up just fine, it doesn’t 
 matter if I turn off the executor uri or not, but I always get the same error…

 org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in 
 stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 
 (TID 23, 10.253.1.117): ExecutorLostFailure (executor 
 20150424-104711-1375862026-5050-20113-S1 lost)
 Driver stacktrace:
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

 These boxes

Re: Spark on Mesos

2015-05-13 Thread Stephen Carman
Yup, exactly as Tim mentioned on it too. I went back and tried what you just 
suggested and that was also perfectly fine.

Steve

On May 13, 2015, at 1:58 PM, Tim Chen 
t...@mesosphere.iomailto:t...@mesosphere.io wrote:

Hi Stephen,

You probably didn't run the Spark driver/shell as root, as Mesos scheduler will 
pick up your local user and tries to impersonate as the same user and chown the 
directory before executing any task.

If you try to run Spark driver as root it should resolve the problem. No switch 
user can also work as it won't try to switch user for you.

Tim

On Wed, May 13, 2015 at 10:50 AM, Stephen Carman 
scar...@coldlight.commailto:scar...@coldlight.com wrote:
Sander,

I eventually solved this problem via the --[no-]switch_user flag, which is set 
to true by default. I set this to false, which would have the user that owns 
the process run the job, otherwise it was my username (scarman)
running the job, which would fail because obviously my username didn’t exist 
there. When ran as root, it ran totally fine with no problems what so ever.

Hopefully this works for you too,

Steve
 On May 13, 2015, at 11:45 AM, Sander van Dijk 
 sgvand...@gmail.commailto:sgvand...@gmail.com wrote:

 Hey all,

 I seem to be experiencing the same thing as Stephen. I run Spark 1.2.1 with 
 Mesos 0.22.1, with Spark coming from the spark-1.2.1-bin-hadoop2.4.tgz 
 prebuilt package, and Mesos installed from the Mesosphere repositories. I 
 have been running with Spark standalone successfully for a while and now 
 trying to setup Mesos. Mesos is up and running, the UI at port 5050 reports 
 all slaves alive. I then run Spark shell with: `spark-shell --master 
 mesos://1.1.1.1:5050` (with 1.1.1.1 the master's ip address), which starts up 
 fine, with output:

 I0513 15:02:45.340287 28804 sched.cpp:448] Framework registered with 
 20150512-150459-2618695596-5050-3956-0009 15/05/13 15:02:45 INFO 
 mesos.MesosSchedulerBackend: Registered as framework ID 
 20150512-150459-2618695596-5050-3956-0009

 and the framework shows up in the Mesos UI. Then when trying to run something 
 (e.g. 'val rdd = sc.txtFile(path); rdd.count') fails with lost executors. 
 In /var/log/mesos-slave.ERROR on the slave instances there are entries like:

 E0513 14:57:01.198995 13077 slave.cpp:3112] Container 
 'eaf33d36-dde5-498a-9ef1-70138810a38c' for executor 
 '20150512-145720-2618695596-5050-3082-S10' of framework 
 '20150512-150459-2618695596-5050-3956-0009' failed to start: Failed to 
 execute mesos-fetcher: Failed to chown work directory

 From what I can find, the work directory is in /tmp/mesos, where indeed I see 
 a directory structure with executor and framework IDs, with at the leaves 
 stdout and stderr files of size 0. Everything there is owned by root, but I 
 assume the processes are also run by root, so any chowning in there should be 
 possible.

 I was thinking maybe it fails to fetch the Spark package executor? I uploaded 
 spark-1.2.1-bin-hadoop2.4.tgz to hdfs, SPARK_EXECUTOR_URI is set in 
 spark-env.sh, and in the Environment section of the web UI I see this picked 
 up in the spark.executor.uriparameter. I checked and the URI is reachable by 
 the slaves: an `hdfs dfs -stat $SPARK_EXECUTOR_URI` is successful.

 Any pointers?

 Many thanks,
 Sander

 On Fri, May 1, 2015 at 8:35 AM Tim Chen 
 t...@mesosphere.iomailto:t...@mesosphere.io wrote:
 Hi Stephen,

 It looks like Mesos slave was most likely not able to launch some mesos 
 helper processes (fetcher probably?).

 How did you install Mesos? Did you build from source yourself?

 Please install Mesos through a package or actually from source run make 
 install and run from the installed binary.

 Tim

 On Mon, Apr 27, 2015 at 11:11 AM, Stephen Carman 
 scar...@coldlight.commailto:scar...@coldlight.com wrote:
 So I installed spark on each of the slaves 1.3.1 built with hadoop2.6 I just 
 basically got the pre-built from the spark website…

 I placed those compiled spark installs on each slave at /opt/spark

 My spark properties seem to be getting picked up on my side fine…

 Screen Shot 2015-04-27 at 10.30.01 AM.png
 The framework is registered in Mesos, it shows up just fine, it doesn’t 
 matter if I turn off the executor uri or not, but I always get the same error…

 org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in 
 stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 
 (TID 23, 10.253.1.117): ExecutorLostFailure (executor 
 20150424-104711-1375862026-5050-20113-S1 lost)
 Driver stacktrace:
 at 
 org.apache.spark.scheduler.DAGScheduler.orghttp://org.apache.spark.scheduler.dagscheduler.org/$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192

Re: Spark on Mesos

2015-04-27 Thread Stephen Carman
So I installed spark on each of the slaves 1.3.1 built with hadoop2.6 I just 
basically got the pre-built from the spark website…

I placed those compiled spark installs on each slave at /opt/spark

My spark properties seem to be getting picked up on my side fine…

[cid:683C1BA0-C9EC-448C-B1DB-E93AC4576DE9@coldlight.corp]
The framework is registered in Mesos, it shows up just fine, it doesn’t matter 
if I turn off the executor uri or not, but I always get the same error…

org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in 
stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 
23, 10.253.1.117): ExecutorLostFailure (executor 
20150424-104711-1375862026-5050-20113-S1 lost)
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

These boxes are totally open to one another so they shouldn’t have any firewall 
issues, everything seems to show up in mesos and spark just fine, but actually 
running stuff totally blows up.

There is nothing in the stderr or stdout, it downloads the package and untars 
it but doesn’t seem to do much after that. Any insights?

Steve


On Apr 24, 2015, at 5:50 PM, Yang Lei 
genia...@gmail.commailto:genia...@gmail.com wrote:

SPARK_PUBLIC_DNS, SPARK_LOCAL_IP, SPARK_LOCAL_HOST

This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.


Spark on Mesos

2015-04-24 Thread Stephen Carman
So I can’t for the life of me to get something even simple working for Spark on 
Mesos.

I installed a 3 master, 3 slave mesos cluster, which is all configured, but I 
can’t for the life of me even get the spark shell to work properly.

I get errors like this
org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in 
stage 0.0 failed 4 times, most recent failure: Lost task 5.3 in stage 0.0 (TID 
23, 10.253.1.117): ExecutorLostFailure (executor 
20150424-104711-1375862026-5050-20113-S1 lost)
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

I tried both mesos 0.21 and 0.22 and they both produce the same error…

My version of spark is 1.3.1 with hadoop 2.6, I just downloaded the pre-build 
from the site, or is that wrong and i have to build it myself?

I have my mesos_native_java_library, spark executor URI and mesos master set in 
my spark-env.sh, they to the best of my abilities seem correct.

Does anyone have any insight into this at all? I’m running this on red hat 7 
with 8 CPU cores and 14gb of ram per slave, so 24 cores total and 42gb of ram 
total.

Anyone have any idea at all what is going on here?

Thanks,
Steve
This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark Memory Utilities

2015-04-03 Thread Stephen Carman
I noticed spark has some nice memory tracking estimators in it, but they are 
private. We have some custom implementations of RDD and PairRDD to suit our 
internal needs and it’d be fantastic if we’d be able to just leverage the 
memory estimates that already exist in Spark.

Is there any change they can be made public inside the library or have some 
interface to them such that children classes can make use of them?

Thanks,

Stephen Carman, M.S.
AI Engineer, Coldlight Solutions, LLC
Cell - 267 240 0363
This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org