security testing on spark ?
Hi all, Does anyone know of any effort from the community on security testing spark clusters. I.e. Static source code analysis to find security flaws Penetration testing to identify ways to compromise spark cluster Fuzzing to crash spark Thanks, Judy
thrift server reliability issue
Hi everyone, Found a thrift server reliability issue on spark 1.3.1 that causes thrift to fail. When thrift server has too little memory allocated to the driver to process the request, its Spark SQL session exits with OutOfMemory exception, causing thrift server to stop working. Is this a known issue? Thanks, Judy -- Full stacktrace of out of memory exception: 2015-07-08 03:30:18,011 ERROR actor.ActorSystemImpl (Slf4jLogger.scala:apply$mcV$sp(66)) - Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down ActorSystem [sparkDriver] java.lang.OutOfMemoryError: Java heap space at org.spark_project.protobuf.ByteString.toByteArray(ByteString.java:515) at akka.remote.serialization.MessageContainerSerializer.fromBinary(MessageContainerSerializer.scala:64) at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104) at scala.util.Try$.apply(Try.scala:161) at akka.serialization.Serialization.deserialize(Serialization.scala:98) at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23) at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58) at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58) at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76) at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
RE: spark slave cannot execute without admin permission on windows
Update to the thread. Upon investigation, this is a bug on windows. Windows does not grant user permission read permission to jar files by default. Have created a pull request for SPARK-5914https://issues.apache.org/jira/browse/SPARK-5914 to grant read permission to jar owner (slave service account in this case). With this fix, slave will be able to run without admin permission. FYI: master thrift server works fine with only user permission, so no issue there. From: Judy Nash [mailto:judyn...@exchange.microsoft.com] Sent: Thursday, February 19, 2015 12:26 AM To: Akhil Das; dev@spark.apache.org Cc: u...@spark.apache.org Subject: RE: spark slave cannot execute without admin permission on windows + dev mailing list If this is supposed to work, is there a regression then? The spark core code shows the permission for copied file to \work is set to a+x at Line 442 of Utils.scalahttps://github.com/apache/spark/blob/b271c265b742fa6947522eda4592e9e6a7fd1f3a/core/src/main/scala/org/apache/spark/util/Utils.scala . The example jar I used had all permissions including Read Execute prior spark-submit: [cid:image001.png@01D04FCA.85961CE0] However after copied to worker node’s \work folder, only limited permission left on the jar with no execution right. [cid:image002.png@01D04FCA.85961CE0] From: Akhil Das [mailto:ak...@sigmoidanalytics.com] Sent: Wednesday, February 18, 2015 10:40 PM To: Judy Nash Cc: u...@spark.apache.orgmailto:u...@spark.apache.org Subject: Re: spark slave cannot execute without admin permission on windows You need not require admin permission, but just make sure all those jars has execute permission ( read/write access) Thanks Best Regards On Thu, Feb 19, 2015 at 11:30 AM, Judy Nash judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote: Hi, Is it possible to configure spark to run without admin permission on windows? My current setup run master slave successfully with admin permission. However, if I downgrade permission level from admin to user, SparkPi fails with the following exception on the slave node: Exception in thread main org.apache.spark.SparkException: Job aborted due to s tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, workernode0.jnashsparkcurr2.d10.internal.cloudapp.nethttp://workernode0.jnashsparkcurr2.d10.internal.cloudapp.net) : java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) Upon investigation, it appears that sparkPi jar under spark_home\worker\appname\*.jar does not have execute permission set, causing spark not able to find class. Advice would be very much appreciated. Thanks, Judy
RE: spark slave cannot execute without admin permission on windows
+ dev mailing list If this is supposed to work, is there a regression then? The spark core code shows the permission for copied file to \work is set to a+x at Line 442 of Utils.scalahttps://github.com/apache/spark/blob/b271c265b742fa6947522eda4592e9e6a7fd1f3a/core/src/main/scala/org/apache/spark/util/Utils.scala . The example jar I used had all permissions including Read Execute prior spark-submit: [cid:image001.png@01D04BDA.A74C65E0] However after copied to worker node’s \work folder, only limited permission left on the jar with no execution right. [cid:image002.png@01D04BDA.A74C65E0] From: Akhil Das [mailto:ak...@sigmoidanalytics.com] Sent: Wednesday, February 18, 2015 10:40 PM To: Judy Nash Cc: u...@spark.apache.org Subject: Re: spark slave cannot execute without admin permission on windows You need not require admin permission, but just make sure all those jars has execute permission ( read/write access) Thanks Best Regards On Thu, Feb 19, 2015 at 11:30 AM, Judy Nash judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote: Hi, Is it possible to configure spark to run without admin permission on windows? My current setup run master slave successfully with admin permission. However, if I downgrade permission level from admin to user, SparkPi fails with the following exception on the slave node: Exception in thread main org.apache.spark.SparkException: Job aborted due to s tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, workernode0.jnashsparkcurr2.d10.internal.cloudapp.nethttp://workernode0.jnashsparkcurr2.d10.internal.cloudapp.net) : java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) Upon investigation, it appears that sparkPi jar under spark_home\worker\appname\*.jar does not have execute permission set, causing spark not able to find class. Advice would be very much appreciated. Thanks, Judy
New Metrics Sink class not packaged in spark-assembly jar
Hello, Working on SPARK-5708https://issues.apache.org/jira/browse/SPARK-5708 - Add Slf4jSink to Spark Metrics Sink. Wrote a new Slf4jSink class (see patch attached), but the new class is not packaged as part of spark-assembly jar. Do I need to update build config somewhere to have this packaged? Current packaged class: [cid:image001.png@01D044B4.1B17A1C0] Thought I must have missed something basic but can't figure out why. Thanks! Judy - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
RE: New Metrics Sink class not packaged in spark-assembly jar
Thanks Patrick! That was the issue. Built the jars on windows env with mvn and forgot to run make-distributions.ps1 afterward, so was looking at old jars. From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Monday, February 9, 2015 10:43 PM To: Judy Nash Cc: dev@spark.apache.org Subject: Re: New Metrics Sink class not packaged in spark-assembly jar Actually, to correct myself, the assembly jar is in assembly/target/scala-2.11 (I think). On Mon, Feb 9, 2015 at 10:42 PM, Patrick Wendell pwend...@gmail.commailto:pwend...@gmail.com wrote: Hi Judy, If you have added source files in the sink/ source folder, they should appear in the assembly jar when you build. One thing I noticed is that you are looking inside the /dist folder. That only gets populated if you run make-distribution. The normal development process is just to do mvn package and then look at the assembly jar that is contained in core/target. - Patrick On Mon, Feb 9, 2015 at 10:02 PM, Judy Nash judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote: Hello, Working on SPARK-5708https://issues.apache.org/jira/browse/SPARK-5708 - Add Slf4jSink to Spark Metrics Sink. Wrote a new Slf4jSink class (see patch attached), but the new class is not packaged as part of spark-assembly jar. Do I need to update build config somewhere to have this packaged? Current packaged class: [cid:image001.png@01D044BC.8FE515C0] Thought I must have missed something basic but can't figure out why. Thanks! Judy - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.orgmailto:dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.orgmailto:dev-h...@spark.apache.org
RE: build in IntelliJ IDEA
Thanks Josh. That was the issue. From: Josh Rosen [mailto:rosenvi...@gmail.com] Sent: Friday, December 5, 2014 3:21 PM To: Judy Nash; dev@spark.apache.org Subject: Re: build in IntelliJ IDEA If you go to “File - Project Structure” and click on “Project” under the “Project settings” heading, do you see an entry for “Project SDK?” If not, you should click “New…” and configure a JDK; by default, I think IntelliJ should figure out a correct path to your system JDK, so you should just be able to hit “Ok” then rebuild your project. For reference, here’s a screenshot showing what my version of that window looks like: http://i.imgur.com/hRfQjIi.png On December 5, 2014 at 1:52:35 PM, Judy Nash (judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com) wrote: Hi everyone, Have a newbie question on using IntelliJ to build and debug. I followed this wiki to setup IntelliJ: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA Afterward I tried to build via Toolbar (Build Rebuild Project). The action fails with the error message: Cannot start compiler: the SDK is not specified. What SDK do I need to specify to get the build working? Thanks, Judy
build in IntelliJ IDEA
Hi everyone, Have a newbie question on using IntelliJ to build and debug. I followed this wiki to setup IntelliJ: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA Afterward I tried to build via Toolbar (Build Rebuild Project). The action fails with the error message: Cannot start compiler: the SDK is not specified. What SDK do I need to specify to get the build working? Thanks, Judy