spark 1.3.0 strange log message

2015-04-23 Thread Henry Hung
Dear All, When using spark 1.3.0 spark-submit with directing out and err to a log file, I saw some strange lines inside that looks like this: [Stage 0:(0 + 2) / 120] [Stage 0:(2 + 2) /

how to process a file in spark standalone cluster without distributed storage (i.e. HDFS/EC2)?

2015-02-06 Thread Henry Hung
Hi All, sc.textFile will not work because the file is not distributed to other workers, So I try to read the file first using FileUtils.readLines and then use sc.parallelize, but the readLines failed because OOM (file is large). Is there a way to split local files and upload those partition to

RE: how to process a file in spark standalone cluster without distributed storage (i.e. HDFS/EC2)?

2015-02-06 Thread Henry Hung
Hi All, I already find a solution to solve this problem. Please ignore my question... Thanx Best regards, Henry From: MA33 YTHung1 Sent: Friday, February 6, 2015 4:34 PM To: user@spark.apache.org Subject: how to process a file in spark standalone cluster without distributed storage (i.e.

broadcasting object issue

2014-12-22 Thread Henry Hung
Hi All, I have a problem with broadcasting a serialize class object that returned by another not-serialize class, here is the sample code: class A extends java.io.Serializable { def halo(): String = halo } class B { def getA() = new A } val list = List(1) val b = new B val a = b.getA

RE: how to build spark 1.1.0 to include org.apache.commons.math3 ?

2014-10-19 Thread Henry Hung
YTHung1 Cc: user@spark.apache.org Subject: Re: how to build spark 1.1.0 to include org.apache.commons.math3 ? It doesn't contain commons math3 since Spark does not depend on it. Its tests do, but tests are not built into the Spark assembly. On Thu, Oct 16, 2014 at 9:57 PM, Henry Hung ythu

RE: error when maven build spark 1.1.0 with message You have 1 Scalastyle violation

2014-10-17 Thread Henry Hung
an exception input length = 2: error file=D:\tools\spark-1.1.0\mllib\src\main\scala\org\apache\spark\mllib\optimization\Gradient.scala message=Input length = 2 Best regards, Henry Hung From: MA33 YTHung1 Sent: Friday, October 17, 2014 1:05 PM To: user@spark.apache.orgmailto:user@spark.apache.org

how to build spark 1.1.0 to include org.apache.commons.math3 ?

2014-10-16 Thread Henry Hung
HI All, I try to build spark 1.1.0 using sbt with command: sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly but the resulting spark-assembly-1.1.0-hadoop2.2.0.jar still missing the apache commons math3 classes. How to add the math3 into package? Best regards, Henry

error when maven build spark 1.1.0 with message You have 1 Scalastyle violation

2014-10-16 Thread Henry Hung
Hi All, I'm using windows 8.1 to build spark 1.1.0 using this command: C:\apache-maven-3.0.5\bin\mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package -e Below is the error message: [ERROR] Failed to execute goal org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default)

RE: error when maven build spark 1.1.0 with message You have 1 Scalastyle violation

2014-10-16 Thread Henry Hung
Hi All, Another piece of information, somehow the Gradient.scala throws an exception input length = 2: error file=D:\tools\spark-1.1.0\mllib\src\main\scala\org\apache\spark\mllib\optimization\Gradient.scala message=Input length = 2 Best regards, Henry Hung From: MA33 YTHung1 Sent: Friday

adding element into MutableList throws an error type mismatch

2014-10-15 Thread Henry Hung
Hi All, Could someone shed a light to why when adding element into MutableList can result in type mistmatch, even if I'm sure that the class type is right? Below is the sample code I run in spark 1.0.2 console, at the end of line, there is an error type mismatch: Welcome to

How to get SparckContext inside mapPartitions?

2014-09-30 Thread Henry Hung
Hi All, A noob question: How to get SparckContext inside mapPartitions? Example: Let's say I have rddObjects that can be split into different partitions to be assigned to multiple executors, to speed up the export data from database. Variable sc is created in the main program using these

RE: how to correctly run scala script using spark-shell through stdin (spark v1.0.0)

2014-08-27 Thread Henry Hung
the script. Best regards, Henry Hung From: MA33 YTHung1 Sent: Thursday, August 28, 2014 10:01 AM To: user@spark.apache.org Subject: how to correctly run scala script using spark-shell through stdin (spark v1.0.0) HI All, Right now I'm trying to execute a script using this command: nohup

a noob question for how to implement setup and cleanup in Spark map

2014-08-18 Thread Henry Hung
tell me how to do it in Spark? Best regards, Henry Hung The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated

RE: a noob question for how to implement setup and cleanup in Spark map

2014-08-18 Thread Henry Hung
in the setup() once and run() will execute SQL query, then cleanup() will close the connection. Could someone tell me how to do it in Spark? Best regards, Henry Hung The privileged confidential information contained in this email is intended for use only

RE: a noob question for how to implement setup and cleanup in Spark map

2014-08-18 Thread Henry Hung
=+D4vj=JfG2tP9eqn5RPko=dr...@mail.gmail.com%3E On Mon, Aug 18, 2014 at 8:04 AM, Henry Hung ythu...@winbond.com wrote: Hi All, Please ignore my question, I found a way to implement it via old archive mails: http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3CCAF