Re: Spark on Windows

2015-04-17 Thread Arun Lists
. With Regards Sree On Thursday, April 16, 2015 9:07 PM, Arun Lists lists.a...@gmail.com wrote: Here is what I got from the engineer who worked on building Spark and using it on Windows: 1) Hadoop winutils.exe is needed on Windows, even for local files – and you have to set

Spark on Windows

2015-04-16 Thread Arun Lists
We run Spark on Mac and Linux but also need to run it on Windows 8.1 and Windows Server. We ran into problems with the Scala 2.10 binary bundle for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on Scala 2.11.6 (we built Spark from the sources). On Windows, however

Re: Spark on Windows

2015-04-16 Thread Arun Lists
errors are you seeing? Matei On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote: We run Spark on Mac and Linux but also need to run it on Windows 8.1 and Windows Server. We ran into problems with the Scala 2.10 binary bundle for Spark 1.3.0 but managed to get it working

Re: Registering classes with KryoSerializer

2015-04-14 Thread Arun Lists
) perhaps the class is package private or something, and the repl somehow subverts it ... On Tue, Apr 14, 2015 at 5:44 PM, Arun Lists lists.a...@gmail.com wrote: Hi Imran, Thanks for the response! However, I am still not there yet. In the Scala interpreter, I can do: scala classOf

Re: Registering classes with KryoSerializer

2015-04-14 Thread Arun Lists
-by-one: scala classOf[scala.reflect.ClassTag$$anon$1] res0: Class[scala.reflect.ClassTag[T]{def unapply(x$1: scala.runtime.BoxedUnit): Option[_]; def arrayClass(x$1: Class[_]): Class[_]}] = class scala.reflect.ClassTag$$anon$1 On Mon, Apr 13, 2015 at 6:09 PM, Arun Lists lists.a...@gmail.com

Registering classes with KryoSerializer

2015-04-13 Thread Arun Lists
Hi, I am trying to register classes with KryoSerializer. This has worked with other programs. Usually the error messages are helpful in indicating which classes need to be registered. But with my current program, I get the following cryptic error message: *Caused by:

Reading file with Unicode characters

2015-04-08 Thread Arun Lists
Hi, Does SparkContext's textFile() method handle files with Unicode characters? How about files in UTF-8 format? Going further, is it possible to specify encodings to the method? If not, what should one do if the files to be read are in some encoding? Thanks, arun

Re: Reading file with Unicode characters

2015-04-08 Thread Arun Lists
Thanks! arun On Wed, Apr 8, 2015 at 10:51 AM, java8964 java8...@hotmail.com wrote: Spark use the Hadoop TextInputFormat to read the file. Since Hadoop is almost only supporting Linux, so UTF-8 is the only encoding supported, as it is the the one on Linux. If you have other encoding data,

Specifying Spark property from command line?

2015-04-07 Thread Arun Lists
Hi, Is it possible to specify a Spark property like spark.local.dir from the command line when running an application using spark-submit? Thanks, arun

Error when running Spark on Windows 8.1

2015-04-07 Thread Arun Lists
Hi, We are trying to run a Spark application using spark-submit on Windows 8.1. The application runs successfully to completion on MacOS 10.10 and on Ubuntu Linux. On Windows, we get the following error messages (see below). It appears that Spark is trying to delete some temporary directory that

Re: Specifying Spark property from command line?

2015-04-07 Thread Arun Lists
I just figured this out from the documentation: --conf spark.local.dir=C:\Temp On Tue, Apr 7, 2015 at 5:00 PM, Arun Lists lists.a...@gmail.com wrote: Hi, Is it possible to specify a Spark property like spark.local.dir from the command line when running an application using spark-submit

Registering classes with KryoSerializer

2015-03-30 Thread Arun Lists
I am trying to register classes with KryoSerializer. I get the following error message: How do I find out what class is being referred to by: *OpenHashMap$mcI$sp ?* *com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Class is not registered:

ClassNotFoundException when registering classes with Kryo

2015-02-01 Thread Arun Lists
Here is the relevant snippet of code in my main program: === sparkConf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer) sparkConf.set(spark.kryo.registrationRequired, true) val summaryDataClass = classOf[SummaryData] val summaryViewClass

Re: ClassNotFoundException when registering classes with Kryo

2015-02-01 Thread Arun Lists
been fixed in https://github.com/apache/spark/pull/4258 but not yet been merged. Best Regards, Shixiong Zhu 2015-02-02 10:08 GMT+08:00 Arun Lists lists.a...@gmail.com: Here is the relevant snippet of code in my main program: === sparkConf.set

Re: Reading resource files in a Spark application

2015-01-14 Thread Arun Lists
in a JAR file will necessarily be specific to where the JAR is on the local filesystem and that is not portable or the right way to read a resource. But you didn't specify the problem here. On Jan 14, 2015 5:15 AM, Arun Lists lists.a...@gmail.com wrote: I experimented with using getResourceAsStream

Re: Reading resource files in a Spark application

2015-01-13 Thread Arun Lists
I experimented with using getResourceAsStream(cls, fileName) instead cls.getResource(fileName).toURI. That works! I have no idea why the latter method does not work in Spark. Any explanations would be welcome. Thanks, arun On Tue, Jan 13, 2015 at 6:35 PM, Arun Lists lists.a...@gmail.com wrote

Re: Running Spark application from command line

2015-01-13 Thread Arun Lists
you're running with Scala 2.11 too? On Tue, Jan 13, 2015 at 6:58 AM, Arun Lists lists.a...@gmail.com wrote: I have a Spark application that was assembled using sbt 0.13.7, Scala 2.11, and Spark 1.2.0. In build.sbt, I am running on Mac OSX Yosemite. I use provided for the Spark

Reading resource files in a Spark application

2015-01-13 Thread Arun Lists
In some classes, I initialize some values from resource files using the following snippet: new File(cls.getResource(fileName).toURI) This works fine in SBT. When I run it using spark-submit, I get a bunch of errors because the classes cannot be initialized. What can I do to make such

Running Spark application from command line

2015-01-12 Thread Arun Lists
I have a Spark application that was assembled using sbt 0.13.7, Scala 2.11, and Spark 1.2.0. In build.sbt, I am running on Mac OSX Yosemite. I use provided for the Spark dependencies. I can run the application fine within sbt. I run into problems when I try to run it from the command line. Here