from:"Kartheek.R"

Re: RDDs

2015-03-03 Thread Kartheek.R

Hi TD, "You can always run two jobs on the same cached RDD, and they can run in parallel (assuming you launch the 2 jobs from two different threads)" Is this a correct way to launch jobs from two different threads? val threadA = new Thread(new Runnable { def run() { for(i<- 0 until e

Job submission via multiple threads

2015-02-26 Thread Kartheek.R

Hi, I just wrote an application that intends to submit its actions(jobs) via independent threads keeping in view of the point: "Second, within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads", mentioned in: https://spa

Task not serializable exception

2015-02-24 Thread Kartheek.R

Hi, I run into Task not Serializable excption with following code below. When I remove the threads and run, it works, but with threads I run into Task not serializable exception. object SparkKart extends Serializable{ def parseVector(line: String): Vector[Double] = { DenseVector(line.split('

Re: Task not serializable exception

2015-02-23 Thread Kartheek.R

I could trace where the problem is. If I run without any threads, it works fine. When I allocate threads, I run into Not serializable problem. But, I need to have threads in my code. Any help please!!! This is my code: object SparkKart { def parseVector(line: String): Vector[Double] = { Dens

Task not serializable exception

2015-02-23 Thread Kartheek.R

Hi, I have a file containig data in the following way: 0.0 0.0 0.0 0.1 0.1 0.1 0.2 0.2 0.2 9.0 9.0 9.0 9.1 9.1 9.1 9.2 9.2 9.2 Now I do the folloowing: val kPoints = data.takeSample(withReplacement = false, 4, 42).toArray val thread1= new Thread(new Runnable { def run() { v

Re: java.io.IOException: Filesystem closed

2015-02-21 Thread Kartheek.R

Are you replicating any RDDs? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-io-IOException-Filesystem-closed-tp20150p21749.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Caching RDD

2015-02-19 Thread Kartheek.R

Hi, I have HDFS file of size 598MB. I create RDD over this file and cache it in RAM in a 7 node cluster with 2G RAM each. I find that each partition gets replicated thrice or even 4 times in the cluster even without me specifying in code. Total partitions are 5 for the RDD created but cached partit

Inconsistent execution times for same application.

2015-02-15 Thread Kartheek.R

Hi, My spark cluster contains machines like Pentium-4, dual core and quad-core machines. I am trying to run a character frequency count application. The application contains several threads, each submitting a job(action) that counts the frequency of a single character. But, my problem is, I get dif

Need a spark application.

2015-02-09 Thread Kartheek.R

Hi, Can someone please suggest some real life application implemented in spark ( things like gene sequencing) that is of type below code. Basically, the application should have jobs submitted via as many threads as possible. I need similar kind of spark application for benchmarking. val threadA

Question about recomputing lost partition of rdd ?

2015-02-06 Thread Kartheek.R

Hi, I have this doubt: Assume that an rdd is stored across multiple nodes and one of the nodes fails. So, a partition is lost. Now, I know that when this node is back, it uses the lineage from its neighbours and recomputes that partition alone. 1) How does it get the source data (original data be

Connection closed/reset by peers error

2015-02-01 Thread Kartheek.R

Hi, I keep facing this error when I run my application: java.io.IOException: Connection from s1/- closed +details java.io.IOException: Connection from s1/:43741 closed at org.apache.spark.network.client.TransportResponseHandler.channelUnregistered(TransportResponseHandler.java:9

Re: java.io.IOException: connection closed.

2015-01-24 Thread Kartheek.R

When I increase the executor.memory size, I run it smoothly without any errors. On Sat, Jan 24, 2015 at 9:29 PM, Rapelly Kartheek wrote: > Hi, > While running spark application, I get the following Exception leading to > several failed stages. > > Exception in thread "Thread-46" org.apache.sp

java.io.IOException: connection closed.

2015-01-24 Thread Kartheek.R

Hi, While running spark application, I get the following Exception leading to several failed stages. Exception in thread "Thread-46" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 11.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11.0 (TID 262, s

Fwd: UnknownhostException : home

2015-01-19 Thread Kartheek.R

-- Forwarded message -- From: Rapelly Kartheek Date: Mon, Jan 19, 2015 at 3:03 PM Subject: UnknownhostException : home To: "user@spark.apache.org" Hi, I get the following exception when I run my application: karthik@karthik:~/spark-1.2.0$ ./bin/spark-submit --class org.apache.

Re: Problem with building spark-1.2.0

2015-01-12 Thread Kartheek.R

Hi, This is what I am trying to do: karthik@s4:~/spark-1.2.0$ SPARK_HADOOP_VERSION=2.3.0 sbt/sbt clean Using /usr/lib/jvm/java-7-oracle as default JAVA_HOME. Note, this will be overridden by -java-home if it is set. [info] Loading project definition from /home/karthik/spark-1.2.0/project/project C

Re: Problem with building spark-1.2.0

2015-01-04 Thread Kartheek.R

The problem is that my network is not able to access github.com for cloning some dependencies as github is blocked in India. What are the other possible ways for this problem?? Thank you! On Sun, Jan 4, 2015 at 9:45 PM, Rapelly Kartheek wrote: > Hi, > > I get the following error when I build sp

Problem with building spark-1.2.0

2015-01-04 Thread Kartheek.R

Hi, I get the following error when I build spark-1.2.0 using sbt: [error] Nonzero exit code (128): git clone https://github.com/ScrapCodes/sbt-pom-reader.git /home/karthik/.sbt/0.13/staging/ad8e8574a5bcb2d22d23/sbt-pom-reader [error] Use 'last' for the full log. Any help please? Thanks -- V

Re: How to access application name in the spark framework code.

2014-11-24 Thread Kartheek.R

Hi Deng, Thank you. That works perfectly:) Regards Karthik. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-access-application-name-in-the-spark-framework-code-tp19719p19723.html Sent from the Apache Spark User List mailing list archive at Nabble.co

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread Kartheek.R

Does SparkContext exists when this part (AskDriverWithReply()) of the scheduler code gets executed? On Sun, Oct 12, 2014 at 1:54 PM, rapelly kartheek wrote: > Hi Sean, > I tried even with sc as: sc.parallelize(data). But. I get the error: value > sc not found. > > On Sun, Oct 12, 2014 at 1:47 PM

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread Kartheek.R

Hi Sean, I tried even with sc as: sc.parallelize(data). But. I get the error: value sc not found. On Sun, Oct 12, 2014 at 1:47 PM, sowen [via Apache Spark User List] < ml-node+s1001560n16233...@n3.nabble.com> wrote: > It is a method of the class, not a static method of the object. Since a > Spark

Re: replicate() method in BlockManager.scala choosing only one node for replication.

2014-09-12 Thread Kartheek.R

When I see the storage details of the rdd in the webUI, I find that each block is replicated twice and not on a single node. All the nodes in the cluster are hosting some block or the other. Why is this difference?? The trace of replicate() method shows only one node. But, webUI shows multiple nod

Re: RDDs

2014-09-04 Thread Kartheek.R

Thank you yuanbosoft. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDDs-tp13343p13444.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-

RE: RDDs

2014-09-03 Thread Kartheek.R

Thank you Raymond and Tobias. Yeah, I am very clear about what I was asking. I was talking about "replicated" rdd only. Now that I've got my understanding about job and application validated, I wanted to know if we can replicate an rdd and run two jobs (that need same rdd) of an application in par

Re: Scheduling in spark

2014-07-14 Thread Kartheek.R

Thank you Andrew for the updated link. regards Karthik -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scheduling-in-spark-tp9035p9717.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Scheduling in spark

2014-07-14 Thread Kartheek.R

Thank you so much for the link, Sujeet. regards Karthik -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scheduling-in-spark-tp9035p9716.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: RDDs

Job submission via multiple threads

Task not serializable exception

Re: Task not serializable exception

Task not serializable exception

Re: java.io.IOException: Filesystem closed

Caching RDD

Inconsistent execution times for same application.

Need a spark application.

Question about recomputing lost partition of rdd ?

Connection closed/reset by peers error

Re: java.io.IOException: connection closed.

java.io.IOException: connection closed.

Fwd: UnknownhostException : home

Re: Problem with building spark-1.2.0

Re: Problem with building spark-1.2.0

Problem with building spark-1.2.0

Re: How to access application name in the spark framework code.

Re: How to convert a non-rdd data to rdd.

Re: How to convert a non-rdd data to rdd.

Re: replicate() method in BlockManager.scala choosing only one node for replication.

Re: RDDs

RE: RDDs

Re: Scheduling in spark

Re: Scheduling in spark

25 matches

Site Navigation

Mail list logo

Footer information