Re: Spark SQL table Join, one task is taking long

2014-12-04 Thread Veeranagouda Mukkanagoudar
Have you tried joins on regular RDD instead of schemaRDD? We have found that its 10 times faster than joins between schemaRDDs. val largeRDD = ... val smallRDD = ... largeRDD.join(smallRDD) // otherway JOIN would run for long. Only limitation i see with that implementation is regular RDD suppor

Re: spark RDD join Error

2014-09-04 Thread Veeranagouda Mukkanagoudar
Thanks a lot, that fixed the issue :) On Thu, Sep 4, 2014 at 4:51 PM, Zhan Zhang wrote: > Try this: > Import org.apache.spark.SparkContext._ > > Thanks. > > Zhan Zhang > > > On Sep 4, 2014, at 4:36 PM, Veeranagouda Mukkanagoudar > wrote: > > I am plannin

spark RDD join Error

2014-09-04 Thread Veeranagouda Mukkanagoudar
I am planning to use RDD join operation, to test out i was trying to compile some test code, but am getting following compilation error *value join is not a member of org.apache.spark.rdd.RDD[(String, Int)]* *[error] rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) }* Code: import org.apac

Re: confirm subscribe to user@spark.apache.org

2014-07-11 Thread Veeranagouda Mukkanagoudar
9h > > GHn2TTXJ31eGH+Iin0TG/SBLs8OKCttD0OeS+1XFH5zAHSSFlc734BDb5LQnBkqGDpIE > hU8g== > MIME-Version: 1.0 > X-Received: by 10.194.87.97 with SMTP id w1mr2272592wjz.42.1405116657184; > Fri, > 11 Jul 2014 15:10:57 -0700 (PDT) > Received: by 10.194.204.228 with HTTP; Fri, 11 Jul 2014 15:10:57 -0700 > (PDT) > Date: Fri, 11 Jul 2014 15:10:57 -0700 > Message-ID: < > cafep6g8gs80d_vmbxd9ghrvbolbrab1pbu9k7updlm2v8n8...@mail.gmail.com> > Subject: please grant me subscriber access > From: Veeranagouda Mukkanagoudar > To: user-subscr...@spark.apache.org > Content-Type: multipart/alternative; boundary=089e010d86283835d004fdf23707 > X-Virus-Checked: Checked by ClamAV on apache.org > >