Unsubscribe

2016-08-15 Thread Sarath Chandra

Issue with wholeTextFiles

2016-03-21 Thread Sarath Chandra
I'm using Hadoop 1.0.4 and Spark 1.2.0. I'm facing a strange issue. I have a requirement to read a small file from HDFS and all it's content has to be read at one shot. So I'm using spark context's wholeTextFiles API passing the HDFS URL for the file. When I try this from a spark shell it's works

Re: Assign unique link ID

2015-10-31 Thread Sarath Chandra
u have > used for joining. So, records 1 and 4 should generate same hash value. > 3. group by using this new id (you have already linked the records) and > pull out required fields. > > Please let the group know if it works... > > Best > Ayan > > On Sat, Oct 31, 2015 a

Assign unique link ID

2015-10-31 Thread Sarath Chandra
Hi All, I have a hive table where data from 2 different sources (S1 and S2) get accumulated. Sample data below - *RECORD_ID|SOURCE_TYPE|TRN_NO|DATE1|DATE2|BRANCH|REF1|REF2|REF3|REF4|REF5|REF6|DC_FLAG|AMOUNT|CURRENCY* *1|S1|55|19-Oct-2015|19-Oct-2015|25602|999||41106|47311|379|9|004|999|99

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
erify your executor/driver actually started with this option to > rule out a config problem. > > On Wed, Jul 29, 2015 at 10:45 AM, Sarath Chandra > wrote: > > Yes. > > > > As mentioned in my mail at the end, I tried with both 256 and 512 > opt

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
ingle node mesos cluster on my laptop having 4 CPUs and 12GB RAM. On Wed, Jul 29, 2015 at 2:49 PM, fightf...@163.com wrote: > Hi, Sarath > > Did you try to use and increase spark.excecutor.extraJaveOptions > -XX:PermSize= -XX:MaxPermSize= > > > --------

PermGen Space Error

2015-07-29 Thread Sarath Chandra
Dear All, I'm using - => Spark 1.2.0 => Hive 0.13.1 => Mesos 0.18.1 => Spring => JDK 1.7 I've written a scala program which => instantiates a spark and hive context => parses an XML file which provides the where clauses for queries => generates full fledged hive queries to be run on hi

Re: Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
(Test.java:7)* Regards, Sarath. Thanks & Regards, *Sarath Chandra Josyam* Sr. Technical Architect *Algofusion Technologies India Pvt. Ltd.* Email: sarathchandra.jos...@algofusiontech.com Phone: +91-80-65330112/113 Mobile: +91 8762491331 On Wed, Mar 4, 2015 at 5:08 PM, Sarath Chandra < s

Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
Hi, I have a cluster running on CDH5.2.1 and I have a Mesos cluster (version 0.18.1). Through a Oozie java action I'm want to submit a Spark job to mesos cluster. Before configuring it as Oozie job I'm testing the java action from command line and getting exception as below. While running I'm poin

Parallel spark jobs on mesos cluster

2014-09-30 Thread Sarath Chandra
Hi All, I have a requirement to process a set of files in parallel. So I'm submitting spark jobs using java's ExecutorService. But when I do this way, 1 or more jobs are failing with status as "EXITED". Earlier I tried with a standalone spark cluster setting the job scheduling to "Fair Scheduling"

Parallel spark jobs on standalone cluster

2014-09-25 Thread Sarath Chandra
Hi All, I have a java program which submits a spark job to a standalone spark cluster (2 nodes; 10 cores (6+4); 12GB (8+4)). This is being called by another java program through ExecutorService and invokes it multiple times with different set of arguments and parameters. I have set spark memory us

Worker state is 'killed'

2014-09-21 Thread Sarath Chandra
Hi All, I'm executing a simple job in spark which reads a file on HDFS, processes the lines and saves the processed lines back to HDFS. All the 3 stages are happening correctly and I'm able to see the processed file on the HDFS. But on the spark UI, the worker state is shown as "killed". And I'm

Saving RDD with array of strings

2014-09-21 Thread Sarath Chandra
Hi All, If my RDD is having array/sequence of strings, how can I save them as a HDFS file with each string on separate line? For example if I write code as below, the output should get saved as hdfs file having one string per line ... ... var newLines = lines.map(line => myfunc(line)); newLines.s

Re: Task not serializable

2014-09-10 Thread Sarath Chandra
w your > code since it may not be doing what you think. > > If you instantiate an object, it happens every time your function is > called. map() is called once per data element; mapPartitions() once > per partition. It depends. > > On Wed, Sep 10, 2014 at 3:25 PM, Sarath Ch

Re: Task not serializable

2014-09-10 Thread Sarath Chandra
ableManagerClass in the function and therefore on the > worker. > > mapPartitions is better if this creation is expensive. > > On Fri, Sep 5, 2014 at 3:06 PM, Sarath Chandra > wrote: > > Hi, > > > > I'm trying to migrate a map-reduce program to work with spark

Re: Task not serializable

2014-09-06 Thread Sarath Chandra
> In the first instance, you create the object on the driver and try to > serialize and copy it to workers. In the second, you're creating > SomeUnserializableManagerClass in the function and therefore on the > worker. > > mapPartitions is better if this creation is expensive. &

Re: Task not serializable

2014-09-05 Thread Sarath Chandra
gt; You can bring those classes out of the library and Serialize it > (implements Serializable). It is not the right way of doing it though it > solved few of my similar problems. > > Thanks > Best Regards > > > On Fri, Sep 5, 2014 at 7:36 PM, Sarath Chandra < > sa

Task not serializable

2014-09-05 Thread Sarath Chandra
Hi, I'm trying to migrate a map-reduce program to work with spark. I migrated the program from Java to Scala. The map-reduce program basically loads a HDFS file and for each line in the file it applies several transformation functions available in various external libraries. When I execute this o

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
n Thu, Jul 17, 2014 at 1:13 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > No Sonal, I'm not doing any explicit call to stop context. > > If you see my previous post to Michael, the commented portion of the code > is my requirement. When I run this over s

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
gards, > Sonal > Nube Technologies <http://www.nubetech.co> > > <http://in.linkedin.com/in/sonalgoyal> > > > > > On Thu, Jul 17, 2014 at 12:51 PM, Sarath Chandra < > sarathchandra.jos...@algofusiontech.com> wrote: > >> Hi Michael, Soumya, &

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Hi Michael, Soumya, Can you please check and let me know what is the issue? what am I missing? Let me know if you need any logs to analyze. ~Sarath On Wed, Jul 16, 2014 at 8:24 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Michael, > > Tried it.

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
ATH $CONFIG_OPTS test.Test4 spark://master:7077 "/usr/local/spark-1.0.1-bin-hadoop1" hdfs://master:54310/user/hduser/file1.csv hdfs://master:54310/user/hduser/file2.csv* ~Sarath On Wed, Jul 16, 2014 at 8:14 PM, Michael Armbrust wrote: > What if you just run something like: > *sc.te

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
2014 at 7:59 PM, Soumya Simanta wrote: > > > Can you try submitting a very simple job to the cluster. > > On Jul 16, 2014, at 10:25 AM, Sarath Chandra < > sarathchandra.jos...@algofusiontech.com> wrote: > > Yes it is appearing on the Spark UI, and remains there wit

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Sarath On Wed, Jul 16, 2014 at 7:48 PM, Soumya Simanta wrote: > When you submit your job, it should appear on the Spark UI. Same with the > REPL. Make sure you job is submitted to the cluster properly. > > > On Wed, Jul 16, 2014 at 10:08 AM, Sarath Chandra < > sarathchandra.j

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
anything going wrong, all are info messages. What else do I need check? ~Sarath On Wed, Jul 16, 2014 at 7:23 PM, Soumya Simanta wrote: > Check your executor logs for the output or if your data is not big collect > it in the driver and print it. > > > > On Jul 16, 2014, at 9:21 AM

Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Hi All, I'm trying to do a simple record matching between 2 files and wrote following code - *import org.apache.spark.sql.SQLContext;* *import org.apache.spark.rdd.RDD* *object SqlTest {* * case class Test(fld1:String, fld2:String, fld3:String, fld4:String, fld4:String, fld5:Double, fld6:String)