Re: Remote jar file

2014-12-11 Thread rahulkumar-aws
Put Jar file in site HDFS, URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes. - Software Developer SigmoidAnalytics, Bangalore -- View this message in context:

Re: Spark Streaming in Production

2014-12-12 Thread rahulkumar-aws
Run Spark Cluster managed my Apache Mesos. Mesos can run in high-availability mode, in which multiple Mesos masters run simultaneously. - Software Developer SigmoidAnalytics, Bangalore -- View this message in context:

Re: Access to s3 from spark

2014-12-12 Thread rahulkumar-aws
Try Following any one : *1. Set the access key and secret key in the sparkContext:* sparkContext.set(AWS_ACCESS_KEY_ID,yourAccessKey) sparkContext.set(AWS_SECRET_ACCESS_KEY,yourSecretKey) *2. Set the access key and secret key in the environment before starting your application:* export

Not able to run spark job from code on EC2 with spark 1.2.0

2015-01-17 Thread rahulkumar-aws
Hi I am trying to run simple count on a s3 bucket, but with spark 1.2.0 version on EC2 it is not able to run. I started my cluster using ec2 script that came with spark 1.2.0. some part of code : It is working with spark 1.1.1 , but not with 1.2.0 - Software Developer

Re: EC2 Having script run at startup

2015-03-25 Thread rahulkumar-aws
You can use AWS user-data feature. try this, if it help for you. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html - Software Developer SigmoidAnalytics, Bangalore -- View this message in context:

Re: Header in each output files.

2015-06-19 Thread rahulkumar-aws
Just check this stackoverflow link may it help http://stackoverflow.com/questions/26157456/add-a-header-before-text-file-on-save-in-spark http://stackoverflow.com/questions/26157456/add-a-header-before-text-file-on-save-in-spark - Software Developer Sigmoid (SigmoidAnalytics), India --

Re: Error when connecting to Spark SQL via Hive JDBC driver

2015-06-19 Thread rahulkumar-aws
it look's like your spark-Hive jars are not compatible with Spark , compile spark source with hive 13 flag. mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver -DskipTests clean package it will solve ur problem. - Software Developer Sigmoid (SigmoidAnalytics), India

Re: Spark application in production without HDFS

2015-06-15 Thread rahulkumar-aws
Hi If your data is not so huge you can use both cloudera and HDP's free stack. Cloudera Express is 100% opensource free. - Software Developer SigmoidAnalytics, Bangalore -- View this message in context:

Re: Limit Spark Shuffle Disk Usage

2015-06-15 Thread rahulkumar-aws
Check this link https://forums.databricks.com/questions/277/how-do-i-avoid-the-no-space-left-on-device-error.html https://forums.databricks.com/questions/277/how-do-i-avoid-the-no-space-left-on-device-error.html Hope this will solve your problem. - Software Developer Sigmoid

Re: How to run multiple Spark jobs as a workflow that takes input from a Streaming job in Oozie

2015-12-20 Thread rahulkumar-aws
Use Spark job server https://github.com/spark-jobserver/spark-jobserver Additional: 1. You can also write your on job server with spray (a Scala REST framework). 2. Create Thrift server and pass states of each job states (Thrift Object ) between your different Jobs. - Software

Re: How to give name to Spark jobs shown in Spark UI

2016-07-26 Thread rahulkumar-aws
You can set name in SparkConf() or if You are using Spark submit set --name flag *val sparkconf = new SparkConf()* * .setMaster("local[4]")* * .setAppName("saveFileJob")* *val sc = new SparkContext(sparkconf)* or spark-submit : *./bin/spark-submit --name "FileSaveJob"

Re: Guide Step by step Stark streaming

2016-09-15 Thread rahulkumar-aws
Really your project is very nice and you can simulate various domain use case with this, as this is your college project I can't help you in coding but I can share a presentation https://goo.gl/XUJd3b of mine that will give you a graphical diagram of a real-life system

Re: Spark job within Web application

2016-09-15 Thread rahulkumar-aws
Hi, As I see your code it looks like you are trying to call spark code inside your servlet, but my perspective try to make the separate system and do communication using Thrift or Protobuf libraries. I build various distributed web app using Play Framework, Akka, Spray and jersey it is working