date:20140924

How to sort rdd filled with existing data structures?

2014-09-24 Thread Tao Xiao

Hi , I have the following rdd : val conf = new SparkConf() .setAppName("<< Testing Sorting >>") val sc = new SparkContext(conf) val L = List( (new Student("XiaoTao", 80, 29), "I'm Xiaotao"), (new Student("CCC", 100, 24), "I'm CCC"), (new Student("Jack", 90,

Re: All-time stream re-processing

2014-09-24 Thread Tobias Pfeiffer

Hi, On Wed, Sep 24, 2014 at 7:23 PM, Dibyendu Bhattacharya < dibyendu.bhattach...@gmail.com> wrote: > So you have a single Kafka topic which has very high retention period ( > that decides the storage capacity of a given Kafka topic) and you want to > process all historical data first using Camus

Re: sortByKey trouble

2014-09-24 Thread david

thank's i've already try this solution but it does not compile (in Eclipse) I'm surprise to see that in Spark-shell, sortByKey works fine on 2 solutions : (String,String,String,String) (String,(String,String,String)) -- View this message in context: http://apache-spark-user-list.1

java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap

Hi , I am getting this weird error while starting Worker. -bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker spark://osebi-UServer:59468 Spark assembly has been built with Hive, including Datanucleus jars on classpath 14/09/24 16:22:04 INFO worker.Worker: Registered signal handle

java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap

Hi , I am getting this weird error while starting Worker. -bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker spark://osebi-UServer:59468 Spark assembly has been built with Hive, including Datanucleus jars on classpath 14/09/24 16:22:04 INFO worker.Worker: Registered signal handle

RE: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap

Hi , I am getting this weird error while starting Worker. -bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker spark://osebi-UServer:59468 Spark assembly has been built with Hive, including Datanucleus jars on classpath 14/09/24 16:22:04 INFO worker.Worker: Registered signal handl

Re: java.lang.NumberFormatException while starting spark-worker

2014-09-24 Thread jishnu.prathap

Hi , I am getting this weird error while starting Worker. -bash-4.1$ spark-class org.apache.spark.deploy.worker.Worker spark://osebi-UServer:59468 Spark assembly has been built with Hive, including Datanucleus jars on classpath 14/09/24 16:22:04 INFO worker.Worker: Registered signal handle

Does Spark Driver works with HDFS in HA mode

2014-09-24 Thread Petr Novak

Hello, if our Hadoop cluster is configured with HA and "fs.defaultFS" points to a namespace instead of a namenode hostname - hdfs:/// - then our Spark job fails with exception. Is there anything to configure or it is not implemented? Exception in thread "main" org.apache.spark.SparkException: Job

find subgraph in Graphx

2014-09-24 Thread uuree

Hi, I want to extract a subgraph using two clauses such as (?person worksAt MIT && MIT locatesIn USA), I have this query. My data is constructed as triplets so I assume Graphx will suit it best. But I found out that subgraph method only supports a single edge checking and it seems to me that I cann

Spark Streaming

2014-09-24 Thread Reddy Raja

Given this program.. I have the following queries.. val sparkConf = new SparkConf().setAppName("NetworkWordCount") sparkConf.set("spark.master", "local[2]") val ssc = new StreamingContext(sparkConf, Seconds(10)) val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.

All-time stream re-processing

2014-09-24 Thread Tobias Pfeiffer

Hi, I have a setup (in mind) where data is written to Kafka and this data is persisted in HDFS (e.g., using camus) so that I have an all-time archive of all stream data ever received. Now I want to process that all-time archive and when I am done with that, continue with the live stream, using Spa

Re: sortByKey trouble

2014-09-24 Thread Liquan Pei

Hi David, Can you try val rddToSave = file.map(l => l.split("\\|")).map(r => (r(34)+"-"+r(3), (r(4), r(10), r(12 ? That should work. Liquan On Wed, Sep 24, 2014 at 1:29 AM, david wrote: > Hi, > > Does anybody know how to use sortbykey in scala on a RDD like : > > val rddToSave = fi

RE: sortByKey trouble

2014-09-24 Thread Shao, Saisai

Hi, SortByKey is only for RDD[(K, V)], each tuple can only has two members, Spark will sort with first member, if you want to use sortByKey, you have to change your RDD[(String, String, String, String)] into RDD[(String, (String, String, String))]. Thanks Jerry -Original Message- From

sortByKey trouble

2014-09-24 Thread david

Hi, Does anybody know how to use sortbykey in scala on a RDD like : val rddToSave = file.map(l => l.split("\\|")).map(r => (r(34)+"-"+r(3), r(4), r(10), r(12))) besauce, i received ann error "sortByKey is not a member of ord.apache.spark.rdd.RDD[(String,String,String,String)]. What i t

Re: Converting one RDD to another

2014-09-24 Thread Sean Owen

top returns the specified number of "largest" elements in your RDD. They are returned to the driver as an Array. If you want to make an RDD out of them again, call SparkContext.parallelize(...). Make sure this is what you mean though. On Wed, Sep 24, 2014 at 5:33 AM, Deep Pradhan wrote: > Hi, > I

Re: HdfsWordCount only counts some of the words

2014-09-24 Thread Sean Owen

If you look at the code for HdfsWordCount, you see it calls print(), which defaults to print 10 elements from each RDD. If you are just talking about the console output, then it is not expected to print all words to begin with. On Wed, Sep 24, 2014 at 2:29 AM, SK wrote: > > I execute it as follow

[Streaming] Non-blocking recommendation in custom receiver documentation and KinesisReceiver's worker.run blocking calll

2014-09-24 Thread Aniket Bhatnagar

Hi all Reading through Spark streaming's custom receiver documentation, it is recommended that onStart and onStop methods should not block indefinitely. However, looking at the source code of KinesisReceiver, the onStart method calls worker.run that blocks until worker is shutdown (via a call to o

How to sort rdd filled with existing data structures?

Re: All-time stream re-processing

Re: sortByKey trouble

java.lang.NumberFormatException while starting spark-worker

java.lang.NumberFormatException while starting spark-worker

RE: java.lang.NumberFormatException while starting spark-worker

Re: java.lang.NumberFormatException while starting spark-worker

Does Spark Driver works with HDFS in HA mode

find subgraph in Graphx

Spark Streaming

All-time stream re-processing

Re: sortByKey trouble

RE: sortByKey trouble

sortByKey trouble

Re: Converting one RDD to another

Re: HdfsWordCount only counts some of the words

[Streaming] Non-blocking recommendation in custom receiver documentation and KinesisReceiver's worker.run blocking calll

< 1 2

101 - 117 of 117 matches

Site Navigation

Mail list logo

Footer information