from:"\"Fridtjof Sander\""

Re: Can not control bucket files number if it was speficed

2016-09-19 Thread Fridtjof Sander

I didn't follow all of this thread, but if you want to have exactly one bucket-output-file per RDD-partition, you have to repartition (shuffle) your data on the bucket-key. If you don't repartition (shuffle), you may have records with different bucket-keys in the same RDD-partition, leading to t

Re: How to make the result of sortByKey distributed evenly?

2016-09-06 Thread Fridtjof Sander

Your data has only two keys, and basically all values are assigned to only one of them. There is no better way to distribute the keys, than the one Spark executes. What you have to do is to use different keys to sort and range-partition on. Try to invoke sortBy() on a non-pair-RDD. This will t

Re: Splitting columns from a text file

2016-09-05 Thread Fridtjof Sander

Ask yourself how to access the third element in an array in Scala. Am 05.09.2016 um 16:14 schrieb Ashok Kumar: Hi, I want to filter them for values. This is what is in array 74,20160905-133143,98.11218069128827594148 I want to filter anything > 50.0 in the third column Thanks On Monday,

Re: seeing this message repeatedly.

2016-09-04 Thread Fridtjof Sander

Have you followed this? http://spark.apache.org/docs/latest/spark-standalone.html It sounds more like your master is not connected to any executor. Hence, no resources are available. Am 04.09.16 um 05:34 schrieb kant kodali: I don't think my driver program which is running on my local machine

Re: any idea what this error could be?

2016-09-03 Thread Fridtjof Sander

t;compile group: 'org.apache.spark' name: 'spark-streaming_2.10' version: >'2.0.0' >on the executor side I don't know what jars are being used but I have >installed >using this zip filespark-2.0.0-bin-hadoop2.7.tgz > > > > > > >

Re: any idea what this error could be?

2016-09-03 Thread Fridtjof Sander

There is an InvalidClassException complaining about non-matching serialVersionUIDs. Shouldn't that be caused by different jars on executors and driver? Am 03.09.2016 1:04 nachm. schrieb "Tal Grynbaum" : > My guess is that you're running out of memory somewhere. Try to increase > the driver memor

Re: Grouping on bucketed and sorted columns

2016-09-02 Thread Fridtjof Sander

ntation available in 2.0.0. I would highly appreciate some feedback to my thoughts and questions Am 31.08.2016 um 14:45 schrieb Fridtjof Sander: Hi Spark users, I'm currently investigating spark's bucketing and partitioning capabilities and I have some questions: Let /T/ be a table

Grouping on bucketed and sorted columns

2016-08-31 Thread Fridtjof Sander

Hi Spark users, I'm currently investigating spark's bucketing and partitioning capabilities and I have some questions: Let /T/ be a table that is bucketed and sorted by /T.id/ and partitioned by /T.date/. Before persisting, /T/ has been repartitioned by /T.id/ to get only one file per bucket

Re: Isotonic Regression, run method overloaded Error

2016-07-11 Thread Fridtjof Sander

IsotonicRegression().setIsotonic(true) val model = ir.fit(dataset) val predictions = model .transform(dataset) .select("prediction").rdd.map { case Row(pred) => pred }.collect() assert(predictions === Array(1, 2, 2, 2, 6, 16.5, 16.5, 17, 18)) | Thanks Yanbo 2016-07-11 6:14 GMT-0

Re: Isotonic Regression, run method overloaded Error

2016-07-11 Thread Fridtjof Sander

Hi Swaroop, from my understanding, Isotonic Regression is currently limited to data with 1 feature plus weight and label. Also the entire data is required to fit into memory of a single machine. I did some work on the latter issue but discontinued the project, because I felt no one really need

Re: Can not control bucket files number if it was speficed

Re: How to make the result of sortByKey distributed evenly?

Re: Splitting columns from a text file

Re: seeing this message repeatedly.

Re: any idea what this error could be?

Re: any idea what this error could be?

Re: Grouping on bucketed and sorted columns

Grouping on bucketed and sorted columns

Re: Isotonic Regression, run method overloaded Error

Re: Isotonic Regression, run method overloaded Error

10 matches

Site Navigation

Mail list logo

Footer information