Hi;
I am trying multiclass text classification with Randomforest Classifier on
my local computer(16 GB RAM, 4 physical cores ).
When i run with the parameters below, i am getting
"java.lang.OutOfMemoryError: GC overhead limit exceeded" error.
spark-submit --driver-memory 1G --driver-memory
Hi;
I have 2 dataframes. I am trying to cross join for finding vector distances.
Then i can choose the most similiar vectors.
Cross join speed is too slow. How can i increase the speed, or have you any
suggestion for this comparision?
val result=myDict.join(mainDataset).map(x=>{
Spark is not running on mesos, runing only client mode.
From: Rodrick Brown [mailto:rodr...@orchardplatform.com]
Sent: Monday, November 7, 2016 8:15 PM
To: Kürşat Kurt <kur...@kursatkurt.com>
Cc: Sean Owen <so...@cloudera.com>; User <user@spark.apache.org>
Subject: Re: Out
:21 PM
To: Kürşat Kurt <kur...@kursatkurt.com>; user@spark.apache.org
Subject: Re: Out of memory at 60GB free memory.
You say "out of memory", and you allocate a huge amount of driver memory, but,
it's your executor that's running out of memory. You want --executor-memory.
You ca
Any idea about this?
From: Kürşat Kurt [mailto:kur...@kursatkurt.com]
Sent: Sunday, October 30, 2016 7:59 AM
To: 'Jörn Franke' <jornfra...@gmail.com>
Cc: 'user@spark.apache.org' <user@spark.apache.org>
Subject: RE: Out Of Memory issue
Hi Jörn;
I am reading 300.000 l
Hi;
While training NaiveBayes classification, i am getting OOM.
What is wrong with these parameters?
Here is the spark-submit command: ./spark-submit --class main.scala.Test1
--master local[*] --driver-memory 60g /home/user1/project_2.11-1.0.jar
Ps: Os is Ubuntu 14.04 and system has
Hi;
I am trying to train Random forest classifier.
I have predefined classification set (classifications.csv , ~300.000 line)
While fitting, i am getting "Size exceeds Integer.MAX_VALUE" error.
Here is the code:
object Test1 {
var savePath = "c:/Temp/SparkModel/"
var