On 08/13/2016 08:24 PM, guyoh wrote:
> My company is trying to decide whether to use kubernetes or mesos. Since we
> are planning to use Spark in the near future, I was wandering what is the
> best choice for us.
> Thanks,
> Guy
>
Both Kubernetes and Mesos enables you to share your infrastructur
Hi,
I am wondering if it is possible for the spark standalone master UI to
proxy app/driver UI and worker UI. The reason for this is that currently
if you want to access UI of driver and worker to see logs, you need to
have access to their IP:port which makes it harder to open up from
networking p
On 09/10/2015 07:42 AM, Tathagata Das wrote:
> Rewriting is necessary. You will have to convert RDD/DStream operations
> to DataFrame operations. So get the RDDs in DStream, using
> transform/foreachRDD, convert to DataFrames and then do DataFrame
> operations.
Are there any plans for 1.6 or later
On 09/05/2015 11:22 AM, Reynold Xin wrote:
> Try increase the shuffle memory fraction (by default it is only 16%).
> Again, if you run Spark 1.5, this will probably run a lot faster,
> especially if you increase the shuffle memory fraction ...
Hi Reynold,
Does the 1.5 has better join/cogroup perfo
>
>
> On Tue, Jul 15, 2014 at 11:02 PM, Vinod Kone <mailto:vi...@twitter.com>> wrote:
>
>
> On Fri, Jul 4, 2014 at 2:05 AM, Gurvinder Singh
> mailto:gurvinder.si...@uninett.no>>
> wrote:
>
> ERROR storage.BlockManagerMasterActor: Got two different bl
I want to add that there a regression when using pyspark to read data
from HDFS. its performance during map tasks has gone down approx 1 ->
0.5x. I have tested the 1.0.2 and the performance was fine, but the 1.1
release candidate has this issue. I tested by setting the following
properties to make
this operation.
Any suggestion/help in this regard will be helpful.
- Gurvinder
On 08/14/2014 10:27 AM, Gurvinder Singh wrote:
> Hi,
>
> I am running spark from the git directly. I recently compiled the newer
> version Aug 13 version and it has performance drop of 2-3x in read from
>
Hi,
I am running spark from the git directly. I recently compiled the newer
version Aug 13 version and it has performance drop of 2-3x in read from
HDFS compare to git version of Aug 1. So I am wondering which commit
would have cause such an issue in read performance. The performance is
almost sam
where to look for changing the mesos setting in this
case.
- Gurvinder
>
> On Sun, Aug 3, 2014 at 11:35 PM, Gurvinder Singh
> mailto:gurvinder.si...@uninett.no>> wrote:
>
> On 08/03/2014 02:33 AM, Michael Armbrust wrote:
> > I am not a mesos expert... but it sound
It has
exact size set in -Xms/-Xmx params. Do you if somehow I can find which
class or thread inside the spark jvm process is using how much memory
and see which makes it to reach the memory limit on CacheTable case
where as not in cache RDD case.
- Gurvinder
>
> On Fri, Aug 1, 2014 at 12:07 A
ed for SchemaRDDs. It something similar to MEMORY_ONLY_SER
> but not quite. You can specify the persistence level on the
> SchemaRDD itself and register that as a temporary table, however it
> is likely you will not get as good performance.
>
>
> On Thu, Jul 31, 2014 at
Hi,
I am wondering how can I specify the persistence level in cacheTable. As
it is takes only table name as parameter. It should be possible to
specify the persistence level.
- Gurvinder
On 07/06/2014 05:19 AM, Nicholas Chammas wrote:
> On Fri, Jul 4, 2014 at 3:33 PM, Gurvinder Singh
> mailto:gurvinder.si...@uninett.no>> wrote:
>
> csv =
>
> sc.newAPIHadoopFile(opts.input,"com.hadoop.mapreduce.LzoTextInputFormat",
io.LongWritable","org.apache.hadoop.io.Text").count()
- Gurvinder
On 07/03/2014 06:24 PM, Gurvinder Singh wrote:
> Hi all,
>
> I am trying to read the lzo files. It seems spark recognizes that the
> input file is compressed and got the decompressor as
>
> 14/07/
is not set.
>
> Just to mention again the pyspark works fine as well as spark-shell,
> only when we are running compiled jar it seems SPARK_HOME causes some
> java run time issues that we get class cast exception.
>
> Thanks,
> Gurvinder
> On 07/01/2014 09:28 AM, Gurvinde
We are getting this issue when we are running jobs with close to 1000
workers. Spark is from the github version and mesos is 0.19.0
ERROR storage.BlockManagerMasterActor: Got two different block manager
registrations on 201407031041-1227224054-5050-24004-0
Googling about it seems that mesos is st
fine as well as spark-shell,
only when we are running compiled jar it seems SPARK_HOME causes some
java run time issues that we get class cast exception.
Thanks,
Gurvinder
On 07/01/2014 09:28 AM, Gurvinder Singh wrote:
> Hi,
>
> I am having issue in running scala example code. I have t
Hi all,
I am trying to read the lzo files. It seems spark recognizes that the
input file is compressed and got the decompressor as
14/07/03 18:11:01 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
14/07/03 18:11:01 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [h
Hi,
I am having issue in running scala example code. I have tested and able
to run successfully python example code, but when I run the scala code I
get this error
java.lang.ClassCastException: cannot assign instance of
org.apache.spark.examples.SparkPi$$anonfun$1 to field
org.apache.spark.rdd.Ma
19 matches
Mail list logo