date:20151023

slightly more informative error message in MLUtils.loadLibSVMFile

2015-10-23 Thread Robert Dodier

Hi, MLUtils.loadLibSVMFile verifies that indices are 1-based and increasing, and otherwise triggers an error. I'd like to suggest that the error message be a little more informative. I ran into this when loading a malformed file. Exactly what gets printed isn't too crucial, maybe you would want to

Spark.Executor.Cores question

2015-10-23 Thread mkhaitman

Regarding the 'spark.executor.cores' config option in a Standalone spark environment, I'm curious about whether there's a way to enforce the following logic: *- Max cores per executor = 4* ** Max executors PER application PER worker = 1* In order to force better balance across all workers, I want

RE: Dataframe nested schema inference from Json without type conflicts

2015-10-23 Thread Ewan Leith

Hi all, It’s taken us a while, but one of my colleagues has made the pull request on github for our proposed solution to this, https://issues.apache.org/jira/browse/SPARK-10947 https://github.com/apache/spark/pull/9249 It adds a parameter to the Json read otpions to force all primitives as a S

Re: Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

2015-10-23 Thread 周千昊

We have not tried that yet, however both implementations on MR and spark are tested on the same amount of partition and same cluster 250635...@qq.com <250635...@qq.com>于2015年10月23日周五下午5:21写道： > Hi, > > Not an expert on this kind of implementation. But referring to the > performance result, > > i

Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

2015-10-23 Thread Li Yang

Any advise on how to tune the repartitionAndSortWithinPartitions stage? Any particular metrics or parameter to look into? Basically Spark and MR shuffles the same amount of data, cause we kinda copied MR implementation into Spark. Let us know if more info is needed. On Fri, Oct 23, 2015 at 10:24

slightly more informative error message in MLUtils.loadLibSVMFile

Spark.Executor.Cores question

RE: Dataframe nested schema inference from Json without type conflicts

Re: Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

5 matches

Site Navigation

Mail list logo

Footer information