[jira] [Commented] (SPARK-18788) Add getNumPartitions() to SparkR

2016-12-09 Thread Raela Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736836#comment-15736836
 ] 

Raela Wang commented on SPARK-18788:


Yes, RDD support in SparkR has been removed, which is why I think it is worth 
it to wrap this and add it to the current SparkR API. This would be really 
useful to users who are trying out the new UDFs too (dapply - apply a function 
to each partition of a SparkDataFrame...but how many partitions do I have?).

> Add getNumPartitions() to SparkR
> 
>
> Key: SPARK-18788
> URL: https://issues.apache.org/jira/browse/SPARK-18788
> Project: Spark
>  Issue Type: New Feature
>  Components: SparkR
>Reporter: Raela Wang
>Priority: Minor
>
> Would be really convenient to have getNumPartitions() in SparkR, which was in 
> the RDD API.
> rdd <- SparkR:::toRDD(df)
> SparkR:::getNumPartitions(rdd)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-18788) Add getNumPartitions() to SparkR

2016-12-08 Thread Raela Wang (JIRA)
Raela Wang created SPARK-18788:
--

 Summary: Add getNumPartitions() to SparkR
 Key: SPARK-18788
 URL: https://issues.apache.org/jira/browse/SPARK-18788
 Project: Spark
  Issue Type: New Feature
  Components: SparkR
Reporter: Raela Wang
Priority: Minor


Would be really convenient to have getNumPartitions() in SparkR, which was in 
the RDD API.

rdd <- SparkR:::toRDD(df)
SparkR:::getNumPartitions(rdd)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13274) Fix Aggregator Links on GroupedDataset Scala API

2016-02-10 Thread Raela Wang (JIRA)
Raela Wang created SPARK-13274:
--

 Summary: Fix Aggregator Links on GroupedDataset Scala API 
 Key: SPARK-13274
 URL: https://issues.apache.org/jira/browse/SPARK-13274
 Project: Spark
  Issue Type: Documentation
  Components: Documentation
Reporter: Raela Wang
Priority: Trivial


Update Scala API docs for GroupedDataset. Links in flatMapGroups() and 
mapGroups() are pointing to the wrong Aggregator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11223) PySpark CrossValidatorModel does not output metrics for every param in paramGrid

2015-10-30 Thread Raela Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983112#comment-14983112
 ] 

Raela Wang commented on SPARK-11223:


It wasn't exactly for debugging purposes. It will be convenient for users to 
detect overfitting if they are able to access errors for every model created.

> PySpark CrossValidatorModel does not output metrics for every param in 
> paramGrid
> 
>
> Key: SPARK-11223
> URL: https://issues.apache.org/jira/browse/SPARK-11223
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, PySpark
>Reporter: Raela Wang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-11223) CrossValidatorModel does not output metrics for every param in paramGrid

2015-10-20 Thread Raela Wang (JIRA)
Raela Wang created SPARK-11223:
--

 Summary: CrossValidatorModel does not output metrics for every 
param in paramGrid
 Key: SPARK-11223
 URL: https://issues.apache.org/jira/browse/SPARK-11223
 Project: Spark
  Issue Type: Improvement
  Components: ML, PySpark
Reporter: Raela Wang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10759) Missing Python code example in ML Programming guide

2015-09-22 Thread Raela Wang (JIRA)
Raela Wang created SPARK-10759:
--

 Summary: Missing Python code example in ML Programming guide
 Key: SPARK-10759
 URL: https://issues.apache.org/jira/browse/SPARK-10759
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.5.0
Reporter: Raela Wang
Priority: Minor


http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation

http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org