[jira] [Updated] (SPARK-26393) Different behaviors of date_add when calling it inside expr

2018-12-18 Thread Ahmed Kamal` (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Kamal` updated SPARK-26393:
-
Description: 
When Calling date_add from pyspark.sql.functions directly without using expr, 
like this : 
{code:java}
df.withColumn("added", F.date_add(F.to_date(F.lit('1998-9-26')), 
F.col('days'))).toPandas(){code}
It will raise Error : `TypeError: Column is not iterable`

because it only taking a number not a column 

but when i try to use it inside an expr, like this :
{code:java}
df.withColumn("added", F.expr("date_add(to_date('1998-9-26'), 
days)")).toPandas(){code}
It will work fine.

Shouldn't it behave the same way ? 

and i think its logical to accept a column  here as well.

A python Notebook to demonstrate :

[https://gist.github.com/AhmedKamal20/fec10337e815baa44f115d307e3b07eb]

  was:
When Calling date_add from pyspark.sql.functions directly without using expr, 
like this : 
{code:java}
df.withColumn("added", F.date_add(F.to_date(F.lit('1998-9-26')), 
F.col('days'))).toPandas(){code}
It will raise Error : `TypeError: Column is not iterable`

because it only taking a number not a column 

but when i try to use it inside an expr, like this :

 
{code:java}
df.withColumn("added", F.expr("date_add(to_date('1998-9-26'), 
days)")).toPandas()
{code}
 

it will work fine.

 

Shouldn't it behave the same way ? 

and i thin its logical to accept a column  here as well.

 

A python Notebook to demonstrate :

https://gist.github.com/AhmedKamal20/fec10337e815baa44f115d307e3b07eb


> Different behaviors of date_add when calling it inside expr
> ---
>
> Key: SPARK-26393
> URL: https://issues.apache.org/jira/browse/SPARK-26393
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.3.2
>Reporter: Ahmed Kamal`
>Priority: Minor
>
> When Calling date_add from pyspark.sql.functions directly without using expr, 
> like this : 
> {code:java}
> df.withColumn("added", F.date_add(F.to_date(F.lit('1998-9-26')), 
> F.col('days'))).toPandas(){code}
> It will raise Error : `TypeError: Column is not iterable`
> because it only taking a number not a column 
> but when i try to use it inside an expr, like this :
> {code:java}
> df.withColumn("added", F.expr("date_add(to_date('1998-9-26'), 
> days)")).toPandas(){code}
> It will work fine.
> Shouldn't it behave the same way ? 
> and i think its logical to accept a column  here as well.
> A python Notebook to demonstrate :
> [https://gist.github.com/AhmedKamal20/fec10337e815baa44f115d307e3b07eb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26393) Different behaviors of date_add when calling it inside expr

2018-12-18 Thread Ahmed Kamal` (JIRA)
Ahmed Kamal` created SPARK-26393:


 Summary: Different behaviors of date_add when calling it inside 
expr
 Key: SPARK-26393
 URL: https://issues.apache.org/jira/browse/SPARK-26393
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 2.3.2
Reporter: Ahmed Kamal`


When Calling date_add from pyspark.sql.functions directly without using expr, 
like this : 
{code:java}
df.withColumn("added", F.date_add(F.to_date(F.lit('1998-9-26')), 
F.col('days'))).toPandas(){code}
It will raise Error : `TypeError: Column is not iterable`

because it only taking a number not a column 

but when i try to use it inside an expr, like this :

 
{code:java}
df.withColumn("added", F.expr("date_add(to_date('1998-9-26'), 
days)")).toPandas()
{code}
 

it will work fine.

 

Shouldn't it behave the same way ? 

and i thin its logical to accept a column  here as well.

 

A python Notebook to demonstrate :

https://gist.github.com/AhmedKamal20/fec10337e815baa44f115d307e3b07eb



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14516) Clustering evaluator

2016-04-13 Thread Ahmed Kamal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239652#comment-15239652
 ] 

Ahmed Kamal commented on SPARK-14516:
-

I will go through the Mlib code to familiarize myself with its structure. Did 
we agree on the metrics that would be added ?
[~podongfeng] Please let me know how would you share with me your current 
state/design ? A google document may be a good way I think 

> Clustering evaluator
> 
>
> Key: SPARK-14516
> URL: https://issues.apache.org/jira/browse/SPARK-14516
> Project: Spark
>  Issue Type: Brainstorming
>  Components: ML
>Reporter: zhengruifeng
>Priority: Minor
>
> MLlib does not have any general purposed clustering metrics with a ground 
> truth.
> In 
> [Scikit-Learn](http://scikit-learn.org/stable/modules/classes.html#clustering-metrics),
>  there are several kinds of metrics for this.
> It may be meaningful to add some clustering metrics into MLlib.
> This should be added as a {{ClusteringEvaluator}} class of extending 
> {{Evaluator}} in spark.ml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14516) What about adding general clustering metrics?

2016-04-12 Thread Ahmed Kamal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236736#comment-15236736
 ] 

Ahmed Kamal commented on SPARK-14516:
-

[~srowen] I guess this could be a good starter issue for me in Spark. I can 
start working on silhouette if I got assigned. 

> What about adding general clustering metrics?
> -
>
> Key: SPARK-14516
> URL: https://issues.apache.org/jira/browse/SPARK-14516
> Project: Spark
>  Issue Type: Brainstorming
>  Components: ML, MLlib
>Reporter: zhengruifeng
>Priority: Minor
>
> ML/MLLIB dont have any general purposed clustering metrics with a ground 
> truth.
> In 
> [Scikit-Learn](http://scikit-learn.org/stable/modules/classes.html#clustering-metrics),
>  there are several kinds of metrics for this.
> It may be meaningful to add some clustering metrics into ML/MLLIB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13769) Java Doc needs update in SparkSubmit.scala

2016-03-08 Thread Ahmed Kamal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186689#comment-15186689
 ] 

Ahmed Kamal commented on SPARK-13769:
-

I have created a pull request to fix this issue

https://github.com/apache/spark/pull/11600

> Java Doc needs update in SparkSubmit.scala
> --
>
> Key: SPARK-13769
> URL: https://issues.apache.org/jira/browse/SPARK-13769
> Project: Spark
>  Issue Type: Bug
>Reporter: Ahmed Kamal
>Priority: Minor
>
> The java doc here 
> (https://github.com/apache/spark/blob/e97fc7f176f8bf501c9b3afd8410014e3b0e1602/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L51)
> needs to be updated from "The latter two operations are currently supported 
> only for standalone cluster mode." to "The latter two operations are 
> currently supported only for standalone and mesos cluster mode."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13769) Java Doc needs update in SparkSubmit.scala

2016-03-08 Thread Ahmed Kamal (JIRA)
Ahmed Kamal created SPARK-13769:
---

 Summary: Java Doc needs update in SparkSubmit.scala
 Key: SPARK-13769
 URL: https://issues.apache.org/jira/browse/SPARK-13769
 Project: Spark
  Issue Type: Bug
Reporter: Ahmed Kamal
Priority: Minor


The java doc here 
(https://github.com/apache/spark/blob/e97fc7f176f8bf501c9b3afd8410014e3b0e1602/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L51)
needs to be updated from "The latter two operations are currently supported 
only for standalone cluster mode." to "The latter two operations are currently 
supported only for standalone and mesos cluster mode."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-12528) Make Apache Spark’s gateway hidden REST API (in standalone cluster mode) public API

2016-03-02 Thread Ahmed Kamal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175619#comment-15175619
 ] 

Ahmed Kamal edited comment on SPARK-12528 at 3/2/16 1:54 PM:
-

As mentioned in this issue design document 
(https://issues.apache.org/jira/browse/SPARK-5338) , REST API is supporting 
MESOS too . Why don't Spark also make the API support YARN (YARN already has a 
REST API for jobs submission and monitoring so I can imagine this shouldn't be 
difficult) so it become a standard way to submit jobs in a language independent 
& cluster independent way.


was (Author: akamal):
As mentioned in this issue design document , REST API is supporting MESOS too 
(https://issues.apache.org/jira/browse/SPARK-5338). Why don't Spark also make 
the API support YARN (YARN already has a REST API for jobs submission and 
monitoring so I can imagine this shouldn't be difficult) so it become a 
standard way to submit jobs in a language independent & cluster independent way.

> Make Apache Spark’s gateway hidden REST API (in standalone cluster mode) 
> public API
> ---
>
> Key: SPARK-12528
> URL: https://issues.apache.org/jira/browse/SPARK-12528
> Project: Spark
>  Issue Type: Improvement
>  Components: Deploy
>Affects Versions: 2.0.0
>Reporter: Youcef HILEM
>Priority: Minor
>
> Spark has a hidden REST API which handles application submission, status 
> checking and cancellation (https://issues.apache.org/jira/browse/SPARK-5388).
> There is enough interest using this API to justify making it public :
> - https://github.com/ywilkof/spark-jobs-rest-client
> - https://github.com/yohanliyanage/jenkins-spark-deploy
> - https://github.com/spark-jobserver/spark-jobserver
> - http://stackoverflow.com/questions/28992802/triggering-spark-jobs-with-rest
> - http://stackoverflow.com/questions/34225879/how-to-submit-a-job-via-rest-api
> - http://arturmkrtchyan.com/apache-spark-hidden-rest-api



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12528) Make Apache Spark’s gateway hidden REST API (in standalone cluster mode) public API

2016-03-02 Thread Ahmed Kamal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175619#comment-15175619
 ] 

Ahmed Kamal commented on SPARK-12528:
-

As mentioned in this issue design document , REST API is supporting MESOS too 
(https://issues.apache.org/jira/browse/SPARK-5338). Why don't Spark also make 
the API support YARN (YARN already has a REST API for jobs submission and 
monitoring so I can imagine this shouldn't be difficult) so it become a 
standard way to submit jobs in a language independent & cluster independent way.

> Make Apache Spark’s gateway hidden REST API (in standalone cluster mode) 
> public API
> ---
>
> Key: SPARK-12528
> URL: https://issues.apache.org/jira/browse/SPARK-12528
> Project: Spark
>  Issue Type: Improvement
>  Components: Deploy
>Affects Versions: 2.0.0
>Reporter: Youcef HILEM
>Priority: Minor
>
> Spark has a hidden REST API which handles application submission, status 
> checking and cancellation (https://issues.apache.org/jira/browse/SPARK-5388).
> There is enough interest using this API to justify making it public :
> - https://github.com/ywilkof/spark-jobs-rest-client
> - https://github.com/yohanliyanage/jenkins-spark-deploy
> - https://github.com/spark-jobserver/spark-jobserver
> - http://stackoverflow.com/questions/28992802/triggering-spark-jobs-with-rest
> - http://stackoverflow.com/questions/34225879/how-to-submit-a-job-via-rest-api
> - http://arturmkrtchyan.com/apache-spark-hidden-rest-api



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org