[jira] [Commented] (SPARK-30629) cleanClosure on recursive call leads to node stack overflow

2020-01-25 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023715#comment-17023715 ] Hossein Falaki commented on SPARK-30629: Yes, this is a good example. There must

[jira] [Commented] (SPARK-30629) cleanClosure on recursive call leads to node stack overflow

2020-01-25 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023712#comment-17023712 ] Hossein Falaki commented on SPARK-30629: Oh yes. I think we can disable that spe

[jira] [Commented] (SPARK-30629) cleanClosure on recursive call leads to node stack overflow

2020-01-25 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023710#comment-17023710 ] Hossein Falaki commented on SPARK-30629: [~zero323] that indefinite recursive ca

[jira] [Updated] (SPARK-29777) SparkR::cleanClosure aggressively removes a function required by user function

2019-11-06 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-29777: --- Description: Following code block reproduces the issue: {code} df <- createDataFrame(data.fr

[jira] [Updated] (SPARK-29777) SparkR::cleanClosure aggressively removes a function required by user function

2019-11-06 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-29777: --- Description: Following code block reproduces the issue: {code:java} library(SparkR) sparkR.s

[jira] [Updated] (SPARK-29777) SparkR::cleanClosure aggressively removes a function required by user function

2019-11-06 Thread Hossein Falaki (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-29777: --- Description: Following code block reproduces the issue: {code:java} library(SparkR) sparkR.s

[jira] [Created] (SPARK-29777) SparkR::cleanClosure aggressively removes a function required by user function

2019-11-06 Thread Hossein Falaki (Jira)
Hossein Falaki created SPARK-29777: -- Summary: SparkR::cleanClosure aggressively removes a function required by user function Key: SPARK-29777 URL: https://issues.apache.org/jira/browse/SPARK-29777 Pr

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513008#comment-16513008 ] Hossein Falaki commented on SPARK-24359: [~shivaram] I like that. > SPIP: ML Pi

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512959#comment-16512959 ] Hossein Falaki commented on SPARK-24359: Considering that I am volunteering myse

[jira] [Comment Edited] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501254#comment-16501254 ] Hossein Falaki edited comment on SPARK-24359 at 6/5/18 3:51 AM: --

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501254#comment-16501254 ] Hossein Falaki commented on SPARK-24359: [~shivaram] what prevents us from creat

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-01 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498842#comment-16498842 ] Hossein Falaki commented on SPARK-24359: Yes. My bad, I meant releasing an updat

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-26 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-26 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Attachment: SparkML_ ML Pipelines in R-v3.pdf > SPIP: ML Pipelines in R > ---

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-26 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491907#comment-16491907 ] Hossein Falaki commented on SPARK-24359: Thank you [~josephkb] and [~felixcheung]

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-24 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489944#comment-16489944 ] Hossein Falaki commented on SPARK-24359: Thank you guys for feedback. I updated t

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-24 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-24 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Attachment: SparkML_ ML Pipelines in R-v2.pdf > SPIP: ML Pipelines in R > ---

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-23 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487815#comment-16487815 ] Hossein Falaki commented on SPARK-24359: Thanks [~shivaram] and [~zero323]. It se

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486799#comment-16486799 ] Hossein Falaki commented on SPARK-24359: Thanks for reviewing [~felixcheung]. #

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Created] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-24359: -- Summary: SPIP: ML Pipelines in R Key: SPARK-24359 URL: https://issues.apache.org/jira/browse/SPARK-24359 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Attachment: SparkML_ ML Pipelines in R.pdf > SPIP: ML Pipelines in R > --

[jira] [Commented] (SPARK-23114) Spark R 2.3 QA umbrella

2018-01-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334885#comment-16334885 ] Hossein Falaki commented on SPARK-23114: [~felixcheung] I don't have any datasets

[jira] [Commented] (SPARK-17762) invokeJava fails when serialized argument list is larger than INT_MAX (2,147,483,647) bytes

2018-01-02 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309053#comment-16309053 ] Hossein Falaki commented on SPARK-17762: I think SPARK-17790 is one place where t

[jira] [Commented] (SPARK-22766) Install R linter package in spark lib directory

2017-12-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301751#comment-16301751 ] Hossein Falaki commented on SPARK-22766: [~felixcheung] This just makes installat

[jira] [Commented] (SPARK-22812) Failing cran-check on master

2017-12-16 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293940#comment-16293940 ] Hossein Falaki commented on SPARK-22812: Do you know what is being checked in tha

[jira] [Updated] (SPARK-22812) Failing cran-check on master

2017-12-16 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-22812: --- Priority: Minor (was: Major) > Failing cran-check on master > -

[jira] [Created] (SPARK-22812) Failing cran-check on master

2017-12-15 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-22812: -- Summary: Failing cran-check on master Key: SPARK-22812 URL: https://issues.apache.org/jira/browse/SPARK-22812 Project: Spark Issue Type: Bug C

[jira] [Created] (SPARK-22766) Install R linter package in spark lib directory

2017-12-12 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-22766: -- Summary: Install R linter package in spark lib directory Key: SPARK-22766 URL: https://issues.apache.org/jira/browse/SPARK-22766 Project: Spark Issue Typ

[jira] [Commented] (SPARK-22344) Prevent R CMD check from using /tmp

2017-10-26 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221460#comment-16221460 ] Hossein Falaki commented on SPARK-22344: I don't have solid pointer as to why we

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-10-24 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217279#comment-16217279 ] Hossein Falaki commented on SPARK-15799: Is there a ticket to follow up on new po

[jira] [Commented] (SPARK-17902) collect() ignores stringsAsFactors

2017-10-18 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209958#comment-16209958 ] Hossein Falaki commented on SPARK-17902: A simple unit test we could add would be

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-10-12 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202385#comment-16202385 ] Hossein Falaki commented on SPARK-15799: Congrats everyone. Thanks for the hard w

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-09-25 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179754#comment-16179754 ] Hossein Falaki commented on SPARK-15799: It seems we can trivially use {{with}} i

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-09-25 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179716#comment-16179716 ] Hossein Falaki commented on SPARK-15799: Hi guys, are there any updates on this?

[jira] [Created] (SPARK-21940) Support timezone for timestamps in SparkR

2017-09-06 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-21940: -- Summary: Support timezone for timestamps in SparkR Key: SPARK-21940 URL: https://issues.apache.org/jira/browse/SPARK-21940 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-08-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137639#comment-16137639 ] Hossein Falaki commented on SPARK-15799: Hi [~felixcheung] would you please share

[jira] [Resolved] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki resolved SPARK-21450. Resolution: Not A Bug > List of NA is flattened inside a SparkR struct type > -

[jira] [Commented] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091194#comment-16091194 ] Hossein Falaki commented on SPARK-21450: Thanks guys. I will close it as not an i

[jira] [Commented] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091149#comment-16091149 ] Hossein Falaki commented on SPARK-21450: Yes, it was a copy/paste typo. Here is a

[jira] [Updated] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-21450: --- Description: Consider the following two cases copied from {{test_sparkSQL.R}}: {code} df <-

[jira] [Updated] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-21450: --- Description: Consider the following two cases copied from {{test_sparkSQL.R}}: {code} df <-

[jira] [Updated] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-21450: --- Description: Consider the following two cases copied from {{test_sparkSQL.R}}: {code} df <-

[jira] [Created] (SPARK-21450) List of NA is flattened inside a SparkR struct type

2017-07-17 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-21450: -- Summary: List of NA is flattened inside a SparkR struct type Key: SPARK-21450 URL: https://issues.apache.org/jira/browse/SPARK-21450 Project: Spark Issue

[jira] [Commented] (SPARK-21263) NumberFormatException is not thrown while converting an invalid string to float/double

2017-07-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074109#comment-16074109 ] Hossein Falaki commented on SPARK-21263: [~sowen] note that user specified the mo

[jira] [Updated] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-20684: --- Summary: expose createGlobalTempView and dropGlobalTempView in SparkR (was: expose createGlo

[jira] [Commented] (SPARK-20684) expose createGlobalTempView in SparkR

2017-05-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005552#comment-16005552 ] Hossein Falaki commented on SPARK-20684: Yes I agree. > expose createGlobalTempV

[jira] [Created] (SPARK-20684) expose createGlobalTempView in SparkR

2017-05-09 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-20684: -- Summary: expose createGlobalTempView in SparkR Key: SPARK-20684 URL: https://issues.apache.org/jira/browse/SPARK-20684 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-20661) SparkR tableNames() test fails

2017-05-08 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-20661: -- Summary: SparkR tableNames() test fails Key: SPARK-20661 URL: https://issues.apache.org/jira/browse/SPARK-20661 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-20088) Do not create new SparkContext in SparkR createSparkContext

2017-03-24 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-20088: -- Summary: Do not create new SparkContext in SparkR createSparkContext Key: SPARK-20088 URL: https://issues.apache.org/jira/browse/SPARK-20088 Project: Spark

[jira] [Created] (SPARK-20007) Make SparkR apply() functions robust to workers that return empty data.frame

2017-03-17 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-20007: -- Summary: Make SparkR apply() functions robust to workers that return empty data.frame Key: SPARK-20007 URL: https://issues.apache.org/jira/browse/SPARK-20007 Proj

[jira] [Commented] (SPARK-18924) Improve collect/createDataFrame performance in SparkR

2016-12-19 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15762687#comment-15762687 ] Hossein Falaki commented on SPARK-18924: Would be good to think about this along

[jira] [Updated] (SPARK-18011) SparkR serialize "NA" throws exception

2016-10-19 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-18011: --- Description: For some versions of R, if Date has "NA" field, backend will throw negative ind

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-15 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15578607#comment-15578607 ] Hossein Falaki commented on SPARK-17878: I think moving it to another ticket is a

[jira] [Commented] (SPARK-17916) CSV data source treats empty string as null no matter what nullValue option is

2016-10-13 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573692#comment-15573692 ] Hossein Falaki commented on SPARK-17916: Thanks for linking it. Yes they are very

[jira] [Created] (SPARK-17919) Make timeout to RBackend configurable in SparkR

2016-10-13 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17919: -- Summary: Make timeout to RBackend configurable in SparkR Key: SPARK-17919 URL: https://issues.apache.org/jira/browse/SPARK-17919 Project: Spark Issue Typ

[jira] [Created] (SPARK-17916) CSV data source treats empty string as null no matter what nullValue option is

2016-10-13 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17916: -- Summary: CSV data source treats empty string as null no matter what nullValue option is Key: SPARK-17916 URL: https://issues.apache.org/jira/browse/SPARK-17916 Pr

[jira] [Commented] (SPARK-17902) collect() ignores stringsAsFactors

2016-10-13 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572689#comment-15572689 ] Hossein Falaki commented on SPARK-17902: Thanks for the pointer [~shivaram]. I wi

[jira] [Created] (SPARK-17902) collect() ignores stringsAsFactors

2016-10-12 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17902: -- Summary: collect() ignores stringsAsFactors Key: SPARK-17902 URL: https://issues.apache.org/jira/browse/SPARK-17902 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567180#comment-15567180 ] Hossein Falaki commented on SPARK-17878: Sure. If passing a list is possible it i

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567104#comment-15567104 ] Hossein Falaki commented on SPARK-17878: That would require API change in SparkSQ

[jira] [Created] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17878: -- Summary: Support for multiple null values when reading CSV data Key: SPARK-17878 URL: https://issues.apache.org/jira/browse/SPARK-17878 Project: Spark Is

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566541#comment-15566541 ] Hossein Falaki commented on SPARK-17781: [~shivaram] Thanks for looking into it.

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563944#comment-15563944 ] Hossein Falaki commented on SPARK-17781: Yes, but somehow inside {{worker.R}} Dat

[jira] [Commented] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563506#comment-15563506 ] Hossein Falaki commented on SPARK-17811: Thanks [~wm624] for looking into it. I s

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563193#comment-15563193 ] Hossein Falaki commented on SPARK-17781: I investigated the issue. The root cause

[jira] [Created] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-06 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17811: -- Summary: SparkR cannot parallelize data.frame with NA or NULL in Date columns Key: SPARK-17811 URL: https://issues.apache.org/jira/browse/SPARK-17811 Project: Spa

[jira] [Commented] (SPARK-17774) Add support for head on DataFrame Column

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15550488#comment-15550488 ] Hossein Falaki commented on SPARK-17774: I agree. I think if we decouple {{head}}

[jira] [Updated] (SPARK-17790) Support for parallelizing data.frame larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17790: --- Issue Type: Sub-task (was: Story) Parent: SPARK-6235 > Support for parallelizing dat

[jira] [Updated] (SPARK-17790) Support for parallelizing R data.frame larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17790: --- Summary: Support for parallelizing R data.frame larger than 2GB (was: Support for paralleliz

[jira] [Commented] (SPARK-17790) Support for parallelizing data.frame larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549996#comment-15549996 ] Hossein Falaki commented on SPARK-17790: Thanks for pointing it out. SPARK-6235 s

[jira] [Updated] (SPARK-17790) Support for parallelizing data.frame larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17790: --- Summary: Support for parallelizing data.frame larger than 2GB (was: Support for parallelizin

[jira] [Commented] (SPARK-17790) Support for parallelizing/creating DataFrame on data larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549932#comment-15549932 ] Hossein Falaki commented on SPARK-17790: [~shivaram] and [~mengxr] just double ch

[jira] [Created] (SPARK-17790) Support for parallelizing/creating DataFrame on data larger than 2GB

2016-10-05 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17790: -- Summary: Support for parallelizing/creating DataFrame on data larger than 2GB Key: SPARK-17790 URL: https://issues.apache.org/jira/browse/SPARK-17790 Project: Spa

[jira] [Commented] (SPARK-17774) Add support for head on DataFrame Column

2016-10-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549913#comment-15549913 ] Hossein Falaki commented on SPARK-17774: I strongly feel {{head}} should work, bu

[jira] [Updated] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17781: --- Description: When we ship a SparkDataFrame to workers for dapply family functions, inside th

[jira] [Updated] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17781: --- Affects Version/s: 2.0.0 > datetime is serialized as double inside dapply() > ---

[jira] [Updated] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17781: --- Affects Version/s: (was: 2.0.1) > datetime is serialized as double inside dapply() >

[jira] [Updated] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17781: --- Description: When we ship a SparkDataFrame to workers for dapply family functions, inside th

[jira] [Updated] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17781: --- Description: When we ship a SparkDataFrame to workers for dapply family functions, inside th

[jira] [Commented] (SPARK-17774) Add support for head on DataFrame Column

2016-10-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547232#comment-15547232 ] Hossein Falaki commented on SPARK-17774: Putting implementation aside, throwing a

[jira] [Created] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-04 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17781: -- Summary: datetime is serialized as double inside dapply() Key: SPARK-17781 URL: https://issues.apache.org/jira/browse/SPARK-17781 Project: Spark Issue Ty

[jira] [Updated] (SPARK-17774) Add support for head on DataFrame Column

2016-10-03 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17774: --- Description: There was a lot of discussion on SPARK-9325. To summarize the conversation on t

[jira] [Updated] (SPARK-17774) Add support for head on DataFrame Column

2016-10-03 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17774: --- Description: There was a lot of discussion on SPARK-9325. To summarize the conversation on t

[jira] [Created] (SPARK-17774) Add support for head on DataFrame Column

2016-10-03 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17774: -- Summary: Add support for head on DataFrame Column Key: SPARK-17774 URL: https://issues.apache.org/jira/browse/SPARK-17774 Project: Spark Issue Type: Sub-

[jira] [Updated] (SPARK-17762) invokeJava fails when serialized argument list is larger than INT_MAX (2,147,483,647) bytes

2016-10-02 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17762: --- Summary: invokeJava fails when serialized argument list is larger than INT_MAX (2,147,483,647

[jira] [Created] (SPARK-17762) invokeJava serialized argument list is larger than INT_MAX (2,147,483,647) bytes

2016-10-02 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17762: -- Summary: invokeJava serialized argument list is larger than INT_MAX (2,147,483,647) bytes Key: SPARK-17762 URL: https://issues.apache.org/jira/browse/SPARK-17762

[jira] [Updated] (SPARK-17442) Additional arguments in write.df are not passed to data source

2016-09-07 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17442: --- Target Version/s: 2.0.1 > Additional arguments in write.df are not passed to data source > --

[jira] [Updated] (SPARK-17442) Additional arguments in write.df are not passed to data source

2016-09-07 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17442: --- Priority: Blocker (was: Critical) > Additional arguments in write.df are not passed to data

[jira] [Updated] (SPARK-17442) Additional arguments in write.df are not passed to data source

2016-09-07 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-17442: --- Priority: Critical (was: Major) > Additional arguments in write.df are not passed to data so

[jira] [Created] (SPARK-17442) Additional arguments in write.df are not passed to data source

2016-09-07 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17442: -- Summary: Additional arguments in write.df are not passed to data source Key: SPARK-17442 URL: https://issues.apache.org/jira/browse/SPARK-17442 Project: Spark

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-05 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410105#comment-15410105 ] Hossein Falaki commented on SPARK-16883: Thanks [~shivaram]! This may require cha

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408728#comment-15408728 ] Hossein Falaki commented on SPARK-16896: I suggest we generally follow the restri

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408722#comment-15408722 ] Hossein Falaki commented on SPARK-16903: Thanks for the info. That make me doubt

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408621#comment-15408621 ] Hossein Falaki commented on SPARK-16883: I think that is because we are not conve

[jira] [Created] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-16903: -- Summary: nullValue in first field is not respected by CSV source when read Key: SPARK-16903 URL: https://issues.apache.org/jira/browse/SPARK-16903 Project: Spark

  1   2   >