[jira] [Commented] (SPARK-21551) pyspark's collect fails when getaddrinfo is too slow

2017-10-17 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207220#comment-16207220 ] Frank Rosner commented on SPARK-21551: -- Do you guys mind if I backport this also to 2.0.x, 2.1.x,

[jira] [Commented] (SPARK-18649) sc.textFile(my_file).collect() raises socket.timeout on large files

2017-10-17 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207217#comment-16207217 ] Frank Rosner commented on SPARK-18649: -- Looks like in SPARK-21551 they increased the hard coded

[jira] [Commented] (SPARK-20489) Different results in local mode and yarn mode when working with dates (silent corruption due to system timezone setting)

2017-05-15 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010251#comment-16010251 ] Frank Rosner commented on SPARK-20489: -- The problem is that {{java.util.Date}} is an instant, not a

[jira] [Comment Edited] (SPARK-20489) Different results in local mode and yarn mode when working with dates (race condition with SimpleDateFormat?)

2017-04-28 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988350#comment-15988350 ] Frank Rosner edited comment on SPARK-20489 at 4/28/17 7:25 AM: --- In

[jira] [Commented] (SPARK-20489) Different results in local mode and yarn mode when working with dates (race condition with SimpleDateFormat?)

2017-04-28 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988350#comment-15988350 ] Frank Rosner commented on SPARK-20489: -- In

[jira] [Comment Edited] (SPARK-20489) Different results in local mode and yarn mode when working with dates (race condition with SimpleDateFormat?)

2017-04-28 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988350#comment-15988350 ] Frank Rosner edited comment on SPARK-20489 at 4/28/17 7:24 AM: --- In

[jira] [Commented] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575232#comment-15575232 ] Frank Rosner commented on SPARK-17933: -- Thanks [~srowen]. I know a lot of discussions about the

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: (was: screenshot-1.png) > Shuffle fails when driver is on one of the same

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: screenshot-1.png > Shuffle fails when driver is on one of the same machines as

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Description: h4. Problem When I run a job that requires some shuffle, some tasks fail because

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Description: h4. Problem When I run a job that requires some shuffle, some tasks fail because

[jira] [Created] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
Frank Rosner created SPARK-17933: Summary: Shuffle fails when driver is on one of the same machines as executor Key: SPARK-17933 URL: https://issues.apache.org/jira/browse/SPARK-17933 Project: Spark

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: screenshot-2.png screenshot-1.png > Shuffle fails when driver is on

[jira] [Commented] (SPARK-12823) Cannot create UDF with StructType input

2016-01-20 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108689#comment-15108689 ] Frank Rosner commented on SPARK-12823: -- Ok :( :D > Cannot create UDF with StructType input >

[jira] [Commented] (SPARK-12823) Cannot create UDF with StructType input

2016-01-20 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109034#comment-15109034 ] Frank Rosner commented on SPARK-12823: -- Ok :( :D > Cannot create UDF with StructType input >

[jira] [Issue Comment Deleted] (SPARK-12823) Cannot create UDF with StructType input

2016-01-20 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-12823: - Comment: was deleted (was: Ok :( :D) > Cannot create UDF with StructType input >

[jira] [Commented] (SPARK-12823) Cannot create UDF with StructType input

2016-01-20 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108342#comment-15108342 ] Frank Rosner commented on SPARK-12823: -- Any thoughts on this one [~srowen]? > Cannot create UDF

[jira] [Updated] (SPARK-12823) Cannot create UDF with StructType input

2016-01-18 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-12823: - Description: h5. Problem It is not possible to apply a UDF to a column that has a struct data

[jira] [Updated] (SPARK-12823) Cannot create UDF with StructType input

2016-01-18 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-12823: - Description: h5. Problem It is not possible to apply a UDF to a column that has a struct data

[jira] [Updated] (SPARK-12823) Cannot create UDF with StructType input

2016-01-18 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-12823: - Description: h5. Problem It is not possible to apply a UDF to a column that has a struct data

[jira] [Updated] (SPARK-12823) Cannot create UDF with StructType input

2016-01-18 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-12823: - Description: h5. Problem It is not possible to apply a UDF to a column that has a struct data

[jira] [Created] (SPARK-12823) Cannot create UDF with StructType input

2016-01-14 Thread Frank Rosner (JIRA)
Frank Rosner created SPARK-12823: Summary: Cannot create UDF with StructType input Key: SPARK-12823 URL: https://issues.apache.org/jira/browse/SPARK-12823 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11258) Remove quadratic runtime complexity for converting a Spark DataFrame into an R data.frame

2015-10-23 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971025#comment-14971025 ] Frank Rosner commented on SPARK-11258: -- Actually I am pretty confused now. Thinking about it, having

[jira] [Updated] (SPARK-11258) Converting a Spark DataFrame into an R data.frame is slow / requires a lot of memory

2015-10-23 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-11258: - Description: h4. Problem We tried to collect a DataFrame with > 1 million rows and a few

[jira] [Updated] (SPARK-11258) Converting a Spark DataFrame into an R data.frame is slow / requires a lot of memory

2015-10-23 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-11258: - Description: h4. Problem We tried to collect a DataFrame with > 1 million rows and a few

[jira] [Updated] (SPARK-11258) Converting a Spark DataFrame into an R data.frame is slow / requires a lot of memory

2015-10-23 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-11258: - Summary: Converting a Spark DataFrame into an R data.frame is slow / requires a lot of memory

[jira] [Commented] (SPARK-11258) Converting a Spark DataFrame into an R data.frame is slow / requires a lot of memory

2015-10-23 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971155#comment-14971155 ] Frank Rosner commented on SPARK-11258: -- I adjusted the description to be more general. I will see if

[jira] [Created] (SPARK-11258) Remove quadratic runtime complexity for converting a Spark DataFrame into an R data.frame

2015-10-22 Thread Frank Rosner (JIRA)
Frank Rosner created SPARK-11258: Summary: Remove quadratic runtime complexity for converting a Spark DataFrame into an R data.frame Key: SPARK-11258 URL: https://issues.apache.org/jira/browse/SPARK-11258

[jira] [Updated] (SPARK-11258) Remove quadratic runtime complexity for converting a Spark DataFrame into an R data.frame

2015-10-22 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-11258: - Description: h4. Introduction We tried to collect a DataFrame with > 1 million rows and a few

[jira] [Commented] (SPARK-10493) reduceByKey not returning distinct results

2015-09-08 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735598#comment-14735598 ] Frank Rosner commented on SPARK-10493: -- Thanks for submitting the issue, [~glenn.strycker] :) Can

[jira] [Commented] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696889#comment-14696889 ] Frank Rosner commented on SPARK-9971: - Ok so shall we close it as a wontfix?

[jira] [Created] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
Frank Rosner created SPARK-9971: --- Summary: MaxFunction not working correctly with columns containing Double.NaN Key: SPARK-9971 URL: https://issues.apache.org/jira/browse/SPARK-9971 Project: Spark

[jira] [Updated] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-9971: Priority: Minor (was: Major) MaxFunction not working correctly with columns containing Double.NaN

[jira] [Updated] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-9971: Description: h4. Problem Description When using the {{max}} function on a {{DoubleType}} column

[jira] [Commented] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696760#comment-14696760 ] Frank Rosner commented on SPARK-9971: - I would like to provide a patch to make the

[jira] [Commented] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696766#comment-14696766 ] Frank Rosner commented on SPARK-9971: - [~srowen] I get your point. But given your

[jira] [Comment Edited] (SPARK-9971) MaxFunction not working correctly with columns containing Double.NaN

2015-08-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696760#comment-14696760 ] Frank Rosner edited comment on SPARK-9971 at 8/14/15 9:34 AM: --

[jira] [Commented] (SPARK-6480) histogram() bucket function is wrong in some simple edge cases

2015-03-26 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381588#comment-14381588 ] Frank Rosner commented on SPARK-6480: - [~srowen] will do today! histogram() bucket

[jira] [Commented] (SPARK-6480) histogram() bucket function is wrong in some simple edge cases

2015-03-24 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377611#comment-14377611 ] Frank Rosner commented on SPARK-6480: - Thanks for picking it up [~srowen]!

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2015-01-21 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285381#comment-14285381 ] Frank Rosner commented on SPARK-2620: - Why is the Spark REPL wrapping user code into

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2015-01-19 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282323#comment-14282323 ] Frank Rosner commented on SPARK-2620: - The issue is caused by the fact that pattern