[jira] [Created] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread holdenk (JIRA)
holdenk created SPARK-21231: --- Summary: Conda install of packages during Jenkins testing is causing intermittent failure Key: SPARK-21231 URL: https://issues.apache.org/jira/browse/SPARK-21231 Project: Spark

[jira] [Assigned] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21278: --- Assignee: Dongjoon Hyun > Upgrade to Py4J 0.10.6 > -- > > Key: S

[jira] [Updated] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21278: Fix Version/s: 2.3.0 > Upgrade to Py4J 0.10.6 > -- > > Key: SPARK-21278

[jira] [Resolved] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21278. - Resolution: Fixed Issue resolved by pull request 18546 [https://github.com/apache/spark/pull/18546] > Up

[jira] [Created] (SPARK-21384) Spark 2.2 + YARN without spark.yarn.jars / spark.yarn.archive fails

2017-07-11 Thread holdenk (JIRA)
holdenk created SPARK-21384: --- Summary: Spark 2.2 + YARN without spark.yarn.jars / spark.yarn.archive fails Key: SPARK-21384 URL: https://issues.apache.org/jira/browse/SPARK-21384 Project: Spark Is

[jira] [Commented] (SPARK-21425) LongAccumulator, DoubleAccumulator not threadsafe

2017-07-16 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088842#comment-16088842 ] holdenk commented on SPARK-21425: - Thanks for finding this. I've given some thought in th

[jira] [Created] (SPARK-21434) Add PySpark pip documentation

2017-07-17 Thread holdenk (JIRA)
holdenk created SPARK-21434: --- Summary: Add PySpark pip documentation Key: SPARK-21434 URL: https://issues.apache.org/jira/browse/SPARK-21434 Project: Spark Issue Type: Improvement Compone

[jira] [Created] (SPARK-21436) Take advantage of known partioner for distinct on RDDs

2017-07-17 Thread holdenk (JIRA)
holdenk created SPARK-21436: --- Summary: Take advantage of known partioner for distinct on RDDs Key: SPARK-21436 URL: https://issues.apache.org/jira/browse/SPARK-21436 Project: Spark Issue Type: Impr

[jira] [Updated] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21394: Affects Version/s: (was: 2.2.0) > Reviving broken callable objects in UDF in PySpark >

[jira] [Assigned] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21432: --- Assignee: Hyukjin Kwon > Reviving broken partial functions in UDF in PySpark > -

[jira] [Resolved] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21432. - Resolution: Fixed > Reviving broken partial functions in UDF in PySpark > ---

[jira] [Assigned] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21394: --- Assignee: Hyukjin Kwon > Reviving broken callable objects in UDF in PySpark > --

[jira] [Resolved] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21394. - Resolution: Fixed > Reviving broken callable objects in UDF in PySpark >

[jira] [Updated] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21432: Fix Version/s: 2.3.0 > Reviving broken partial functions in UDF in PySpark > --

[jira] [Updated] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21394: Fix Version/s: 2.3.0 > Reviving broken callable objects in UDF in PySpark > ---

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2017-07-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095299#comment-16095299 ] holdenk commented on SPARK-7146: So it seems like there is a (more recent) agreement that

[jira] [Created] (SPARK-21489) Update release docs to point out Python 2.6 support is removed.

2017-07-20 Thread holdenk (JIRA)
holdenk created SPARK-21489: --- Summary: Update release docs to point out Python 2.6 support is removed. Key: SPARK-21489 URL: https://issues.apache.org/jira/browse/SPARK-21489 Project: Spark Issue

[jira] [Resolved] (SPARK-21489) Update release docs to point out Python 2.6 support is removed.

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21489. - Resolution: Resolved Assignee: Hyukjin Kwon Fix Version/s: 2.2.1 Fixed in https://github

[jira] [Assigned] (SPARK-21434) Add PySpark pip documentation

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21434: --- Assignee: holdenk > Add PySpark pip documentation > - > >

[jira] [Resolved] (SPARK-21434) Add PySpark pip documentation

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21434. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull request 186

[jira] [Resolved] (SPARK-20090) Add StructType.fieldNames to Python API

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20090. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18618 [https://github.com/ap

[jira] [Assigned] (SPARK-20090) Add StructType.fieldNames to Python API

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20090: --- Assignee: Hyukjin Kwon > Add StructType.fieldNames to Python API > -

[jira] [Commented] (SPARK-21573) Tests failing with run-tests.py SyntaxError occasionally in Jenkins

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106024#comment-16106024 ] holdenk commented on SPARK-21573: - Yes we did drop 2.6 support. We should change the scri

[jira] [Commented] (SPARK-16020) Fix complete mode aggregation with console sink

2016-06-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353579#comment-15353579 ] holdenk commented on SPARK-16020: - Do we know why this bug happened? > Fix complete mode

[jira] [Commented] (SPARK-13233) Python Dataset

2016-06-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356670#comment-15356670 ] holdenk commented on SPARK-13233: - [~maver1ck] not really sure what the API plan is here

[jira] [Commented] (SPARK-13233) Python Dataset

2016-06-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356683#comment-15356683 ] holdenk commented on SPARK-13233: - The ability to intermix functional transformations eas

[jira] [Created] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-07-06 Thread holdenk (JIRA)
holdenk created SPARK-16407: --- Summary: Allow users to supply custom StreamSinkProviders Key: SPARK-16407 URL: https://issues.apache.org/jira/browse/SPARK-16407 Project: Spark Issue Type: Improvemen

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-07-07 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366520#comment-15366520 ] holdenk commented on SPARK-15581: - What do we think of Streaming ML Pipelines being on th

[jira] [Created] (SPARK-16424) Add support for Structured Streaming to the ML Pipeline API

2016-07-07 Thread holdenk (JIRA)
holdenk created SPARK-16424: --- Summary: Add support for Structured Streaming to the ML Pipeline API Key: SPARK-16424 URL: https://issues.apache.org/jira/browse/SPARK-16424 Project: Spark Issue Type

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-07-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368320#comment-15368320 ] holdenk commented on SPARK-15581: - Yah - the more I look at it the more rough it seems -

[jira] [Created] (SPARK-16454) Consider adding a per-batch transform for structured streaming

2016-07-08 Thread holdenk (JIRA)
holdenk created SPARK-16454: --- Summary: Consider adding a per-batch transform for structured streaming Key: SPARK-16454 URL: https://issues.apache.org/jira/browse/SPARK-16454 Project: Spark Issue T

[jira] [Updated] (SPARK-16424) Add support for Structured Streaming to the ML Pipeline API

2016-07-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16424: Description: For Spark 2.1 we should consider adding support for machine learning on top of the structured

[jira] [Commented] (SPARK-14813) ML 2.0 QA: API: Python API coverage

2016-07-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371624#comment-15371624 ] holdenk commented on SPARK-14813: - Yup, auditing is done and once 2.0 is out we will go b

[jira] [Updated] (SPARK-15581) MLlib 2.1 Roadmap

2016-07-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-15581: Description: This is a master list for MLlib improvements we are working on for the next release. Please v

[jira] [Commented] (SPARK-16589) Chained cartesian produces incorrect number of records

2016-07-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15390299#comment-15390299 ] holdenk commented on SPARK-16589: - Yah I think we should explore whats going on a bit mor

[jira] [Created] (SPARK-16720) Loading CSV file with 2k+ columns and writing result with one selected column fails during attribute resolution

2016-07-25 Thread holdenk (JIRA)
holdenk created SPARK-16720: --- Summary: Loading CSV file with 2k+ columns and writing result with one selected column fails during attribute resolution Key: SPARK-16720 URL: https://issues.apache.org/jira/browse/SPARK-16

[jira] [Updated] (SPARK-16720) Loading CSV file with 2k+ columns fails during attribute resolution on action

2016-07-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16720: Summary: Loading CSV file with 2k+ columns fails during attribute resolution on action (was: Loading CSV f

[jira] [Commented] (SPARK-15130) PySpark shared params should include default values to match Scala

2016-07-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392716#comment-15392716 ] holdenk commented on SPARK-15130: - Now that 2.0 is ready to go out, maybe we can decide w

[jira] [Created] (SPARK-16773) Post Spark 2.0 deprecation cleanup

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16773: --- Summary: Post Spark 2.0 deprecation cleanup Key: SPARK-16773 URL: https://issues.apache.org/jira/browse/SPARK-16773 Project: Spark Issue Type: Improvement Co

[jira] [Created] (SPARK-16774) Fix use of deprecated TimeStamp constructor

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16774: --- Summary: Fix use of deprecated TimeStamp constructor Key: SPARK-16774 URL: https://issues.apache.org/jira/browse/SPARK-16774 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16775: --- Summary: Reduce internal warnings from deprecated accumulator API Key: SPARK-16775 URL: https://issues.apache.org/jira/browse/SPARK-16775 Project: Spark Issue Type: Su

[jira] [Created] (SPARK-16776) Fix Kafka deprecation warnings

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16776: --- Summary: Fix Kafka deprecation warnings Key: SPARK-16776 URL: https://issues.apache.org/jira/browse/SPARK-16776 Project: Spark Issue Type: Sub-task Component

[jira] [Updated] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16775: Component/s: (was: ML) (was: SQL) (was: MLlib) > Reduce inter

[jira] [Created] (SPARK-16777) Parquet schema converter depends on deprecated APIs

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16777: --- Summary: Parquet schema converter depends on deprecated APIs Key: SPARK-16777 URL: https://issues.apache.org/jira/browse/SPARK-16777 Project: Spark Issue Type: Sub-tas

[jira] [Updated] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16775: Component/s: SQL > Reduce internal warnings from deprecated accumulator API > -

[jira] [Created] (SPARK-16778) Fix use of deprecated SQLContext constructor

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16778: --- Summary: Fix use of deprecated SQLContext constructor Key: SPARK-16778 URL: https://issues.apache.org/jira/browse/SPARK-16778 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-16773) Post Spark 2.0 deprecation & warnings cleanup

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16773: Summary: Post Spark 2.0 deprecation & warnings cleanup (was: Post Spark 2.0 deprecation cleanup) > Post S

[jira] [Created] (SPARK-16779) Fix unnecessary use of postfix operations

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16779: --- Summary: Fix unnecessary use of postfix operations Key: SPARK-16779 URL: https://issues.apache.org/jira/browse/SPARK-16779 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-16779) Fix unnecessary use of postfix operations

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397984#comment-15397984 ] holdenk commented on SPARK-16779: - I'm sort of on the fence with fixing as well - but we

[jira] [Comment Edited] (SPARK-16779) Fix unnecessary use of postfix operations

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397984#comment-15397984 ] holdenk edited comment on SPARK-16779 at 7/28/16 6:36 PM: -- I'm s

[jira] [Commented] (SPARK-16774) Fix use of deprecated TimeStamp constructor

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398424#comment-15398424 ] holdenk commented on SPARK-16774: - While diving into this (relatedly I hate timezones) -

[jira] [Updated] (SPARK-16774) Fix use of deprecated TimeStamp constructor (also providing incorrect results)

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16774: Summary: Fix use of deprecated TimeStamp constructor (also providing incorrect results) (was: Fix use of d

[jira] [Updated] (SPARK-16774) Fix use of deprecated TimeStamp constructor (also providing incorrect results)

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16774: Description: The TimeStamp constructor we use inside of DateTime utils has been deprecated since JDK 1.1 -

[jira] [Updated] (SPARK-16774) Fix use of deprecated TimeStamp constructor

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-16774: Description: The TimeStamp constructor we use inside of DateTime utils has been deprecated since JDK 1.1 -

[jira] [Comment Edited] (SPARK-16774) Fix use of deprecated TimeStamp constructor (also providing incorrect results)

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398424#comment-15398424 ] holdenk edited comment on SPARK-16774 at 7/28/16 11:38 PM: --- Whi

[jira] [Created] (SPARK-16788) Investigate JSR-310 & scala-time alternatives to our own datetime utils

2016-07-28 Thread holdenk (JIRA)
holdenk created SPARK-16788: --- Summary: Investigate JSR-310 & scala-time alternatives to our own datetime utils Key: SPARK-16788 URL: https://issues.apache.org/jira/browse/SPARK-16788 Project: Spark

[jira] [Commented] (SPARK-16788) Investigate JSR-310 & scala-time alternatives to our own datetime utils

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398632#comment-15398632 ] holdenk commented on SPARK-16788: - cc [~davies] [~ckadner] :) > Investigate JSR-310 & sc

[jira] [Commented] (SPARK-16777) Parquet schema converter depends on deprecated APIs

2016-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398827#comment-15398827 ] holdenk commented on SPARK-16777: - That's a good point, thanks for the comment/note :) I

[jira] [Commented] (SPARK-16777) Parquet schema converter depends on deprecated APIs

2016-07-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398869#comment-15398869 ] holdenk commented on SPARK-16777: - Go for it :) Please CC me on the PR so I can do a code

[jira] [Commented] (SPARK-16776) Fix Kafka deprecation warnings

2016-07-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399918#comment-15399918 ] holdenk commented on SPARK-16776: - Is this one you wanted to take on as well? > Fix Kafk

[jira] [Commented] (SPARK-16779) Fix unnecessary use of postfix operations

2016-07-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400092#comment-15400092 ] holdenk commented on SPARK-16779: - I've gone ahead and done a more full-scope version - l

[jira] [Created] (SPARK-16814) Fix deprecated use of ParquetWriter in Parquet test suites

2016-07-30 Thread holdenk (JIRA)
holdenk created SPARK-16814: --- Summary: Fix deprecated use of ParquetWriter in Parquet test suites Key: SPARK-16814 URL: https://issues.apache.org/jira/browse/SPARK-16814 Project: Spark Issue Type:

[jira] [Created] (SPARK-27095) We depend on silently accepting failures in setup-integration-test-env.sh

2019-03-07 Thread holdenk (JIRA)
holdenk created SPARK-27095: --- Summary: We depend on silently accepting failures in setup-integration-test-env.sh Key: SPARK-27095 URL: https://issues.apache.org/jira/browse/SPARK-27095 Project: Spark

[jira] [Resolved] (SPARK-9792) PySpark DenseMatrix, SparseMatrix should override __eq__

2019-04-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-9792. Resolution: Fixed Fix Version/s: 3.0.0 > PySpark DenseMatrix, SparseMatrix should override __eq__ >

[jira] [Commented] (SPARK-18073) Migrate wiki to spark.apache.org web site

2016-10-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602152#comment-15602152 ] holdenk commented on SPARK-18073: - I like the idea of migrating everything off of the wik

[jira] [Reopened] (SPARK-1267) Add a pip installer for PySpark

2016-10-26 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reopened SPARK-1267: re-opening after discussion on mailing list and PR thread. > Add a pip installer for PySpark >

[jira] [Created] (SPARK-18128) Add support for publishing to PyPI

2016-10-26 Thread holdenk (JIRA)
holdenk created SPARK-18128: --- Summary: Add support for publishing to PyPI Key: SPARK-18128 URL: https://issues.apache.org/jira/browse/SPARK-18128 Project: Spark Issue Type: Improvement Co

[jira] [Created] (SPARK-18129) Sign pip artifacts

2016-10-26 Thread holdenk (JIRA)
holdenk created SPARK-18129: --- Summary: Sign pip artifacts Key: SPARK-18129 URL: https://issues.apache.org/jira/browse/SPARK-18129 Project: Spark Issue Type: Improvement Components: PySpar

[jira] [Created] (SPARK-18136) Make PySpark pip install works on windows

2016-10-27 Thread holdenk (JIRA)
holdenk created SPARK-18136: --- Summary: Make PySpark pip install works on windows Key: SPARK-18136 URL: https://issues.apache.org/jira/browse/SPARK-18136 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2016-10-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611611#comment-15611611 ] holdenk commented on SPARK-17602: - This certainly looks interesting, do you maybe have so

[jira] [Updated] (SPARK-18128) Add support for publishing to PyPI

2016-10-27 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18128: Description: After SPARK-1267 is done we should add support for publishing to PyPI similar to how we publi

[jira] [Commented] (SPARK-2868) Support named accumulators in Python

2016-11-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625621#comment-15625621 ] holdenk commented on SPARK-2868: or maybe [~rxin] or [~squito] who have been doing some ot

[jira] [Closed] (SPARK-3981) Consider a better approach to initialize SerDe on executors

2016-11-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-3981. -- Resolution: Won't Fix I'm closing this as a "Won't Fix" for now since we are moving over to the ML APIs. If thi

[jira] [Closed] (SPARK-7638) Python API for pmml.export

2016-11-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk closed SPARK-7638. -- Resolution: Won't Fix We are moving away from the MLlib APIs, so any new functionality should be done against t

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2016-11-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625685#comment-15625685 ] holdenk commented on SPARK-7146: I think it might be reasonable to just expose it as Scala

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631463#comment-15631463 ] holdenk commented on SPARK-18128: - Extracted from the discussion around SPARK-1267: Peop

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-11-03 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634568#comment-15634568 ] holdenk commented on SPARK-15581: - This sounds like really good suggestions - I think som

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637104#comment-15637104 ] holdenk commented on SPARK-18128: - Good call - so publishing to PyPI test has worked fine

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637111#comment-15637111 ] holdenk commented on SPARK-18128: - Sure > Add support for publishing to PyPI > -

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637110#comment-15637110 ] holdenk commented on SPARK-18128: - When I e-mailed [~prabinb] earlier this week I got an

[jira] [Updated] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18128: Issue Type: Sub-task (was: Improvement) Parent: SPARK-18267 > Add support for publishing to PyPI >

[jira] [Created] (SPARK-18418) Make release script hadoop profiles aren't correctly specified.

2016-11-11 Thread holdenk (JIRA)
holdenk created SPARK-18418: --- Summary: Make release script hadoop profiles aren't correctly specified. Key: SPARK-18418 URL: https://issues.apache.org/jira/browse/SPARK-18418 Project: Spark Issue

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2016-11-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674308#comment-15674308 ] holdenk commented on SPARK-2620: I don't think its been resolved, does your code need to b

[jira] [Commented] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-11-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687563#comment-15687563 ] holdenk commented on SPARK-12469: - Hi [~rxin]/[~squito] if we want to try and get this in

[jira] [Commented] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-11-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687604#comment-15687604 ] holdenk commented on SPARK-12469: - In some ways I agree, on the other hand its slipped 2.

[jira] [Commented] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-11-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687763#comment-15687763 ] holdenk commented on SPARK-12469: - Cool - I'll bug y'all after the 2.1 release is out so

[jira] [Created] (SPARK-18576) Expose basic TaskContext info in PySpark

2016-11-24 Thread holdenk (JIRA)
holdenk created SPARK-18576: --- Summary: Expose basic TaskContext info in PySpark Key: SPARK-18576 URL: https://issues.apache.org/jira/browse/SPARK-18576 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695922#comment-15695922 ] holdenk commented on SPARK-18541: - Making it easier for PySpark SQL users to specify meta

[jira] [Updated] (SPARK-18532) Code generation memory issue

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18532: Component/s: (was: Spark Core) SQL > Code generation memory issue > --

[jira] [Updated] (SPARK-18502) Spark does not handle columns that contain backquote (`)

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18502: Component/s: (was: Spark Core) SQL > Spark does not handle columns that contain backqu

[jira] [Commented] (SPARK-18405) Add yarn-cluster mode support to Spark Thrift Server

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695936#comment-15695936 ] holdenk commented on SPARK-18405: - Even in cluster mode you could overwhelm the node runn

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695939#comment-15695939 ] holdenk commented on SPARK-18128: - Thanks! :) I'll start working on this issue once we st

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695941#comment-15695941 ] holdenk commented on SPARK-18128: - Thanks! :) I'll start working on this issue once we st

[jira] [Updated] (SPARK-18108) Partition discovery fails with explicitly written long partitions

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18108: Component/s: (was: Spark Core) SQL > Partition discovery fails with explicitly written

[jira] [Commented] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695956#comment-15695956 ] holdenk commented on SPARK-17788: - This is semi-expected behaviour of the range partition

[jira] [Updated] (SPARK-17788) RangePartitioner results in few very large tasks and many small to empty tasks

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-17788: Target Version/s: (was: 2.1.0) > RangePartitioner results in few very large tasks and many small to empty

[jira] [Commented] (SPARK-636) Add mechanism to run system management/configuration tasks on all workers

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695967#comment-15695967 ] holdenk commented on SPARK-636: --- If you have a logging system you want to initialize wouldn't

[jira] [Commented] (SPARK-5190) Allow spark listeners to be added before spark context gets initialized.

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696003#comment-15696003 ] holdenk commented on SPARK-5190: This seems to be fixed, but we forgot to close (cc [~josh

[jira] [Resolved] (SPARK-3348) Support user-defined SparkListeners properly

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-3348. Resolution: Duplicate > Support user-defined SparkListeners properly > -

[jira] [Commented] (SPARK-5997) Increase partition count without performing a shuffle

2016-11-25 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696016#comment-15696016 ] holdenk commented on SPARK-5997: That could work, although we'd probably want a different

  1   2   3   4   5   6   7   8   9   10   >