[jira] [Commented] (SPARK-17944) sbin/start-* scripts use of `hostname -f` fail for Solaris

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576323#comment-15576323 ] Sean Owen commented on SPARK-17944: --- Yeah, I think Solaris is the odd man out here then. Linux and OS X

[jira] [Commented] (SPARK-17944) sbin/start-* scripts use of `hostname -f` fail for Solaris

2016-10-14 Thread Erik O'Shaughnessy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576367#comment-15576367 ] Erik O'Shaughnessy commented on SPARK-17944: I'm sure there are situations where Linux and OS

[jira] [Created] (SPARK-17945) Writing to S3 should allow setting object metadata

2016-10-14 Thread Jeff Schobelock (JIRA)
Jeff Schobelock created SPARK-17945: --- Summary: Writing to S3 should allow setting object metadata Key: SPARK-17945 URL: https://issues.apache.org/jira/browse/SPARK-17945 Project: Spark

[jira] [Updated] (SPARK-17944) sbin/start-* scripts use of `hostname -f` fail with Solaris

2016-10-14 Thread Erik O'Shaughnessy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik O'Shaughnessy updated SPARK-17944: --- Summary: sbin/start-* scripts use of `hostname -f` fail with Solaris (was:

[jira] [Assigned] (SPARK-17947) Document the impact of `spark.sql.debug`

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17947: Assignee: (was: Apache Spark) > Document the impact of `spark.sql.debug` >

[jira] [Assigned] (SPARK-17947) Document the impact of `spark.sql.debug`

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17947: Assignee: Apache Spark > Document the impact of `spark.sql.debug` >

[jira] [Created] (SPARK-17947) Just document the impact of `spark.sql.debug`

2016-10-14 Thread Xiao Li (JIRA)
Xiao Li created SPARK-17947: --- Summary: Just document the impact of `spark.sql.debug` Key: SPARK-17947 URL: https://issues.apache.org/jira/browse/SPARK-17947 Project: Spark Issue Type:

[jira] [Updated] (SPARK-17947) Document the impact of `spark.sql.debug`

2016-10-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-17947: Summary: Document the impact of `spark.sql.debug` (was: Just document the impact of `spark.sql.debug`) >

[jira] [Commented] (SPARK-17947) Document the impact of `spark.sql.debug`

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576587#comment-15576587 ] Apache Spark commented on SPARK-17947: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-17863. -- Resolution: Fixed Fix Version/s: 2.1.0 2.0.2 Issue resolved by pull request

[jira] [Updated] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17863: - Assignee: Davies Liu > SELECT distinct does not work if there is a order by clause >

[jira] [Comment Edited] (SPARK-17942) OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=

2016-10-14 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576375#comment-15576375 ] Harish edited comment on SPARK-17942 at 10/14/16 8:20 PM: -- --conf

[jira] [Commented] (SPARK-17942) OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=

2016-10-14 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576375#comment-15576375 ] Harish commented on SPARK-17942: --conf "spark.executor.extraJavaOptions=-XX:ReservedCodeCacheSize=600m"

[jira] [Resolved] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17620. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15190

[jira] [Updated] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17620: - Fix Version/s: (was: 2.1.0) > hive.default.fileformat=orc does not set OrcSerde >

[jira] [Assigned] (SPARK-17946) Python crossJoin API similar to Scala

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17946: Assignee: Apache Spark > Python crossJoin API similar to Scala >

[jira] [Assigned] (SPARK-17946) Python crossJoin API similar to Scala

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17946: Assignee: (was: Apache Spark) > Python crossJoin API similar to Scala >

[jira] [Assigned] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17620: Assignee: Dilip Biswal (was: Apache Spark) > hive.default.fileformat=orc does not set

[jira] [Commented] (SPARK-17636) Parquet filter push down doesn't handle struct fields

2016-10-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576513#comment-15576513 ] Cheng Lian commented on SPARK-17636: [~MasterDDT], yes, just as what [~hyukjin.kwon] explained

[jira] [Updated] (SPARK-17941) Logistic regression test suites should use weights when comparing to glmnet

2016-10-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-17941: Assignee: Seth Hendrickson > Logistic regression test suites should use weights when comparing to glmnet >

[jira] [Closed] (SPARK-17941) Logistic regression test suites should use weights when comparing to glmnet

2016-10-14 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai closed SPARK-17941. --- Resolution: Fixed Fix Version/s: 2.1.0 > Logistic regression test suites should use weights when

[jira] [Created] (SPARK-17946) Python crossJoin API similar to Scala

2016-10-14 Thread Srinath (JIRA)
Srinath created SPARK-17946: --- Summary: Python crossJoin API similar to Scala Key: SPARK-17946 URL: https://issues.apache.org/jira/browse/SPARK-17946 Project: Spark Issue Type: Bug

[jira] [Reopened] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reopened SPARK-17620: -- > hive.default.fileformat=orc does not set OrcSerde > - >

[jira] [Commented] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576492#comment-15576492 ] Yin Huai commented on SPARK-17620: -- The PR somehow breaks the build and it has been reverted. >

[jira] [Assigned] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17620: Assignee: Apache Spark (was: Dilip Biswal) > hive.default.fileformat=orc does not set

[jira] [Commented] (SPARK-17946) Python crossJoin API similar to Scala

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576495#comment-15576495 ] Apache Spark commented on SPARK-17946: -- User 'srinathshankar' has created a pull request for this

[jira] [Updated] (SPARK-17636) Parquet filter push down doesn't handle struct fields

2016-10-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-17636: --- Description: There's a *PushedFilters* for a simple numeric field, but not for a numeric field

[jira] [Commented] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2016-10-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576523#comment-15576523 ] Cheng Lian commented on SPARK-9783: --- Yes, I'm closing this. Thanks! > Use SqlNewHadoopRDD in

[jira] [Closed] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2016-10-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian closed SPARK-9783. - Resolution: Not A Problem This issue is no longer a problem since we re-implemented the JSON data source

[jira] [Updated] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-17620: Assignee: Dilip Biswal > hive.default.fileformat=orc does not set OrcSerde >

[jira] [Commented] (SPARK-10954) Parquet version in the "created_by" metadata field of Parquet files written by Spark 1.5 and 1.6 is wrong

2016-10-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576623#comment-15576623 ] Cheng Lian commented on SPARK-10954: [~hyukjin.kwon], yes, confirmed. Thanks! > Parquet version in

[jira] [Resolved] (SPARK-16063) Add storageLevel to Dataset

2016-10-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-16063. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13780

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Description: h4. Problem When I run a job that requires some shuffle, some tasks fail because

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: screenshot-2.png screenshot-1.png > Shuffle fails when driver is on

[jira] [Created] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
Frank Rosner created SPARK-17933: Summary: Shuffle fails when driver is on one of the same machines as executor Key: SPARK-17933 URL: https://issues.apache.org/jira/browse/SPARK-17933 Project: Spark

[jira] [Resolved] (SPARK-4257) Spark master can only be accessed by hostname

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4257. -- Resolution: Not A Problem > Spark master can only be accessed by hostname >

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-10-14 Thread Low Chin Wei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574502#comment-15574502 ] Low Chin Wei commented on SPARK-13747: -- I encounter this in 2.0.1, is there any workaround like

[jira] [Resolved] (SPARK-17572) Write.df is failing on spark cluster

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17572. --- Resolution: Not A Problem > Write.df is failing on spark cluster >

[jira] [Resolved] (SPARK-17903) MetastoreRelation should talk to external catalog instead of hive client

2016-10-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17903. - Resolution: Fixed Fix Version/s: 2.1.0 > MetastoreRelation should talk to external

[jira] [Commented] (SPARK-17917) Convert 'Initial job has not accepted any resources..' logWarning to a SparkListener event

2016-10-14 Thread Mario Briggs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574639#comment-15574639 ] Mario Briggs commented on SPARK-17917: -- >> I don't have a strong feeling on this partly because I'm

[jira] [Resolved] (SPARK-17898) --repositories needs username and password

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17898. --- Resolution: Not A Problem These are Gradle questions, really. It should be perfectly possible to

[jira] [Resolved] (SPARK-17555) ExternalShuffleBlockResolver fails randomly with External Shuffle Service and Dynamic Resource Allocation on Mesos running under Marathon

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17555. --- Resolution: Not A Problem > ExternalShuffleBlockResolver fails randomly with External Shuffle

[jira] [Created] (SPARK-17934) Support percentile scale in ml.feature

2016-10-14 Thread Lei Wang (JIRA)
Lei Wang created SPARK-17934: Summary: Support percentile scale in ml.feature Key: SPARK-17934 URL: https://issues.apache.org/jira/browse/SPARK-17934 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-5113) Audit and document use of hostnames and IP addresses in Spark

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5113. -- Resolution: Won't Fix > Audit and document use of hostnames and IP addresses in Spark >

[jira] [Assigned] (SPARK-17929) Deadlock when AM restart and send RemoveExecutor on reset

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17929: Assignee: (was: Apache Spark) > Deadlock when AM restart and send RemoveExecutor on

[jira] [Resolved] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-650. - Resolution: Duplicate > Add a "setup hook" API for running initialization code on each executor >

[jira] [Resolved] (SPARK-16720) Loading CSV file with 2k+ columns fails during attribute resolution on action

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16720. --- Resolution: Not A Problem > Loading CSV file with 2k+ columns fails during attribute resolution on

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574540#comment-15574540 ] Apache Spark commented on SPARK-16845: -- User 'lw-lin' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-16575) partition calculation mismatch with sc.binaryFiles

2016-10-14 Thread Tarun Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572950#comment-15572950 ] Tarun Kumar edited comment on SPARK-16575 at 10/14/16 8:38 AM: --- [~rxin] I

[jira] [Commented] (SPARK-17934) Support percentile scale in ml.feature

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574787#comment-15574787 ] Sean Owen commented on SPARK-17934: --- This might be on the border of things that are widely used enough

[jira] [Resolved] (SPARK-10872) Derby error (XSDB6) when creating new HiveContext after restarting SparkContext

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10872. --- Resolution: Not A Problem I don't think this is something to be fixed if I understand it correctly,

[jira] [Commented] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574802#comment-15574802 ] Sean Owen commented on SPARK-17933: --- I think this is related to a lot of existing JIRAs concerning the

[jira] [Assigned] (SPARK-17929) Deadlock when AM restart and send RemoveExecutor on reset

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17929: Assignee: Apache Spark > Deadlock when AM restart and send RemoveExecutor on reset >

[jira] [Commented] (SPARK-17929) Deadlock when AM restart and send RemoveExecutor on reset

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574801#comment-15574801 ] Apache Spark commented on SPARK-17929: -- User 'scwf' has created a pull request for this issue:

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Description: h4. Problem When I run a job that requires some shuffle, some tasks fail because

[jira] [Commented] (SPARK-12776) Implement Python API for Datasets

2016-10-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576739#comment-15576739 ] Michael Armbrust commented on SPARK-12776: -- I would love to see better support here, but I don't

[jira] [Commented] (SPARK-17813) Maximum data per trigger

2016-10-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576826#comment-15576826 ] Michael Armbrust commented on SPARK-17813: -- I think its okay to ignore compacted topics, at

[jira] [Created] (SPARK-17948) WARN CodeGenerator: Error calculating stats of compiled class

2016-10-14 Thread Harish (JIRA)
Harish created SPARK-17948: -- Summary: WARN CodeGenerator: Error calculating stats of compiled class Key: SPARK-17948 URL: https://issues.apache.org/jira/browse/SPARK-17948 Project: Spark Issue

[jira] [Created] (SPARK-17949) Introduce a JVM object based aggregate operator

2016-10-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17949: --- Summary: Introduce a JVM object based aggregate operator Key: SPARK-17949 URL: https://issues.apache.org/jira/browse/SPARK-17949 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17709) spark 2.0 join - column resolution error

2016-10-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576763#comment-15576763 ] Xiao Li commented on SPARK-17709: - That is what I said above. The deduplication is not triggered. It

[jira] [Assigned] (SPARK-17748) One-pass algorithm for linear regression with L1 and elastic-net penalties

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17748: Assignee: Apache Spark (was: Seth Hendrickson) > One-pass algorithm for linear

[jira] [Commented] (SPARK-17748) One-pass algorithm for linear regression with L1 and elastic-net penalties

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576770#comment-15576770 ] Apache Spark commented on SPARK-17748: -- User 'sethah' has created a pull request for this issue:

[jira] [Commented] (SPARK-17620) hive.default.fileformat=orc does not set OrcSerde

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576787#comment-15576787 ] Apache Spark commented on SPARK-17620: -- User 'dilipbiswal' has created a pull request for this

[jira] [Resolved] (SPARK-17942) OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=

2016-10-14 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish resolved SPARK-17942. Resolution: Works for Me > OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using >

[jira] [Commented] (SPARK-17812) More granular control of starting offsets (assign)

2016-10-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576840#comment-15576840 ] Michael Armbrust commented on SPARK-17812: -- That sounds pretty good to me, with one question:

[jira] [Resolved] (SPARK-17948) WARN CodeGenerator: Error calculating stats of compiled class

2016-10-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-17948. Resolution: Duplicate > WARN CodeGenerator: Error calculating stats of compiled class >

[jira] [Commented] (SPARK-16002) Sleep when no new data arrives to avoid 100% CPU usage

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577198#comment-15577198 ] Apache Spark commented on SPARK-16002: -- User 'lw-lin' has created a pull request for this issue:

[jira] [Updated] (SPARK-17851) Make sure all test sqls in catalyst pass checkAnalyze

2016-10-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-17851: - Description: Currently we have several tens of test sqls in catalyst will fail at

[jira] [Commented] (SPARK-636) Add mechanism to run system management/configuration tasks on all workers

2016-10-14 Thread Luis Ramos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575005#comment-15575005 ] Luis Ramos commented on SPARK-636: -- I feel like the broadcasting mechanism doesn't get me "close" enough

[jira] [Commented] (SPARK-17930) The SerializerInstance instance used when deserializing a TaskResult is not reused

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575123#comment-15575123 ] Sean Owen commented on SPARK-17930: --- But this code is only ever called once per object. Reuse doesn't

[jira] [Commented] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575400#comment-15575400 ] Apache Spark commented on SPARK-17868: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17868: Assignee: (was: Apache Spark) > Do not use bitmasks during parsing and analysis of

[jira] [Resolved] (SPARK-14634) Add BisectingKMeansSummary

2016-10-14 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-14634. - Resolution: Fixed Assignee: zhengruifeng Fix Version/s: 2.1.0 > Add

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: (was: screenshot-1.png) > Shuffle fails when driver is on one of the same

[jira] [Updated] (SPARK-17935) Add KafkaForeachWriter in external kafka-0.8.0 for structured streaming module

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17935: -- Target Version/s: (was: 2.0.0) Fix Version/s: (was: 2.0.0) > Add KafkaForeachWriter in

[jira] [Resolved] (SPARK-15402) PySpark ml.evaluation should support save/load

2016-10-14 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-15402. - Resolution: Fixed Assignee: Yanbo Liang Fix Version/s: 2.1.0 > PySpark

[jira] [Resolved] (SPARK-17787) spark submit throws error while using kafka Appender log4j:ERROR Could not instantiate class [kafka.producer.KafkaLog4jAppender]. java.lang.ClassNotFoundException: kafk

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17787. --- Resolution: Not A Problem > spark submit throws error while using kafka Appender log4j:ERROR Could

[jira] [Updated] (SPARK-17855) Spark worker throw Exception when uber jar's http url contains query string

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17855: -- Assignee: Hao Ren > Spark worker throw Exception when uber jar's http url contains query string >

[jira] [Commented] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575232#comment-15575232 ] Frank Rosner commented on SPARK-17933: -- Thanks [~srowen]. I know a lot of discussions about the

[jira] [Resolved] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17936. --- Resolution: Duplicate Duplicate of several JIRAs -- have a look through first. > "CodeGenerator -

[jira] [Updated] (SPARK-17933) Shuffle fails when driver is on one of the same machines as executor

2016-10-14 Thread Frank Rosner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Rosner updated SPARK-17933: - Attachment: screenshot-1.png > Shuffle fails when driver is on one of the same machines as

[jira] [Updated] (SPARK-17937) Clarify Kafka offset semantics for Structured Streaming

2016-10-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-17937: --- Description: Possible events for which offsets are needed: # New partition is discovered #

[jira] [Created] (SPARK-17940) Typo in LAST function error message

2016-10-14 Thread Shuai Lin (JIRA)
Shuai Lin created SPARK-17940: - Summary: Typo in LAST function error message Key: SPARK-17940 URL: https://issues.apache.org/jira/browse/SPARK-17940 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-17935) Add KafkaForeachWriter in external kafka-0.8.0 for structured streaming module

2016-10-14 Thread zhangxinyu (JIRA)
zhangxinyu created SPARK-17935: -- Summary: Add KafkaForeachWriter in external kafka-0.8.0 for structured streaming module Key: SPARK-17935 URL: https://issues.apache.org/jira/browse/SPARK-17935 Project:

[jira] [Updated] (SPARK-17935) Add KafkaForeachWriter in external kafka-0.8.0 for structured streaming module

2016-10-14 Thread zhangxinyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangxinyu updated SPARK-17935: --- Description: Now spark already supports kafkaInputStream. It would be useful that we add

[jira] [Comment Edited] (SPARK-636) Add mechanism to run system management/configuration tasks on all workers

2016-10-14 Thread Luis Ramos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575005#comment-15575005 ] Luis Ramos edited comment on SPARK-636 at 10/14/16 11:07 AM: - I feel like the

[jira] [Comment Edited] (SPARK-636) Add mechanism to run system management/configuration tasks on all workers

2016-10-14 Thread Luis Ramos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575005#comment-15575005 ] Luis Ramos edited comment on SPARK-636 at 10/14/16 11:11 AM: - I feel like the

[jira] [Resolved] (SPARK-17855) Spark worker throw Exception when uber jar's http url contains query string

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17855. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15420

[jira] [Resolved] (SPARK-17777) Spark Scheduler Hangs Indefinitely

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1. --- Resolution: Not A Problem > Spark Scheduler Hangs Indefinitely > --

[jira] [Resolved] (SPARK-10954) Parquet version in the "created_by" metadata field of Parquet files written by Spark 1.5 and 1.6 is wrong

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10954. --- Resolution: Not A Problem > Parquet version in the "created_by" metadata field of Parquet files

[jira] [Resolved] (SPARK-10681) DateTimeUtils needs a method to parse string to SQL's timestamp value

2016-10-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10681. --- Resolution: Not A Problem > DateTimeUtils needs a method to parse string to SQL's timestamp value >

[jira] [Assigned] (SPARK-17940) Typo in LAST function error message

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17940: Assignee: Apache Spark > Typo in LAST function error message >

[jira] [Updated] (SPARK-17937) Clarify Kafka offset semantics for Structured Streaming

2016-10-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-17937: --- Description: Possible events for which offsets are needed: # New partition is discovered #

[jira] [Updated] (SPARK-17937) Clarify Kafka offset semantics for Structured Streaming

2016-10-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-17937: --- Description: Possible events for which offsets are needed: # New partition is discovered #

[jira] [Updated] (SPARK-17624) Flaky test? StateStoreSuite maintenance

2016-10-14 Thread Adam Roberts (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Roberts updated SPARK-17624: - Affects Version/s: 2.1.0 > Flaky test? StateStoreSuite maintenance >

[jira] [Commented] (SPARK-17624) Flaky test? StateStoreSuite maintenance

2016-10-14 Thread Adam Roberts (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575683#comment-15575683 ] Adam Roberts commented on SPARK-17624: -- Having another look at this now as it still fails

[jira] [Commented] (SPARK-17940) Typo in LAST function error message

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575694#comment-15575694 ] Apache Spark commented on SPARK-17940: -- User 'lins05' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17940) Typo in LAST function error message

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17940: Assignee: (was: Apache Spark) > Typo in LAST function error message >

[jira] [Created] (SPARK-17941) Logistic regression test suites should use weights when comparing to glmnet

2016-10-14 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-17941: Summary: Logistic regression test suites should use weights when comparing to glmnet Key: SPARK-17941 URL: https://issues.apache.org/jira/browse/SPARK-17941

[jira] [Assigned] (SPARK-17941) Logistic regression test suites should use weights when comparing to glmnet

2016-10-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17941: Assignee: Apache Spark > Logistic regression test suites should use weights when

  1   2   >