[jira] [Commented] (SPARK-1022) Add unit tests for kafka streaming
[ https://issues.apache.org/jira/browse/SPARK-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083928#comment-14083928 ] Apache Spark commented on SPARK-1022: - User 'jerryshao' has created a pull request for this issue: https://github.com/apache/spark/pull/1751 Add unit tests for kafka streaming -- Key: SPARK-1022 URL: https://issues.apache.org/jira/browse/SPARK-1022 Project: Spark Issue Type: Bug Reporter: Patrick Wendell Assignee: Saisai Shao It would be nice if we could add unit tests to verify elements of kafka's stream. Right now we do integration tests only which makes it hard to upgrade versions of kafka. The place to start here would be to look at how kafka tests itself and see if the functionality can be exposed to third party users. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1997) Update breeze to version 0.8.1
[ https://issues.apache.org/jira/browse/SPARK-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083929#comment-14083929 ] Sean Owen commented on SPARK-1997: -- Was scalalogging a problem per se? the issue was that Spark used a different verison, but now it doesn't use it at all, and there is no conflict. Unless I misunderstand, it would be fine to use breeze 0.8.1 + Scala 2.10 in the current Spark code. Update breeze to version 0.8.1 -- Key: SPARK-1997 URL: https://issues.apache.org/jira/browse/SPARK-1997 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.1.0 {{breeze 0.7}} does not support {{scala 2.11}} . -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1449) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/SPARK-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083933#comment-14083933 ] Sean Owen commented on SPARK-1449: -- Sebb, is this just a matter of svn co https://dist.apache.org/repos/dist/release/spark/; and svn rm'ing the 0.9.1 and 1.0.0 releases? I'd do it but I don't have access. I think. [~pwendell] maybe this can be a step in the release process if not already? It may well be and these older ones were just missed last time. Please delete old releases from mirroring system Key: SPARK-1449 URL: https://issues.apache.org/jira/browse/SPARK-1449 Project: Spark Issue Type: Task Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.9.1 Reporter: Sebb To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1449) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/SPARK-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083942#comment-14083942 ] Sebb commented on SPARK-1449: - No need to check out the directory tree (which is large), you can remove files directly from SVN using svn delete (del, remove, rm) By default all members of the Spark PMC [1] will have karma to update the dist/release/spark tree. In particular whoever uploaded the last release should have ensured that previous releases were tidied up a few days after uploading the latest release ... The PMC can vote to ask Infra if they wish the dist/release/spark tree to be updateable by non-PMC members as well. [1] http://people.apache.org/committers-by-project.html#spark-pmc Please delete old releases from mirroring system Key: SPARK-1449 URL: https://issues.apache.org/jira/browse/SPARK-1449 Project: Spark Issue Type: Task Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.9.1 Reporter: Sebb To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2803) add Kafka stream feature for fetch messages from specified starting offset position
[ https://issues.apache.org/jira/browse/SPARK-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083973#comment-14083973 ] pengyanhong commented on SPARK-2803: resolved this issue in the pull request #1602 add Kafka stream feature for fetch messages from specified starting offset position --- Key: SPARK-2803 URL: https://issues.apache.org/jira/browse/SPARK-2803 Project: Spark Issue Type: New Feature Components: Input/Output Reporter: pengyanhong Labels: patch There are some use cases that we want to fetch message from specified offset position, as below: * replay messages * deal with transaction * skip bulk incorrect messages * random fetch message according to index -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2814) HiveThriftServer throws NPE when executing native commands
Cheng Lian created SPARK-2814: - Summary: HiveThriftServer throws NPE when executing native commands Key: SPARK-2814 URL: https://issues.apache.org/jira/browse/SPARK-2814 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.1.0 Reporter: Cheng Lian After [PR #1686|https://github.com/apache/spark/pull/1686], {{HiveThriftServer2}} throws exception when executing native commands. The reason is that initialization of {{HiveContext.sessionState.out}} and {{HiveContext.sessionState.err}} were made lazy, while {{HiveThriftServer2}} uses an overriden version of {{HiveContext}} that doesn't know how to initialize these two streams. Reproduction steps: # Start HiveThriftServer2 # Connect to it via beeline # Execute `set;` Exception thrown: {code} == HIVE FAILURE OUTPUT == == END HIVE FAILURE OUTPUT == 14/08/03 21:30:55 ERROR SparkSQLOperationManager: Error executing query: java.lang.NullPointerException at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:173) at org.apache.spark.sql.hive.HiveContext.set(HiveContext.scala:144) at org.apache.spark.sql.execution.SetCommand.sideEffectResult$lzycompute(commands.scala:59) at org.apache.spark.sql.execution.SetCommand.sideEffectResult(commands.scala:50) ... {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2814) HiveThriftServer throws NPE when executing native commands
[ https://issues.apache.org/jira/browse/SPARK-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083974#comment-14083974 ] Apache Spark commented on SPARK-2814: - User 'liancheng' has created a pull request for this issue: https://github.com/apache/spark/pull/1753 HiveThriftServer throws NPE when executing native commands -- Key: SPARK-2814 URL: https://issues.apache.org/jira/browse/SPARK-2814 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.1.0 Reporter: Cheng Lian After [PR #1686|https://github.com/apache/spark/pull/1686], {{HiveThriftServer2}} throws exception when executing native commands. The reason is that initialization of {{HiveContext.sessionState.out}} and {{HiveContext.sessionState.err}} were made lazy, while {{HiveThriftServer2}} uses an overriden version of {{HiveContext}} that doesn't know how to initialize these two streams. Reproduction steps: # Start HiveThriftServer2 # Connect to it via beeline # Execute `set;` Exception thrown: {code} == HIVE FAILURE OUTPUT == == END HIVE FAILURE OUTPUT == 14/08/03 21:30:55 ERROR SparkSQLOperationManager: Error executing query: java.lang.NullPointerException at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:173) at org.apache.spark.sql.hive.HiveContext.set(HiveContext.scala:144) at org.apache.spark.sql.execution.SetCommand.sideEffectResult$lzycompute(commands.scala:59) at org.apache.spark.sql.execution.SetCommand.sideEffectResult(commands.scala:50) ... {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2814) HiveThriftServer throws NPE when executing native commands
[ https://issues.apache.org/jira/browse/SPARK-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-2814: -- Description: After [PR #1686|https://github.com/apache/spark/pull/1686], {{HiveThriftServer2}} throws exception when executing native commands. The reason is that initialization of {{HiveContext.sessionState.out}} and {{HiveContext.sessionState.err}} were made lazy, while {{HiveThriftServer2}} uses an overriden version of {{HiveContext}} that doesn't know how to initialize these two streams. When {{HiveContext.runHive}} tries to write to {{HiveContext.sessionState.out}}, an NPE is throw. Reproduction steps: # Start HiveThriftServer2 # Connect to it via beeline # Execute `set;` Exception thrown: {code} == HIVE FAILURE OUTPUT == == END HIVE FAILURE OUTPUT == 14/08/03 21:30:55 ERROR SparkSQLOperationManager: Error executing query: java.lang.NullPointerException at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:173) at org.apache.spark.sql.hive.HiveContext.set(HiveContext.scala:144) at org.apache.spark.sql.execution.SetCommand.sideEffectResult$lzycompute(commands.scala:59) at org.apache.spark.sql.execution.SetCommand.sideEffectResult(commands.scala:50) ... {code} was: After [PR #1686|https://github.com/apache/spark/pull/1686], {{HiveThriftServer2}} throws exception when executing native commands. The reason is that initialization of {{HiveContext.sessionState.out}} and {{HiveContext.sessionState.err}} were made lazy, while {{HiveThriftServer2}} uses an overriden version of {{HiveContext}} that doesn't know how to initialize these two streams. Reproduction steps: # Start HiveThriftServer2 # Connect to it via beeline # Execute `set;` Exception thrown: {code} == HIVE FAILURE OUTPUT == == END HIVE FAILURE OUTPUT == 14/08/03 21:30:55 ERROR SparkSQLOperationManager: Error executing query: java.lang.NullPointerException at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:173) at org.apache.spark.sql.hive.HiveContext.set(HiveContext.scala:144) at org.apache.spark.sql.execution.SetCommand.sideEffectResult$lzycompute(commands.scala:59) at org.apache.spark.sql.execution.SetCommand.sideEffectResult(commands.scala:50) ... {code} HiveThriftServer throws NPE when executing native commands -- Key: SPARK-2814 URL: https://issues.apache.org/jira/browse/SPARK-2814 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.1.0 Reporter: Cheng Lian After [PR #1686|https://github.com/apache/spark/pull/1686], {{HiveThriftServer2}} throws exception when executing native commands. The reason is that initialization of {{HiveContext.sessionState.out}} and {{HiveContext.sessionState.err}} were made lazy, while {{HiveThriftServer2}} uses an overriden version of {{HiveContext}} that doesn't know how to initialize these two streams. When {{HiveContext.runHive}} tries to write to {{HiveContext.sessionState.out}}, an NPE is throw. Reproduction steps: # Start HiveThriftServer2 # Connect to it via beeline # Execute `set;` Exception thrown: {code} == HIVE FAILURE OUTPUT == == END HIVE FAILURE OUTPUT == 14/08/03 21:30:55 ERROR SparkSQLOperationManager: Error executing query: java.lang.NullPointerException at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:210) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:173) at org.apache.spark.sql.hive.HiveContext.set(HiveContext.scala:144) at org.apache.spark.sql.execution.SetCommand.sideEffectResult$lzycompute(commands.scala:59) at org.apache.spark.sql.execution.SetCommand.sideEffectResult(commands.scala:50) ... {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
pengyanhong created SPARK-2815: -- Summary: Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object org.apache.hadoop.yarn.api.ApplicationConstants.Environment [error] val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:128: value setTokens is not a member of org.apache.hadoop.yarn.api.records.ContainerLaunchContext [error] amContainer.setTokens(ByteBuffer.wrap(dob.getData())) [error] ^ [error]
[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083998#comment-14083998 ] Guoqiang Li commented on SPARK-2815: I also encountered this bug. PRed: https://github.com/apache/spark/pull/1754 Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object org.apache.hadoop.yarn.api.ApplicationConstants.Environment [error] val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) [error]
[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084000#comment-14084000 ] Apache Spark commented on SPARK-2815: - User 'witgo' has created a pull request for this issue: https://github.com/apache/spark/pull/1754 Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object org.apache.hadoop.yarn.api.ApplicationConstants.Environment [error] val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) [error]
[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084006#comment-14084006 ] Guoqiang Li commented on SPARK-2815: [~pengyanhong] You can try this first {{./sbt/sbt clean assembly -Pyarn-alpha -Phive -Dhadoop.version=2.0.0-cdh4.5.0}} Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object org.apache.hadoop.yarn.api.ApplicationConstants.Environment [error] val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) [error]
[jira] [Comment Edited] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084006#comment-14084006 ] Guoqiang Li edited comment on SPARK-2815 at 8/3/14 3:10 PM: [~pengyanhong] You can try this {{./sbt/sbt clean assembly -Pyarn-alpha -Phive -Dhadoop.version=2.0.0-cdh4.5.0}} was (Author: gq): [~pengyanhong] You can try this first {{./sbt/sbt clean assembly -Pyarn-alpha -Phive -Dhadoop.version=2.0.0-cdh4.5.0}} Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object
[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084025#comment-14084025 ] Nicholas Chammas commented on SPARK-1981: - Word. Thanks for the clarification! Add AWS Kinesis streaming support - Key: SPARK-1981 URL: https://issues.apache.org/jira/browse/SPARK-1981 Project: Spark Issue Type: New Feature Components: Streaming Reporter: Chris Fregly Assignee: Chris Fregly Fix For: 1.1.0 Add AWS Kinesis support to Spark Streaming. Initial discussion occured here: https://github.com/apache/spark/pull/223 I discussed this with Parviz from AWS recently and we agreed that I would take this over. Look for a new PR that takes into account all the feedback from the earlier PR including spark-1.0-compliant implementation, AWS-license-aware build support, tests, comments, and style guide compliance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1335) Also increase perm gen / code cache for scalatest when invoked via Maven build
[ https://issues.apache.org/jira/browse/SPARK-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084026#comment-14084026 ] Guoqiang Li commented on SPARK-1335: The problem also appeared in branch 1.1. The following command fails. {{mvn -Pyarn-alpha -Phive -Dhadoop.version=2.0.0-cdh4.5.0 -DskipTests package}} . I'm on Java 6 / OSX 10.9.4 Also increase perm gen / code cache for scalatest when invoked via Maven build -- Key: SPARK-1335 URL: https://issues.apache.org/jira/browse/SPARK-1335 Project: Spark Issue Type: Bug Components: Build Affects Versions: 0.9.0 Reporter: Sean Owen Assignee: Sean Owen Fix For: 1.0.0 I am observing build failures when the Maven build reaches tests in the new SQL components. (I'm on Java 7 / OSX 10.9). The failure is the usual complaint from scala, that it's out of permgen space, or that JIT out of code cache space. I see that various build scripts increase these both for SBT. This change simply adds these settings to scalatest's arguments. Works for me and seems a bit more consistent. (In the PR I'm going to tack on some other little changes too -- see PR.) -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084034#comment-14084034 ] Guoqiang Li commented on SPARK-2815: Currently {{yarn-alpha}} does not support version {{2.0.0-cdh4.5.0}}, but seems to support version {{2.0.0-cdh4.2.0}} {{2.0.0-cdh4.5.0}} get following error: {noformat} [ERROR] /Users/witgo/work/code/java/spark/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:36: object AMResponse is not a member of package org.apache.hadoop.yarn.api.records [ERROR] import org.apache.hadoop.yarn.api.records.{AMResponse, ApplicationAttemptId} [ERROR]^ [ERROR] /Users/witgo/work/code/java/spark/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:114: value getAMResponse is not a member of org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse [ERROR] val amResp = allocateExecutorResources(executorsToRequest).getAMResponse {noformat} Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error]
[jira] [Resolved] (SPARK-2712) Add a small note that mvn package must happen before test
[ https://issues.apache.org/jira/browse/SPARK-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2712. Resolution: Fixed Issue resolved by pull request 1615 [https://github.com/apache/spark/pull/1615] Add a small note that mvn package must happen before test - Key: SPARK-2712 URL: https://issues.apache.org/jira/browse/SPARK-2712 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 0.9.1, 1.0.0, 1.1.1 Environment: all Reporter: Stephen Boesch Assignee: Stephen Boesch Priority: Trivial Labels: documentation Fix For: 1.1.0 Original Estimate: 0h Remaining Estimate: 0h Add to the building-with-maven.md: Requirement: build packages before running tests Tests must be run AFTER the package target has already been executed. The following is an example of a correct (build, test) sequence: mvn -Pyarn -Phadoop-2.3 -DskipTests -Phive clean package mvn -Pyarn -Phadoop-2.3 -Phive test BTW Reynold Xin requested this tiny doc improvement. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2246) Add user-data option to EC2 scripts
[ https://issues.apache.org/jira/browse/SPARK-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2246: --- Assignee: Allan Douglas R. de Oliveira Add user-data option to EC2 scripts --- Key: SPARK-2246 URL: https://issues.apache.org/jira/browse/SPARK-2246 Project: Spark Issue Type: Improvement Components: EC2 Reporter: Allan Douglas R. de Oliveira Assignee: Allan Douglas R. de Oliveira EC2 servers can use an user-data script for custom startup/initialization of machines. The EC2 scripts should provide an option to set this. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2197) Spark invoke DecisionTree by Java
[ https://issues.apache.org/jira/browse/SPARK-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2197. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1740 [https://github.com/apache/spark/pull/1740] Spark invoke DecisionTree by Java - Key: SPARK-2197 URL: https://issues.apache.org/jira/browse/SPARK-2197 Project: Spark Issue Type: Bug Components: MLlib Reporter: wulin Assignee: Joseph K. Bradley Fix For: 1.1.0 Strategy strategy = new Strategy(Algo.Classification(), new Impurity() { @Override public double calculate(double arg0, double arg1, double arg2) { return Gini.calculate(arg0, arg1, arg2); } @Override public double calculate(double arg0, double arg1) { return Gini.calculate(arg0, arg1); } }, 5, 100, QuantileStrategy.Sort(), null, 256); DecisionTree decisionTree = new DecisionTree(strategy); final DecisionTreeModel decisionTreeModel = decisionTree.train(labeledPoints.rdd()); i try to run it on spark, but find an error on the console: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.spark.mllib.regression.LabeledPoint; at org.apache.spark.mllib.tree.DecisionTree$.findSplitsBins(DecisionTree.scala:990) at org.apache.spark.mllib.tree.DecisionTree.train(DecisionTree.scala:56) at org.project.modules.spark.java.SparkDecisionTree.main(SparkDecisionTree.java:75) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) i view source code, find val numFeatures = input.take(1)(0).features.size this is a problem. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1997) Update breeze to version 0.8.1
[ https://issues.apache.org/jira/browse/SPARK-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084057#comment-14084057 ] Xiangrui Meng commented on SPARK-1997: -- It's fine within Spark. If we add breeze-0.8.1 with scalalogging-2.1.1, users may have trouble using Spark with their own library if it depends on scalalogging-1.0.1. This is why we removed scalalogging dependency from Spark SQL, so there is no reason to add it back, no matter which version it is. David already merged the PR that removes scalalogging from breeze. We are now waiting for him to help cut a new release of breeze, without scalalogging. Update breeze to version 0.8.1 -- Key: SPARK-1997 URL: https://issues.apache.org/jira/browse/SPARK-1997 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.1.0 {{breeze 0.7}} does not support {{scala 2.11}} . -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2784) Make language configurable using SQLConf instead of hql/sql functions
[ https://issues.apache.org/jira/browse/SPARK-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2784. - Resolution: Fixed Fix Version/s: 1.1.0 Make language configurable using SQLConf instead of hql/sql functions - Key: SPARK-2784 URL: https://issues.apache.org/jira/browse/SPARK-2784 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Assignee: Michael Armbrust Priority: Blocker Fix For: 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2752) spark sql cli should not exit when get a exception
[ https://issues.apache.org/jira/browse/SPARK-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2752. - Resolution: Fixed Target Version/s: 1.1.0 spark sql cli should not exit when get a exception -- Key: SPARK-2752 URL: https://issues.apache.org/jira/browse/SPARK-2752 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.0.0 Reporter: wangfei Fix For: 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2360: Target Version/s: 1.2.0 (was: 1.1.0) CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: New Feature Components: SQL Reporter: Michael Armbrust Assignee: Hossein Falaki Priority: Minor I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers
[ https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1740. --- Resolution: Fixed Fix Version/s: 1.1.0 Pyspark cancellation kills unrelated pyspark workers Key: SPARK-1740 URL: https://issues.apache.org/jira/browse/SPARK-1740 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.0 Reporter: Aaron Davidson Assignee: Davies Liu Priority: Critical Fix For: 1.1.0 PySpark cancellation calls SparkEnv#destroyPythonWorker. Since there is one python worker per process, this would seem like a sensible thing to do. Unfortunately, this method actually destroys a python daemon, and all associated workers, which generally means that we can cause failures in unrelated Pyspark jobs. The severity of this bug is limited by the fact that the Pyspark daemon is easily recreated, so the tasks will succeed after being restarted. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084164#comment-14084164 ] Apache Spark commented on SPARK-1981: - User 'cfregly' has created a pull request for this issue: https://github.com/apache/spark/pull/1757 Add AWS Kinesis streaming support - Key: SPARK-1981 URL: https://issues.apache.org/jira/browse/SPARK-1981 Project: Spark Issue Type: New Feature Components: Streaming Reporter: Chris Fregly Assignee: Chris Fregly Fix For: 1.1.0 Add AWS Kinesis support to Spark Streaming. Initial discussion occured here: https://github.com/apache/spark/pull/223 I discussed this with Parviz from AWS recently and we agreed that I would take this over. Look for a new PR that takes into account all the feedback from the earlier PR including spark-1.0-compliant implementation, AWS-license-aware build support, tests, comments, and style guide compliance. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2810) update scala-maven-plugin to version 3.2.0
[ https://issues.apache.org/jira/browse/SPARK-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2810: --- Assignee: Anand Avati update scala-maven-plugin to version 3.2.0 -- Key: SPARK-2810 URL: https://issues.apache.org/jira/browse/SPARK-2810 Project: Spark Issue Type: Sub-task Components: Build, Spark Core Reporter: Anand Avati Assignee: Anand Avati Fix For: 1.1.0 Needed for Scala 2.11 'compiler-interface' -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2810) update scala-maven-plugin to version 3.2.0
[ https://issues.apache.org/jira/browse/SPARK-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2810. Resolution: Fixed Fix Version/s: 1.1.0 Target Version/s: 1.1.0 Fixed by: https://github.com/apache/spark/pull/1711 update scala-maven-plugin to version 3.2.0 -- Key: SPARK-2810 URL: https://issues.apache.org/jira/browse/SPARK-2810 Project: Spark Issue Type: Sub-task Components: Build, Spark Core Reporter: Anand Avati Assignee: Anand Avati Fix For: 1.1.0 Needed for Scala 2.11 'compiler-interface' -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084195#comment-14084195 ] Apache Spark commented on SPARK-2583: - User 'JoshRosen' has created a pull request for this issue: https://github.com/apache/spark/pull/1758 ConnectionManager cannot distinguish whether error occurred or not -- Key: SPARK-2583 URL: https://issues.apache.org/jira/browse/SPARK-2583 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Kousuke Saruta Assignee: Kousuke Saruta Priority: Critical ConnectionManager#handleMessage sent empty messages to another peer if some error occurred or not in onReceiveCalback. {code} val ackMessage = if (onReceiveCallback != null) { logDebug(Calling back) onReceiveCallback(bufferMessage, connectionManagerId) } else { logDebug(Not calling back as callback is null) None } if (ackMessage.isDefined) { if (!ackMessage.get.isInstanceOf[BufferMessage]) { logDebug(Response to + bufferMessage + is not a buffer message, it is of type + ackMessage.get.getClass) } else if (!ackMessage.get.asInstanceOf[BufferMessage].hasAckId) { logDebug(Response to + bufferMessage + does not have ack id set) ackMessage.get.asInstanceOf[BufferMessage].ackId = bufferMessage.id } } // We have no way to tell peer whether error occurred or not sendMessage(connectionManagerId, ackMessage.getOrElse { Message.createBufferMessage(bufferMessage.id) }) } {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-2744) The configuration spark.history.retainedApplications is invalid
[ https://issues.apache.org/jira/browse/SPARK-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula closed SPARK-2744. --- Resolution: Not a Problem The configuration spark.history.retainedApplications is invalid - Key: SPARK-2744 URL: https://issues.apache.org/jira/browse/SPARK-2744 Project: Spark Issue Type: Bug Components: Spark Core Reporter: meiyoula Labels: historyserver when I set it in spark-env.sh like this:export SPARK_HISTORY_OPTS=$SPARK_HISTORY_OPTS -Dspark.history.ui.port=5678 -Dspark.history.retainedApplications=1 , the web of historyserver retains more than one application -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2816) Type-safe SQL queries
Michael Armbrust created SPARK-2816: --- Summary: Type-safe SQL queries Key: SPARK-2816 URL: https://issues.apache.org/jira/browse/SPARK-2816 Project: Spark Issue Type: New Feature Components: SQL Reporter: Michael Armbrust Assignee: Michael Armbrust -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2816) Type-safe SQL queries
[ https://issues.apache.org/jira/browse/SPARK-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084257#comment-14084257 ] Apache Spark commented on SPARK-2816: - User 'marmbrus' has created a pull request for this issue: https://github.com/apache/spark/pull/1759 Type-safe SQL queries - Key: SPARK-2816 URL: https://issues.apache.org/jira/browse/SPARK-2816 Project: Spark Issue Type: New Feature Components: SQL Reporter: Michael Armbrust Assignee: Michael Armbrust -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2817) add show create table support
Yi Tian created SPARK-2817: -- Summary: add show create table support Key: SPARK-2817 URL: https://issues.apache.org/jira/browse/SPARK-2817 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.0.0 Reporter: Yi Tian Priority: Minor In spark sql component, the show create table syntax had been disabled. We thought it is a useful funciton to describe a hive table. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084287#comment-14084287 ] pengyanhong commented on SPARK-2815: I changed the YarnAllocationHandler.scala file as below: import org.apache.hadoop.yarn.api.records,ApplicationAttemptId val amResp = allocateExecutorResources(executorsToRequest) then compile successfully and it can work on YARN cluster, but i am not sure whether there are potential problems. Compilation failed upon the hadoop version 2.0.0-cdh4.5.0 - Key: SPARK-2815 URL: https://issues.apache.org/jira/browse/SPARK-2815 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.1.0 Reporter: pengyanhong Assignee: Guoqiang Li Priority: Blocker compile fail via SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt assembly, finally get error message : [error] (yarn-stable/compile:compile) Compilation failed, the following is the detail error on console: [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.YarnClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40: not found: value YarnClient [error] val yarnClient = YarnClient.createYarnClient [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36: object util is not a member of package org.apache.hadoop.yarn.webapp [error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:64: value RM_AM_MAX_ATTEMPTS is not a member of object org.apache.hadoop.yarn.conf.YarnConfiguration [error] YarnConfiguration.RM_AM_MAX_ATTEMPTS, YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:66: not found: type AMRMClient [error] private var amClient: AMRMClient[ContainerRequest] = _ [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:92: not found: value AMRMClient [error] amClient = AMRMClient.createAMRMClient() [error]^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:137: not found: value WebAppUtils [error] val proxy = WebAppUtils.getProxyHostAndPort(conf) [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40: object api is not a member of package org.apache.hadoop.yarn.client [error] import org.apache.hadoop.yarn.client.api.AMRMClient [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577: not found: type AMRMClient [error] amClient: AMRMClient[ContainerRequest], [error] ^ [error] /Users/pengyanhong/git/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:410: value CONTAINER_ID is not a member of object
[jira] [Resolved] (SPARK-2272) Feature scaling which standardizes the range of independent variables or features of data.
[ https://issues.apache.org/jira/browse/SPARK-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2272. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1207 [https://github.com/apache/spark/pull/1207] Feature scaling which standardizes the range of independent variables or features of data. -- Key: SPARK-2272 URL: https://issues.apache.org/jira/browse/SPARK-2272 Project: Spark Issue Type: New Feature Components: MLlib Reporter: DB Tsai Assignee: DB Tsai Fix For: 1.1.0 Feature scaling is a method used to standardize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data preprocessing step. In this work, a trait called `VectorTransformer` is defined for generic transformation of a vector. It contains two methods, `apply` which applies transformation on a vector and `unapply` which applies inverse transformation on a vector. There are three concrete implementations of `VectorTransformer`, and they all can be easily extended with PMML transformation support. 1) `VectorStandardizer` - Standardises a vector given the mean and variance. Since the standardization will densify the output, the output is always in dense vector format. 2) `VectorRescaler` - Rescales a vector into target range specified by a tuple of two double values or two vectors as new target minimum and maximum. Since the rescaling will substrate the minimum of each column first, the output will always be in dense vector regardless of input vector type. 3) `VectorDivider` - Transforms a vector by dividing a constant or diving a vector with element by element basis. This transformation will preserve the type of input vector without densifying the result. Utility helper methods are implemented for taking an input of RDD[Vector], and then transformed RDD[Vector] and transformer are returned for dividing, rescaling, normalization, and standardization. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2817) add show create table support
[ https://issues.apache.org/jira/browse/SPARK-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084308#comment-14084308 ] Apache Spark commented on SPARK-2817: - User 'tianyi' has created a pull request for this issue: https://github.com/apache/spark/pull/1760 add show create table support Key: SPARK-2817 URL: https://issues.apache.org/jira/browse/SPARK-2817 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.0.0 Reporter: Yi Tian Priority: Minor In spark sql component, the show create table syntax had been disabled. We thought it is a useful funciton to describe a hive table. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2818) Improve joinning RDDs that transformed from the same cached RDD
Lu Lu created SPARK-2818: Summary: Improve joinning RDDs that transformed from the same cached RDD Key: SPARK-2818 URL: https://issues.apache.org/jira/browse/SPARK-2818 Project: Spark Issue Type: Improvement Reporter: Lu Lu -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2819) Difficult to turn on intercept with linear models
Sandy Ryza created SPARK-2819: - Summary: Difficult to turn on intercept with linear models Key: SPARK-2819 URL: https://issues.apache.org/jira/browse/SPARK-2819 Project: Spark Issue Type: Improvement Components: MLlib Reporter: Sandy Ryza If I want to train a logistic regression model with default parameters and include an intercept, I can run: val alg = new LogisticRegressionWithSGD() alg.setIntercept(true) alg.run(data) but if I want to set a parameter like numIterations, I need to use LogisticRegressionWithSGD.train(data, 50) and have no opportunity to turn on the intercept. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org