[jira] [Updated] (SPARK-24093) Make some fields of KafkaStreamWriter/InternalRowMicroBatchWriter visible to outside of the classes
[ https://issues.apache.org/jira/browse/SPARK-24093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-24093: - Description: To make third parties able to get the information of streaming writer, for example, the information of "writer" and "topic" which streaming data are written into, this jira is created to make relevant fields of KafkaStreamWriter and InternalRowMicroBatchWriter visible to outside of the classes. (was: We are working on Spark Atlas Connector([https://github.com/hortonworks-spark/spark-atlas-connector)|https://github.com/hortonworks-spark/spark-atlas-connector).], and adding supports for Spark Streaming. As SAC needs the information of "writer" and "topic" which streaming data are written into, this jira is created to make relevant fields of KafkaStreamWriter and InternalRowMicroBatchWriter visible to outside of the classes.) > Make some fields of KafkaStreamWriter/InternalRowMicroBatchWriter visible to > outside of the classes > --- > > Key: SPARK-24093 > URL: https://issues.apache.org/jira/browse/SPARK-24093 > Project: Spark > Issue Type: Wish > Components: Structured Streaming >Affects Versions: 2.3.0 >Reporter: Weiqing Yang >Priority: Minor > > To make third parties able to get the information of streaming writer, for > example, the information of "writer" and "topic" which streaming data are > written into, this jira is created to make relevant fields of > KafkaStreamWriter and InternalRowMicroBatchWriter visible to outside of the > classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24093) Make some fields of KafkaStreamWriter/InternalRowMicroBatchWriter visible to outside of the classes
Weiqing Yang created SPARK-24093: Summary: Make some fields of KafkaStreamWriter/InternalRowMicroBatchWriter visible to outside of the classes Key: SPARK-24093 URL: https://issues.apache.org/jira/browse/SPARK-24093 Project: Spark Issue Type: Wish Components: Structured Streaming Affects Versions: 2.3.0 Reporter: Weiqing Yang We are working on Spark Atlas Connector([https://github.com/hortonworks-spark/spark-atlas-connector)|https://github.com/hortonworks-spark/spark-atlas-connector).], and adding supports for Spark Streaming. As SAC needs the information of "writer" and "topic" which streaming data are written into, this jira is created to make relevant fields of KafkaStreamWriter and InternalRowMicroBatchWriter visible to outside of the classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122234#comment-16122234 ] Weiqing Yang commented on SPARK-21697: -- Thanks for filing this issue! > NPE & ExceptionInInitializerError trying to load UTF from HDFS > -- > > Key: SPARK-21697 > URL: https://issues.apache.org/jira/browse/SPARK-21697 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1 > Environment: Spark Client mode, Hadoop 2.6.0 >Reporter: Steve Loughran >Priority: Minor > > Reported on [the > PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for > SPARK-12868: trying to load a UDF of HDFS is triggering an > {{ExceptionInInitializerError}}, caused by an NPE which should only happen if > the commons-logging {{LOG}} log is null. > Hypothesis: the commons logging scan for {{commons-logging.properties}} is > happening in the classpath with the HDFS JAR; this is triggering a D/L of the > JAR, which needs to force in commons-logging, and, as that's not inited yet, > NPEs -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6628) ClassCastException occurs when executing sql statement "insert into" on hbase table
[ https://issues.apache.org/jira/browse/SPARK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011556#comment-16011556 ] Weiqing Yang commented on SPARK-6628: - Hi [~srowen], I just submitted a PR for this. could you please help to review it? Thanks. > ClassCastException occurs when executing sql statement "insert into" on hbase > table > --- > > Key: SPARK-6628 > URL: https://issues.apache.org/jira/browse/SPARK-6628 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: meiyoula > > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in > stage 3.0 (TID 12, vm-17): java.lang.ClassCastException: > org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to > org.apache.hadoop.hive.ql.io.HiveOutputFormat > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.outputFormat$lzycompute(hiveWriterContainers.scala:72) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.outputFormat(hiveWriterContainers.scala:71) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.getOutputName(hiveWriterContainers.scala:91) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.initWriters(hiveWriterContainers.scala:115) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.executorSideSetup(hiveWriterContainers.scala:84) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:112) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6628) ClassCastException occurs when executing sql statement "insert into" on hbase table
[ https://issues.apache.org/jira/browse/SPARK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011490#comment-16011490 ] Weiqing Yang commented on SPARK-6628: - We met with this issue too. The major issue is: {code} org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat {code} cannot be cast to {code} org.apache.hadoop.hive.ql.io.HiveOutputFormat {code} The reason is: {code} public interface HiveOutputFormatextends OutputFormat {…} public class HiveHBaseTableOutputFormat extends TableOutputFormat implements OutputFormat {...} {code} >From the two snippets above, we can see both HiveHBaseTableOutputFormat and >HiveOutputFormat 'extends' /'implements' OutputFormat, and can not cast to >each other. Spark initials the outputformat in SparkHiveWriterContainer of Spark 1.6, 2.0, 2.1 (or: in HiveFileFormat of Spark 2.2 /Master) {code} @transient private lazy val outputFormat = jobConf.value.getOutputFormat.asInstanceOf[HiveOutputFormat[AnyRef, Writable]] {code} Notice: this file output format is {color:red}HiveOutputFormat{color} However, when users write the data into the hbase, the outputFormat is HiveHBaseTableOutputFormat, it isn't instance of HiveOutputFormat. I am going to submit a PR for this. > ClassCastException occurs when executing sql statement "insert into" on hbase > table > --- > > Key: SPARK-6628 > URL: https://issues.apache.org/jira/browse/SPARK-6628 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: meiyoula > > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 3.0 failed 4 times, most recent failure: Lost task 1.3 in > stage 3.0 (TID 12, vm-17): java.lang.ClassCastException: > org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to > org.apache.hadoop.hive.ql.io.HiveOutputFormat > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.outputFormat$lzycompute(hiveWriterContainers.scala:72) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.outputFormat(hiveWriterContainers.scala:71) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.getOutputName(hiveWriterContainers.scala:91) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.initWriters(hiveWriterContainers.scala:115) > at > org.apache.spark.sql.hive.SparkHiveWriterContainer.executorSideSetup(hiveWriterContainers.scala:84) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:112) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15857) Add Caller Context in Spark
[ https://issues.apache.org/jira/browse/SPARK-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864491#comment-15864491 ] Weiqing Yang commented on SPARK-15857: -- Thanks. [~zsxwing] > Add Caller Context in Spark > --- > > Key: SPARK-15857 > URL: https://issues.apache.org/jira/browse/SPARK-15857 > Project: Spark > Issue Type: New Feature >Reporter: Weiqing Yang > > Hadoop has implemented a feature of log tracing – caller context (Jira: > HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand > how specific applications impacting parts of the Hadoop system and potential > problems they may be creating (e.g. overloading NN). As HDFS mentioned in > HDFS-9184, for a given HDFS operation, it's very helpful to track which upper > level job issues it. The upper level callers may be specific Oozie tasks, MR > jobs, hive queries, Spark jobs. > Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, > HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those > systems invoke HDFS client API and Yarn client API to setup caller context, > and also expose an API to pass in caller context into it. > Lots of Spark applications are running on Yarn/HDFS. Spark can also implement > its caller context via invoking HDFS/Yarn API, and also expose an API to its > upstream applications to set up their caller contexts. In the end, the spark > caller context written into Yarn log / HDFS log can associate with task id, > stage id, job id and app id. That is also very good for Spark users to > identify tasks especially if Spark supports multi-tenant environment in the > future. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-15857) Add Caller Context in Spark
[ https://issues.apache.org/jira/browse/SPARK-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang resolved SPARK-15857. -- Resolution: Fixed > Add Caller Context in Spark > --- > > Key: SPARK-15857 > URL: https://issues.apache.org/jira/browse/SPARK-15857 > Project: Spark > Issue Type: New Feature >Reporter: Weiqing Yang > > Hadoop has implemented a feature of log tracing – caller context (Jira: > HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand > how specific applications impacting parts of the Hadoop system and potential > problems they may be creating (e.g. overloading NN). As HDFS mentioned in > HDFS-9184, for a given HDFS operation, it's very helpful to track which upper > level job issues it. The upper level callers may be specific Oozie tasks, MR > jobs, hive queries, Spark jobs. > Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, > HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those > systems invoke HDFS client API and Yarn client API to setup caller context, > and also expose an API to pass in caller context into it. > Lots of Spark applications are running on Yarn/HDFS. Spark can also implement > its caller context via invoking HDFS/Yarn API, and also expose an API to its > upstream applications to set up their caller contexts. In the end, the spark > caller context written into Yarn log / HDFS log can associate with task id, > stage id, job id and app id. That is also very good for Spark users to > identify tasks especially if Spark supports multi-tenant environment in the > future. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18746) Add implicit encoders for BigDecimal, timestamp and date
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18746: - Description: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} In this pR, implicit encoders for java.math.BigDecimal will be added in the PR. Also, timestamp and date was: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} To fix the error above, an implicit encoder for java.math.BigDecimal will be added in the PR. Also, > Add implicit encoders for BigDecimal, timestamp and date > > > Key: SPARK-18746 > URL: https://issues.apache.org/jira/browse/SPARK-18746 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Weiqing Yang > > Run the code below in spark-shell, there will be an error: > {code} > scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) > :24: error: Unable to find encoder for type stored in a Dataset. > Primitive types (Int, String, etc) and Product types (case classes) are > supported by importing spark.implicits._ Support for serializing other types > will be added in future releases. >spark.createDataset(Seq(new java.math.BigDecimal(10))) > ^ > scala> > {code} > In this pR, implicit encoders for java.math.BigDecimal will be added in the > PR. Also, timestamp and date -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18746) Add implicit encoders for BigDecimal, timestamp and date
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18746: - Description: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} In this PR, implicit encoders for BigDecimal, timestamp and date will be added. was: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} In this pR, implicit encoders for java.math.BigDecimal will be added in the PR. Also, timestamp and date > Add implicit encoders for BigDecimal, timestamp and date > > > Key: SPARK-18746 > URL: https://issues.apache.org/jira/browse/SPARK-18746 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Weiqing Yang > > Run the code below in spark-shell, there will be an error: > {code} > scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) > :24: error: Unable to find encoder for type stored in a Dataset. > Primitive types (Int, String, etc) and Product types (case classes) are > supported by importing spark.implicits._ Support for serializing other types > will be added in future releases. >spark.createDataset(Seq(new java.math.BigDecimal(10))) > ^ > scala> > {code} > In this PR, implicit encoders for BigDecimal, timestamp and date will be > added. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18746) Add implicit encoders for BigDecimal, timestamp and date
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18746: - Description: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} To fix the error above, an implicit encoder for java.math.BigDecimal will be added in the PR. Also, was: Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} To fix the error above, {{newBigDecimalEncoder}} will be added in the PR. > Add implicit encoders for BigDecimal, timestamp and date > > > Key: SPARK-18746 > URL: https://issues.apache.org/jira/browse/SPARK-18746 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Weiqing Yang > > Run the code below in spark-shell, there will be an error: > {code} > scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) > :24: error: Unable to find encoder for type stored in a Dataset. > Primitive types (Int, String, etc) and Product types (case classes) are > supported by importing spark.implicits._ Support for serializing other types > will be added in future releases. >spark.createDataset(Seq(new java.math.BigDecimal(10))) > ^ > scala> > {code} > To fix the error above, an implicit encoder for java.math.BigDecimal will be > added in the PR. Also, -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18746) Add implicit encoders for BigDecimal, timestamp and date
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18746: - Summary: Add implicit encoders for BigDecimal, timestamp and date (was: Add newBigDecimalEncoder) > Add implicit encoders for BigDecimal, timestamp and date > > > Key: SPARK-18746 > URL: https://issues.apache.org/jira/browse/SPARK-18746 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Weiqing Yang > > Run the code below in spark-shell, there will be an error: > {code} > scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) > :24: error: Unable to find encoder for type stored in a Dataset. > Primitive types (Int, String, etc) and Product types (case classes) are > supported by importing spark.implicits._ Support for serializing other types > will be added in future releases. >spark.createDataset(Seq(new java.math.BigDecimal(10))) > ^ > scala> > {code} > To fix the error above, {{newBigDecimalEncoder}} will be added in the PR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18697) Upgrade sbt plugins
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18697: - Description: For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} was: For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} All other plugins are up-to-date. > Upgrade sbt plugins > --- > > Key: SPARK-18697 > URL: https://issues.apache.org/jira/browse/SPARK-18697 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Assignee: Weiqing Yang >Priority: Trivial > > For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt > plugins will be upgraded: > {code} > sbteclipse-plugin: 4.0.0 -> 5.0.1 > sbt-mima-plugin: 0.1.11 -> 0.1.12 > org.ow2.asm/asm: 5.0.3 -> 5.1 > org.ow2.asm/asm-commons: 5.0.3 -> 5.1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18697) Upgrade sbt plugins
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18697: - Description: For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} All other plugins are up-to-date. was: For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbt-assembly: 0.11.2 -> 0.14.3 sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} All other plugins are up-to-date. > Upgrade sbt plugins > --- > > Key: SPARK-18697 > URL: https://issues.apache.org/jira/browse/SPARK-18697 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Assignee: Weiqing Yang >Priority: Trivial > > For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt > plugins will be upgraded: > {code} > sbteclipse-plugin: 4.0.0 -> 5.0.1 > sbt-mima-plugin: 0.1.11 -> 0.1.12 > org.ow2.asm/asm: 5.0.3 -> 5.1 > org.ow2.asm/asm-commons: 5.0.3 -> 5.1 > {code} > All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18746) Add newBigDecimalEncoder
Weiqing Yang created SPARK-18746: Summary: Add newBigDecimalEncoder Key: SPARK-18746 URL: https://issues.apache.org/jira/browse/SPARK-18746 Project: Spark Issue Type: Bug Components: SQL Reporter: Weiqing Yang Run the code below in spark-shell, there will be an error: {code} scala> spark.createDataset(Seq(new java.math.BigDecimal(10))) :24: error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. spark.createDataset(Seq(new java.math.BigDecimal(10))) ^ scala> {code} To fix the error above, {{newBigDecimalEncoder}} will be added in the PR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18696) Upgrade sbt plugins
[ https://issues.apache.org/jira/browse/SPARK-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718550#comment-15718550 ] Weiqing Yang commented on SPARK-18696: -- Oh, yes, thanks for closing this. > Upgrade sbt plugins > --- > > Key: SPARK-18696 > URL: https://issues.apache.org/jira/browse/SPARK-18696 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Priority: Minor > > For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt > plugins will be upgraded: > {code} > sbt-assembly: 0.11.2 -> 0.14.3 > sbteclipse-plugin: 4.0.0 -> 5.0.1 > sbt-mima-plugin: 0.1.11 -> 0.1.12 > org.ow2.asm/asm: 5.0.3 -> 5.1 > org.ow2.asm/asm-commons: 5.0.3 -> 5.1 > {code} > All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18697) Upgrade sbt plugins
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18697: - Target Version/s: (was: 2.2.0) > Upgrade sbt plugins > --- > > Key: SPARK-18697 > URL: https://issues.apache.org/jira/browse/SPARK-18697 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Priority: Trivial > > For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt > plugins will be upgraded: > {code} > sbt-assembly: 0.11.2 -> 0.14.3 > sbteclipse-plugin: 4.0.0 -> 5.0.1 > sbt-mima-plugin: 0.1.11 -> 0.1.12 > org.ow2.asm/asm: 5.0.3 -> 5.1 > org.ow2.asm/asm-commons: 5.0.3 -> 5.1 > {code} > All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18697) Upgrade sbt plugins
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717193#comment-15717193 ] Weiqing Yang commented on SPARK-18697: -- Will submit a PR after SPARK-18638 ([PR #16069|https://github.com/apache/spark/pull/16069#issuecomment-264080711]) is fixed. > Upgrade sbt plugins > --- > > Key: SPARK-18697 > URL: https://issues.apache.org/jira/browse/SPARK-18697 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Priority: Minor > > For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt > plugins will be upgraded: > {code} > sbt-assembly: 0.11.2 -> 0.14.3 > sbteclipse-plugin: 4.0.0 -> 5.0.1 > sbt-mima-plugin: 0.1.11 -> 0.1.12 > org.ow2.asm/asm: 5.0.3 -> 5.1 > org.ow2.asm/asm-commons: 5.0.3 -> 5.1 > {code} > All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18697) Upgrade sbt plugins
Weiqing Yang created SPARK-18697: Summary: Upgrade sbt plugins Key: SPARK-18697 URL: https://issues.apache.org/jira/browse/SPARK-18697 Project: Spark Issue Type: Improvement Components: Build Reporter: Weiqing Yang Priority: Minor For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbt-assembly: 0.11.2 -> 0.14.3 sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18696) Upgrade sbt plugins
Weiqing Yang created SPARK-18696: Summary: Upgrade sbt plugins Key: SPARK-18696 URL: https://issues.apache.org/jira/browse/SPARK-18696 Project: Spark Issue Type: Improvement Components: Build Reporter: Weiqing Yang Priority: Minor For 2.2.x, it's better to make sbt plugins up-to-date. The following sbt plugins will be upgraded: {code} sbt-assembly: 0.11.2 -> 0.14.3 sbteclipse-plugin: 4.0.0 -> 5.0.1 sbt-mima-plugin: 0.1.11 -> 0.1.12 org.ow2.asm/asm: 5.0.3 -> 5.1 org.ow2.asm/asm-commons: 5.0.3 -> 5.1 {code} All other plugins are up-to-date. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18638) Upgrade sbt, zinc and maven plugins
[ https://issues.apache.org/jira/browse/SPARK-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18638: - Description: v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and upgrade it from 0.13.11 to 0.13.13. The release notes since the last version we used are: https://github.com/sbt/sbt/releases/tag/v0.13.12 and https://github.com/sbt/sbt/releases/tag/v0.13.13. Both releases include some regression fixes. This jira will also update Zinc and Maven plugins. {code} sbt: 0.13.11 -> 0.13.13, zinc: 0.3.9 -> 0.3.11, maven-assembly-plugin: 2.6 -> 3.0.0 maven-compiler-plugin: 3.5.1 -> 3.6. maven-jar-plugin: 2.6 -> 3.0.2 maven-javadoc-plugin: 2.10.3 -> 2.10.4 maven-source-plugin: 2.4 -> 3.0.1 org.codehaus.mojo:build-helper-maven-plugin: 1.10 -> 1.12 org.codehaus.mojo:exec-maven-plugin: 1.4.0 -> 1.5.0 {code} was:v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and upgrade it from 0.13.11 to 0.13.13. The release notes since the last version we used are: https://github.com/sbt/sbt/releases/tag/v0.13.12 and https://github.com/sbt/sbt/releases/tag/v0.13.13. Both releases include some regression fixes. > Upgrade sbt, zinc and maven plugins > --- > > Key: SPARK-18638 > URL: https://issues.apache.org/jira/browse/SPARK-18638 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Priority: Minor > > v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and > upgrade it from 0.13.11 to 0.13.13. The release notes since the last version > we used are: https://github.com/sbt/sbt/releases/tag/v0.13.12 and > https://github.com/sbt/sbt/releases/tag/v0.13.13. Both releases include some > regression fixes. This jira will also update Zinc and Maven plugins. > {code} >sbt: 0.13.11 -> 0.13.13, >zinc: 0.3.9 -> 0.3.11, >maven-assembly-plugin: 2.6 -> 3.0.0 >maven-compiler-plugin: 3.5.1 -> 3.6. >maven-jar-plugin: 2.6 -> 3.0.2 >maven-javadoc-plugin: 2.10.3 -> 2.10.4 >maven-source-plugin: 2.4 -> 3.0.1 >org.codehaus.mojo:build-helper-maven-plugin: 1.10 -> 1.12 >org.codehaus.mojo:exec-maven-plugin: 1.4.0 -> 1.5.0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18638) Upgrade sbt, zinc and maven plugins
[ https://issues.apache.org/jira/browse/SPARK-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18638: - Summary: Upgrade sbt, zinc and maven plugins (was: Upgrade sbt to 0.13.13) > Upgrade sbt, zinc and maven plugins > --- > > Key: SPARK-18638 > URL: https://issues.apache.org/jira/browse/SPARK-18638 > Project: Spark > Issue Type: Improvement > Components: Build >Reporter: Weiqing Yang >Priority: Minor > > v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and > upgrade it from 0.13.11 to 0.13.13. The release notes since the last version > we used are: https://github.com/sbt/sbt/releases/tag/v0.13.12 and > https://github.com/sbt/sbt/releases/tag/v0.13.13. Both releases include some > regression fixes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18638) Upgrade sbt to 0.13.13
Weiqing Yang created SPARK-18638: Summary: Upgrade sbt to 0.13.13 Key: SPARK-18638 URL: https://issues.apache.org/jira/browse/SPARK-18638 Project: Spark Issue Type: Improvement Components: Build Reporter: Weiqing Yang Priority: Minor v2.1.0-rc1has been out. For 2.2.x, it is better to keep sbt up-to-date, and upgrade it from 0.13.11 to 0.13.13. The release notes since the last version we used are: https://github.com/sbt/sbt/releases/tag/v0.13.12 and https://github.com/sbt/sbt/releases/tag/v0.13.13. Both releases include some regression fixes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18629) Fix numPartition of JDBCSuite Testcase
Weiqing Yang created SPARK-18629: Summary: Fix numPartition of JDBCSuite Testcase Key: SPARK-18629 URL: https://issues.apache.org/jira/browse/SPARK-18629 Project: Spark Issue Type: Bug Components: SQL Reporter: Weiqing Yang Priority: Minor When running any one of the test cases in JDBCSuite, you will get the following warning. {code} 10:34:26.389 WARN org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation: The number of partitions is reduced because the specified number of partitions is less than the difference between upper bound and lower bound. Updated number of partitions: 3; Input number of partitions: 4; Lower bound: 1; Upper bound: 4.{code} This jira is to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-18521) Add `NoRedundantStringInterpolator` Scala rule
[ https://issues.apache.org/jira/browse/SPARK-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang closed SPARK-18521. Resolution: Won't Fix > Add `NoRedundantStringInterpolator` Scala rule > -- > > Key: SPARK-18521 > URL: https://issues.apache.org/jira/browse/SPARK-18521 > Project: Spark > Issue Type: Improvement >Reporter: Weiqing Yang > > Currently the s string interpolator is used in many cases in which there is > no embed variable reference in the processed string literals. > For example: > core/src/main/scala/org/apache/spark/deploy/Client.scala > {code} > logError(s"Error processing messages, exiting.") > {code} > examples/src/main/scala/org/apache/spark/examples/graphx/SynthBenchmark.scala > {code} > println(s"Creating graph...") > {code} > examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala > {code} > println(s"Corpus summary:") > {code} > sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLViewSuite.scala > {code} > test(s"correctly handle CREATE OR REPLACE TEMPORARY VIEW") { > {code} > We can add a new scala style rule 'NoRedundantStringInterpolator' to prevent > unnecessary string interpolators. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18521) Add `NoRedundantStringInterpolator` Scala rule
Weiqing Yang created SPARK-18521: Summary: Add `NoRedundantStringInterpolator` Scala rule Key: SPARK-18521 URL: https://issues.apache.org/jira/browse/SPARK-18521 Project: Spark Issue Type: Improvement Reporter: Weiqing Yang Currently the s string interpolator is used in many cases in which there is no embed variable reference in the processed string literals. For example: core/src/main/scala/org/apache/spark/deploy/Client.scala {code} logError(s"Error processing messages, exiting.") {code} examples/src/main/scala/org/apache/spark/examples/graphx/SynthBenchmark.scala {code} println(s"Creating graph...") {code} examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala {code} println(s"Corpus summary:") {code} sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLViewSuite.scala {code} test(s"correctly handle CREATE OR REPLACE TEMPORARY VIEW") { {code} We can add a new scala style rule 'NoRedundantStringInterpolator' to prevent unnecessary string interpolators. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18417) Define 'spark.yarn.am.port' in yarn config object
[ https://issues.apache.org/jira/browse/SPARK-18417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-18417: - Description: Usually Yarn configurations are defined in yarn/config.scala, and are used everywhere. So we should make 'spark.yarn.am.port' defined in yarn config.scala as well, making code easier to maintain. (was: Usually Yarn configurations are defined in yarn/config.scala, and then use them everywhere. So we should define 'spark.yarn.am.port' in yarn config.scala too, that will make code easier to maintain.) > Define 'spark.yarn.am.port' in yarn config object > - > > Key: SPARK-18417 > URL: https://issues.apache.org/jira/browse/SPARK-18417 > Project: Spark > Issue Type: Improvement > Components: YARN >Reporter: Weiqing Yang >Priority: Minor > > Usually Yarn configurations are defined in yarn/config.scala, and are used > everywhere. So we should make 'spark.yarn.am.port' defined in yarn > config.scala as well, making code easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18417) Define 'spark.yarn.am.port' in yarn config object
Weiqing Yang created SPARK-18417: Summary: Define 'spark.yarn.am.port' in yarn config object Key: SPARK-18417 URL: https://issues.apache.org/jira/browse/SPARK-18417 Project: Spark Issue Type: Improvement Components: YARN Reporter: Weiqing Yang Priority: Minor Usually Yarn configurations are defined in yarn/config.scala, and then use them everywhere. So we should define 'spark.yarn.am.port' in yarn config.scala too, that will make code easier to maintain. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17714) ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName
[ https://issues.apache.org/jira/browse/SPARK-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572580#comment-15572580 ] Weiqing Yang commented on SPARK-17714: -- Not yet, need to investigate more. Could we pull in people more familiar with the Repl classloader stuff? Thanks. > ClassCircularityError is thrown when using > org.apache.spark.util.Utils.classForName > > > Key: SPARK-17714 > URL: https://issues.apache.org/jira/browse/SPARK-17714 > Project: Spark > Issue Type: Bug >Reporter: Weiqing Yang > > This jira is a follow up to [SPARK-15857| > https://issues.apache.org/jira/browse/SPARK-15857] . > Task invokes CallerContext. SetCurrentContext() to set its callerContext to > HDFS. In SetCurrentContext(), it tries looking for class > {{org.apache.hadoop.ipc.CallerContext}} by using > {{org.apache.spark.util.Utils.classForName}}. This causes > ClassCircularityError to be thrown when running ReplSuite in master Maven > builds (The same tests pass in the SBT build). A hotfix > [SPARK-17710|https://issues.apache.org/jira/browse/SPARK-17710] has been made > by using Class.forName instead, but it needs further investigation. > Error: > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.3/2000/testReport/junit/org.apache.spark.repl/ReplSuite/simple_foreach_with_accumulator/ > {code} > scala> accum: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 0, > name: None, value: 0) > scala> org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in > stage 0.0 (TID 0, localhost): java.lang.ClassCircularityError: > io/netty/util/internal/_matchers_/org/apache/spark/network/protocol/MessageMatcher > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) > at > io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) > at > io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) > at > io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) > at > org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) > at org.apache.spark.network.TransportContext.(TransportContext.java:78) > at > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) > at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) > at > org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) > at > org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) > at > io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) > at > io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) > at > io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) > at > io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) > at > org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) > at org.apache.spark.network.TransportContext.(TransportContext.java:78) > at > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) > at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) > at > org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) > at > org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) > at >
[jira] [Commented] (SPARK-16757) Set up caller context to HDFS and Yarn
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531253#comment-15531253 ] Weiqing Yang commented on SPARK-16757: -- [SPARK-17714|https://issues.apache.org/jira/browse/SPARK-17714] has been created for further investigation. > Set up caller context to HDFS and Yarn > -- > > Key: SPARK-16757 > URL: https://issues.apache.org/jira/browse/SPARK-16757 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang >Assignee: Weiqing Yang > Fix For: 2.1.0 > > > In this jira, Spark will invoke hadoop caller context api to set up its > caller context to HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-17714) ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName
[ https://issues.apache.org/jira/browse/SPARK-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-17714: - Description: This jira is a follow up to [SPARK-15857| https://issues.apache.org/jira/browse/SPARK-15857] . Task invokes CallerContext. SetCurrentContext() to set its callerContext to HDFS. In SetCurrentContext(), it tries looking for class {{org.apache.hadoop.ipc.CallerContext}} by using {{org.apache.spark.util.Utils.classForName}}. This causes ClassCircularityError to be thrown when running ReplSuite in master Maven builds (The same tests pass in the SBT build). A hotfix [SPARK-17710|https://issues.apache.org/jira/browse/SPARK-17710] has been made by using Class.forName instead, but it needs further investigation. Error: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.3/2000/testReport/junit/org.apache.spark.repl/ReplSuite/simple_foreach_with_accumulator/ {code} scala> accum: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 0, name: None, value: 0) scala> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ClassCircularityError: io/netty/util/internal/_matchers_/org/apache/spark/network/protocol/MessageMatcher at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.util.CallerContext.setCurrentContext(Utils.scala:2492) at org.apache.spark.scheduler.Task.run(Task.scala:96) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
[jira] [Updated] (SPARK-17714) ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName
[ https://issues.apache.org/jira/browse/SPARK-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-17714: - Description: This jira is related to [SPARK-15857| https://issues.apache.org/jira/browse/SPARK-15857] . Task invokes CallerContext. SetCurrentContext() to set its callerContext to HDFS. In SetCurrentContext(), it tries looking for class {{org.apache.hadoop.ipc.CallerContext}} by using {{org.apache.spark.util.Utils.classForName}}. This causes ClassCircularityError to be thrown when running ReplSuite in master Maven builds (The same tests pass in the SBT build). A hotfix [SPARK-17710|https://issues.apache.org/jira/browse/SPARK-17710] has been made by using Class.forName instead, but it needs further investigation. Error: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.3/2000/testReport/junit/org.apache.spark.repl/ReplSuite/simple_foreach_with_accumulator/ {code} scala> accum: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 0, name: None, value: 0) scala> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ClassCircularityError: io/netty/util/internal/_matchers_/org/apache/spark/network/protocol/MessageMatcher at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.util.CallerContext.setCurrentContext(Utils.scala:2492) at org.apache.spark.scheduler.Task.run(Task.scala:96) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
[jira] [Created] (SPARK-17714) ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName
Weiqing Yang created SPARK-17714: Summary: ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName Key: SPARK-17714 URL: https://issues.apache.org/jira/browse/SPARK-17714 Project: Spark Issue Type: Bug Reporter: Weiqing Yang This jira is a follow up to [SPARK-15857| https://issues.apache.org/jira/browse/SPARK-15857] . Task invokes CallerContext. SetCurrentContext() to set its callerContext to HDFS. In SetCurrentContext(), it tries looking for class {{org.apache.hadoop.ipc.CallerContext}} by using {{org.apache.spark.util.Utils.classForName}}. This causes ClassCircularityError to be thrown when running ReplSuite in master Maven builds (The same tests pass in the SBT build). A hotfix [SPARK-17710|https://issues.apache.org/jira/browse/SPARK-17710] has been made by using Class.forName instead, but it needs further investigation. Error: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.3/2000/testReport/junit/org.apache.spark.repl/ReplSuite/simple_foreach_with_accumulator/ {code} scala> accum: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 0, name: None, value: 0) scala> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ClassCircularityError: io/netty/util/internal/_matchers_/org/apache/spark/network/protocol/MessageMatcher at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62) at io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54) at io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42) at io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78) at io.netty.handler.codec.MessageToMessageEncoder.(MessageToMessageEncoder.java:59) at org.apache.spark.network.protocol.MessageEncoder.(MessageEncoder.java:34) at org.apache.spark.network.TransportContext.(TransportContext.java:78) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324) at org.apache.spark.repl.ExecutorClassLoader.org$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57) at org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:162) at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.util.CallerContext.setCurrentContext(Utils.scala:2492) at org.apache.spark.scheduler.Task.run(Task.scala:96) at
[jira] [Commented] (SPARK-17710) ReplSuite fails with ClassCircularityError in master Maven builds
[ https://issues.apache.org/jira/browse/SPARK-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531062#comment-15531062 ] Weiqing Yang commented on SPARK-17710: -- A PR https://github.com/apache/spark/pull/15286 has been created to resolve this. > ReplSuite fails with ClassCircularityError in master Maven builds > - > > Key: SPARK-17710 > URL: https://issues.apache.org/jira/browse/SPARK-17710 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.1.0 >Reporter: Josh Rosen >Priority: Critical > > The master Maven build is currently broken because ReplSuite consistently > fails with ClassCircularityErrors. See > https://spark-tests.appspot.com/jobs/spark-master-test-maven-hadoop-2.3 for a > timeline of the failure. > Here's the first build where this failed: > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.3/2000/ > This appears to correspond to > https://github.com/apache/spark/commit/6a68c5d7b4eb07e4ed6b702dd1536cd08d9bba7d > The same tests pass in the SBT build. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16757) Set up caller context to HDFS and Yarn
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-16757: - Summary: Set up caller context to HDFS and Yarn (was: Set up caller context to HDFS) > Set up caller context to HDFS and Yarn > -- > > Key: SPARK-16757 > URL: https://issues.apache.org/jira/browse/SPARK-16757 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang > > In this jira, Spark will invoke hadoop caller context api to set up its > caller context to HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-16758) Set up caller context to YARN
[ https://issues.apache.org/jira/browse/SPARK-16758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang resolved SPARK-16758. -- Resolution: Duplicate The code for this jira has been put into the PR of SPARK-16757. > Set up caller context to YARN > - > > Key: SPARK-16758 > URL: https://issues.apache.org/jira/browse/SPARK-16758 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang > > In this jira, Spark will invoke hadoop caller context api to set up its > caller context to YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-17220) Upgrade Py4J to 0.10.3
[ https://issues.apache.org/jira/browse/SPARK-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435470#comment-15435470 ] Weiqing Yang edited comment on SPARK-17220 at 8/24/16 6:56 PM: --- Oh. I see. Thanks for resolving this. was (Author: weiqingyang): Oh. I see. Thanks for resolved this. > Upgrade Py4J to 0.10.3 > -- > > Key: SPARK-17220 > URL: https://issues.apache.org/jira/browse/SPARK-17220 > Project: Spark > Issue Type: Improvement >Reporter: Weiqing Yang >Priority: Minor > > Py4J 0.10.3 has landed. It includes some important bug fixes. For example: > Both sides: fixed memory leak issue with ClientServer and potential deadlock > issue by creating a memory leak test suite. (Py4J 0.10.2) > Both sides: added more memory leak tests and fixed a potential memory leak > related to listeners. (Py4J 0.10.3) > So it's time to upgrade py4j from 0.10.1 to 0.10.3. The changelog is > available at https://www.py4j.org/changelog.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17220) Upgrade Py4J to 0.10.3
[ https://issues.apache.org/jira/browse/SPARK-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435470#comment-15435470 ] Weiqing Yang commented on SPARK-17220: -- Oh. I see. Thanks for resolved this. > Upgrade Py4J to 0.10.3 > -- > > Key: SPARK-17220 > URL: https://issues.apache.org/jira/browse/SPARK-17220 > Project: Spark > Issue Type: Improvement >Reporter: Weiqing Yang >Priority: Minor > > Py4J 0.10.3 has landed. It includes some important bug fixes. For example: > Both sides: fixed memory leak issue with ClientServer and potential deadlock > issue by creating a memory leak test suite. (Py4J 0.10.2) > Both sides: added more memory leak tests and fixed a potential memory leak > related to listeners. (Py4J 0.10.3) > So it's time to upgrade py4j from 0.10.1 to 0.10.3. The changelog is > available at https://www.py4j.org/changelog.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-17220) Upgrade Py4J to 0.10.3
[ https://issues.apache.org/jira/browse/SPARK-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-17220: - Description: Py4J 0.10.3 has landed. It includes some important bug fixes. For example: Both sides: fixed memory leak issue with ClientServer and potential deadlock issue by creating a memory leak test suite. (Py4J 0.10.2) Both sides: added more memory leak tests and fixed a potential memory leak related to listeners. (Py4J 0.10.3) So it's time to upgrade py4j from 0.10.1 to 0.10.3. The changelog is available at https://www.py4j.org/changelog.html was: Py4J 0.10.3 has landed. It includes some important bug fixes. For example: Both sides: fixed memory leak issue with ClientServer and potential deadlock issue by creating a memory leak test suite. Both sides: added more memory leak tests and fixed a potential memory leak related to listeners. So it's time to upgrade py4j to 0.10.3. The changelog is available at https://www.py4j.org/changelog.html > Upgrade Py4J to 0.10.3 > -- > > Key: SPARK-17220 > URL: https://issues.apache.org/jira/browse/SPARK-17220 > Project: Spark > Issue Type: Improvement >Reporter: Weiqing Yang >Priority: Minor > > Py4J 0.10.3 has landed. It includes some important bug fixes. For example: > Both sides: fixed memory leak issue with ClientServer and potential deadlock > issue by creating a memory leak test suite. (Py4J 0.10.2) > Both sides: added more memory leak tests and fixed a potential memory leak > related to listeners. (Py4J 0.10.3) > So it's time to upgrade py4j from 0.10.1 to 0.10.3. The changelog is > available at https://www.py4j.org/changelog.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-17220) Upgrade Py4J to 0.10.3
Weiqing Yang created SPARK-17220: Summary: Upgrade Py4J to 0.10.3 Key: SPARK-17220 URL: https://issues.apache.org/jira/browse/SPARK-17220 Project: Spark Issue Type: Improvement Reporter: Weiqing Yang Priority: Minor Py4J 0.10.3 has landed. It includes some important bug fixes. For example: Both sides: fixed memory leak issue with ClientServer and potential deadlock issue by creating a memory leak test suite. Both sides: added more memory leak tests and fixed a potential memory leak related to listeners. So it's time to upgrade py4j to 0.10.3. The changelog is available at https://www.py4j.org/changelog.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16757) Set up caller context to HDFS
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429237#comment-15429237 ] Weiqing Yang commented on SPARK-16757: -- Thanks, [~srowen]. When Spark applications run on HDFS, if Spark reads data from HDFS or writes data into HDFS, a corresponding operation record with spark caller contexts will be written into hdfs-audit.log. The Spark caller contexts are JobID_stageID_stageAttemptId_taskID_attemptNumbe and applications’ name. That can help users to better diagnose and understand how specific applications impacting parts of the Hadoop system and potential problems they may be creating (e.g. overloading NN). As HDFS mentioned in HDFS-9184, for a given HDFS operation, it's very helpful to track which upper level job issues it. > Set up caller context to HDFS > - > > Key: SPARK-16757 > URL: https://issues.apache.org/jira/browse/SPARK-16757 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang > > In this jira, Spark will invoke hadoop caller context api to set up its > caller context to HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16757) Set up caller context to HDFS
[ https://issues.apache.org/jira/browse/SPARK-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423730#comment-15423730 ] Weiqing Yang commented on SPARK-16757: -- Hi, [~srowen] Could you help review this PR please? > Set up caller context to HDFS > - > > Key: SPARK-16757 > URL: https://issues.apache.org/jira/browse/SPARK-16757 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang > > In this jira, Spark will invoke hadoop caller context api to set up its > caller context to HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-16760) Pass 'jobId' to Task
[ https://issues.apache.org/jira/browse/SPARK-16760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang closed SPARK-16760. Resolution: Duplicate The code for this jira has been put into the PR of SPARK-16757 > Pass 'jobId' to Task > > > Key: SPARK-16760 > URL: https://issues.apache.org/jira/browse/SPARK-16760 > Project: Spark > Issue Type: Sub-task >Reporter: Weiqing Yang > > In the end, the spark caller context written into HDFS log will associate > with task id, stage id, job id, app id, etc, but now Task does not know any > job information, so job id will be passed to Task in the patch of this jira. > That is good for Spark users to identify tasks especially if Spark supports > multi-tenant environment in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420116#comment-15420116 ] Weiqing Yang commented on SPARK-16966: -- [~srowen] Thanks for the new PR and review. > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang >Assignee: Sean Owen > Fix For: 2.0.1, 2.1.0 > > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414778#comment-15414778 ] Weiqing Yang commented on SPARK-16966: -- Thanks for the feedback. I will update my PR to remove those three lines of code. > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16986) "Started" time, "Completed" time and "Last Updated" time in history server UI are not user local time
Weiqing Yang created SPARK-16986: Summary: "Started" time, "Completed" time and "Last Updated" time in history server UI are not user local time Key: SPARK-16986 URL: https://issues.apache.org/jira/browse/SPARK-16986 Project: Spark Issue Type: Bug Components: Web UI Reporter: Weiqing Yang Priority: Minor Currently, "Started" time, "Completed" time and "Last Updated" time in history server UI are GMT. They should be the user local time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413921#comment-15413921 ] Weiqing Yang commented on SPARK-16966: -- Yes, it will be the name of the class executing. > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413852#comment-15413852 ] Weiqing Yang edited comment on SPARK-16966 at 8/9/16 5:04 PM: -- Thanks for the quick feedback. If "--name" is not configured and appName() is not called, "spark.app.name" will be "mainClass" --- // Set name from main class if not given name = Option(name).orElse(Option(mainClass)).orNull -- If "mainClass" is always there, I think removing those three lines of code will be safe, but for pyspark and sparkr, the mainclass might be none, is it safe to remove those three lines of code? What do you think? [~srowen][~jerryshao] was (Author: weiqingyang): Thanks for the quick feedback. If "--name" is not configured and appName() does not be called, "spark.app.name" will be "mainClass" --- // Set name from main class if not given name = Option(name).orElse(Option(mainClass)).orNull -- If "mainClass" is always there, I think removing those three lines of code will be safe, but for pyspark and sparkr, the mainclass might be none, is it safe to remove those three lines of code? What do you think? [~srowen][~jerryshao] > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413852#comment-15413852 ] Weiqing Yang commented on SPARK-16966: -- Thanks for the quick feedback. If "--name" is not configured and appName() does not be called, "spark.app.name" will be "mainClass" --- // Set name from main class if not given name = Option(name).orElse(Option(mainClass)).orNull -- If "mainClass" is always there, I think removing those three lines of code will be safe, but for pyspark and sparkr, the mainclass might be none, is it safe to remove those three lines of code? What do you think? [~srowen][~jerryshao] > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413810#comment-15413810 ] Weiqing Yang edited comment on SPARK-16966 at 8/9/16 4:39 PM: -- In the tests, I modified org.apache.spark.examples.SparkKMeans (code is from spark master branch), commented its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() was (Author: weiqingyang): In the tests, I modified org.apache.spark.examples.SparkKMeans to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413810#comment-15413810 ] Weiqing Yang edited comment on SPARK-16966 at 8/9/16 4:34 PM: -- In the tests, I modified org.apache.spark.examples.SparkKMeans to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() was (Author: weiqingyang): In the tests, I modified org.apache.spark.examples.SparkKMeans example to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413810#comment-15413810 ] Weiqing Yang commented on SPARK-16966: -- In the tests, I modified org.apache.spark.examples.SparkKMeans example to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413810#comment-15413810 ] Weiqing Yang edited comment on SPARK-16966 at 8/9/16 4:35 PM: -- In the tests, I modified org.apache.spark.examples.SparkKMeans to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() was (Author: weiqingyang): In the tests, I modified org.apache.spark.examples.SparkKMeans to comment its appName() call. val spark = SparkSession .builder // .appName("SparkKMeans") .getOrCreate() > App Name is a randomUUID even when "spark.app.name" exists > -- > > Key: SPARK-16966 > URL: https://issues.apache.org/jira/browse/SPARK-16966 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Weiqing Yang > > When submitting an application with "--name": > ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 > --num-executors 1 --master yarn --deploy-mode client --class > org.apache.spark.examples.SparkKMeans > examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar > hdfs://localhost:9000/lr_big.txt 2 5 > In the history server UI: > App ID: application_1470694797714_0016 > App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b > The App Name should not be a randomUUID > "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was > myApplicationTest. > The application "org.apache.spark.examples.SparkKMeans" above did not invoke > ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists
Weiqing Yang created SPARK-16966: Summary: App Name is a randomUUID even when "spark.app.name" exists Key: SPARK-16966 URL: https://issues.apache.org/jira/browse/SPARK-16966 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Weiqing Yang When submitting an application with "--name": ./bin/spark-submit --name myApplicationTest --verbose --executor-cores 3 --num-executors 1 --master yarn --deploy-mode client --class org.apache.spark.examples.SparkKMeans examples/target/original-spark-examples_2.11-2.1.0-SNAPSHOT.jar hdfs://localhost:9000/lr_big.txt 2 5 In the history server UI: App ID: application_1470694797714_0016 App Name: 70c06dc5-1b99-4b4a-a826-ea27497e977b The App Name should not be a randomUUID "70c06dc5-1b99-4b4a-a826-ea27497e977b" since the "spark.app.name" was myApplicationTest. The application "org.apache.spark.examples.SparkKMeans" above did not invoke ".appName()". -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16945) Fix Java Lint errors
[ https://issues.apache.org/jira/browse/SPARK-16945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang updated SPARK-16945: - Component/s: Build > Fix Java Lint errors > > > Key: SPARK-16945 > URL: https://issues.apache.org/jira/browse/SPARK-16945 > Project: Spark > Issue Type: Task > Components: Build >Reporter: Weiqing Yang >Priority: Minor > > There are following errors when running dev/lint-java: > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[42,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[97,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[113,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/RowBasedKeyValueBatch.java:[126,11] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[36,11] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[46,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[74,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[93,13] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[106,10] > (modifier) RedundantModifier: Redundant 'final' modifier. > [ERROR] > src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java:[224] > (sizes) LineLength: Line is longer than 100 characters (found 104). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16945) Fix Java Lint errors
Weiqing Yang created SPARK-16945: Summary: Fix Java Lint errors Key: SPARK-16945 URL: https://issues.apache.org/jira/browse/SPARK-16945 Project: Spark Issue Type: Task Reporter: Weiqing Yang Priority: Minor There are following errors when running dev/lint-java: [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[42,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[97,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java:[113,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/RowBasedKeyValueBatch.java:[126,11] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[36,11] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[46,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[74,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[93,13] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/sql/catalyst/expressions/FixedLengthRowBasedKeyValueBatch.java:[106,10] (modifier) RedundantModifier: Redundant 'final' modifier. [ERROR] src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java:[224] (sizes) LineLength: Line is longer than 100 characters (found 104). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15857) Add Caller Context in Spark
[ https://issues.apache.org/jira/browse/SPARK-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396677#comment-15396677 ] Weiqing Yang commented on SPARK-15857: -- That single PR is for subtasks 1, 2 and 4. Based on the review feedback, I will make a new and smaller PR to each sub-task. That will make review easier. I will close that PR, and submit the design doc before any new PR. > Add Caller Context in Spark > --- > > Key: SPARK-15857 > URL: https://issues.apache.org/jira/browse/SPARK-15857 > Project: Spark > Issue Type: New Feature >Reporter: Weiqing Yang > > Hadoop has implemented a feature of log tracing – caller context (Jira: > HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand > how specific applications impacting parts of the Hadoop system and potential > problems they may be creating (e.g. overloading NN). As HDFS mentioned in > HDFS-9184, for a given HDFS operation, it's very helpful to track which upper > level job issues it. The upper level callers may be specific Oozie tasks, MR > jobs, hive queries, Spark jobs. > Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, > HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those > systems invoke HDFS client API and Yarn client API to setup caller context, > and also expose an API to pass in caller context into it. > Lots of Spark applications are running on Yarn/HDFS. Spark can also implement > its caller context via invoking HDFS/Yarn API, and also expose an API to its > upstream applications to set up their caller contexts. In the end, the spark > caller context written into Yarn log / HDFS log can associate with task id, > stage id, job id and app id. That is also very good for Spark users to > identify tasks especially if Spark supports multi-tenant environment in the > future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15857) Add Caller Context in Spark
[ https://issues.apache.org/jira/browse/SPARK-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396583#comment-15396583 ] Weiqing Yang commented on SPARK-15857: -- To make review easier, subtasks have been created. > Add Caller Context in Spark > --- > > Key: SPARK-15857 > URL: https://issues.apache.org/jira/browse/SPARK-15857 > Project: Spark > Issue Type: New Feature >Reporter: Weiqing Yang > > Hadoop has implemented a feature of log tracing – caller context (Jira: > HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand > how specific applications impacting parts of the Hadoop system and potential > problems they may be creating (e.g. overloading NN). As HDFS mentioned in > HDFS-9184, for a given HDFS operation, it's very helpful to track which upper > level job issues it. The upper level callers may be specific Oozie tasks, MR > jobs, hive queries, Spark jobs. > Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, > HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those > systems invoke HDFS client API and Yarn client API to setup caller context, > and also expose an API to pass in caller context into it. > Lots of Spark applications are running on Yarn/HDFS. Spark can also implement > its caller context via invoking HDFS/Yarn API, and also expose an API to its > upstream applications to set up their caller contexts. In the end, the spark > caller context written into Yarn log / HDFS log can associate with task id, > stage id, job id and app id. That is also very good for Spark users to > identify tasks especially if Spark supports multi-tenant environment in the > future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16760) Pass 'jobId' to Task
Weiqing Yang created SPARK-16760: Summary: Pass 'jobId' to Task Key: SPARK-16760 URL: https://issues.apache.org/jira/browse/SPARK-16760 Project: Spark Issue Type: Sub-task Reporter: Weiqing Yang In the end, the spark caller context written into HDFS log will associate with task id, stage id, job id, app id, etc, but now Task does not know any job information, so job id will be passed to Task in the patch of this jira. That is good for Spark users to identify tasks especially if Spark supports multi-tenant environment in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16759) Spark expose an API to pass in Caller Context into it
Weiqing Yang created SPARK-16759: Summary: Spark expose an API to pass in Caller Context into it Key: SPARK-16759 URL: https://issues.apache.org/jira/browse/SPARK-16759 Project: Spark Issue Type: Sub-task Reporter: Weiqing Yang The API should expose a way for the upstream components to inject a caller context. Caller context will in the form of a tuple ( caller context type, caller context id ). Initial implementation will require support of at least one primary caller context. Future versions may need secondary caller contexts to also be provided via the same interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16758) Set up caller context to YARN
Weiqing Yang created SPARK-16758: Summary: Set up caller context to YARN Key: SPARK-16758 URL: https://issues.apache.org/jira/browse/SPARK-16758 Project: Spark Issue Type: Sub-task Reporter: Weiqing Yang In this jira, Spark will invoke hadoop caller context api to set up its caller context to YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16757) Set up caller context to HDFS
Weiqing Yang created SPARK-16757: Summary: Set up caller context to HDFS Key: SPARK-16757 URL: https://issues.apache.org/jira/browse/SPARK-16757 Project: Spark Issue Type: Sub-task Reporter: Weiqing Yang In this jira, Spark will invoke hadoop caller context api to set up its caller context to HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16595) Spark History server Rest Api gives Application not found error for yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-16595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390162#comment-15390162 ] Weiqing Yang commented on SPARK-16595: -- This issue is not reproduced. > Spark History server Rest Api gives Application not found error for > yarn-cluster mode > - > > Key: SPARK-16595 > URL: https://issues.apache.org/jira/browse/SPARK-16595 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Scenario: > * Start SparkPi application in Spark1 using yarn-cluster mode > (application_1468686376753_0041) > * After application finishes validate application exists in respective Spark > History server. > {code} > Error loading url > http://xx.xx.xx.xx:18080/api/v1/applications/application_1468686376753_0041/1/executors > HTTP Code: 404 > HTTP Data: no such app: application_1468686376753_0041{code} > {code:title=spark HS log} > 16/07/16 15:55:29 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468678823755_0049.inprogress > 16/07/16 15:56:20 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468678823755_0049 > 16/07/16 16:23:14 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468678823755_0061.inprogress > 16/07/16 16:24:14 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468678823755_0061 > 16/07/16 17:42:32 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/local-1468690940553.inprogress > 16/07/16 17:43:22 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/local-1468690940553 > 16/07/16 17:43:44 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/local-1468691017376.inprogress > 16/07/16 17:44:34 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/local-1468691017376 > 16/07/16 18:53:10 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468686376753_0041_1.inprogress > 16/07/16 19:03:26 INFO PackagesResourceConfig: Scanning for root resource and > provider classes in the packages: > org.apache.spark.status.api.v1 > 16/07/16 19:03:35 INFO ScanningResourceConfig: Root resource classes found: > class org.apache.spark.status.api.v1.ApiRootResource > 16/07/16 19:03:35 INFO ScanningResourceConfig: Provider classes found: > class org.apache.spark.status.api.v1.JacksonMessageWriter > 16/07/16 19:03:35 INFO WebApplicationImpl: Initiating Jersey application, > version 'Jersey: 1.9 09/02/2011 11:17 AM' > 16/07/16 19:03:36 INFO SecurityManager: Changing view acls to: spark > 16/07/16 19:03:36 INFO SecurityManager: Changing modify acls to: spark > 16/07/16 19:03:36 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(spark); users > with modify permissions: Set(spark) > 16/07/16 19:03:36 INFO ApplicationCache: Failed to load application attempt > application_1468686376753_0041/Some(1) > 16/07/16 19:04:21 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468686376753_0043.inprogress > 16/07/16 19:12:02 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468686376753_0043 > 16/07/16 19:16:11 INFO SecurityManager: Changing view acls to: spark > 16/07/16 19:16:11 INFO SecurityManager: Changing modify acls to: spark > 16/07/16 19:16:11 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(spark); users > with modify permissions: Set(spark) > 16/07/16 19:16:11 INFO FsHistoryProvider: Replaying log path: > hdfs://xx.xx.xx.xx:8020/spark-history/application_1468686376753_0043 > 16/07/16 19:16:22 INFO SecurityManager: Changing acls enabled to: false > 16/07/16 19:16:22 INFO SecurityManager: Changing admin acls to: > 16/07/16 19:16:22 INFO SecurityManager: Changing view acls to: hrt_qa{code} > {code} > hdfs@xxx:/var/log/spark$ hdfs dfs -ls /spark-history/ > Found 8 items > -rwxrwx--- 3 hrt_qa hadoop 28793 2016-07-16 15:56 > /spark-history/application_1468678823755_0049 > -rwxrwx--- 3 hrt_qa hadoop 28763 2016-07-16 16:24 > /spark-history/application_1468678823755_0061 > -rwxrwx--- 3 hrt_qa hadoop 58868885 2016-07-16 18:59 > /spark-history/application_1468686376753_0041_1 > -rwxrwx--- 3 hrt_qa hadoop 58841982 2016-07-16 19:11 > /spark-history/application_1468686376753_0043 > -rwxrwx--- 3 hive hadoop 5823 2016-07-16 11:38 > /spark-history/local-1468666932940 > -rwxrwx--- 3 hive hadoop 5757 2016-07-16 22:44 >
[jira] [Reopened] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang reopened SPARK-15923: -- Reopen the jira to update monitoring.md > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiqing Yang closed SPARK-15923. Resolution: Won't Fix > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363056#comment-15363056 ] Weiqing Yang edited comment on SPARK-15923 at 7/5/16 7:23 PM: -- [~jerryshao]Thanks for the feedback. After having discussed with Yesha, and will close this jira. was (Author: weiqingyang): [~jerryshao]Thanks for the feedback. After discussed with Yesha, and will close this jira. > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363056#comment-15363056 ] Weiqing Yang commented on SPARK-15923: -- [~jerryshao]Thanks for the feedback. After discussed with Yesha, and will close this jira. > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359912#comment-15359912 ] Weiqing Yang commented on SPARK-15923: -- Hi, [~tgraves] [~ste...@apache.org]could you help to review this, please? Thanks. > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15923) Spark Application rest api returns "no such app: "
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359872#comment-15359872 ] Weiqing Yang commented on SPARK-15923: -- Debugged in the cluster. Whether the cluster is secure or unsecure, this issue happens. And only applications in yarn-client mode have this issue. Detailed jira description: 1. yarn-client mode: Applications in yarn-client mode, they donot have 'attemptId' in their records, e.g: "id": "application_1465778870517_0001", "name": "Spark Pi", "attempts": [ {"startTime": "2016-06-13T01:07:16.958GMT", "endTime" : "2016-06-13T01:09:29.668GMT", "sparkUser" : "hrt_qa", "completed" : true } ] So when checking the web UI for executors’ information, the link used is http://:18080/history/application_1465778870517_0001/executors/, which shows all the executors’ information. Note: it does not have attemptId inside the link. On the other hand, if calling the rest API: http://:18080/api/v1/applications/application_1465778870517_0001/1/executors, it has attemptId "1" inside, and gets errors like "no such app" and "INFO ApplicationCache: Failed to load application attempt application_1465778870517_0001/Some(1)" . Instead, if you try the rest API: "http://:18080/api/v1/applications/application_1465778870517_0001/executors", which has no attemptId inside, we can see all the executors’ information. 2. yarn-cluster mode: Applications in yarn-cluster mode. They do have 'attemptId' in their record, e.g.: "id" : "application_1465778870517_0002", "name" : "Spark Pi", "attempts" : [ {"attemptId": "1", "startTime" : "2016-06-13T01:12:48.797GMT", "endTime" : "2016-06-13T01:14:26.900GMT", "sparkUser" : "hrt_qa", "completed" : true } ] We can check executor information by web UI and rest API since both of them have attemptId “1”: http://:18080/history/application_1465778870517_0002/1/executors/ http://:18080/api/v1/applications/application_1465778870517_0002/1/executors Summary: When checking job/executor information by rest APIs, the "attemptId" is included inside. However, in yarn client mode, there will be no attempt ID. I am going to make a pull request for review. > Spark Application rest api returns "no such app: " > - > > Key: SPARK-15923 > URL: https://issues.apache.org/jira/browse/SPARK-15923 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.1 >Reporter: Yesha Vora > > Env : secure cluster > Scenario: > * Run SparkPi application in yarn-client or yarn-cluster mode > * After application finishes, check Spark HS rest api to get details like > jobs / executor etc. > {code} > http://:18080/api/v1/applications/application_1465778870517_0001/1/executors{code} > > Rest api return HTTP Code: 404 and prints "HTTP Data: no such app: > application_1465778870517_0001" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15857) Add Caller Context in Spark
[ https://issues.apache.org/jira/browse/SPARK-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323655#comment-15323655 ] Weiqing Yang commented on SPARK-15857: -- I will attach the design doc soon. > Add Caller Context in Spark > --- > > Key: SPARK-15857 > URL: https://issues.apache.org/jira/browse/SPARK-15857 > Project: Spark > Issue Type: New Feature >Reporter: Weiqing Yang > > Hadoop has implemented a feature of log tracing – caller context (Jira: > HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand > how specific applications impacting parts of the Hadoop system and potential > problems they may be creating (e.g. overloading NN). As HDFS mentioned in > HDFS-9184, for a given HDFS operation, it's very helpful to track which upper > level job issues it. The upper level callers may be specific Oozie tasks, MR > jobs, hive queries, Spark jobs. > Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, > HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those > systems invoke HDFS client API and Yarn client API to setup caller context, > and also expose an API to pass in caller context into it. > Lots of Spark applications are running on Yarn/HDFS. Spark can also implement > its caller context via invoking HDFS/Yarn API, and also expose an API to its > upstream applications to set up their caller contexts. In the end, the spark > caller context written into Yarn log / HDFS log can associate with task id, > stage id, job id and app id. That is also very good for Spark users to > identify tasks especially if Spark supports multi-tenant environment in the > future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-15857) Add Caller Context in Spark
Weiqing Yang created SPARK-15857: Summary: Add Caller Context in Spark Key: SPARK-15857 URL: https://issues.apache.org/jira/browse/SPARK-15857 Project: Spark Issue Type: New Feature Reporter: Weiqing Yang Hadoop has implemented a feature of log tracing – caller context (Jira: HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand how specific applications impacting parts of the Hadoop system and potential problems they may be creating (e.g. overloading NN). As HDFS mentioned in HDFS-9184, for a given HDFS operation, it's very helpful to track which upper level job issues it. The upper level callers may be specific Oozie tasks, MR jobs, hive queries, Spark jobs. Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those systems invoke HDFS client API and Yarn client API to setup caller context, and also expose an API to pass in caller context into it. Lots of Spark applications are running on Yarn/HDFS. Spark can also implement its caller context via invoking HDFS/Yarn API, and also expose an API to its upstream applications to set up their caller contexts. In the end, the spark caller context written into Yarn log / HDFS log can associate with task id, stage id, job id and app id. That is also very good for Spark users to identify tasks especially if Spark supports multi-tenant environment in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15707) Make Code Neat - Use map instead of if check
[ https://issues.apache.org/jira/browse/SPARK-15707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310794#comment-15310794 ] Weiqing Yang commented on SPARK-15707: -- I am going to send pull request for review. > Make Code Neat - Use map instead of if check > > > Key: SPARK-15707 > URL: https://issues.apache.org/jira/browse/SPARK-15707 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Weiqing Yang >Priority: Trivial > > In forType function of object RandomDataGenerator, there is a piece of code > as following: > > if (maybeSqlTypeGenerator.isDefined) { >val sqlTypeGenerator = maybeSqlTypeGenerator.get >val generator = () => { > …. >} >Some(generator) > } else { > None > } > --- > It is better to use maybeSqlTypeGenerator.map instead of ‘if … else …’ above > since ‘map’ has ‘if … else …’ inside itself. That will make code neat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-15707) Make Code Neat - Use map instead of if check
Weiqing Yang created SPARK-15707: Summary: Make Code Neat - Use map instead of if check Key: SPARK-15707 URL: https://issues.apache.org/jira/browse/SPARK-15707 Project: Spark Issue Type: Improvement Components: SQL Reporter: Weiqing Yang Priority: Trivial In forType function of object RandomDataGenerator, there is a piece of code as following: if (maybeSqlTypeGenerator.isDefined) { val sqlTypeGenerator = maybeSqlTypeGenerator.get val generator = () => { …. } Some(generator) } else { None } --- It is better to use maybeSqlTypeGenerator.map instead of ‘if … else …’ above since ‘map’ has ‘if … else …’ inside itself. That will make code neat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org