[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11863 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60550/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11863 **[Test build #60550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60550/consoleFull)** for PR 11863 at commit [`c715872`](https://github.com/apache/spark/commit/c7158724a39649a0f8d84e0e02e4d71f58a4f433). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11863 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12506: [SPARK-14736][core] Deadlock in registering applications...
Github user squito commented on the issue: https://github.com/apache/spark/pull/12506 unfortunately I'm not very knowledgeable here either. I agree that this change looks reasonable, but also wish there was a test case for it. I don't think there is any good integration test framework, and it seems there aren't any tests for recovery state now, so you'd have to build that out yourself. One possibility -- "local-cluster" mode is closely related to a standalone cluster -- maybe that could be used to create a test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13565 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60548/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13565 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #60548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60548/consoleFull)** for PR 13565 at commit [`e35a7d3`](https://github.com/apache/spark/commit/e35a7d3d0696fd0c864ee8fa9c9ded95abffba2c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12739: [SPARK-14955] [SQL] avoid stride value equals to zero
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12739 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60549/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12739: [SPARK-14955] [SQL] avoid stride value equals to zero
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12739 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12739: [SPARK-14955] [SQL] avoid stride value equals to zero
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12739 **[Test build #60549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60549/consoleFull)** for PR 12739 at commit [`b1b5be2`](https://github.com/apache/spark/commit/b1b5be286a593aec164d13db1a43e66affecd7f6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user squito commented on the issue: https://github.com/apache/spark/pull/13677 everything proposed here makes sense to me, I think just needs the one minor reordering I mentioned. about creating extra stages, I looked into this some here: https://issues.apache.org/jira/browse/SPARK-10193. Actually seems fairly straight-forward, though I never followed up on profiling the extra memory involved. (happy to let someone else takeover, I doubt I'll get back to it anytime soon ...) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13677: [SPARK 15926] Improve readability of DAGScheduler...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/13677#discussion_r67101246 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -374,17 +349,35 @@ class DAGScheduler( } /** + * Create a ResultStage associated with the provided jobId. + */ + private def createResultStage( + rdd: RDD[_], + func: (TaskContext, Iterator[_]) => _, + partitions: Array[Int], + jobId: Int, + callSite: CallSite): ResultStage = { +val id = nextStageId.getAndIncrement() +val stage = new ResultStage( + id, rdd, func, partitions, getOrCreateParentStages(rdd, jobId), jobId, callSite) --- End diff -- from a quick check, I think that might solve all the failing tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13677: [SPARK 15926] Improve readability of DAGScheduler...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/13677#discussion_r67099857 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -374,17 +349,35 @@ class DAGScheduler( } /** + * Create a ResultStage associated with the provided jobId. + */ + private def createResultStage( + rdd: RDD[_], + func: (TaskContext, Iterator[_]) => _, + partitions: Array[Int], + jobId: Int, + callSite: CallSite): ResultStage = { +val id = nextStageId.getAndIncrement() +val stage = new ResultStage( + id, rdd, func, partitions, getOrCreateParentStages(rdd, jobId), jobId, callSite) --- End diff -- `getOrCreateParentStages` should be called before getting the id for the result stage, otherwise the result stage will get numbered below the dependent stages. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13668: [SPARK-15915][SQL] Logical plans should use subqu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13668#discussion_r67099374 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala --- @@ -56,10 +57,12 @@ case class LocalRelation(output: Seq[Attribute], data: Seq[InternalRow] = Nil) override protected def stringArgs = Iterator(output) - override def sameResult(plan: LogicalPlan): Boolean = plan match { -case LocalRelation(otherOutput, otherData) => - otherOutput.map(_.dataType) == output.map(_.dataType) && otherData == data -case _ => false + override def sameResult(plan: LogicalPlan): Boolean = { +EliminateSubQueries(plan) match { --- End diff -- Maybe we should execute `EliminateSubQueries` when looking up cache entry? 1.6 is not like 2.0, we don't have the `canonized` method and can't memorize the result of `EliminateSubQueries`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13631 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60546/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13631 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13631 **[Test build #60546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60546/consoleFull)** for PR 13631 at commit [`3030144`](https://github.com/apache/spark/commit/3030144b3bbe4becaa438e6155e90b53aaeb94cf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13651 **[Test build #60551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60551/consoleFull)** for PR 13651 at commit [`3735411`](https://github.com/apache/spark/commit/3735411d8dc6210098939721f63aeecbe93cb873). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13676#discussion_r67097541 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -479,7 +354,287 @@ private[hive] trait HiveInspectors { } /** - * Builds specific unwrappers ahead of time according to object inspector + * Strictly follows the following order in unwrapping (constant OI has the higher priority): + * Constant Null object inspector => + * return null + * Constant object inspector => + * extract the value from constant object inspector + * If object inspector prefers writable => + * extract writable from `data` and then get the catalyst type from the writable + * Extract the java object directly from the object inspector + * + * NOTICE: the complex data type requires recursive unwrapping. + */ + def unwrapperFor(objectInspector: ObjectInspector): Any => Any = +objectInspector match { + case coi: ConstantObjectInspector if coi.getWritableConstantValue == null => +data: Any => null + case poi: WritableConstantStringObjectInspector => +data: Any => + UTF8String.fromString(poi.getWritableConstantValue.toString) + case poi: WritableConstantHiveVarcharObjectInspector => +data: Any => + UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue) + case poi: WritableConstantHiveCharObjectInspector => +data: Any => + UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue) + case poi: WritableConstantHiveDecimalObjectInspector => +data: Any => + HiveShim.toCatalystDecimal( +PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector, +poi.getWritableConstantValue.getHiveDecimal) + case poi: WritableConstantTimestampObjectInspector => +data: Any => { + val t = poi.getWritableConstantValue + t.getSeconds * 100L + t.getNanos / 1000L +} + case poi: WritableConstantIntObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantDoubleObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantBooleanObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantLongObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantFloatObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantShortObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantByteObjectInspector => +data: Any => + poi.getWritableConstantValue.get() + case poi: WritableConstantBinaryObjectInspector => +data: Any => { + val writable = poi.getWritableConstantValue + val temp = new Array[Byte](writable.getLength) + System.arraycopy(writable.getBytes, 0, temp, 0, temp.length) + temp +} + case poi: WritableConstantDateObjectInspector => +data: Any => + DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get()) + case mi: StandardConstantMapObjectInspector => +val keyUnwrapper = unwrapperFor(mi.getMapKeyObjectInspector) +val valueUnwrapper = unwrapperFor(mi.getMapValueObjectInspector) +data: Any => { + // take the value from the map inspector object, rather than the input data + val keyValues = mi.getWritableConstantValue.asScala.toSeq + val keys = keyValues.map(kv => keyUnwrapper(kv._1)).toArray + val values = keyValues.map(kv => valueUnwrapper(kv._2)).toArray + ArrayBasedMapData(keys, values) +} + case li: StandardConstantListObjectInspector => +val unwrapper = unwrapperFor(li.getListElementObjectInspector) +data: Any => { + // take the value from the list inspector object, rather than the input data + val values = li.getWritableConstantValue.asScala +.map(unwrapper) +.toArray + new GenericArrayData(values) +} + case poi: VoidObjectInspector => +data: Any => + null // always be null for void object inspector + case pi: PrimitiveObjectInspector => pi match { +// We think HiveVarchar/HiveChar is also a String +case hvoi:
[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13676#discussion_r67097521 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -243,137 +243,12 @@ private[hive] trait HiveInspectors { * @param data the data in Hive type * @param oi the ObjectInspector associated with the Hive Type * @return convert the data into catalyst type - * TODO return the function of (data => Any) instead for performance consideration * - * Strictly follows the following order in unwrapping (constant OI has the higher priority): - * Constant Null object inspector => - *return null - * Constant object inspector => - *extract the value from constant object inspector - * Check whether the `data` is null => - *return null if true - * If object inspector prefers writable => - *extract writable from `data` and then get the catalyst type from the writable - * Extract the java object directly from the object inspector - * - * NOTICE: the complex data type requires recursive unwrapping. + * Use unwrapperFor's (data => Any) instead for performance consideration. */ - def unwrap(data: Any, oi: ObjectInspector): Any = oi match { -case coi: ConstantObjectInspector if coi.getWritableConstantValue == null => null -case poi: WritableConstantStringObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.toString) -case poi: WritableConstantHiveVarcharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue) -case poi: WritableConstantHiveCharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue) -case poi: WritableConstantHiveDecimalObjectInspector => - HiveShim.toCatalystDecimal( -PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector, -poi.getWritableConstantValue.getHiveDecimal) -case poi: WritableConstantTimestampObjectInspector => - val t = poi.getWritableConstantValue - t.getSeconds * 100L + t.getNanos / 1000L -case poi: WritableConstantIntObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantDoubleObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBooleanObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantLongObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantFloatObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantShortObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantByteObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBinaryObjectInspector => - val writable = poi.getWritableConstantValue - val temp = new Array[Byte](writable.getLength) - System.arraycopy(writable.getBytes, 0, temp, 0, temp.length) - temp -case poi: WritableConstantDateObjectInspector => - DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get()) -case mi: StandardConstantMapObjectInspector => - // take the value from the map inspector object, rather than the input data - val keyValues = mi.getWritableConstantValue.asScala.toSeq - val keys = keyValues.map(kv => unwrap(kv._1, mi.getMapKeyObjectInspector)).toArray - val values = keyValues.map(kv => unwrap(kv._2, mi.getMapValueObjectInspector)).toArray - ArrayBasedMapData(keys, values) -case li: StandardConstantListObjectInspector => - // take the value from the list inspector object, rather than the input data - val values = li.getWritableConstantValue.asScala -.map(unwrap(_, li.getListElementObjectInspector)) -.toArray - new GenericArrayData(values) -// if the value is null, we don't care about the object inspector type -case _ if data == null => null -case poi: VoidObjectInspector => null // always be null for void object inspector -case pi: PrimitiveObjectInspector => pi match { - // We think HiveVarchar/HiveChar is also a String - case hvoi: HiveVarcharObjectInspector if hvoi.preferWritable() => - UTF8String.fromString(hvoi.getPrimitiveWritableObject(data).getHiveVarchar.getValue) - case hvoi: HiveVarcharObjectInspector => -UTF8String.fromString(hvoi.getPrimitiveJavaObject(data).getValue) - case hvoi: HiveCharObjectInspector if hvoi.preferWritable() => -
[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11863 **[Test build #60550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60550/consoleFull)** for PR 11863 at commit [`c715872`](https://github.com/apache/spark/commit/c7158724a39649a0f8d84e0e02e4d71f58a4f433). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13678 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60545/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13678 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13678 **[Test build #60545 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60545/consoleFull)** for PR 13678 at commit [`2639bc4`](https://github.com/apache/spark/commit/2639bc4e3f468678b089e40d36ea3e6d60cdc0db). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12739: [SPARK-14955] [SQL] avoid stride value equals to zero
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12739 **[Test build #60549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60549/consoleFull)** for PR 12739 at commit [`b1b5be2`](https://github.com/apache/spark/commit/b1b5be286a593aec164d13db1a43e66affecd7f6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13646: [SPARK-15927] Eliminate redundant DAGScheduler code.
Github user squito commented on the issue: https://github.com/apache/spark/pull/13646 @kayousterhout btw I tracked down the test issue. It wasn't safe to check the map output status inside the backend in those tests -- I addressed that here: https://github.com/apache/spark/pull/13565/commits/91ea3df33aaebb20b2df0cbbe56dbc99975200ea. Also the real problem was somewhat hidden b/c the assertion was failing in another thread, which I fixed here: https://github.com/apache/spark/pull/13565/commits/e35a7d3d0696fd0c864ee8fa9c9ded95abffba2c I ran the tests over 15k times on my laptop and still never triggered it, though. wish I had a better way to find these issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs tab
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13620 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60543/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs tab
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13620 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs tab
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13620 **[Test build #60543 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60543/consoleFull)** for PR 13620 at commit [`41bb189`](https://github.com/apache/spark/commit/41bb18976bb04e0af43c3464300fae204a111f07). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #60548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60548/consoleFull)** for PR 13565 at commit [`e35a7d3`](https://github.com/apache/spark/commit/e35a7d3d0696fd0c864ee8fa9c9ded95abffba2c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60544/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60544 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60544/consoleFull)** for PR 13673 at commit [`90ed61c`](https://github.com/apache/spark/commit/90ed61ca579cfc94eee4492f56ad3c43c54e57dd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60542/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60542 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60542/consoleFull)** for PR 13673 at commit [`6d1501c`](https://github.com/apache/spark/commit/6d1501c35bdf8ff7c2ee5af3c373e8ebbdbe). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/13676#discussion_r67094026 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -243,137 +243,12 @@ private[hive] trait HiveInspectors { * @param data the data in Hive type * @param oi the ObjectInspector associated with the Hive Type * @return convert the data into catalyst type - * TODO return the function of (data => Any) instead for performance consideration * - * Strictly follows the following order in unwrapping (constant OI has the higher priority): - * Constant Null object inspector => - *return null - * Constant object inspector => - *extract the value from constant object inspector - * Check whether the `data` is null => - *return null if true - * If object inspector prefers writable => - *extract writable from `data` and then get the catalyst type from the writable - * Extract the java object directly from the object inspector - * - * NOTICE: the complex data type requires recursive unwrapping. + * Use unwrapperFor's (data => Any) instead for performance consideration. */ - def unwrap(data: Any, oi: ObjectInspector): Any = oi match { -case coi: ConstantObjectInspector if coi.getWritableConstantValue == null => null -case poi: WritableConstantStringObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.toString) -case poi: WritableConstantHiveVarcharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue) -case poi: WritableConstantHiveCharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue) -case poi: WritableConstantHiveDecimalObjectInspector => - HiveShim.toCatalystDecimal( -PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector, -poi.getWritableConstantValue.getHiveDecimal) -case poi: WritableConstantTimestampObjectInspector => - val t = poi.getWritableConstantValue - t.getSeconds * 100L + t.getNanos / 1000L -case poi: WritableConstantIntObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantDoubleObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBooleanObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantLongObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantFloatObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantShortObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantByteObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBinaryObjectInspector => - val writable = poi.getWritableConstantValue - val temp = new Array[Byte](writable.getLength) - System.arraycopy(writable.getBytes, 0, temp, 0, temp.length) - temp -case poi: WritableConstantDateObjectInspector => - DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get()) -case mi: StandardConstantMapObjectInspector => - // take the value from the map inspector object, rather than the input data - val keyValues = mi.getWritableConstantValue.asScala.toSeq - val keys = keyValues.map(kv => unwrap(kv._1, mi.getMapKeyObjectInspector)).toArray - val values = keyValues.map(kv => unwrap(kv._2, mi.getMapValueObjectInspector)).toArray - ArrayBasedMapData(keys, values) -case li: StandardConstantListObjectInspector => - // take the value from the list inspector object, rather than the input data - val values = li.getWritableConstantValue.asScala -.map(unwrap(_, li.getListElementObjectInspector)) -.toArray - new GenericArrayData(values) -// if the value is null, we don't care about the object inspector type -case _ if data == null => null -case poi: VoidObjectInspector => null // always be null for void object inspector -case pi: PrimitiveObjectInspector => pi match { - // We think HiveVarchar/HiveChar is also a String - case hvoi: HiveVarcharObjectInspector if hvoi.preferWritable() => - UTF8String.fromString(hvoi.getPrimitiveWritableObject(data).getHiveVarchar.getValue) - case hvoi: HiveVarcharObjectInspector => -UTF8String.fromString(hvoi.getPrimitiveJavaObject(data).getValue) - case hvoi: HiveCharObjectInspector if hvoi.preferWritable() => -
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60547/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60547/consoleFull)** for PR 13673 at commit [`d1c5036`](https://github.com/apache/spark/commit/d1c50360ef654e3fb13f1cefa07eacdcff2124ba). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/13677 > One goal of this change it to make it clearer which functions may create new stages (as opposed to looking up stages that already exist). Something that I have been looking at of late, and I know that @squito has looked at some, too. In short, I'm pretty confident that we doing some silliness around creating new stages instead of reusing already existing stages, then recognizing that all the task for the "new" stages are already completed (at least we're smart enough to reuse the map outputs), so the "new" stages just become "skipped". I'll take a closer look at this tomorrow, and may have a follow-on PR in the not too distant future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60547/consoleFull)** for PR 13673 at commit [`d1c5036`](https://github.com/apache/spark/commit/d1c50360ef654e3fb13f1cefa07eacdcff2124ba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13631 **[Test build #60546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60546/consoleFull)** for PR 13631 at commit [`3030144`](https://github.com/apache/spark/commit/3030144b3bbe4becaa438e6155e90b53aaeb94cf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13678 cc @Sephiroth-Lin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13671: [SPARK-15952] [SQL] fix "show databases" ordering issue
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/13671 thanks for merging! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13678 **[Test build #60545 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60545/consoleFull)** for PR 13678 at commit [`2639bc4`](https://github.com/apache/spark/commit/2639bc4e3f468678b089e40d36ea3e6d60cdc0db). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60541/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...
Github user dafrista commented on a diff in the pull request: https://github.com/apache/spark/pull/13676#discussion_r67092384 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -243,137 +243,12 @@ private[hive] trait HiveInspectors { * @param data the data in Hive type * @param oi the ObjectInspector associated with the Hive Type * @return convert the data into catalyst type - * TODO return the function of (data => Any) instead for performance consideration * - * Strictly follows the following order in unwrapping (constant OI has the higher priority): - * Constant Null object inspector => - *return null - * Constant object inspector => - *extract the value from constant object inspector - * Check whether the `data` is null => - *return null if true - * If object inspector prefers writable => - *extract writable from `data` and then get the catalyst type from the writable - * Extract the java object directly from the object inspector - * - * NOTICE: the complex data type requires recursive unwrapping. + * Use unwrapperFor's (data => Any) instead for performance consideration. */ - def unwrap(data: Any, oi: ObjectInspector): Any = oi match { -case coi: ConstantObjectInspector if coi.getWritableConstantValue == null => null -case poi: WritableConstantStringObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.toString) -case poi: WritableConstantHiveVarcharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue) -case poi: WritableConstantHiveCharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue) -case poi: WritableConstantHiveDecimalObjectInspector => - HiveShim.toCatalystDecimal( -PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector, -poi.getWritableConstantValue.getHiveDecimal) -case poi: WritableConstantTimestampObjectInspector => - val t = poi.getWritableConstantValue - t.getSeconds * 100L + t.getNanos / 1000L -case poi: WritableConstantIntObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantDoubleObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBooleanObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantLongObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantFloatObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantShortObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantByteObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBinaryObjectInspector => - val writable = poi.getWritableConstantValue - val temp = new Array[Byte](writable.getLength) - System.arraycopy(writable.getBytes, 0, temp, 0, temp.length) - temp -case poi: WritableConstantDateObjectInspector => - DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get()) -case mi: StandardConstantMapObjectInspector => - // take the value from the map inspector object, rather than the input data - val keyValues = mi.getWritableConstantValue.asScala.toSeq - val keys = keyValues.map(kv => unwrap(kv._1, mi.getMapKeyObjectInspector)).toArray - val values = keyValues.map(kv => unwrap(kv._2, mi.getMapValueObjectInspector)).toArray - ArrayBasedMapData(keys, values) -case li: StandardConstantListObjectInspector => - // take the value from the list inspector object, rather than the input data - val values = li.getWritableConstantValue.asScala -.map(unwrap(_, li.getListElementObjectInspector)) -.toArray - new GenericArrayData(values) -// if the value is null, we don't care about the object inspector type -case _ if data == null => null -case poi: VoidObjectInspector => null // always be null for void object inspector -case pi: PrimitiveObjectInspector => pi match { - // We think HiveVarchar/HiveChar is also a String - case hvoi: HiveVarcharObjectInspector if hvoi.preferWritable() => - UTF8String.fromString(hvoi.getPrimitiveWritableObject(data).getHiveVarchar.getValue) - case hvoi: HiveVarcharObjectInspector => -UTF8String.fromString(hvoi.getPrimitiveJavaObject(data).getValue) - case hvoi: HiveCharObjectInspector if hvoi.preferWritable() => -
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60541/consoleFull)** for PR 13673 at commit [`010423f`](https://github.com/apache/spark/commit/010423f5ba300c9229b82343463147d733920a4a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... s...
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/13678 [SPARK-15824][SQL] Execute WITH INSERT ... statements immediately ## What changes were proposed in this pull request? We currently immediately execute `INSERT` commands when they are issued. This is not the case as soon as we use a `WITH` to define common table expressions, for example: ```sql WITH tbl AS (SELECT * FROM x WHERE id = 10) INSERT INTO y SELECT * FROM tbl ``` This PR fixes this problem. This PR closes https://github.com/apache/spark/pull/13561 (which fixes the a instance of this problem in the ThriftSever). ## How was this patch tested? Added a test to `InsertSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/hvanhovell/spark SPARK-15824 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13678.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13678 commit 2639bc4e3f468678b089e40d36ea3e6d60cdc0db Author: Herman van HovellDate: 2016-06-15T02:40:25Z Check the analyzed plan for side effects instead of the logical plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13665: [SPARK-15935][PySpark]Fix a wrong format tag in t...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13665 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13663: [SPARK-15950][SQL] Eliminate unreachable code at project...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13663 @cloud-fan and @davies , thank you for your comments. A global variable is used for 1. I will address 1. by using another approach without using a global variable in another PR. This PR will focus on 2. and 3. that are addressed without using a global variable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13665: [SPARK-15935][PySpark]Fix a wrong format tag in the erro...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/13665 Thanks. Merging into master and 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13631: [SPARK-15911][SQL] Remove the additional Project ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/13631#discussion_r67091914 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -470,20 +470,16 @@ class Analyzer( // Assume partition columns are correctly placed at the end of the child's output i.copy(table = EliminateSubqueryAliases(table)) } else { - // Set up the table's partition scheme with all dynamic partitions by moving partition - // columns to the end of the column list, in partition order. - val (inputPartCols, columns) = child.output.partition { attr => -tablePartitionNames.contains(attr.name) - } // All partition columns are dynamic because this InsertIntoTable had no partitioning - val partColumns = tablePartitionNames.map { name => -inputPartCols.find(_.name == name).getOrElse( - throw new AnalysisException(s"Cannot find partition column $name")) + tablePartitionNames.filterNot { name => +child.output.exists(_.name == name) --- End diff -- hmm. indeed. As we use ordering not name, this check is not needed anymore. I will remove it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13677 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13677 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60540/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13677 **[Test build #60540 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60540/consoleFull)** for PR 13677 at commit [`1b7a338`](https://github.com/apache/spark/commit/1b7a3386fb631f1a41ea73d6bfdc9961ed8e4234). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13676#discussion_r67091024 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -243,137 +243,12 @@ private[hive] trait HiveInspectors { * @param data the data in Hive type * @param oi the ObjectInspector associated with the Hive Type * @return convert the data into catalyst type - * TODO return the function of (data => Any) instead for performance consideration * - * Strictly follows the following order in unwrapping (constant OI has the higher priority): - * Constant Null object inspector => - *return null - * Constant object inspector => - *extract the value from constant object inspector - * Check whether the `data` is null => - *return null if true - * If object inspector prefers writable => - *extract writable from `data` and then get the catalyst type from the writable - * Extract the java object directly from the object inspector - * - * NOTICE: the complex data type requires recursive unwrapping. + * Use unwrapperFor's (data => Any) instead for performance consideration. */ - def unwrap(data: Any, oi: ObjectInspector): Any = oi match { -case coi: ConstantObjectInspector if coi.getWritableConstantValue == null => null -case poi: WritableConstantStringObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.toString) -case poi: WritableConstantHiveVarcharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue) -case poi: WritableConstantHiveCharObjectInspector => - UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue) -case poi: WritableConstantHiveDecimalObjectInspector => - HiveShim.toCatalystDecimal( -PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector, -poi.getWritableConstantValue.getHiveDecimal) -case poi: WritableConstantTimestampObjectInspector => - val t = poi.getWritableConstantValue - t.getSeconds * 100L + t.getNanos / 1000L -case poi: WritableConstantIntObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantDoubleObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBooleanObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantLongObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantFloatObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantShortObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantByteObjectInspector => - poi.getWritableConstantValue.get() -case poi: WritableConstantBinaryObjectInspector => - val writable = poi.getWritableConstantValue - val temp = new Array[Byte](writable.getLength) - System.arraycopy(writable.getBytes, 0, temp, 0, temp.length) - temp -case poi: WritableConstantDateObjectInspector => - DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get()) -case mi: StandardConstantMapObjectInspector => - // take the value from the map inspector object, rather than the input data - val keyValues = mi.getWritableConstantValue.asScala.toSeq - val keys = keyValues.map(kv => unwrap(kv._1, mi.getMapKeyObjectInspector)).toArray - val values = keyValues.map(kv => unwrap(kv._2, mi.getMapValueObjectInspector)).toArray - ArrayBasedMapData(keys, values) -case li: StandardConstantListObjectInspector => - // take the value from the list inspector object, rather than the input data - val values = li.getWritableConstantValue.asScala -.map(unwrap(_, li.getListElementObjectInspector)) -.toArray - new GenericArrayData(values) -// if the value is null, we don't care about the object inspector type -case _ if data == null => null -case poi: VoidObjectInspector => null // always be null for void object inspector -case pi: PrimitiveObjectInspector => pi match { - // We think HiveVarchar/HiveChar is also a String - case hvoi: HiveVarcharObjectInspector if hvoi.preferWritable() => - UTF8String.fromString(hvoi.getPrimitiveWritableObject(data).getHiveVarchar.getValue) - case hvoi: HiveVarcharObjectInspector => -UTF8String.fromString(hvoi.getPrimitiveJavaObject(data).getValue) - case hvoi: HiveCharObjectInspector if hvoi.preferWritable() => -
[GitHub] spark issue #13668: [SPARK-15915][SQL] Logical plans should use subqueries e...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60539/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13668: [SPARK-15915][SQL] Logical plans should use subqueries e...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13668 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13668: [SPARK-15915][SQL] Logical plans should use subqueries e...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13668 **[Test build #60539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60539/consoleFull)** for PR 13668 at commit [`e51774f`](https://github.com/apache/spark/commit/e51774f1eb73ddcb820619addd8be18894bba5e6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13662: [SPARK-15945] [MLLIB] Conversion between old/new ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13662 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60544 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60544/consoleFull)** for PR 13673 at commit [`90ed61c`](https://github.com/apache/spark/commit/90ed61ca579cfc94eee4492f56ad3c43c54e57dd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13665: [SPARK-15935][PySpark]Fix a wrong format tag in the erro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13665 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60537/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13665: [SPARK-15935][PySpark]Fix a wrong format tag in the erro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13665 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs tab
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13620 **[Test build #60543 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60543/consoleFull)** for PR 13620 at commit [`41bb189`](https://github.com/apache/spark/commit/41bb18976bb04e0af43c3464300fae204a111f07). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13665: [SPARK-15935][PySpark]Fix a wrong format tag in the erro...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13665 **[Test build #60537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60537/consoleFull)** for PR 13665 at commit [`152d222`](https://github.com/apache/spark/commit/152d2223bc38b92488e289e65533b4bbec1cf0a6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs ...
Github user nblintao commented on a diff in the pull request: https://github.com/apache/spark/pull/13620#discussion_r67087787 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala --- @@ -369,3 +361,246 @@ private[ui] class AllJobsPage(parent: JobsTab) extends WebUIPage("") { } } } + +private[ui] class JobTableRowData( +val jobData: JobUIData, +val lastStageName: String, +val lastStageDescription: String, +val duration: Long, +val formattedDuration: String, +val submissionTime: Long, +val formattedSubmissionTime: String, +val jobDescription: NodeSeq, +val detailUrl: String) + +private[ui] class JobDataSource( +jobs: Seq[JobUIData], +stageIdToInfo: HashMap[Int, StageInfo], +stageIdToData: HashMap[(Int, Int), StageUIData], +basePath: String, +currentTime: Long, +pageSize: Int, +sortColumn: String, +desc: Boolean) extends PagedDataSource[JobTableRowData](pageSize) { + + // Convert JobUIData to JobTableRowData which contains the final contents to show in the table + // so that we can avoid creating duplicate contents during sorting the data + private val data = jobs.map(jobRow).sorted(ordering(sortColumn, desc)) + + private var _slicedJobIds: Set[Int] = null + + override def dataSize: Int = data.size + + override def sliceData(from: Int, to: Int): Seq[JobTableRowData] = { +val r = data.slice(from, to) +_slicedJobIds = r.map(_.jobData.jobId).toSet +r + } + + def slicedJobIds: Set[Int] = _slicedJobIds + + private def getLastStageNameAndDescription(job: JobUIData): (String, String) = { +val lastStageInfo = Option(job.stageIds) + .filter(_.nonEmpty) + .flatMap { ids => stageIdToInfo.get(ids.max)} +val lastStageData = lastStageInfo.flatMap { s => + stageIdToData.get((s.stageId, s.attemptId)) +} +val name = lastStageInfo.map(_.name).getOrElse("(Unknown Stage Name)") +val description = lastStageData.flatMap(_.description).getOrElse("") +(name, description) + } + + private def jobRow(jobData: JobUIData): JobTableRowData = { +val (lastStageName, lastStageDescription) = getLastStageNameAndDescription(jobData) +val duration: Option[Long] = { + jobData.submissionTime.map { start => +val end = jobData.completionTime.getOrElse(System.currentTimeMillis()) +end - start + } +} +val formattedDuration = duration.map(d => UIUtils.formatDuration(d)).getOrElse("Unknown") +val submissionTime = jobData.submissionTime +val formattedSubmissionTime = submissionTime.map(UIUtils.formatDate).getOrElse("Unknown") +val jobDescription = UIUtils.makeDescription(lastStageDescription, basePath, plainText = false) + +val detailUrl = "%s/jobs/job?id=%s".format(basePath, jobData.jobId) + +new JobTableRowData ( + jobData, + lastStageName, + lastStageDescription, + duration.getOrElse(-1), + formattedDuration, + submissionTime.getOrElse(-1), + formattedSubmissionTime, + jobDescription, + detailUrl +) + } + + /** + * Return Ordering according to sortColumn and desc + */ + private def ordering(sortColumn: String, desc: Boolean): Ordering[JobTableRowData] = { +val ordering = sortColumn match { + case "Job Id" | "Job Id (Job Group)" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Int.compare(x.jobData.jobId, y.jobData.jobId) + } + case "Description" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.String.compare(x.lastStageDescription, y.lastStageDescription) + } + case "Submitted" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Long.compare(x.submissionTime, y.submissionTime) + } + case "Duration" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Long.compare(x.duration, y.duration) + } + case "Stages: Succeeded/Total" | "Tasks (for all stages): Succeeded/Total" => +throw new IllegalArgumentException(s"Unsortable column: $sortColumn") + case unknownColumn => throw new IllegalArgumentException(s"Unknown column: $unknownColumn") +} +if (desc) { + ordering.reverse +} else { + ordering +} + } +
[GitHub] spark pull request #13620: [SPARK-15590] [WEBUI] Paginate Job Table in Jobs ...
Github user nblintao commented on a diff in the pull request: https://github.com/apache/spark/pull/13620#discussion_r67087737 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala --- @@ -369,3 +361,246 @@ private[ui] class AllJobsPage(parent: JobsTab) extends WebUIPage("") { } } } + +private[ui] class JobTableRowData( +val jobData: JobUIData, +val lastStageName: String, +val lastStageDescription: String, +val duration: Long, +val formattedDuration: String, +val submissionTime: Long, +val formattedSubmissionTime: String, +val jobDescription: NodeSeq, +val detailUrl: String) + +private[ui] class JobDataSource( +jobs: Seq[JobUIData], +stageIdToInfo: HashMap[Int, StageInfo], +stageIdToData: HashMap[(Int, Int), StageUIData], +basePath: String, +currentTime: Long, +pageSize: Int, +sortColumn: String, +desc: Boolean) extends PagedDataSource[JobTableRowData](pageSize) { + + // Convert JobUIData to JobTableRowData which contains the final contents to show in the table + // so that we can avoid creating duplicate contents during sorting the data + private val data = jobs.map(jobRow).sorted(ordering(sortColumn, desc)) + + private var _slicedJobIds: Set[Int] = null + + override def dataSize: Int = data.size + + override def sliceData(from: Int, to: Int): Seq[JobTableRowData] = { +val r = data.slice(from, to) +_slicedJobIds = r.map(_.jobData.jobId).toSet +r + } + + def slicedJobIds: Set[Int] = _slicedJobIds + + private def getLastStageNameAndDescription(job: JobUIData): (String, String) = { +val lastStageInfo = Option(job.stageIds) + .filter(_.nonEmpty) + .flatMap { ids => stageIdToInfo.get(ids.max)} +val lastStageData = lastStageInfo.flatMap { s => + stageIdToData.get((s.stageId, s.attemptId)) +} +val name = lastStageInfo.map(_.name).getOrElse("(Unknown Stage Name)") +val description = lastStageData.flatMap(_.description).getOrElse("") +(name, description) + } + + private def jobRow(jobData: JobUIData): JobTableRowData = { +val (lastStageName, lastStageDescription) = getLastStageNameAndDescription(jobData) +val duration: Option[Long] = { + jobData.submissionTime.map { start => +val end = jobData.completionTime.getOrElse(System.currentTimeMillis()) +end - start + } +} +val formattedDuration = duration.map(d => UIUtils.formatDuration(d)).getOrElse("Unknown") +val submissionTime = jobData.submissionTime +val formattedSubmissionTime = submissionTime.map(UIUtils.formatDate).getOrElse("Unknown") +val jobDescription = UIUtils.makeDescription(lastStageDescription, basePath, plainText = false) + +val detailUrl = "%s/jobs/job?id=%s".format(basePath, jobData.jobId) + +new JobTableRowData ( + jobData, + lastStageName, + lastStageDescription, + duration.getOrElse(-1), + formattedDuration, + submissionTime.getOrElse(-1), + formattedSubmissionTime, + jobDescription, + detailUrl +) + } + + /** + * Return Ordering according to sortColumn and desc + */ + private def ordering(sortColumn: String, desc: Boolean): Ordering[JobTableRowData] = { +val ordering = sortColumn match { + case "Job Id" | "Job Id (Job Group)" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Int.compare(x.jobData.jobId, y.jobData.jobId) + } + case "Description" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.String.compare(x.lastStageDescription, y.lastStageDescription) + } + case "Submitted" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Long.compare(x.submissionTime, y.submissionTime) + } + case "Duration" => new Ordering[JobTableRowData] { +override def compare(x: JobTableRowData, y: JobTableRowData): Int = + Ordering.Long.compare(x.duration, y.duration) + } + case "Stages: Succeeded/Total" | "Tasks (for all stages): Succeeded/Total" => +throw new IllegalArgumentException(s"Unsortable column: $sortColumn") + case unknownColumn => throw new IllegalArgumentException(s"Unknown column: $unknownColumn") +} +if (desc) { + ordering.reverse +} else { + ordering +} + } +
[GitHub] spark pull request #13671: [SPARK-15952] [SQL] fix "show databases" ordering...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13671 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13671: [SPARK-15952] [SQL] fix "show databases" ordering issue
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13671 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13662: [SPARK-15945] [MLLIB] Conversion between old/new vector ...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13662 It looks like the merge script is not happy, I will retry later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13662: [SPARK-15945] [MLLIB] Conversion between old/new vector ...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13662 LGTM, merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60542 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60542/consoleFull)** for PR 13673 at commit [`6d1501c`](https://github.com/apache/spark/commit/6d1501c35bdf8ff7c2ee5af3c373e8ebbdbe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13498: [SPARK-15011][SQL] Re-enable 'analyze MetastoreRe...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13498 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed Contin...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/13673#discussion_r67086434 --- Diff: python/pyspark/sql/readwriter.py --- @@ -28,7 +28,7 @@ from pyspark.sql.types import * from pyspark.sql import utils -__all__ = ["DataFrameReader", "DataFrameWriter"] +__all__ = ["DataFrameReader", "DataFrameWriter", "DataStreamReader", "DataStreamWriter"] --- End diff -- Based on comment in #13653 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13653: [SPARK-15933][SQL][STREAMING] Refactored DF reade...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/13653#discussion_r67086340 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala --- @@ -40,6 +40,8 @@ class StreamingAggregationSuite extends StreamTest with BeforeAndAfterAll { import testImplicits._ + --- End diff -- Done in #13673 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13653: [SPARK-15933][SQL][STREAMING] Refactored DF reade...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/13653#discussion_r67086347 --- Diff: python/pyspark/sql/readwriter.py --- @@ -905,6 +764,503 @@ def jdbc(self, url, table, mode=None, properties=None): self._jwrite.mode(mode).jdbc(url, table, jprop) +class DataStreamReader(object): --- End diff -- Done in #13673 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13498: [SPARK-15011][SQL] Re-enable 'analyze MetastoreRelations...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13498 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/13673 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13674: [MINOR][DOCS][SQL] Fix some comments about types(TypeCoe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13674 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13674: [MINOR][DOCS][SQL] Fix some comments about types(TypeCoe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13674 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60534/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13674: [MINOR][DOCS][SQL] Fix some comments about types(TypeCoe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13674 **[Test build #60534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60534/consoleFull)** for PR 13674 at commit [`421bf89`](https://github.com/apache/spark/commit/421bf890809edf96ee9a9dc8978e99abd91f0c7a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60536/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13673 **[Test build #60541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60541/consoleFull)** for PR 13673 at commit [`010423f`](https://github.com/apache/spark/commit/010423f5ba300c9229b82343463147d733920a4a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13651 **[Test build #60536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60536/consoleFull)** for PR 13651 at commit [`ff3aa3b`](https://github.com/apache/spark/commit/ff3aa3b891960755dc6134939d6eb8c64e5120e9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13673: [WIP][SPARK-15953][SQL][STREAMING] Renamed ContinuousQue...
Github user tdas commented on the issue: https://github.com/apache/spark/pull/13673 @zsxwing @marmbrus Could you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13677: [SPARK 15926] Improve readability of DAGScheduler stage ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13677 **[Test build #60540 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60540/consoleFull)** for PR 13677 at commit [`1b7a338`](https://github.com/apache/spark/commit/1b7a3386fb631f1a41ea73d6bfdc9961ed8e4234). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13653: [SPARK-15933][SQL][STREAMING] Refactored DF reade...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13653 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60535/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #5227: [SPARK-6435] spark-shell --jars option does not add all j...
Github user liukun1016 commented on the issue: https://github.com/apache/spark/pull/5227 The same issue happens to the Mac OS. If multiple jar paths were specified, only the first one would be visible. Don't know if this issue was created and resolved or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13651 **[Test build #60535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60535/consoleFull)** for PR 13651 at commit [`c7e0b33`](https://github.com/apache/spark/commit/c7e0b33307582505c5d72ed16a29184860c2eee0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13677: [SPARK 15926] Improve readability of DAGScheduler...
GitHub user kayousterhout opened a pull request: https://github.com/apache/spark/pull/13677 [SPARK 15926] Improve readability of DAGScheduler stage creation methods ## What changes were proposed in this pull request? This pull request refactors parts of the DAGScheduler to improve readability, focusing on the code around stage creation. One goal of this change it to make it clearer which functions may create new stages (as opposed to looking up stages that already exist). There are no functionality changes in this pull request. In more detail: * shuffleToMapStage was renamed to shuffleIdToMapStage (when reading the existing code I have sometimes struggled to remember what the key is -- is it a stage? A stage id? This change is intended to avoid that confusion) * Cleaned up the code to create shuffle map stages. Previously, creating a shuffle map stage involved 3 different functions (newOrUsedShuffleStage, newShuffleMapStage, and getShuffleMapStage), and it wasn't clear what the purpose of each function was. With the new code, a single function (getOrCreateShuffleMapStage) is responsible for getting a stage (if it already exists) or creating new shuffle map stages and any missing ancestor stages, and it delegates to createShuffleMapStage when new stages need to be created. There's some remaining confusion here because the getOrCreateParentStages call in createShuffleMapStage may recursively create ancestor stages; this is an issue I plan to fix in a future pull request, because it's trickier to fix and involves a slight functionality change. * newResultStage was renamed to createResultStage, for consistency with naming around shuffle map stages. * getParentStages has been renamed to getOrCreateParentStages, to make it clear that this function will sometimes create missing ancestor stages. * The only *slight* functionality change is that on line 478, updateJobIdStageIdMaps now uses a stage's parents instance variable rather than re-calculating them (I couldn't see any reason why they'd need to be re-calculated, and suspect this is just leftover from older code). * getAncestorShuffleDependencies was renamed to getMissingAncestorShuffleDependencies, to make it clear that this only returns dependencies that have not yet been run. cc @squito @markhamstra @JoshRosen (who requested more DAG scheduler commenting long ago -- an issue this pull request tries, in part, to address) FYI @rxin You can merge this pull request into a Git repository by running: $ git pull https://github.com/kayousterhout/spark-1 SPARK-15926 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13677.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13677 commit 4dca8001cd987016fd683988632337ec64b67444 Author: Kay OusterhoutDate: 2016-06-15T00:34:50Z Renamed shuffleToMapStaqe in DAGScheduler. This commit renames shuffleToMapStage to shuffleIdToMapStage to make it clear that the shuffle id (and not the shuffle dependency, or the shuffle map stage, or the shuffle map stage id) is the key in the hash map. commit 1468b91cc3ca493bc18e16525fe39444616f2bb2 Author: Kay Ousterhout Date: 2016-06-10T21:17:10Z Cleaned up the code to create new shuffle map stages commit a1a51ac8be0e5c22917464e2e9389fe33351f0b0 Author: Kay Ousterhout Date: 2016-06-10T23:10:12Z Various comment and naming cleanups commit 671d5de33c21dc279e6dda83dfa8366b8adf6f9c Author: Kay Ousterhout Date: 2016-06-11T00:22:59Z More cleanup of shuffle map stage creation commit 43841a7350af38551e9c349ab2f22c7e7e0a444d Author: Kay Ousterhout Date: 2016-06-13T19:38:07Z Some more naming and commenting improvements commit 49156f245baab7282575fe422f8e10906b571958 Author: Kay Ousterhout Date: 2016-06-15T00:42:04Z Moved createShuffleMapStage to be a non-nested function. I think having it nested made the code harder to read. commit 1b7a3386fb631f1a41ea73d6bfdc9961ed8e4234 Author: Kay Ousterhout Date: 2016-06-15T00:48:37Z Renamed newResultStage to createResultStage and added commenting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60533/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13651 **[Test build #60533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60533/consoleFull)** for PR 13651 at commit [`1787c05`](https://github.com/apache/spark/commit/1787c058ba34a5a9bfe2054aae76b6f87af14309). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13668: [SPARK-15915][SQL] Logical plans should use subqueries e...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13668 **[Test build #60539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60539/consoleFull)** for PR 13668 at commit [`e51774f`](https://github.com/apache/spark/commit/e51774f1eb73ddcb820619addd8be18894bba5e6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org