[GitHub] AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464304557 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102409/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464304555 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464304557 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102409/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464304555 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
SparkQA removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278279 **[Test build #102409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102409/testReport)** for PR 23602 at commit [`3aad18a`](https://github.com/apache/spark/commit/3aad18a4ba96b5717c16ebc8a0d23b0a3986c634). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
SparkQA commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464304377 **[Test build #102409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102409/testReport)** for PR 23602 at commit [`3aad18a`](https://github.com/apache/spark/commit/3aad18a4ba96b5717c16ebc8a0d23b0a3986c634). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon closed pull request #23800: [SPARK-26673][FollowUp][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory
HyukjinKwon closed pull request #23800: [SPARK-26673][FollowUp][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory URL: https://github.com/apache/spark/pull/23800 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #23800: [SPARK-26673][FollowUp][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory
HyukjinKwon commented on issue #23800: [SPARK-26673][FollowUp][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory URL: https://github.com/apache/spark/pull/23800#issuecomment-464301095 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23799: [SPARK-26892]Fix saveAsTextFile throws NullPointerException when null row present
HyukjinKwon commented on a change in pull request #23799: [SPARK-26892]Fix saveAsTextFile throws NullPointerException when null row present URL: https://github.com/apache/spark/pull/23799#discussion_r257449106 ## File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala ## @@ -1507,7 +1507,8 @@ abstract class RDD[T: ClassTag]( val r = this.mapPartitions { iter => val text = new Text() iter.map { x => -text.set(x.toString) +val value = if (x != null) x.toString else "Null" +text.set(value) Review comment: I would simply just add an assert or require. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs
HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs URL: https://github.com/apache/spark/pull/23797#discussion_r257448655 ## File path: docs/sql-data-sources-avro.md ## @@ -137,6 +137,37 @@ StreamingQuery query = output .option("topic", "topic2") .start(); +{% endhighlight %} + + +{% highlight python %} +from pyspark.sql.functions import from_avro, to_avro + +# `from_avro` requires Avro schema in JSON string format. +jsonFormatSchema = open("examples/src/main/resources/user.avsc", "r").read() + +df = spark + .readStream + .format("kafka") + .option("kafka.bootstrap.servers", "host1:port1,host2:port2") + .option("subscribe", "topic1") + .load() + +# 1. Decode the Avro data into a struct; +# 2. Filter by column `favorite_color`; +# 3. Encode the column `name` in Avro format. +output = df + .select(from_avro("value", jsonFormatSchema).alias("user")) + .where("user.favorite_color == \"red\"") Review comment: not a big deal but maybe `'user.favorite_color == "red"'` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs
HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs URL: https://github.com/apache/spark/pull/23797#discussion_r257448637 ## File path: docs/sql-data-sources-avro.md ## @@ -137,6 +137,37 @@ StreamingQuery query = output .option("topic", "topic2") .start(); +{% endhighlight %} + + +{% highlight python %} +from pyspark.sql.functions import from_avro, to_avro + +# `from_avro` requires Avro schema in JSON string format. +jsonFormatSchema = open("examples/src/main/resources/user.avsc", "r").read() + +df = spark + .readStream Review comment: nit: I think it needs `\` for each line. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs
HyukjinKwon commented on a change in pull request #23797: [WIP][SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs URL: https://github.com/apache/spark/pull/23797#discussion_r257448673 ## File path: python/pyspark/sql/functions.py ## @@ -2402,6 +2402,64 @@ def to_csv(col, options={}): return Column(jc) +@since(3.0) +def from_avro(col, jsonFormatSchema, options={}): +""" +Converts a binary column of avro format into its corresponding catalyst value. The specified +schema must match the read data, otherwise the behavior is undefined: it may fail or return +arbitrary result. + +Avro is built-in but external data source module since Spark 2.4. Please deploy the application +as per the deployment section of "Apache Avro Data Source Guide". + +:param data: the binary column. +:param jsonFormatSchema: the avro schema in JSON string format. +:param options: options to control how the Avro record is parsed. + +>>> from pyspark.sql import Row +>>> from pyspark.sql.functions import from_avro, to_avro +>>> data = [(1, Row(name='Alice', age=2))] +>>> df = spark.createDataFrame(data, ("key", "value")) +>>> avroDf = df.select(to_avro(df.value).alias("avro")) +>>> avroDf.collect() +[Row(avro=bytearray(b'\\x00\\x00\\x04\\x00\\nAlice'))] +>>> jsonFormatSchema = '''{"type":"record","name":"topLevelRecord","fields": +... [{"name":"avro","type":[{"type":"record","name":"value","namespace":"topLevelRecord", +... "fields":[{"name":"age","type":["long","null"]}, +... {"name":"name","type":["string","null"]}]},"null"]}]}''' +>>> avroDf.select(from_avro(avroDf.avro, jsonFormatSchema).alias("value")).collect() +[Row(value=Row(avro=Row(age=2, name=u'Alice')))] +""" + +sc = SparkContext._active_spark_context +jc = sc._jvm.org.apache.spark.sql.avro.functions.from_avro(_to_java_column(col), + jsonFormatSchema, options) Review comment: I believe this below complies PEP8. ```python jc = sc._jvm.org.apache.spark.sql.avro.functions.from_avro( _to_java_column(col), jsonFormatSchema, options) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #23706: [SPARK-26790][CORE] Change approach for retrieving executor logs and attributes: self-retrieve
HeartSaVioR commented on issue #23706: [SPARK-26790][CORE] Change approach for retrieving executor logs and attributes: self-retrieve URL: https://github.com/apache/spark/pull/23706#issuecomment-464294044 Thanks all for reviewing and merging! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] BryanCutler commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data.
BryanCutler commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data. URL: https://github.com/apache/spark/pull/23795#discussion_r257447143 ## File path: python/pyspark/sql/types.py ## @@ -1681,38 +1681,53 @@ def from_arrow_schema(arrow_schema): for field in arrow_schema]) -def _check_series_convert_date(series, data_type): -""" -Cast the series to datetime.date if it's a date type, otherwise returns the original series. +def _arrow_column_to_pandas(column, data_type): +""" Convert Arrow Column to pandas Series. + +If the given column is a date type column, creates a series of datetime.date directly instead +of creating datetime64[ns] as intermediate data. -:param series: pandas.Series -:param data_type: a Spark data type for the series +:param series: pyarrow.lib.Column +:param data_type: a Spark data type for the column """ -import pyarrow +import pandas as pd +import pyarrow as pa from distutils.version import LooseVersion -# As of Arrow 0.12.0, date_as_objects is True by default, see ARROW-3910 -if LooseVersion(pyarrow.__version__) < LooseVersion("0.12.0") and type(data_type) == DateType: -return series.dt.date +# Since Arrow 0.11.0, support date_as_object to return datetime.date instead of np.datetime64. +if LooseVersion(pa.__version__) < LooseVersion("0.11.0"): +if type(data_type) == DateType: +return pd.Series(column.to_pylist(), name=column.name) +else: +return column.to_pandas() else: -return series +return column.to_pandas(date_as_object=True) + +def _arrow_table_to_pandas(table, schema): +""" Convert Arrow Table to pandas DataFrame. -def _check_dataframe_convert_date(pdf, schema): -""" Correct date type value to use datetime.date. +If the given table contains a date type column, use `_arrow_column_to_pandas` for pyarrow<0.11 +or use `date_as_object` option for pyarrow>=0.11 to avoid creating datetime64[ns] as +intermediate data. Pandas DataFrame created from PyArrow uses datetime64[ns] for date type values, but we should use datetime.date to match the behavior with when Arrow optimization is disabled. -:param pdf: pandas.DataFrame -:param schema: a Spark schema of the pandas.DataFrame +:param table: pyarrow.lib.Table +:param schema: a Spark schema of the pyarrow.lib.Table """ -import pyarrow +import pandas as pd +import pyarrow as pa from distutils.version import LooseVersion -# As of Arrow 0.12.0, date_as_objects is True by default, see ARROW-3910 -if LooseVersion(pyarrow.__version__) < LooseVersion("0.12.0"): -for field in schema: -pdf[field.name] = _check_series_convert_date(pdf[field.name], field.dataType) -return pdf +# Since Arrow 0.11.0, support date_as_object to return datetime.date instead of np.datetime64. +if LooseVersion(pa.__version__) < LooseVersion("0.11.0"): Review comment: It would be nice to bump to 0.12.0 because I think that would allow us to clean up the code the most, but since it's a raised error if the user doesn't have that version, it might too restrictive. Let's definitely make a JIRA to discuss more. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464291740 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102408/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464291739 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
SparkQA commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464291581 **[Test build #102408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102408/testReport)** for PR 17968 at commit [`311c94a`](https://github.com/apache/spark/commit/311c94a3d608b0b86f3ce39415639ec260e5af37). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464291739 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
SparkQA removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271730 **[Test build #102408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102408/testReport)** for PR 17968 at commit [`311c94a`](https://github.com/apache/spark/commit/311c94a3d608b0b86f3ce39415639ec260e5af37). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464291740 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102408/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] BryanCutler commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data.
BryanCutler commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data. URL: https://github.com/apache/spark/pull/23795#discussion_r257446732 ## File path: python/pyspark/sql/types.py ## @@ -1681,38 +1681,53 @@ def from_arrow_schema(arrow_schema): for field in arrow_schema]) -def _check_series_convert_date(series, data_type): -""" -Cast the series to datetime.date if it's a date type, otherwise returns the original series. +def _arrow_column_to_pandas(column, data_type): +""" Convert Arrow Column to pandas Series. + +If the given column is a date type column, creates a series of datetime.date directly instead +of creating datetime64[ns] as intermediate data. Review comment: It would be nice to say that for dates this will return `datetime.date`, but yeah maybe move the part about datetime[64] as intermediate to an internal comment. `_arrow_table_to_pandas` has a comment that the reason for this is to match pyspark w/o arrow, but maybe it would be good to add here as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] yucai commented on a change in pull request #21149: [SPARK-24076][SQL] Use different seed in HashAggregate to avoid hash conflict
yucai commented on a change in pull request #21149: [SPARK-24076][SQL] Use different seed in HashAggregate to avoid hash conflict URL: https://github.com/apache/spark/pull/21149#discussion_r257445460 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala ## @@ -755,7 +755,10 @@ case class HashAggregateExec( } // generate hash code for key -val hashExpr = Murmur3Hash(groupingExpressions, 42) +// SPARK-24076: HashAggregate uses the same hash algorithm on the same expressions +// as ShuffleExchange, it may lead to bad hash conflict when shuffle.partitions=8192*n, +// pick a different seed to avoid this conflict +val hashExpr = Murmur3Hash(groupingExpressions, 48) Review comment: @cloud-fan you mean `unsafeRowKeys.hashCode()`, right? I think it is a good idea, unsafe row has [null bit set] etc., the result should be different, we don't need weird `48` also. Do you want me to create a followup PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
SparkQA commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#issuecomment-464281896 **[Test build #102410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102410/testReport)** for PR 23750 at commit [`edfe3d7`](https://github.com/apache/spark/commit/edfe3d7f1771ef72b6eae1e31840aca8b49eebf3). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
AmplabJenkins removed a comment on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#issuecomment-464281636 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
AmplabJenkins commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#issuecomment-464281636 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
AmplabJenkins removed a comment on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#issuecomment-464281638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7990/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
AmplabJenkins commented on issue #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#issuecomment-464281638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7990/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dilipbiswal commented on issue #23780: [SPARK-26864][SQL][BACKPORT-2.4] Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both leg
dilipbiswal commented on issue #23780: [SPARK-26864][SQL][BACKPORT-2.4] Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join URL: https://github.com/apache/spark/pull/23780#issuecomment-464281489 @cloud-fan Can be merged now ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dilipbiswal commented on a change in pull request #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc.
dilipbiswal commented on a change in pull request #23750: [SPARK-19712][SQL] Pushing Left Semi and Left Anti joins through Project, Aggregate, Window, Union etc. URL: https://github.com/apache/spark/pull/23750#discussion_r257444587 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -1188,6 +1189,190 @@ object PushDownPredicate extends Rule[LogicalPlan] with PredicateHelper { } } +object PushDownLeftSemiAntiJoin extends Rule[LogicalPlan] with PredicateHelper { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +// Similar to the above Filter over Project +// LeftSemi/LeftAnti over Project +case join @ Join(p @ Project(pList, gChild), rightOp, LeftSemiOrAnti(joinType), joinCond, hint) + if pList.forall(_.deterministic) && !ScalarSubquery.hasScalarSubquery(pList) && +canPushThroughCondition(Seq(gChild), joinCond, rightOp) => + if (joinCond.isEmpty) { +// No join condition, just push down the Join below Project +Project(pList, Join(gChild, rightOp, joinType, joinCond, hint)) + } else { +// Create a map of Aliases to their values from the child projection. +// e.g., 'SELECT a + b AS c, d ...' produces Map(c -> a + b). +val aliasMap = AttributeMap(pList.collect { + case a: Alias => (a.toAttribute, a.child) +}) +val newJoinCond = if (aliasMap.nonEmpty) { + Option(replaceAlias(joinCond.get, aliasMap)) +} else { + joinCond +} +Project(pList, Join(gChild, rightOp, joinType, newJoinCond, hint)) + } + +// Similar to the above Filter over Aggregate +// LeftSemi/LeftAnti over Aggregate +case join @ Join(aggregate: Aggregate, rightOp, LeftSemiOrAnti(joinType), joinCond, hint) + if aggregate.aggregateExpressions.forall(_.deterministic) +&& aggregate.groupingExpressions.nonEmpty => + if (joinCond.isEmpty) { +// No join condition, just push down Join below Aggregate +aggregate.copy(child = Join(aggregate.child, rightOp, joinType, joinCond, hint)) + } else { +// Find all the aliased expressions in the aggregate list that don't include any actual +// AggregateExpression, and create a map from the alias to the expression +val aliasMap = AttributeMap(aggregate.aggregateExpressions.collect { + case a: Alias if a.child.find(_.isInstanceOf[AggregateExpression]).isEmpty => +(a.toAttribute, a.child) +}) + +// For each join condition, expand the alias and +// check if the condition can be evaluated using +// attributes produced by the aggregate operator's child operator. + +val (pushDown, stayUp) = splitConjunctivePredicates(joinCond.get).partition { cond => + val replaced = replaceAlias(cond, aliasMap) + cond.references.nonEmpty && Review comment: @maropu Thanks for reviewing. I have addressed your comments. Please look through it when you get a chance. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
AmplabJenkins commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464279496 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102404/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
AmplabJenkins removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464279496 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102404/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hadoop 3 profile
HyukjinKwon commented on issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hadoop 3 profile URL: https://github.com/apache/spark/pull/21588#issuecomment-464279463 ping for what? Hive upgrade is in progress which blocks this PR https://github.com/apache/spark/pull/23788 Please give inputs here and the discussion thread @wangyum pointed out above. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
AmplabJenkins commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464279495 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
AmplabJenkins removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464279495 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
SparkQA removed a comment on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464227566 **[Test build #102404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102404/testReport)** for PR 23804 at commit [`16caf67`](https://github.com/apache/spark/commit/16caf6733c893204fab2df4603c7abf0c3106bf7). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests
SparkQA commented on issue #23804: [WIP][SPARK-26896] JDK 11 module adjustments for running tests URL: https://github.com/apache/spark/pull/23804#issuecomment-464279361 **[Test build #102404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102404/testReport)** for PR 23804 at commit [`16caf67`](https://github.com/apache/spark/commit/16caf6733c893204fab2df4603c7abf0c3106bf7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
SparkQA commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278279 **[Test build #102409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102409/testReport)** for PR 23602 at commit [`3aad18a`](https://github.com/apache/spark/commit/3aad18a4ba96b5717c16ebc8a0d23b0a3986c634). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278156 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7989/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278155 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins removed a comment on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278155 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
AmplabJenkins commented on issue #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#issuecomment-464278156 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7989/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SongYadong commented on issue #23794: [SPARK-26884][CORE] Let task acquire memory accurately when using spilled memory
SongYadong commented on issue #23794: [SPARK-26884][CORE] Let task acquire memory accurately when using spilled memory URL: https://github.com/apache/spark/pull/23794#issuecomment-464275258 Thanks for review. It's right the memory manager will try to give the right amount. But when going to spill action, that is to say memory manager probably can't give needed memory now. If we acquire unsatisfied amount after spilling ( when `released` < `required - got`), memory manager will try redundant effort to get memory, even be blocked temporarily. By accurate control, I think acquiring memory may return fast. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] liupc commented on a change in pull request #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame
liupc commented on a change in pull request #23602: [SPARK-26674][CORE]Consolidate CompositeByteBuf when reading large frame URL: https://github.com/apache/spark/pull/23602#discussion_r257441590 ## File path: common/network-common/src/main/java/org/apache/spark/network/util/TransportFrameDecoder.java ## @@ -123,30 +140,54 @@ private long decodeFrameSize() { private ByteBuf decodeNext() { long frameSize = decodeFrameSize(); -if (frameSize == UNKNOWN_FRAME_SIZE || totalSize < frameSize) { +if (frameSize == UNKNOWN_FRAME_SIZE) { return null; } -// Reset size for next frame. -nextFrameSize = UNKNOWN_FRAME_SIZE; - -Preconditions.checkArgument(frameSize < MAX_FRAME_SIZE, "Too large frame: %s", frameSize); -Preconditions.checkArgument(frameSize > 0, "Frame length should be positive: %s", frameSize); +if (frameBuf == null) { + Preconditions.checkArgument(frameSize < MAX_FRAME_SIZE, + "Too large frame: %s", frameSize); + Preconditions.checkArgument(frameSize > 0, + "Frame length should be positive: %s", frameSize); + frameRemainingBytes = (int) frameSize; -// If the first buffer holds the entire frame, return it. -int remaining = (int) frameSize; -if (buffers.getFirst().readableBytes() >= remaining) { - return nextBufferForFrame(remaining); + // If buffers is empty, then return immediately for more input data. + if (buffers.isEmpty()) { +return null; + } + // Otherwise, if the first buffer holds the entire frame, we attempt to + // build frame with it and return. + if (buffers.getFirst().readableBytes() >= frameRemainingBytes) { +// Reset buf and size for next frame. +frameBuf = null; +nextFrameSize = UNKNOWN_FRAME_SIZE; +return nextBufferForFrame(frameRemainingBytes); + } + // Other cases, create a composite buffer to manage all the buffers. + frameBuf = buffers.getFirst().alloc().compositeBuffer(Integer.MAX_VALUE); } -// Otherwise, create a composite buffer. -CompositeByteBuf frame = buffers.getFirst().alloc().compositeBuffer(Integer.MAX_VALUE); -while (remaining > 0) { - ByteBuf next = nextBufferForFrame(remaining); - remaining -= next.readableBytes(); - frame.addComponent(next).writerIndex(frame.writerIndex() + next.readableBytes()); +while (frameRemainingBytes > 0 && !buffers.isEmpty()) { + ByteBuf next = nextBufferForFrame(frameRemainingBytes); + frameRemainingBytes -= next.readableBytes(); + frameBuf.addComponent(true, next); } -assert remaining == 0; +// If the delta size of frameBuf exceeds the threshold, then we do consolidation +// to reduce memory consumption. +if (frameBuf.capacity() - consolidatedFrameBufSize > consolidateThreshold) { + int newNumComponents = frameBuf.numComponents() - consolidatedNumComponents; + frameBuf.consolidate(consolidatedNumComponents, newNumComponents); + consolidatedFrameBufSize = frameBuf.capacity(); + consolidatedNumComponents = frameBuf.numComponents(); +} +if (frameRemainingBytes > 0) { + return null; +} + +// Reset buf and size for next frame. +ByteBuf frame = frameBuf; +frameBuf = null; +nextFrameSize = UNKNOWN_FRAME_SIZE; Review comment: Yes, I can add some code to test multiple messages, and we just need to do the same check for consolidated buf capacity. I think this is more result oriented. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] asfgit closed pull request #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
asfgit closed pull request #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273651 Merged to master This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273374 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273377 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102407/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273377 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102407/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
SparkQA removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270324 **[Test build #102407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102407/testReport)** for PR 18339 at commit [`ea267c6`](https://github.com/apache/spark/commit/ea267c68c805951c5ee2fb4fccd9f8fb4a288297). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273374 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
SparkQA commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273285 **[Test build #102407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102407/testReport)** for PR 18339 at commit [`ea267c6`](https://github.com/apache/spark/commit/ea267c68c805951c5ee2fb4fccd9f8fb4a288297). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464273171 Looks like Jenkins listened, everything passed so will merge to master. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics.
edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r257440553 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -2380,10 +2381,14 @@ class SparkContext(config: SparkConf) extends Logging { /** Reports heartbeat metrics for the driver. */ private def reportHeartBeat(): Unit = { -val driverUpdates = _heartbeater.getCurrentMetrics() Review comment: Would it be useful to poll more frequently for driver metrics as well? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics.
edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r257440322 ## File path: core/src/main/scala/org/apache/spark/executor/Executor.scala ## @@ -840,8 +952,25 @@ private[spark] class Executor( val accumUpdates = new ArrayBuffer[(Long, Seq[AccumulatorV2[_, _]])]() val curGCTime = computeTotalGcTime() -// get executor level memory metrics -val executorUpdates = heartbeater.getCurrentMetrics() +// if not polling in a separater poller, poll here +if (poller == null) { + poll() +} + +// build the executor level memory metrics +val executorUpdates = new HashMap[StageKey, ExecutorMetrics] + +def peaksForStage(k: StageKey, v: AtomicLong): (StageKey, AtomicLongArray) = + if (v.get() > 0) (k, stageMetricPeaks.get(k)) else null + +def addPeaks(nested: (StageKey, AtomicLongArray)): Unit = { + val (k, v) = nested + executorUpdates.put(k, new ExecutorMetrics(v)) + // at the same time, reset the peaks in stageMetricPeaks + stageMetricPeaks.put(k, new AtomicLongArray(ExecutorMetricType.numMetrics)) +} + +activeStages.forEach[(StageKey, AtomicLongArray)](LONG_MAX_VALUE, peaksForStage, addPeaks) Review comment: There's the corner case where if the task fails, then metrics may not get sent. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics.
edwinalu commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r257440251 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -2380,10 +2381,14 @@ class SparkContext(config: SparkConf) extends Logging { /** Reports heartbeat metrics for the driver. */ private def reportHeartBeat(): Unit = { -val driverUpdates = _heartbeater.getCurrentMetrics() +val currentMetrics = ExecutorMetrics.getCurrentMetrics(env.memoryManager) +val driverUpdates = new HashMap[(Int, Int), ExecutorMetrics] +// In the driver, we do not track per-stage metrics, so use a dummy stage +// for the key +driverUpdates.put((-1, -1), new ExecutorMetrics(currentMetrics)) Review comment: Yes, in onExecutorMetricsUpdate the stage information is added, so not needed here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data.
holdenk commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data. URL: https://github.com/apache/spark/pull/23795#discussion_r257439925 ## File path: python/pyspark/sql/types.py ## @@ -1681,38 +1681,53 @@ def from_arrow_schema(arrow_schema): for field in arrow_schema]) -def _check_series_convert_date(series, data_type): -""" -Cast the series to datetime.date if it's a date type, otherwise returns the original series. +def _arrow_column_to_pandas(column, data_type): +""" Convert Arrow Column to pandas Series. + +If the given column is a date type column, creates a series of datetime.date directly instead +of creating datetime64[ns] as intermediate data. -:param series: pandas.Series -:param data_type: a Spark data type for the series +:param series: pyarrow.lib.Column +:param data_type: a Spark data type for the column """ -import pyarrow +import pandas as pd +import pyarrow as pa from distutils.version import LooseVersion -# As of Arrow 0.12.0, date_as_objects is True by default, see ARROW-3910 -if LooseVersion(pyarrow.__version__) < LooseVersion("0.12.0") and type(data_type) == DateType: -return series.dt.date +# Since Arrow 0.11.0, support date_as_object to return datetime.date instead of np.datetime64. Review comment: Include a comment about the overflow here so we know why we are avoiding `np.datetime64`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464271644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102403/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data.
holdenk commented on a change in pull request #23795: [SPARK-26887][SQL][PYTHON] Create datetime.date directly instead of creating datetime64[ns] as intermediate data. URL: https://github.com/apache/spark/pull/23795#discussion_r257439790 ## File path: python/pyspark/sql/types.py ## @@ -1681,38 +1681,53 @@ def from_arrow_schema(arrow_schema): for field in arrow_schema]) -def _check_series_convert_date(series, data_type): -""" -Cast the series to datetime.date if it's a date type, otherwise returns the original series. +def _arrow_column_to_pandas(column, data_type): +""" Convert Arrow Column to pandas Series. + +If the given column is a date type column, creates a series of datetime.date directly instead +of creating datetime64[ns] as intermediate data. Review comment: minor: I think these details belong as a comment internally rather than in the doc string. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464271641 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
SparkQA commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271730 **[Test build #102408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102408/testReport)** for PR 17968 at commit [`311c94a`](https://github.com/apache/spark/commit/311c94a3d608b0b86f3ce39415639ec260e5af37). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464271641 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7988/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271529 Build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464271644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102403/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271529 Build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7988/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464192922 **[Test build #102403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102403/testReport)** for PR 19045 at commit [`46b5725`](https://github.com/apache/spark/commit/46b5725f763e1858704c408b7a55f49f717790b0). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] wypoon commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics.
wypoon commented on a change in pull request #23767: [SPARK-26329][CORE][WIP] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r257439819 ## File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ## @@ -40,7 +41,7 @@ private[spark] case class Heartbeat( executorId: String, accumUpdates: Array[(Long, Seq[AccumulatorV2[_, _]])], // taskId -> accumulator updates blockManagerId: BlockManagerId, -executorUpdates: ExecutorMetrics) // executor level updates +executorUpdates: Map[(Int, Int), ExecutorMetrics]) // executor level updates Review comment: Sure, will do. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464271447 **[Test build #102403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102403/testReport)** for PR 19045 at commit [`46b5725`](https://github.com/apache/spark/commit/46b5725f763e1858704c408b7a55f49f717790b0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
AmplabJenkins removed a comment on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-453656628 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
holdenk commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271175 jenkins ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical
holdenk commented on issue #17968: [SPARK-9792] Make DenseMatrix equality semantical URL: https://github.com/apache/spark/pull/17968#issuecomment-464271163 Jenkins OK to test Are you still actively wortking on this and if so would you update it to master? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #20028: [SPARK-19053][ML]Supporting multiple evaluation metrics in DataFrame-based API
holdenk commented on issue #20028: [SPARK-19053][ML]Supporting multiple evaluation metrics in DataFrame-based API URL: https://github.com/apache/spark/pull/20028#issuecomment-464271070 Is this still being actively worked on? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
AmplabJenkins removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464270877 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102402/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #23793: [SPARK-24736][k8s] Let spark-submit handle dependency resolution.
holdenk commented on issue #23793: [SPARK-24736][k8s] Let spark-submit handle dependency resolution. URL: https://github.com/apache/spark/pull/23793#issuecomment-464270862 So I'm a little confused here since if we look at the YARN cluster manager we also see similar logic around setting the PYTHONPATH Have you tested this with a zipfile or egg as a dependency since I don't think Python will by default expand all zip files in pwd? cc @ifilonenko This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
AmplabJenkins removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464270873 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
AmplabJenkins commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464270877 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102402/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
AmplabJenkins commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464270873 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
SparkQA removed a comment on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464192851 **[Test build #102402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102402/testReport)** for PR 23792 at commit [`c50f10d`](https://github.com/apache/spark/commit/c50f10d37d66af6fa60b561c0f139bbf558eccfd). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle
SparkQA commented on issue #23792: [SPARK-26882] Check the Kubernetes integration tests scalatyle URL: https://github.com/apache/spark/pull/23792#issuecomment-464270679 **[Test build #102402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102402/testReport)** for PR 23792 at commit [`c50f10d`](https://github.com/apache/spark/commit/c50f10d37d66af6fa60b561c0f139bbf558eccfd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
AmplabJenkins removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464270406 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102406/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
SparkQA removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464256809 **[Test build #102406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102406/testReport)** for PR 23807 at commit [`799a01a`](https://github.com/apache/spark/commit/799a01ac76763549439e3dd32b9dfdd841d10313). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
AmplabJenkins commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464270406 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102406/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
AmplabJenkins commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464270404 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
AmplabJenkins removed a comment on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464270404 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
SparkQA commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270324 **[Test build #102407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102407/testReport)** for PR 18339 at commit [`ea267c6`](https://github.com/apache/spark/commit/ea267c68c805951c5ee2fb4fccd9f8fb4a288297). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7987/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
SparkQA commented on issue #23807: [SPARK-26897][SQL][TEST] Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite URL: https://github.com/apache/spark/pull/23807#issuecomment-464270362 **[Test build #102406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102406/testReport)** for PR 23807 at commit [`799a01a`](https://github.com/apache/spark/commit/799a01ac76763549439e3dd32b9dfdd841d10313). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins removed a comment on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270142 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270142 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
AmplabJenkins commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464270146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7987/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464269503 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464269764 Jenkins retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway
holdenk commented on issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway URL: https://github.com/apache/spark/pull/18339#issuecomment-464269807 @parente if you could merge in master that would trigger a Jenkins run. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464269507 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102401/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on a change in pull request #23741: [SPARK-22798][PYTHON][ML]Add multiple column support to PySpark StringIndexer
holdenk commented on a change in pull request #23741: [SPARK-22798][PYTHON][ML]Add multiple column support to PySpark StringIndexer URL: https://github.com/apache/spark/pull/23741#discussion_r257438402 ## File path: python/pyspark/ml/wrapper.py ## @@ -87,9 +87,19 @@ def _new_java_array(pylist, java_class): - bool -> sc._gateway.jvm.java.lang.Boolean """ Review comment: Just a gentle ping on doing this part This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464269503 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] holdenk commented on a change in pull request #23741: [SPARK-22798][PYTHON][ML]Add multiple column support to PySpark StringIndexer
holdenk commented on a change in pull request #23741: [SPARK-22798][PYTHON][ML]Add multiple column support to PySpark StringIndexer URL: https://github.com/apache/spark/pull/23741#discussion_r257438503 ## File path: python/pyspark/ml/wrapper.py ## @@ -87,9 +87,19 @@ def _new_java_array(pylist, java_class): - bool -> sc._gateway.jvm.java.lang.Boolean """ sc = SparkContext._active_spark_context -java_array = sc._gateway.new_array(java_class, len(pylist)) -for i in xrange(len(pylist)): -java_array[i] = pylist[i] +java_array = None +if len(pylist) > 0 and isinstance(pylist[0], list): +inner_array_length = 0 +for i in xrange(len(pylist)): +inner_array_length = max(inner_array_length, len(pylist[i])) +java_array = sc._gateway.new_array(java_class, len(pylist), inner_array_length) Review comment: I think we now have this in https://github.com/apache/spark/pull/23741/files#diff-898790f48e214f86080160b45fcf81cfR102 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464269507 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102401/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown
SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown URL: https://github.com/apache/spark/pull/19045#issuecomment-464189231 **[Test build #102401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102401/testReport)** for PR 19045 at commit [`25dc907`](https://github.com/apache/spark/commit/25dc90775a50cc462cd5f325c3b3eada5def1808). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org