MaxGekk closed pull request #38000: [SPARK-40540][SQL] Migrate compilation
errors onto error classes: _LEGACY_ERROR_TEMP_1100-1199
URL: https://github.com/apache/spark/pull/38000
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
MaxGekk commented on PR #38000:
URL: https://github.com/apache/spark/pull/38000#issuecomment-1260417870
Only one test failed:
```
DAGSchedulerSuite.SPARK-40096: Send finalize events even if shuffle merger
blocks indefinitely with registerMergeResults is false
wbo4958 commented on PR #37855:
URL: https://github.com/apache/spark/pull/37855#issuecomment-1260417639
> Good catch! seems we can also simply switch to `XORShiftRandom` which
always [hash the
HyukjinKwon commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981964990
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,130 @@
+#
+# Licensed to the Apache Software Foundation
chaoqin-li1123 commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981960624
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,130 @@
+#
+# Licensed to the Apache Software Foundation
chaoqin-li1123 commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981960722
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,130 @@
+#
+# Licensed to the Apache Software Foundation
amaliujia commented on code in PR #38006:
URL: https://github.com/apache/spark/pull/38006#discussion_r981950320
##
connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##
@@ -185,11 +186,10 @@ object SparkConnectService {
/**
* Starts
cloud-fan commented on code in PR #38006:
URL: https://github.com/apache/spark/pull/38006#discussion_r981940234
##
connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##
@@ -185,11 +186,10 @@ object SparkConnectService {
/**
* Starts
cloud-fan commented on code in PR #38006:
URL: https://github.com/apache/spark/pull/38006#discussion_r981940234
##
connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##
@@ -185,11 +186,10 @@ object SparkConnectService {
/**
* Starts
zhengruifeng opened a new pull request, #38026:
URL: https://github.com/apache/spark/pull/38026
### What changes were proposed in this pull request?
Implement `min_count` in `GroupBy.max`
### Why are the changes needed?
for API coverage
### Does this PR introduce
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981927808
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981927808
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
amaliujia commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981925368
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
amaliujia commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981925368
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
amaliujia commented on PR #38004:
URL: https://github.com/apache/spark/pull/38004#issuecomment-1260363294
Thanks for the link of the implementation!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981920306
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981922371
##
repl/pom.xml:
##
@@ -58,6 +58,11 @@
spark-sql_${scala.binary.version}
${project.version}
+
+ org.apache.spark
+
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981920306
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
amaliujia commented on code in PR #38006:
URL: https://github.com/apache/spark/pull/38006#discussion_r981918310
##
connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##
@@ -189,7 +189,7 @@ object SparkConnectService {
*/
def
zhengruifeng commented on code in PR #37995:
URL: https://github.com/apache/spark/pull/37995#discussion_r981915981
##
python/pyspark/pandas/series.py:
##
@@ -6442,6 +6445,8 @@ def argmin(self, axis: Axis = None, skipna: bool = True)
-> int:
raise ValueError("axis
LuciferYang commented on code in PR #38025:
URL: https://github.com/apache/spark/pull/38025#discussion_r981915685
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetWrite.scala:
##
@@ -94,7 +94,7 @@ case class ParquetWrite(
LuciferYang opened a new pull request, #38025:
URL: https://github.com/apache/spark/pull/38025
### What changes were proposed in this pull request?
This pr do the similar change as https://github.com/apache/spark/pull/24808
### Why are the changes needed?
The print
zhengruifeng commented on code in PR #37995:
URL: https://github.com/apache/spark/pull/37995#discussion_r981914861
##
python/pyspark/pandas/series.py:
##
@@ -6442,6 +6445,8 @@ def argmin(self, axis: Axis = None, skipna: bool = True)
-> int:
raise ValueError("axis
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981913805
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981913250
##
repl/pom.xml:
##
@@ -58,6 +58,11 @@
spark-sql_${scala.binary.version}
${project.version}
+
+ org.apache.spark
+
yaooqinn commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r981911877
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala:
##
@@ -253,7 +259,7 @@ class FileScanRDD(
null
cloud-fan commented on code in PR #38006:
URL: https://github.com/apache/spark/pull/38006#discussion_r981911829
##
connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##
@@ -189,7 +189,7 @@ object SparkConnectService {
*/
def
LuciferYang opened a new pull request, #37654:
URL: https://github.com/apache/spark/pull/37654
### What changes were proposed in this pull request?
This pr is a refactor work, the main change is extract
`ParquetUtils.prepareWrite` method to eliminate duplicate code in
cloud-fan commented on PR #37654:
URL: https://github.com/apache/spark/pull/37654#issuecomment-1260343604
@sadikovi can you take a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
LuciferYang closed pull request #37654: [SPARK-40216][SQL] Extract common
`ParquetUtils.prepareWrite` method to deduplicate code in `ParquetFileFormat`
and `ParquetWrite`
URL: https://github.com/apache/spark/pull/37654
--
This is an automated message from the Apache Git Service.
To respond
zhengruifeng commented on PR #38014:
URL: https://github.com/apache/spark/pull/38014#issuecomment-1260336014
If we want to keep only one badge, I think we can use badge `Pypi downloads`
linking to `pypi`, like [numpy](https://github.com/numpy/numpy) /
yaooqinn commented on PR #35594:
URL: https://github.com/apache/spark/pull/35594#issuecomment-1260335417
+1, and sorry for not merging it after my approval
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
LuciferYang commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r981889549
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala:
##
@@ -253,7 +259,7 @@ class FileScanRDD(
null
AngersZh opened a new pull request, #35594:
URL: https://github.com/apache/spark/pull/35594
### What changes were proposed in this pull request?
Current Spark SQL CLI alway use shutdown hook to stop SparkSQLEnv
```
// Clean up after we exit
cloud-fan commented on PR #35594:
URL: https://github.com/apache/spark/pull/35594#issuecomment-1260321186
@AngersZh can you rebase? We should merge this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
zhengruifeng commented on PR #37855:
URL: https://github.com/apache/spark/pull/37855#issuecomment-1260319666
Good catch!
seems we can also simply switch to `XORShiftRandom` which always [hash the
LuciferYang commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r981889549
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala:
##
@@ -253,7 +259,7 @@ class FileScanRDD(
null
caican00 commented on PR #37876:
URL: https://github.com/apache/spark/pull/37876#issuecomment-1260294346
> I haven't thought of a better way yet
Thanks. I'll share it with you if I can think of a better way
--
This is an automated message from the Apache Git Service.
To respond to
yaooqinn commented on PR #38024:
URL: https://github.com/apache/spark/pull/38024#issuecomment-1260291484
cc @cloud-fan @dongjoon-hyun @HyukjinKwon @wangyum thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
yaooqinn opened a new pull request, #38024:
URL: https://github.com/apache/spark/pull/38024
### What changes were proposed in this pull request?
Let's take a look at the case below, the left and the right are visiting the
same table and its partitions, and both of them
HyukjinKwon commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981863988
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,130 @@
+#
+# Licensed to the Apache Software Foundation
HyukjinKwon commented on code in PR #38018:
URL: https://github.com/apache/spark/pull/38018#discussion_r981856947
##
python/pyspark/pandas/frame.py:
##
@@ -5317,6 +5317,12 @@ def to_orc(
... '%s/to_orc/foo.orc' % path,
... mode = 'overwrite',
zhengruifeng commented on PR #38017:
URL: https://github.com/apache/spark/pull/38017#issuecomment-1260274298
Thanks you @HyukjinKwon @dongjoon-hyun @itholic
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon closed pull request #38017: [SPARK-40579][PS] `GroupBy.first`
should skip NULLs
URL: https://github.com/apache/spark/pull/38017
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
Yikun closed pull request #35088: [SPARK-37758][PYTHON][BUILD] Enable PySpark
test scheduled job on ARM runner
URL: https://github.com/apache/spark/pull/35088
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon closed pull request #38016: [SPARK-40578][PS] Fix
`IndexesTest.test_to_frame` when pandas 1.5.0
URL: https://github.com/apache/spark/pull/38016
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HyukjinKwon commented on PR #38017:
URL: https://github.com/apache/spark/pull/38017#issuecomment-1260272833
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #38016:
URL: https://github.com/apache/spark/pull/38016#issuecomment-1260272532
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
mridulm commented on PR #37638:
URL: https://github.com/apache/spark/pull/37638#issuecomment-1260268934
+CC @otterc, @Ngone51
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
mridulm commented on code in PR #37638:
URL: https://github.com/apache/spark/pull/37638#discussion_r981843162
##
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java:
##
@@ -593,6 +607,9 @@ public void onData(String streamId,
github-actions[bot] commented on PR #35319:
URL: https://github.com/apache/spark/pull/35319#issuecomment-1260238704
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #34856:
URL: https://github.com/apache/spark/pull/34856#issuecomment-1260238760
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35088:
URL: https://github.com/apache/spark/pull/35088#issuecomment-1260238718
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #34903:
URL: https://github.com/apache/spark/pull/34903#issuecomment-1260238745
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35337:
URL: https://github.com/apache/spark/pull/35337#issuecomment-1260238691
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #35569: [SPARK-38250][CORE] Check
existence before deleting stagingDir in HadoopMapReduceCommitProtocol
URL: https://github.com/apache/spark/pull/35569
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
github-actions[bot] commented on PR #35371:
URL: https://github.com/apache/spark/pull/35371#issuecomment-1260238670
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35548:
URL: https://github.com/apache/spark/pull/35548#issuecomment-1260238660
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35549:
URL: https://github.com/apache/spark/pull/35549#issuecomment-1260238647
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #35734: [SPARK-32432][SQL] Add support
for reading ORC/Parquet files of SymlinkTextInputFormat table And Fix Analyze
for SymlinkTextInputFormat table
URL: https://github.com/apache/spark/pull/35734
--
This is an automated message from the Apache Git
github-actions[bot] closed pull request #35594: [SPARK-38270][SQL] Spark SQL
CLI's AM should keep same exit code with client side
URL: https://github.com/apache/spark/pull/35594
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
github-actions[bot] closed pull request #35638: [SPARK-38296][SQL] Support
error class AnalysisExceptions in FunctionRegistry
URL: https://github.com/apache/spark/pull/35638
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
github-actions[bot] commented on PR #35550:
URL: https://github.com/apache/spark/pull/35550#issuecomment-1260238637
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #35608: [SPARK-32838][SQL] Static
partition overwrite could use staging dir insert
URL: https://github.com/apache/spark/pull/35608
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
github-actions[bot] closed pull request #35748: [SPARK-38431][SQL]Support to
delete matched rows from jdbc tables
URL: https://github.com/apache/spark/pull/35748
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
github-actions[bot] closed pull request #35744: [SPARK-37383][SQL][WEBUI]Show
the parsing time for each phase of a SQL on spark ui
URL: https://github.com/apache/spark/pull/35744
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
github-actions[bot] closed pull request #36889: [SPARK-21195][CORE] Dynamically
register metrics from sources as they are reported
URL: https://github.com/apache/spark/pull/36889
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
github-actions[bot] closed pull request #35867:
[SPARK-38559][SQL][WEBUI]Display the number of empty partitions on spark ui
URL: https://github.com/apache/spark/pull/35867
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
github-actions[bot] closed pull request #35990: [SPARK-38639][SQL] Support
ignoreCorruptRecord flag to ensure querying broken sequence file table smoothly
URL: https://github.com/apache/spark/pull/35990
--
This is an automated message from the Apache Git Service.
To respond to the message,
chaoqin-li1123 commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981833418
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,128 @@
+#
+# Licensed to the Apache Software Foundation
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981811676
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981811676
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981809103
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
aokolnychyi commented on PR #38004:
URL: https://github.com/apache/spark/pull/38004#issuecomment-1260176172
@amaliujia, I have linked https://github.com/apache/spark/pull/38005 that
adds test coverage and implementation. I've split this work to reduce the scope
of each PR and simplify
dongjoon-hyun closed pull request #38011: [SPARK-40574][DOCS] Enhance DROP
TABLE documentation
URL: https://github.com/apache/spark/pull/38011
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dongjoon-hyun commented on PR #38021:
URL: https://github.com/apache/spark/pull/38021#issuecomment-1260138452
Welcome to the Apache Spark community, @danitico .
I added you to the Apache Spark contributor group and assign SPARK-40583 to
you.
--
This is an automated message from the
dongjoon-hyun closed pull request #38021: [SPARK-40583][DOCS] Fixing artifactId
name in `cloud-integration.md`
URL: https://github.com/apache/spark/pull/38021
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
amaliujia commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r981760665
##
connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -155,4 +156,7 @@ message Expression {
string expression = 1;
}
+ // represent * (e.g.
amaliujia commented on PR #38023:
URL: https://github.com/apache/spark/pull/38023#issuecomment-1260122735
@HyukjinKwon @cloud-fan @grundprinzip
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
amaliujia commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r981761756
##
connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -155,4 +156,7 @@ message Expression {
string expression = 1;
}
+ // represent * (e.g.
amaliujia commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r981761756
##
connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -155,4 +156,7 @@ message Expression {
string expression = 1;
}
+ // represent * (e.g.
amaliujia commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r981760665
##
connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -155,4 +156,7 @@ message Expression {
string expression = 1;
}
+ // represent * (e.g.
xinrong-meng commented on code in PR #38015:
URL: https://github.com/apache/spark/pull/38015#discussion_r981758988
##
python/pyspark/pandas/indexes/base.py:
##
@@ -1907,6 +1908,9 @@ def append(self, other: "Index") -> "Index":
)
index_fields =
amaliujia opened a new pull request, #38023:
URL: https://github.com/apache/spark/pull/38023
### What changes were proposed in this pull request?
Support `SELECT *` in an explicit way by connect proto.
### Why are the changes needed?
Current proto uses empty
xinrong-meng commented on code in PR #38015:
URL: https://github.com/apache/spark/pull/38015#discussion_r981758420
##
python/pyspark/pandas/indexes/base.py:
##
@@ -1907,6 +1908,9 @@ def append(self, other: "Index") -> "Index":
)
index_fields =
xinrong-meng commented on code in PR #38018:
URL: https://github.com/apache/spark/pull/38018#discussion_r981752197
##
python/pyspark/pandas/frame.py:
##
@@ -5317,6 +5317,12 @@ def to_orc(
... '%s/to_orc/foo.orc' % path,
... mode = 'overwrite',
xinrong-meng commented on PR #38018:
URL: https://github.com/apache/spark/pull/38018#issuecomment-1260097710
pandas-on-Spark is more likely to be a developers' reference in the source
code, whereas `pandas API on Spark` is the official, user-facing name. Hope
that helps :) @bjornjorgensen
amaliujia commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r981745489
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
HeartSaVioR commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981724244
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,128 @@
+#
+# Licensed to the Apache Software Foundation
HeartSaVioR commented on PR #38013:
URL: https://github.com/apache/spark/pull/38013#issuecomment-1260062541
@chaoqin-li1123
https://github.com/chaoqin-li1123/spark/actions/runs/3138156803/jobs/5097193712
Linter is still complaining. Could you take a look?
You can install
attilapiros commented on code in PR #37990:
URL: https://github.com/apache/spark/pull/37990#discussion_r981696574
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala:
##
@@ -193,22 +197,19 @@
Kimahriman commented on PR #37770:
URL: https://github.com/apache/spark/pull/37770#issuecomment-1260009009
> also, what about adding some tests in
`python/pyspark/sql/tests/test_functions.py`?
Thought I found all the places there were explode tests to add inline as
well but missed
amaliujia commented on PR #38003:
URL: https://github.com/apache/spark/pull/38003#issuecomment-1259981240
> The more realistic use case was using a non-deterministic udf for
accumulator things, with the push down resulting in different values, the rand
was just the easiest way to test it.
MaxGekk commented on PR #38000:
URL: https://github.com/apache/spark/pull/38000#issuecomment-1259967353
cc @itholic
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
MaxGekk commented on PR #38000:
URL: https://github.com/apache/spark/pull/38000#issuecomment-1259967214
@srielau @cloud-fan @anchovYu @gatorsmile Please, review this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
amaliujia commented on code in PR #38007:
URL: https://github.com/apache/spark/pull/38007#discussion_r981636250
##
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##
@@ -2635,6 +2635,10 @@ class JDBCV2Suite extends QueryTest with
SharedSparkSession with
Kimahriman commented on PR #38003:
URL: https://github.com/apache/spark/pull/38003#issuecomment-1259938755
The more realistic use case was using a non-deterministic udf for
accumulator things, with the push down resulting in different values, the rand
was just the easiest way to test it.
amaliujia commented on PR #38003:
URL: https://github.com/apache/spark/pull/38003#issuecomment-1259929204
Just curious if this is ever discussed:
The `rand()`, for example, can be evaluated before pushing down. This is
more like a query re-writing that such non-deterministic
amaliujia commented on PR #37993:
URL: https://github.com/apache/spark/pull/37993#issuecomment-1259922406
post + 1. Thanks for following up on this quickly!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
chaoqin-li1123 commented on code in PR #38013:
URL: https://github.com/apache/spark/pull/38013#discussion_r981605902
##
examples/src/main/python/sql/streaming/structured_network_wordcount_session_window.py:
##
@@ -0,0 +1,114 @@
+#
+# Licensed to the Apache Software Foundation
1 - 100 of 252 matches
Mail list logo