Messages by Thread
-
Support Required: Issue with PySpark Code Execution Order
Karthick N
-
Spark K8s auto scaling using Keda or similar tools
Nimrod Ofek
-
[PySpark] [Beginner] [Debug] Does Spark ReadStream support reading from a MinIO bucket?
Kleckner, Jade
-
[ANNOUNCE] Debo CLI v0.1.0 – Unified Hadoop Ecosystem Management Tool
Surafel Temesgen
-
[SPARK-CORE] SerializationDebugger fails on Java 21
Clemens Ballarin
-
[SPARK-CONNECT] [SPARK-4.0] Encountered end-of-stream mid-frame
Manas Bhardwaj
-
GraphFrames is back with v0.9.2! pip install graphframes-py :)
Russell Jurney
-
Read duration, Write duration, Processing Time metrics
Melika Ghiasi
-
Compatibility Issue: DescribeTopicsResult.all() missing in Kafka 4.0.0 used with Spark 4.0.0
Sandeep Ballu
-
[Spark SQL]: Python Data Source API and spark.sql.execution.pyspark.python
Ilya
-
Regarding Obtaining Executor ID and GPU Binding in PySpark
wuchaowei
-
[Spark SQL]: Spark 4 logs warning and stack trace when loading dataframe from path containing wildcard
Glenn J
-
spark.api.mode property is not available in spark 4.0.0
Sangram Mohanty
-
Spark Job Stuck in Active State (v2.4.3, Cluster Mode)
Hitesh Vaghela
-
[Spark SQL]: Spark can't read views created via Trino using enableHiveSupport.
Tal Haimov
-
Clarification on failOnDataLoss Behavior in Spark Structured Streaming with Kafka
Nimrod Ofek
-
Question about Spark Tag in TreeNode
Yifan Li
-
[PR REVIEW] Glob based provider for history server
Gaurav Waghmare
-
Spark checkpointing in batch mode fault tolerance problem
Martin Aras
-
[ANNOUNCE] Apache Spark Kubernetes Operator 0.4.0 released
Dongjoon Hyun
-
[PYSPARK] createDataFrame throws exception with Python 3.12.3
Eyck Troschke
-
Technical Guidance: Dynamic Resource Allocation + External Shuffle Storage
Andrew M.
-
Inquiry About User Impersonation Support in Spark Thrift Server (Spark 1.x to 4.x)
Allen Chu
-
What is the current canonical way to join more than 2 watermarked streams (Spark 3.5.6)?
cheapsolutionarchit...@gmail.com
-
pyspark4.0.0 still includes "jackson-mapper-asl.jar" that was supposed to be removed according to release note
Haibo.Wang
-
Spark on kubernete, configmap add log4j2.properties data
melin li
-
[SQL]: Registering spark extensions which utilise DataSourceV2Strategy in Spark 4
Jack Buggins
-
[ANNOUNCE] Apache Sedona 1.7.2 released
Jia Yu
-
[ANNOUNCE] Apache Spark Kubernetes Operator 0.3.0 released
Dongjoon Hyun
-
[ANNOUNCE] Apache Spark Connect Swift Client 0.3.0 released
Dongjoon Hyun
-
Inquiry: Extending Spark ML Support via Spark Connect to Scala/Java APIs (SPARK-50812 Analogue)
Daniel Filev
-
[ANNOUNCE] Apache Spark 4.0.0 released
Wenchen Fan
-
[PYSPARK] df.collect throws exception for MapType with ArrayType as key
Eyck Troschke
-
Aligning pom.xml in Bundled PySpark JARs with Effective Runtime Dependencies for SCA Tools
Guzarevich, M. (Mikalai)
-
Reg: spark delta table read failing
Akram Shaik
-
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.2.0
Dongjoon Hyun
-
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.2.0
Dongjoon Hyun
-
Structured streaming consumer group offset management in case of consumption of topic with same name from different Kafka clusters
megh vidani
-
user-unsubscribe
Sky Yin
-
Help requested: Spark security triage and followup
Apache Security Team
-
Apache Sedona + Iceberg GEO meetup in San Francisco
Jia Yu
-
[ANNOUNCE] Announcing Apache Spark Connect Swift Client 0.1.0
Dongjoon Hyun
-
[ANNOUNCE] Announcing Apache Spark Kubernetes Operator 0.1.0
Dongjoon Hyun
-
[ML] Does GeneralizedLinearRegression correctly handle interaction between two categorical values?
Emil Hofman
-
Fw: Reg: Supporting inheritance for datatypes in pyspark
Vaibhaw
-
[Spark SQL] spark.sql insert overwrite on existing partition not updating hive metastore partition transient_lastddltime and column_stats
Pradeep
-
Issue with Spark Operator
nilanjan sarkar
-
Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3
Sungwoo Park
-
Appreciate a second opinion – Metadata Analysis of PDF Files
Mich Talebzadeh
-
Checkpointing in foreachPartition in Spark batck
Abhishek Singla
-
Comparison between union and stack in pyspark
Dhruv Singla
-
Structured Streaming Initial Listing Issue
Anastasiia Sokhova
-
Parallelism for glue pyspark jobs
Perez
-
The use of Python ParamSpec in PySpark
Rafał Wojdyła
-
Spark Streaming Dataset with Multiple S3 Sources is too Slow
Jevon Cowell
-
Is "SORTED BY (col DESC)" Supported for Bucketed Table?
Joe Lee
-
kubernetes spark connect iceberg SparkWrite$WriterFactory not found
Razvan Mihai
-
High count of Active Jobs
nayan sharma
-
Announcing the Community Over Code 2025 Streaming Track
James Hughes
-
Kubeflow Spark-Operator
Hamish Whittal
-
Correctness Issue: UNIX_SECONDS() mismatch with TO_UTC_TIMESTAMP() result in Spark 3.5.1
Miguel Leite
-
Executors not getting released dynamically once task is over
Shivang Modi
-
Java coding with spark API
tim wade
-
Spark 3.3 job jar assembly with JDK 17 and JRE 11 runtime (java target/source = 8)
Kristopher Kane
-
Request for Support and Resources for Apache Spark User Groups in Bogotá and Mexico
Juan Diaz
-
Inquiry in regards to a New onQuery Method for StreamingQueryListener
Jevon Cowell
-
performance issue Spark 3.5.2 on kubernetes
Prem Sahoo
-
Spark Shuffle - in kubeflow spark operator installation on k8s
karan alang
-
Motif finding tutorial
Russell Jurney
-
Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance
Prem Gmail
-
High/Critical CVEs in jackson-mapper-asl (spark 3.5.5)
Mohammad, Ejas Ali
-
Spark Kubernetes Operator | Release Date
Dheeraj Panangat
-
[ANNOUNCE] Apache Sedona 1.7.1 released
Jia Yu
-
Multiple CVE issues in apache/spark-py:3.4.0 + Pyspark 3.4.0
Mohammad, Ejas Ali
-
[ANNOUNCE] Apache Celeborn 0.5.4 available
Nicholas
-
4.1.0 release timeline
Martin Bielik
-
[ANNOUNCE] Version 2.0.0-beta1 of hnswlib spark released
jelmer
-
[CONNECT] Question on Spark Connect in Cluster Deply Mode
Yasukazu Nagatomi
-
Apply pivot only on some columns in pyspark
Dhruv Singla
-
[ANNOUNCE] Apache Spark 3.5.5 released
Dongjoon Hyun
-
Optimizing file size of an iceberg table
Pathum Wijethunge
-
Re: Apache - GSOC'25 projects / Contributions
Mich Talebzadeh
-
Kafka Connector: producer throttling
Abhishek Singla
-
GraphFrames Hackathon - NOW :)
Russell Jurney
-
Using storage decommissioning on K8S cluster
Enrico Minack
-
Spark connect: Table caching for global use?
Tim Robertson
-
END OF LIFE DETERMINATION
Izhar Mohammed
-
Doubt regarding year formatting
Dhruv Singla
-
Spark Website Styling Issues Partially Resolved
Gengliang Wang
-
Website Down
Will Dumas
-
Is SSL configuration being used for RPC communication?
Pablo Fernández
-
GraphFrames Hackathon on Friday, February 21
Russell Jurney
-
Drop Python 2 support from GraphFrames?
Russell Jurney