Mailing lists matching spark.apache.org

commits spark.apache.org
dev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org


[jira] [Updated] (SPARK-41150) Document debugging with PySpark memory profiler

2022-11-15 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-41150: - Description: Document how to debug with PySpark memory profiler on https://spark.apache.org

[jira] [Resolved] (SPARK-49315) Generalize `relocateGeneratedCRD` Gradle Task to handle `*.spark.apache.org-v1.yml`

2024-08-19 Thread Dongjoon Hyun (Jira)
request 72 [https://github.com/apache/spark-kubernetes-operator/pull/72] > Generalize `relocateGeneratedCRD` Gradle Task to handle > `*.spark.apache.org-v1.yml` > --- > > K

[jira] [Resolved] (SPARK-49316) Generalize `printer-columns.sh` to handle `*.spark.apache.org-v1.yml` files

2024-08-19 Thread Dongjoon Hyun (Jira)
request 73 [https://github.com/apache/spark-kubernetes-operator/pull/73] > Generalize `printer-columns.sh` to handle `*.spark.apache.org-v1.yml` files > --- > > Key: SPARK-49316 >

[jira] [Resolved] (SPARK-10700) Spark R Documentation not available

2015-09-18 Thread Shivaram Venkataraman (JIRA)
Affects Versions: 1.5.0 >Reporter: Dev Lakhani >Assignee: Shivaram Venkataraman >Priority: Minor > > Documentation > https://spark.apache.org/docs/latest/api/R/glm.html refered to in > https://spark.apache.org/docs/latest/sparkr.html is not availa

[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide

2015-10-10 Thread Bhargav Mangipudi (JIRA)
Issue Type: Improvement > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Raela Wang >Assignee: Lauren Moos >Priority: Minor > Labels: starter > > http://spark.apache.org/docs/latest/ml-guide.html#ex

[jira] [Commented] (SPARK-6306) Readme points to dead link

2015-03-12 Thread Theodore Vasiloudis (JIRA)
n >Reporter: Theodore Vasiloudis >Priority: Trivial > Fix For: 1.4.0 > > > The link to "Specifying the Hadoop Version" now points to > http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version. > T

Re: [PR] [SPARK-41794][SQL] Add `try_remainder` function and re-enable column tests [spark]

2024-05-08 Thread via GitHub
gengliangwang commented on PR #46434: URL: https://github.com/apache/spark/pull/46434#issuecomment-2100956904 @grundprinzip thanks for the work! Let's also mention the new function in https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#useful-functions-for-ansi

Re: [PR] add possibility to set log filename & disable spark log rotation #47373 [spark]

2024-07-17 Thread via GitHub
Tocard commented on PR #47383: URL: https://github.com/apache/spark/pull/47383#issuecomment-2232827755 > Can we file a JIRA please? See also https://spark.apache.org/contributing.html as improvement ? I was thinking you willd o it, my bad. anyway wainting approval. -- This is

Re: [PR] Multiple spark application can be submitted within the current process [spark]

2024-08-01 Thread via GitHub
://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact

Re: [PR] [SPARK-48714][PYTHON] Implement `DataFrame.mergeInto` in PySpark [spark]

2024-08-15 Thread via GitHub
ation](https://spark.apache.org/docs/3.1.3/api/python/reference/index.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[PR] [SPARK-49276] Use API Group `spark.apache.org` [spark-kubernetes-operator]

2024-08-17 Thread via GitHub
comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

Re: [PR] [SQL] Bind JDBC dialect to JDBCRDD at construction [spark]

2024-08-18 Thread via GitHub
HyukjinKwon commented on PR #45410: URL: https://github.com/apache/spark/pull/45410#issuecomment-2295156216 @johnnywalker Let's file a JIRA, see also https://spark.apache.org/contributing.html @urosstan-db I will leave this to you to approve or not. cc @cloud-fan too -- This

Re: [PR] [SPARK-49324] Add state transition e2e test for happy path [spark-kubernetes-operator]

2024-08-24 Thread via GitHub
: -apiVersion: v1 +apiVersion: spark.apache.org/v1alpha1 Review Comment: Removed this from this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

[GitHub] [spark] lyssg commented on pull request #35667: [K8S] Avoid possible errors due to incorrect file size or type supplied in hadoop conf

2022-02-27 Thread GitBox
lyssg commented on pull request #35667: URL: https://github.com/apache/spark/pull/35667#issuecomment-1053761559 > @lyssg mind linking the JIRA into the PR title please? See also https://spark.apache.org/contributing.html thanks, i will complete it. -- This is an automated mess

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #35110: [SPARK-37820][SQL] Replace ApacheCommonBase64 with JavaBase64 for string funcs

2022-01-06 Thread GitBox
dongjoon-hyun edited a comment on pull request #35110: URL: https://github.com/apache/spark/pull/35110#issuecomment-1006371310 Yes, Here it is, `Running benchmarks in your forked repository`. It's easy and nice tool which @HyukjinKwon contributed. - https://spark.apache.org/deve

[GitHub] [spark] yaooqinn commented on pull request #35110: [SPARK-37820][SQL] Replace ApacheCommonBase64 with JavaBase64 for string funcs

2022-01-06 Thread GitBox
kjinKwon contributed. > > * https://spark.apache.org/developer-tools.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: re

[GitHub] [spark] bjornjorgensen commented on pull request #35488: [SPARK-38183][PYTHON] Show warning when creating pandas-on-Spark session under ANSI mode.

2022-02-11 Thread GitBox
bjornjorgensen commented on pull request #35488: URL: https://github.com/apache/spark/pull/35488#issuecomment-1036045494 SQL ANSI mode 'spark.sql.ansi.enabled' is set to True. This is an experimental config. For more information spark.apache.org/docs/latest/sql-ref-ansi-compl

[GitHub] [spark] ahmed-mahran commented on pull request #38966: [SPARK-41008][MLLIB] Dedup isotonic regression duplicate features

2022-12-08 Thread GitBox
ahmed-mahran commented on PR #38966: URL: https://github.com/apache/spark/pull/38966#issuecomment-1342832587 I think we need to follow up with documentation updates https://spark.apache.org/docs/latest/mllib-isotonic-regression.html#:~:text=If%20the%20prediction,point%20are%20used

[GitHub] [spark] packyan commented on pull request #39021: [SPARK-41483] Last metrics system report should have a timeout, avoid to lead shutdown hook timeout

2022-12-11 Thread GitBox
packyan commented on PR #39021: URL: https://github.com/apache/spark/pull/39021#issuecomment-1345823687 > @packyan mind creating a JIRA and linking it to PR title? See also https://spark.apache.org/contributing.html Sorry, I will do it later. -- This is an automated message f

[GitHub] [spark] dongjoon-hyun commented on pull request #39371: [SPARK-41030][BUILD][3.2] Upgrade `Apache Ivy` to 2.5.1

2023-01-04 Thread GitBox
dongjoon-hyun commented on PR #39371: URL: https://github.com/apache/spark/pull/39371#issuecomment-1371344946 Before `v3.2.4`, - `v3.3.2` will arrive on Feb/March timeframe - `v3.4.0` feature freeze will start on January 16th and RC will start on [February](https://spark.apache.org

[GitHub] [spark] martin-g commented on pull request #36358: [SPARK-39023] [K8s] Add Executor Pod inter-pod anti-affinity

2022-04-26 Thread GitBox
martin-g commented on PR #36358: URL: https://github.com/apache/spark/pull/36358#issuecomment-1110582965 > BTW, I'd like to know about the one you mentioned by using the PodTemplate config. https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template -- Th

[GitHub] [spark] tanvn commented on pull request #36618: [SPARK-39237][DOCS][3.2] Update the ANSI SQL mode documentation

2022-05-21 Thread GitBox
tanvn commented on PR #36618: URL: https://github.com/apache/spark/pull/36618#issuecomment-1133825459 @gengliangwang Thank you for the update! May I know when will the change be reflected on https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html ? -- This is an

[GitHub] [spark] itholic commented on pull request #38446: [Spark-40974]When the value of quote or escape exists in the content of csv file, the character in the csv file will be misidentified

2022-10-31 Thread GitBox
itholic commented on PR #38446: URL: https://github.com/apache/spark/pull/38446#issuecomment-1297284288 And also checking https://spark.apache.org/contributing.html would be helpful! :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] amaliujia commented on pull request #38506: [SPARK-41010][CONNECT][PYTHON] Complete Support for Except and Intersect in Python client

2022-11-03 Thread GitBox
amaliujia commented on PR #38506: URL: https://github.com/apache/spark/pull/38506#issuecomment-1302820909 actually I will follow https://spark.apache.org/contributing.html. to add a short description for test cases in this PR/ -- This is an automated message from the Apache Git Service

[GitHub] [spark] amaliujia commented on pull request #38488: [SPARK-41002][CONNECT][PYTHON] Compatible `take`, `head` and `first` API in Python client

2022-11-03 Thread GitBox
amaliujia commented on PR #38488: URL: https://github.com/apache/spark/pull/38488#issuecomment-1302821036 actually I will follow https://spark.apache.org/contributing.html. to add a short description for test cases in this PR/ -- This is an automated message from the Apache Git

[GitHub] [spark] itholic commented on pull request #38702: SPARK-41187 [Core] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen

2022-11-17 Thread GitBox
itholic commented on PR #38702: URL: https://github.com/apache/spark/pull/38702#issuecomment-1319566712 Can we change the JIRA format in the title such as "[SPARK-41187][CORE] ...". Check the [Spark contribution guide](https://spark.apache.org/contributing.html) also wou

[GitHub] [spark] srowen commented on pull request #39566: Patched()Fix Protobuf Java vulnerable to Uncontrolled Resource Consumption

2023-01-24 Thread via GitHub
srowen commented on PR #39566: URL: https://github.com/apache/spark/pull/39566#issuecomment-1402320043 Hold up a sec. First please read https://spark.apache.org/contributing.html Where does this actually affect Spark? You have only updated a protobuf depenency in the Kinesis

[GitHub] [spark] zhengruifeng commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-23 Thread via GitHub
zhengruifeng commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1441434811 I guess you may need to `Go to “Actions” tab on your forked repository and enable “Build and test” and “Report test results” workflows` https://spark.apache.org

[GitHub] [spark] jelmerk commented on pull request #40219: Disable substitution in values

2023-02-28 Thread via GitHub
jelmerk commented on PR #40219: URL: https://github.com/apache/spark/pull/40219#issuecomment-1448415053 > (Can you just file a JIRA and link it per https://spark.apache.org/contributing.html ?) Already did that a few mins ago and added it on top of the PR description -- This is

[GitHub] [spark] philwalk commented on pull request #38167: fix problems that affect windows shell environments (cygwin/msys2/mingw)

2022-10-10 Thread GitBox
philwalk commented on PR #38167: URL: https://github.com/apache/spark/pull/38167#issuecomment-1273490984 > https://spark.apache.org/contributing.html? e.g., let's file a JIRA and link it to the PR title. I'm looking into it now on the JIRA website. -- This is an auto

[GitHub] [spark] zhengruifeng commented on pull request #38213: fix runtime filter do not execute when no stats

2022-10-12 Thread GitBox
zhengruifeng commented on PR #38213: URL: https://github.com/apache/spark/pull/38213#issuecomment-1275837267 A JIRA ticket is needed, you can refer to https://spark.apache.org/contributing.html if this is a bug fix, it's better to have a UT for the fix. -- This is an auto

[GitHub] [spark] HyukjinKwon commented on pull request #38371: Fix some comments in DAGSchedulerSuite

2022-10-24 Thread GitBox
https://spark.apache.org/contributing.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this

[GitHub] [spark] dongjoon-hyun commented on pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-10-25 Thread GitBox
dongjoon-hyun commented on PR #36011: URL: https://github.com/apache/spark/pull/36011#issuecomment-1291093774 https://spark.apache.org/versioning-policy.html ![Screenshot 2022-10-25 at 1 22 56 PM](https://user-images.githubusercontent.com/9700541/197874273-40dc594a-3288-4537-a491

[GitHub] [spark] gengliangwang commented on pull request #37226: [MINOR][SQL] Simplify the description of built-in function.

2022-07-25 Thread GitBox
gengliangwang commented on PR #37226: URL: https://github.com/apache/spark/pull/37226#issuecomment-1194942336 Seems quite minor. The catalyst expressions are not public to users either (https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/index.html). I am +0 on this

[GitHub] [spark] zzzzming95 commented on pull request #37416: [SPARK-39743][DOCS] Updated some spark.io.compression configuration descriptions to clarify parameter applica…

2022-08-04 Thread GitBox
ming95 commented on PR #37416: URL: https://github.com/apache/spark/pull/37416#issuecomment-1206001277 > @ming95 mind filing a JIRA please? See also https://spark.apache.org/contributing.html sorry, forgot to fill in jira in the description . -- This is an automa

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-20 Thread HyukjinKwon
#' @param dataType a character object describing the target data type. #' See -#' \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ -#'Spark Data Types} for available data types. +#'

[GitHub] [spark] HyukjinKwon commented on pull request #33619: [MINOR][DOC] Remove obsolete `contributing-to-spark.md`

2021-08-03 Thread GitBox
HyukjinKwon commented on pull request #33619: URL: https://github.com/apache/spark/pull/33619#issuecomment-892284220 I think actually this is to keep the legacy link not to break (https://spark.apache.org/docs/latest/contributing-to-spark.html). But it's 8 years ago .. I guess that&#

[GitHub] [spark] viirya commented on pull request #33763: [SPARK-36533][SS] Trigger.AvailableNow for running streaming queries like Trigger.Once in multiple batches

2021-08-24 Thread GitBox
viirya commented on pull request #33763: URL: https://github.com/apache/spark/pull/33763#issuecomment-905056479 The structured streaming programming guide: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers The source is at docs/structured

[GitHub] [spark] HyukjinKwon commented on pull request #34595: Result vector from pandas_udf was not the required length

2021-11-14 Thread GitBox
[`DataFrame.mapInPandas`](https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.mapInPandas.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #34715: [WIP][SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2

2021-11-25 Thread GitBox
AngersZh edited a comment on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-979677759 Should we need to clarify this change in https://spark.apache.org/docs/latest/api/python/getting_started/install.html#manually-downloading ? cc @HyukjinKwon

[GitHub] [spark] maropu commented on pull request #32040: Add @ExpressionDescription to TimeWindow to generate missing documentation

2021-04-02 Thread GitBox
maropu commented on pull request #32040: URL: https://github.com/apache/spark/pull/32040#issuecomment-812766403 Probably, we can add it somewhere in the `SQL, DataFrames, and Datasets` doc: https://spark.apache.org/docs/3.1.1/sql-getting-started.html just like the doc of structured

[GitHub] [spark] HyukjinKwon commented on pull request #33839: [SPARK-36291][SQL] Refactor second set of 20 in QueryExecutionErrors to use error classes

2021-09-25 Thread GitBox
HyukjinKwon commented on pull request #33839: URL: https://github.com/apache/spark/pull/33839#issuecomment-927221521 @dgd-contributor, please contact me or priv...@spark.apache.org. As I shared in the email, the submissions from the specific shared account will not be accepted for now

[GitHub] SandishKumarHN commented on issue #23401: [SPARK-26513][Core] : Trigger GC on executor node idle

2018-12-31 Thread GitBox
SandishKumarHN commented on issue #23401: [SPARK-26513][Core] : Trigger GC on executor node idle URL: https://github.com/apache/spark/pull/23401#issuecomment-450690065 @srowen I did send out a proposal email to d...@spark.apache.org and u...@spark.apache.org and created JIRA and does

[GitHub] [spark] zhengruifeng commented on pull request #41444: [SPARK-43916][SQL][PYTHON][CONNECT] Add percentile like functions to Scala and Python API

2023-06-04 Thread via GitHub
zhengruifeng commented on PR #41444: URL: https://github.com/apache/spark/pull/41444#issuecomment-1575995156 Where are `percentile_cont` and `percentile_disc` from? I can not find them in https://spark.apache.org/docs/latest/api/sql/index.html and `FunctionRegistry` -- This is an

[GitHub] [spark] dongjoon-hyun commented on pull request #41469: [SPARK-43974][CONNECT][BUILD] Upgrade buf to v1.21.0

2023-06-06 Thread via GitHub
dongjoon-hyun commented on PR #41469: URL: https://github.com/apache/spark/pull/41469#issuecomment-1579770666 Thank you. Yes, I agree with you. Since the feature freeze is July 16th, maybe after July 10th? - https://spark.apache.org/versioning-policy.html -- This is an automated

[GitHub] [spark] zhengruifeng commented on pull request #41469: [SPARK-43974][CONNECT][BUILD] Upgrade buf to v1.21.0

2023-06-06 Thread via GitHub
zhengruifeng commented on PR #41469: URL: https://github.com/apache/spark/pull/41469#issuecomment-1579788969 > Thank you. Yes, I agree with you. Since the feature freeze is July 16th, maybe after July 10th? > > * https://spark.apache.org/versioning-policy.html

[GitHub] [spark] dongjoon-hyun commented on pull request #41238: [SPARK-43594][SQL] Add LocalDateTime to anyToMicros

2023-06-09 Thread via GitHub
dongjoon-hyun commented on PR #41238: URL: https://github.com/apache/spark/pull/41238#issuecomment-1585492657 Just FYI, @Fokko . Apache Spark 3.5.0 Feature Freeze is next month (July 16th) and you will have Apache Spark 3.5.0 on August. - https://spark.apache.org/versioning-policy.html

[GitHub] [spark] dongjoon-hyun commented on pull request #41374: [SPARK-43876][SQL] Enable fast hashmap for distinct queries

2023-06-20 Thread via GitHub
dongjoon-hyun commented on PR #41374: URL: https://github.com/apache/spark/pull/41374#issuecomment-1599198896 Merged to master for Apache Spark 3.5.0. Thank you, @wankunde . - https://spark.apache.org/versioning-policy.html (Apache Spark 3.5.0 on August 2023) -- This is an automated

Re: [PR] [SPARK-46894][PYTHON] Move PySpark error conditions into standalone JSON file [spark]

2024-03-27 Thread via GitHub
HyukjinKwon commented on PR #44920: URL: https://github.com/apache/spark/pull/44920#issuecomment-2024304980 Just to make sure, does it work when you install PySpark as a ZIP file? e.g., downloading it from https://spark.apache.org/downloads.html would install PySpark as a ZIP file

[GitHub] [spark] HyukjinKwon commented on pull request #42350: [SPARK-44662] Perf improvement in BroadcastHashJoin queries with stream side join key on non partition columns

2023-08-06 Thread via GitHub
HyukjinKwon commented on PR #42350: URL: https://github.com/apache/spark/pull/42350#issuecomment-1667012513 Please follow https://spark.apache.org/improvement-proposals.html. We should start the discussion and vote to pass with one of PMC members shepherded -- This is an automated

[GitHub] [spark] zhengruifeng commented on pull request #42469: [SPARK-44782][INFRA] Adjust PR template to Generative Tooling Guidance recommendations

2023-08-13 Thread via GitHub
zhengruifeng commented on PR #42469: URL: https://github.com/apache/spark/pull/42469#issuecomment-1676521158 I guess this should be documented in https://spark.apache.org/developer-tools.html instead of PR template? cc @HyukjinKwon @gatorsmile @srowen -- This is an automated

[GitHub] [spark] HyukjinKwon commented on pull request #38624: [SPARK-40559][PYTHON] Add applyInArrow to groupBy and cogroup

2023-08-21 Thread via GitHub
HyukjinKwon commented on PR #38624: URL: https://github.com/apache/spark/pull/38624#issuecomment-1687231741 > Where is this dev mailing list and how do I raise a discussion there? See https://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html -- This is an automa

Re: [PR] [MINOR][DOCS] Clarify sort behaviour for structs [spark]

2023-11-21 Thread via GitHub
landlord-matt commented on PR #43871: URL: https://github.com/apache/spark/pull/43871#issuecomment-1820696947 Another alternative would be instead of writing it for this function, we could add something on this documentation page. I don't it is open source though https://spark.apach

[GitHub] [spark] srowen commented on pull request #41033: Update bufbuild plugin references

2023-05-04 Thread via GitHub
srowen commented on PR #41033: URL: https://github.com/apache/spark/pull/41033#issuecomment-1534935590 Looks good, if you can resolve the conflicts and tests pass. File a little JIRA and update the title if you please too https://spark.apache.org/contributing.html -- This is an

[GitHub] [spark] HyukjinKwon commented on pull request #41084: [PYTHON] Remove deprecated use of typing.io

2023-05-07 Thread via GitHub
://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact

Re: [PR] [SPARK-43919][SQL] Extract JSON functionality out of Row [spark]

2024-01-16 Thread via GitHub
tfinn-ias commented on PR #41425: URL: https://github.com/apache/spark/pull/41425#issuecomment-1893858562 Hi, your current public documentation for the Java API lists these methods as available with no disclaimers about them being private or unstable: https://spark.apache.org/docs

Re: [PR] [MINOR][PYTHON] refactor PythonWrite to prepare for supporting python data source streaming write [spark]

2024-02-06 Thread via GitHub
xinrong-meng commented on PR #45049: URL: https://github.com/apache/spark/pull/45049#issuecomment-1930884879 Would you create a Spark JIRA https://issues.apache.org/jira/browse/SPARK and add it to the PR title? Please refer to https://spark.apache.org/contributing.html for details. Thanks

[GitHub] [spark] Yaohua628 commented on pull request #36069: [SPARK-38767][SQL] Support `ignoreCorruptFiles` and `ignoreMissingFiles` in Data Source options

2022-07-20 Thread GitBox
Yaohua628 commented on PR #36069: URL: https://github.com/apache/spark/pull/36069#issuecomment-1190879022 @cloud-fan @dongjoon-hyun Hi folks, do you know when the doc changes will be deployed? https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html? Thanks! -- This

[GitHub] [spark] zhengruifeng commented on pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0

2023-09-14 Thread via GitHub
zhengruifeng commented on PR #42793: URL: https://github.com/apache/spark/pull/42793#issuecomment-1719206074 not related to this PR itself, what is the policy to upgrade the minimum version of dependencies listed [here](https://spark.apache.org/docs/latest/api/python/getting_started

[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-20 Thread via GitHub
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1728702545 Please send the mail to priv...@spark.apache.org and look for action from PMC members. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] dongjoon-hyun edited a comment on issue #24304: [MINOR][DOC] Fix html tag broken in configuration.md

2019-04-05 Thread GitBox
dongjoon-hyun edited a comment on issue #24304: [MINOR][DOC] Fix html tag broken in configuration.md URL: https://github.com/apache/spark/pull/24304#issuecomment-480325000 We need to fix the followings, too. - https://spark.apache.org/docs/2.3.3/configuration.html#available-properties

[GitHub] [spark] dongjoon-hyun commented on issue #27987: [SPARK-31165] Correcting wrong paths in Dockerfile

2020-03-25 Thread GitBox
dongjoon-hyun commented on issue #27987: [SPARK-31165] Correcting wrong paths in Dockerfile URL: https://github.com/apache/spark/pull/27987#issuecomment-603929739 BTW, https://spark.apache.org/docs/latest/running-on-kubernetes.html might be not a good place to have that. Please consider

RE: If not stop StreamingContext gracefully, will checkpoint data be consistent?

2015-06-14 Thread Haopu Wang
-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Optimization module in Python mllib

2015-06-17 Thread Xiangrui Meng
iling list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: u

Re: PySpark/SQL Octet Length

2016-03-08 Thread Ross.Cramblit
park 1.5.2/Python 2.7. > > Thanks > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe

Re: Does parallelize and collect preserve the original order of list?

2016-03-15 Thread Ted Yu
Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >

Re: Is it a bug?

2016-05-09 Thread Daniel Haviv
> > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > - > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > ----

Re: Filter out the elements from xml file in Spark

2016-05-19 Thread Mail.com
--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apach

Re: Kafka connection logs in Spark

2016-05-26 Thread Cody Koeninger
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Serializing DataSets

2016-01-18 Thread Simon Hafner
------ >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > ----- To unsub

Re: Spark SQL 1.3 not finding attribute in DF

2015-12-07 Thread Davies Liu
ute-in-DF-tp25599p25600.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ---

Is there more information about spark shuffer-service

2015-07-21 Thread JoneZhang
There is a saying "If the service is enabled, Spark executors will fetch shuffle files from the service instead of from each other. " in the wiki https://spark.apache.org/docs/1.3.0/job-scheduling.html#graceful-decommission-of-executors <https://spark.apache.org/docs/1.3.0/job-sc

Re: How to update python code in memory

2015-09-16 Thread Davies Liu
---- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Kafka message metadata with Dstreams

2016-08-25 Thread Cody Koeninger
http://spark.apache.org/docs/latest/api/java/index.html messageHandler receives a kafka MessageAndMetadata object. Alternatively, if you just need metadata information on a per-partition basis, you can use HasOffsetRanges http://spark.apache.org/docs/latest/streaming-kafka-integration.html

Re: Spark 1.1 (slow, working), Spark 1.2 (fast, freezing)

2015-01-21 Thread Davies Liu
gt; To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Is Ubuntu server or desktop better for spark cluster

2015-02-14 Thread Sean Owen
gt; To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: HDP 2.2 AM abort : Unable to find ExecutorLauncher class

2015-03-30 Thread Doug Balog
tion" message.) > > Mike Stone > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: use

Re: LDA code little error @Xiangrui Meng

2015-04-22 Thread Xiangrui Meng
the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscri

Re: [PySpark][Python 2.7.8][Spark 1.0.2] count() with TypeError: an integer is required

2014-08-23 Thread Eric Friedman
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: New features (Discretization) for v1.x in xiangrui.pdf

2014-09-02 Thread Xiangrui Meng
ing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr..

Re: Running spark-shell (or queries) over the network (not from master)

2014-09-05 Thread Ognen Duzlevski
st.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13593.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.

Re: groupBy gives non deterministic results

2014-09-10 Thread redocpot
Hi, Xianjin I checked user@spark.apache.org, and found my post there: http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/browser I am using nabble to send this mail, which indicates that the mail will be sent from my email address to the u...@spark.incubator.apache.org mailing list

Re: Question About Submit Application

2014-09-25 Thread Marcelo Vanzin
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark SQL DDL, DML commands

2014-10-16 Thread Yi Tian
he Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To u

RE: spark sql: timestamp in json - fails

2014-10-20 Thread Wang, Daoyuan
. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For

Re: spark-repl_1.2.0 was not uploaded to central maven repository.

2014-12-21 Thread Sean Owen
- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

textFile() ordering and header rows

2015-02-22 Thread Michael Malak
Since RDDs are generally unordered, aren't things like textFile().first() not guaranteed to return the first row (such as looking for a header row)? If so, doesn't that make the example in http://spark.apache.org/docs/1.2.1/quick-start.html#basics

[GitHub] spark issue #21419: Branch 2.2

2018-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21419 @gentlewangyu, please close this and read https://spark.apache.org/contributing.html. Questions should go to mailing list and issues should be filed in JIRA

[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...

2018-03-19 Thread foxish
Github user foxish commented on the issue: https://github.com/apache/spark/pull/20669 There's a section explaining it at the bottom of https://spark.apache.org/committers.html --- - To unsubscribe, e-mail: re

[GitHub] spark issue #20370: Changing JDBC relation to better process quotes

2018-01-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20370 @conorbmurphy Could you create a JIRA and follow [the instruction](https://spark.apache.org/contributing.html) to make a contribution

[GitHub] spark issue #20790: AccumulatorV2 subclass isZero scaladoc fix

2018-03-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20790 Wait .. I just found you opened a JIRA - SPARK-23642. Please link it by `[SPARK-23642][DOCS] ...`. see https://spark.apache.org/contributing.html

[GitHub] spark issue #21988: [SPARK-25003][PYSPARK][BRANCH-2.2] Use SessionExtensions...

2018-08-07 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21988 we always open against master and backport if agreed upon. this is documented here https://spark.apache.org/contributing.html

[GitHub] spark issue #21884: k8s: explicitly expose ports on driver container

2018-07-26 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21884 can you update the format of the title and description as described here "Pull Request" in https://spark.apache.org/contrib

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-10-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18833 Can we document this difference in https://spark.apache.org/docs/latest/sql-programming-guide.html#compatibility-with-apache-hive

[GitHub] spark issue #19263: Optionally add block updates to log

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19263 @michaelmior would you please follow the instruction (https://spark.apache.org/contributing.html) to update PR title and create a corresponding JIRA, thanks

Python Kafka support?

2015-11-10 Thread Darren Govoni
Hi, I read on this page http://spark.apache.org/docs/latest/streaming-kafka-integration.html about python support for "receiverless" kafka integration (Approach 2) but it says its incomplete as of version 1.4. Has this been updated in version 1.5.

Re: subscribe

2015-08-22 Thread Brandon White
https://www.youtube.com/watch?v=umDr0mPuyQc On Sat, Aug 22, 2015 at 8:01 AM, Ted Yu wrote: > See http://spark.apache.org/community.html > > Cheers > > On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes < > li...@hermes-it-consulting.de&g

Re: unsubscribe

2019-06-12 Thread B2B Web ID
Hi, Sonu. You can send email to user-unsubscr...@spark.apache.org with subject "(send this email to unsubscribe)" to unsubscribe from this mailling list[1]. Regards. [1] https://spark.apache.org/community.html 2019-05-27 2:01 GMT+07.00, Sonu Jyotshna : > > -- -- Salam H

[53/56] spark-website git commit: Rebuild for 2.2.0

2017-07-11 Thread marmbrus
://spark.apache.org/releases/spark-release-2-2-0.html + weekly + + + https://spark.apache.org/news/spark-2-2-0-released.html + weekly + + https://spark.apache.org/releases/spark-release-2-1-1.html weekly @@ -644,11 +652,11 @@ weekly - https://spark.apache.org/graphx/ + https

Re: spark on kubernetes

2016-05-23 Thread Gurvinder Singh
kers UI is not possible as then I have to expose >> >>>>> them too >> >>>>>individually and given I can have multiple application it becomes >> >>>>> hard >> >>>>>to manage. >> >>>>&g

<    7   8   9   10   11   12   13   14   15   16   >