Mailing lists matching spark.apache.org
commits spark.apache.orgdev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org
[jira] [Updated] (SPARK-41150) Document debugging with PySpark memory profiler
[ https://issues.apache.org/jira/browse/SPARK-41150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-41150: - Description: Document how to debug with PySpark memory profiler on https://spark.apache.org
[jira] [Resolved] (SPARK-49315) Generalize `relocateGeneratedCRD` Gradle Task to handle `*.spark.apache.org-v1.yml`
request 72 [https://github.com/apache/spark-kubernetes-operator/pull/72] > Generalize `relocateGeneratedCRD` Gradle Task to handle > `*.spark.apache.org-v1.yml` > --- > > K
[jira] [Resolved] (SPARK-49316) Generalize `printer-columns.sh` to handle `*.spark.apache.org-v1.yml` files
request 73 [https://github.com/apache/spark-kubernetes-operator/pull/73] > Generalize `printer-columns.sh` to handle `*.spark.apache.org-v1.yml` files > --- > > Key: SPARK-49316 >
[jira] [Resolved] (SPARK-10700) Spark R Documentation not available
Affects Versions: 1.5.0 >Reporter: Dev Lakhani >Assignee: Shivaram Venkataraman >Priority: Minor > > Documentation > https://spark.apache.org/docs/latest/api/R/glm.html refered to in > https://spark.apache.org/docs/latest/sparkr.html is not availa
[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide
Issue Type: Improvement > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Raela Wang >Assignee: Lauren Moos >Priority: Minor > Labels: starter > > http://spark.apache.org/docs/latest/ml-guide.html#ex
[jira] [Commented] (SPARK-6306) Readme points to dead link
n >Reporter: Theodore Vasiloudis >Priority: Trivial > Fix For: 1.4.0 > > > The link to "Specifying the Hadoop Version" now points to > http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version. > T
Re: [PR] [SPARK-41794][SQL] Add `try_remainder` function and re-enable column tests [spark]
gengliangwang commented on PR #46434: URL: https://github.com/apache/spark/pull/46434#issuecomment-2100956904 @grundprinzip thanks for the work! Let's also mention the new function in https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#useful-functions-for-ansi
Re: [PR] add possibility to set log filename & disable spark log rotation #47373 [spark]
Tocard commented on PR #47383: URL: https://github.com/apache/spark/pull/47383#issuecomment-2232827755 > Can we file a JIRA please? See also https://spark.apache.org/contributing.html as improvement ? I was thinking you willd o it, my bad. anyway wainting approval. -- This is
Re: [PR] Multiple spark application can be submitted within the current process [spark]
://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact
Re: [PR] [SPARK-48714][PYTHON] Implement `DataFrame.mergeInto` in PySpark [spark]
ation](https://spark.apache.org/docs/3.1.3/api/python/reference/index.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
[PR] [SPARK-49276] Use API Group `spark.apache.org` [spark-kubernetes-operator]
comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For
Re: [PR] [SQL] Bind JDBC dialect to JDBCRDD at construction [spark]
HyukjinKwon commented on PR #45410: URL: https://github.com/apache/spark/pull/45410#issuecomment-2295156216 @johnnywalker Let's file a JIRA, see also https://spark.apache.org/contributing.html @urosstan-db I will leave this to you to approve or not. cc @cloud-fan too -- This
Re: [PR] [SPARK-49324] Add state transition e2e test for happy path [spark-kubernetes-operator]
: -apiVersion: v1 +apiVersion: spark.apache.org/v1alpha1 Review Comment: Removed this from this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e
[GitHub] [spark] lyssg commented on pull request #35667: [K8S] Avoid possible errors due to incorrect file size or type supplied in hadoop conf
lyssg commented on pull request #35667: URL: https://github.com/apache/spark/pull/35667#issuecomment-1053761559 > @lyssg mind linking the JIRA into the PR title please? See also https://spark.apache.org/contributing.html thanks, i will complete it. -- This is an automated mess
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #35110: [SPARK-37820][SQL] Replace ApacheCommonBase64 with JavaBase64 for string funcs
dongjoon-hyun edited a comment on pull request #35110: URL: https://github.com/apache/spark/pull/35110#issuecomment-1006371310 Yes, Here it is, `Running benchmarks in your forked repository`. It's easy and nice tool which @HyukjinKwon contributed. - https://spark.apache.org/deve
[GitHub] [spark] yaooqinn commented on pull request #35110: [SPARK-37820][SQL] Replace ApacheCommonBase64 with JavaBase64 for string funcs
kjinKwon contributed. > > * https://spark.apache.org/developer-tools.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: re
[GitHub] [spark] bjornjorgensen commented on pull request #35488: [SPARK-38183][PYTHON] Show warning when creating pandas-on-Spark session under ANSI mode.
bjornjorgensen commented on pull request #35488: URL: https://github.com/apache/spark/pull/35488#issuecomment-1036045494 SQL ANSI mode 'spark.sql.ansi.enabled' is set to True. This is an experimental config. For more information spark.apache.org/docs/latest/sql-ref-ansi-compl
[GitHub] [spark] ahmed-mahran commented on pull request #38966: [SPARK-41008][MLLIB] Dedup isotonic regression duplicate features
ahmed-mahran commented on PR #38966: URL: https://github.com/apache/spark/pull/38966#issuecomment-1342832587 I think we need to follow up with documentation updates https://spark.apache.org/docs/latest/mllib-isotonic-regression.html#:~:text=If%20the%20prediction,point%20are%20used
[GitHub] [spark] packyan commented on pull request #39021: [SPARK-41483] Last metrics system report should have a timeout, avoid to lead shutdown hook timeout
packyan commented on PR #39021: URL: https://github.com/apache/spark/pull/39021#issuecomment-1345823687 > @packyan mind creating a JIRA and linking it to PR title? See also https://spark.apache.org/contributing.html Sorry, I will do it later. -- This is an automated message f
[GitHub] [spark] dongjoon-hyun commented on pull request #39371: [SPARK-41030][BUILD][3.2] Upgrade `Apache Ivy` to 2.5.1
dongjoon-hyun commented on PR #39371: URL: https://github.com/apache/spark/pull/39371#issuecomment-1371344946 Before `v3.2.4`, - `v3.3.2` will arrive on Feb/March timeframe - `v3.4.0` feature freeze will start on January 16th and RC will start on [February](https://spark.apache.org
[GitHub] [spark] martin-g commented on pull request #36358: [SPARK-39023] [K8s] Add Executor Pod inter-pod anti-affinity
martin-g commented on PR #36358: URL: https://github.com/apache/spark/pull/36358#issuecomment-1110582965 > BTW, I'd like to know about the one you mentioned by using the PodTemplate config. https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template -- Th
[GitHub] [spark] tanvn commented on pull request #36618: [SPARK-39237][DOCS][3.2] Update the ANSI SQL mode documentation
tanvn commented on PR #36618: URL: https://github.com/apache/spark/pull/36618#issuecomment-1133825459 @gengliangwang Thank you for the update! May I know when will the change be reflected on https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html ? -- This is an
[GitHub] [spark] itholic commented on pull request #38446: [Spark-40974]When the value of quote or escape exists in the content of csv file, the character in the csv file will be misidentified
itholic commented on PR #38446: URL: https://github.com/apache/spark/pull/38446#issuecomment-1297284288 And also checking https://spark.apache.org/contributing.html would be helpful! :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on
[GitHub] [spark] amaliujia commented on pull request #38506: [SPARK-41010][CONNECT][PYTHON] Complete Support for Except and Intersect in Python client
amaliujia commented on PR #38506: URL: https://github.com/apache/spark/pull/38506#issuecomment-1302820909 actually I will follow https://spark.apache.org/contributing.html. to add a short description for test cases in this PR/ -- This is an automated message from the Apache Git Service
[GitHub] [spark] amaliujia commented on pull request #38488: [SPARK-41002][CONNECT][PYTHON] Compatible `take`, `head` and `first` API in Python client
amaliujia commented on PR #38488: URL: https://github.com/apache/spark/pull/38488#issuecomment-1302821036 actually I will follow https://spark.apache.org/contributing.html. to add a short description for test cases in this PR/ -- This is an automated message from the Apache Git
[GitHub] [spark] itholic commented on pull request #38702: SPARK-41187 [Core] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen
itholic commented on PR #38702: URL: https://github.com/apache/spark/pull/38702#issuecomment-1319566712 Can we change the JIRA format in the title such as "[SPARK-41187][CORE] ...". Check the [Spark contribution guide](https://spark.apache.org/contributing.html) also wou
[GitHub] [spark] srowen commented on pull request #39566: Patched()Fix Protobuf Java vulnerable to Uncontrolled Resource Consumption
srowen commented on PR #39566: URL: https://github.com/apache/spark/pull/39566#issuecomment-1402320043 Hold up a sec. First please read https://spark.apache.org/contributing.html Where does this actually affect Spark? You have only updated a protobuf depenency in the Kinesis
[GitHub] [spark] zhengruifeng commented on pull request #40116: [WIP]SPARK-41391 Fix
zhengruifeng commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1441434811 I guess you may need to `Go to “Actions” tab on your forked repository and enable “Build and test” and “Report test results” workflows` https://spark.apache.org
[GitHub] [spark] jelmerk commented on pull request #40219: Disable substitution in values
jelmerk commented on PR #40219: URL: https://github.com/apache/spark/pull/40219#issuecomment-1448415053 > (Can you just file a JIRA and link it per https://spark.apache.org/contributing.html ?) Already did that a few mins ago and added it on top of the PR description -- This is
[GitHub] [spark] philwalk commented on pull request #38167: fix problems that affect windows shell environments (cygwin/msys2/mingw)
philwalk commented on PR #38167: URL: https://github.com/apache/spark/pull/38167#issuecomment-1273490984 > https://spark.apache.org/contributing.html? e.g., let's file a JIRA and link it to the PR title. I'm looking into it now on the JIRA website. -- This is an auto
[GitHub] [spark] zhengruifeng commented on pull request #38213: fix runtime filter do not execute when no stats
zhengruifeng commented on PR #38213: URL: https://github.com/apache/spark/pull/38213#issuecomment-1275837267 A JIRA ticket is needed, you can refer to https://spark.apache.org/contributing.html if this is a bug fix, it's better to have a UT for the fix. -- This is an auto
[GitHub] [spark] HyukjinKwon commented on pull request #38371: Fix some comments in DAGSchedulerSuite
https://spark.apache.org/contributing.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this
[GitHub] [spark] dongjoon-hyun commented on pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer
dongjoon-hyun commented on PR #36011: URL: https://github.com/apache/spark/pull/36011#issuecomment-1291093774 https://spark.apache.org/versioning-policy.html ![Screenshot 2022-10-25 at 1 22 56 PM](https://user-images.githubusercontent.com/9700541/197874273-40dc594a-3288-4537-a491
[GitHub] [spark] gengliangwang commented on pull request #37226: [MINOR][SQL] Simplify the description of built-in function.
gengliangwang commented on PR #37226: URL: https://github.com/apache/spark/pull/37226#issuecomment-1194942336 Seems quite minor. The catalyst expressions are not public to users either (https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/index.html). I am +0 on this
[GitHub] [spark] zzzzming95 commented on pull request #37416: [SPARK-39743][DOCS] Updated some spark.io.compression configuration descriptions to clarify parameter applica…
ming95 commented on PR #37416: URL: https://github.com/apache/spark/pull/37416#issuecomment-1206001277 > @ming95 mind filing a JIRA please? See also https://spark.apache.org/contributing.html sorry, forgot to fill in jira in the description . -- This is an automa
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
#' @param dataType a character object describing the target data type. #' See -#' \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ -#'Spark Data Types} for available data types. +#'
[GitHub] [spark] HyukjinKwon commented on pull request #33619: [MINOR][DOC] Remove obsolete `contributing-to-spark.md`
HyukjinKwon commented on pull request #33619: URL: https://github.com/apache/spark/pull/33619#issuecomment-892284220 I think actually this is to keep the legacy link not to break (https://spark.apache.org/docs/latest/contributing-to-spark.html). But it's 8 years ago .. I guess that
[GitHub] [spark] viirya commented on pull request #33763: [SPARK-36533][SS] Trigger.AvailableNow for running streaming queries like Trigger.Once in multiple batches
viirya commented on pull request #33763: URL: https://github.com/apache/spark/pull/33763#issuecomment-905056479 The structured streaming programming guide: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers The source is at docs/structured
[GitHub] [spark] HyukjinKwon commented on pull request #34595: Result vector from pandas_udf was not the required length
[`DataFrame.mapInPandas`](https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.mapInPandas.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e
[GitHub] [spark] AngersZhuuuu edited a comment on pull request #34715: [WIP][SPARK-37445][BUILD] Rename the maven hadoop profile to hadoop-3 and hadoop-2
AngersZh edited a comment on pull request #34715: URL: https://github.com/apache/spark/pull/34715#issuecomment-979677759 Should we need to clarify this change in https://spark.apache.org/docs/latest/api/python/getting_started/install.html#manually-downloading ? cc @HyukjinKwon
[GitHub] [spark] maropu commented on pull request #32040: Add @ExpressionDescription to TimeWindow to generate missing documentation
maropu commented on pull request #32040: URL: https://github.com/apache/spark/pull/32040#issuecomment-812766403 Probably, we can add it somewhere in the `SQL, DataFrames, and Datasets` doc: https://spark.apache.org/docs/3.1.1/sql-getting-started.html just like the doc of structured
[GitHub] [spark] HyukjinKwon commented on pull request #33839: [SPARK-36291][SQL] Refactor second set of 20 in QueryExecutionErrors to use error classes
HyukjinKwon commented on pull request #33839: URL: https://github.com/apache/spark/pull/33839#issuecomment-927221521 @dgd-contributor, please contact me or priv...@spark.apache.org. As I shared in the email, the submissions from the specific shared account will not be accepted for now
[GitHub] SandishKumarHN commented on issue #23401: [SPARK-26513][Core] : Trigger GC on executor node idle
SandishKumarHN commented on issue #23401: [SPARK-26513][Core] : Trigger GC on executor node idle URL: https://github.com/apache/spark/pull/23401#issuecomment-450690065 @srowen I did send out a proposal email to d...@spark.apache.org and u...@spark.apache.org and created JIRA and does
[GitHub] [spark] zhengruifeng commented on pull request #41444: [SPARK-43916][SQL][PYTHON][CONNECT] Add percentile like functions to Scala and Python API
zhengruifeng commented on PR #41444: URL: https://github.com/apache/spark/pull/41444#issuecomment-1575995156 Where are `percentile_cont` and `percentile_disc` from? I can not find them in https://spark.apache.org/docs/latest/api/sql/index.html and `FunctionRegistry` -- This is an
[GitHub] [spark] dongjoon-hyun commented on pull request #41469: [SPARK-43974][CONNECT][BUILD] Upgrade buf to v1.21.0
dongjoon-hyun commented on PR #41469: URL: https://github.com/apache/spark/pull/41469#issuecomment-1579770666 Thank you. Yes, I agree with you. Since the feature freeze is July 16th, maybe after July 10th? - https://spark.apache.org/versioning-policy.html -- This is an automated
[GitHub] [spark] zhengruifeng commented on pull request #41469: [SPARK-43974][CONNECT][BUILD] Upgrade buf to v1.21.0
zhengruifeng commented on PR #41469: URL: https://github.com/apache/spark/pull/41469#issuecomment-1579788969 > Thank you. Yes, I agree with you. Since the feature freeze is July 16th, maybe after July 10th? > > * https://spark.apache.org/versioning-policy.html
[GitHub] [spark] dongjoon-hyun commented on pull request #41238: [SPARK-43594][SQL] Add LocalDateTime to anyToMicros
dongjoon-hyun commented on PR #41238: URL: https://github.com/apache/spark/pull/41238#issuecomment-1585492657 Just FYI, @Fokko . Apache Spark 3.5.0 Feature Freeze is next month (July 16th) and you will have Apache Spark 3.5.0 on August. - https://spark.apache.org/versioning-policy.html
[GitHub] [spark] dongjoon-hyun commented on pull request #41374: [SPARK-43876][SQL] Enable fast hashmap for distinct queries
dongjoon-hyun commented on PR #41374: URL: https://github.com/apache/spark/pull/41374#issuecomment-1599198896 Merged to master for Apache Spark 3.5.0. Thank you, @wankunde . - https://spark.apache.org/versioning-policy.html (Apache Spark 3.5.0 on August 2023) -- This is an automated
Re: [PR] [SPARK-46894][PYTHON] Move PySpark error conditions into standalone JSON file [spark]
HyukjinKwon commented on PR #44920: URL: https://github.com/apache/spark/pull/44920#issuecomment-2024304980 Just to make sure, does it work when you install PySpark as a ZIP file? e.g., downloading it from https://spark.apache.org/downloads.html would install PySpark as a ZIP file
[GitHub] [spark] HyukjinKwon commented on pull request #42350: [SPARK-44662] Perf improvement in BroadcastHashJoin queries with stream side join key on non partition columns
HyukjinKwon commented on PR #42350: URL: https://github.com/apache/spark/pull/42350#issuecomment-1667012513 Please follow https://spark.apache.org/improvement-proposals.html. We should start the discussion and vote to pass with one of PMC members shepherded -- This is an automated
[GitHub] [spark] zhengruifeng commented on pull request #42469: [SPARK-44782][INFRA] Adjust PR template to Generative Tooling Guidance recommendations
zhengruifeng commented on PR #42469: URL: https://github.com/apache/spark/pull/42469#issuecomment-1676521158 I guess this should be documented in https://spark.apache.org/developer-tools.html instead of PR template? cc @HyukjinKwon @gatorsmile @srowen -- This is an automated
[GitHub] [spark] HyukjinKwon commented on pull request #38624: [SPARK-40559][PYTHON] Add applyInArrow to groupBy and cogroup
HyukjinKwon commented on PR #38624: URL: https://github.com/apache/spark/pull/38624#issuecomment-1687231741 > Where is this dev mailing list and how do I raise a discussion there? See https://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html -- This is an automa
Re: [PR] [MINOR][DOCS] Clarify sort behaviour for structs [spark]
landlord-matt commented on PR #43871: URL: https://github.com/apache/spark/pull/43871#issuecomment-1820696947 Another alternative would be instead of writing it for this function, we could add something on this documentation page. I don't it is open source though https://spark.apach
[GitHub] [spark] srowen commented on pull request #41033: Update bufbuild plugin references
srowen commented on PR #41033: URL: https://github.com/apache/spark/pull/41033#issuecomment-1534935590 Looks good, if you can resolve the conflicts and tests pass. File a little JIRA and update the title if you please too https://spark.apache.org/contributing.html -- This is an
[GitHub] [spark] HyukjinKwon commented on pull request #41084: [PYTHON] Remove deprecated use of typing.io
://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact
Re: [PR] [SPARK-43919][SQL] Extract JSON functionality out of Row [spark]
tfinn-ias commented on PR #41425: URL: https://github.com/apache/spark/pull/41425#issuecomment-1893858562 Hi, your current public documentation for the Java API lists these methods as available with no disclaimers about them being private or unstable: https://spark.apache.org/docs
Re: [PR] [MINOR][PYTHON] refactor PythonWrite to prepare for supporting python data source streaming write [spark]
xinrong-meng commented on PR #45049: URL: https://github.com/apache/spark/pull/45049#issuecomment-1930884879 Would you create a Spark JIRA https://issues.apache.org/jira/browse/SPARK and add it to the PR title? Please refer to https://spark.apache.org/contributing.html for details. Thanks
[GitHub] [spark] Yaohua628 commented on pull request #36069: [SPARK-38767][SQL] Support `ignoreCorruptFiles` and `ignoreMissingFiles` in Data Source options
Yaohua628 commented on PR #36069: URL: https://github.com/apache/spark/pull/36069#issuecomment-1190879022 @cloud-fan @dongjoon-hyun Hi folks, do you know when the doc changes will be deployed? https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html? Thanks! -- This
[GitHub] [spark] zhengruifeng commented on pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0
zhengruifeng commented on PR #42793: URL: https://github.com/apache/spark/pull/42793#issuecomment-1719206074 not related to this PR itself, what is the policy to upgrade the minimum version of dependencies listed [here](https://spark.apache.org/docs/latest/api/python/getting_started
[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1728702545 Please send the mail to priv...@spark.apache.org and look for action from PMC members. -- This is an automated message from the Apache Git Service. To respond to the message
[GitHub] [spark] dongjoon-hyun edited a comment on issue #24304: [MINOR][DOC] Fix html tag broken in configuration.md
dongjoon-hyun edited a comment on issue #24304: [MINOR][DOC] Fix html tag broken in configuration.md URL: https://github.com/apache/spark/pull/24304#issuecomment-480325000 We need to fix the followings, too. - https://spark.apache.org/docs/2.3.3/configuration.html#available-properties
[GitHub] [spark] dongjoon-hyun commented on issue #27987: [SPARK-31165] Correcting wrong paths in Dockerfile
dongjoon-hyun commented on issue #27987: [SPARK-31165] Correcting wrong paths in Dockerfile URL: https://github.com/apache/spark/pull/27987#issuecomment-603929739 BTW, https://spark.apache.org/docs/latest/running-on-kubernetes.html might be not a good place to have that. Please consider
RE: If not stop StreamingContext gracefully, will checkpoint data be consistent?
-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Optimization module in Python mllib
iling list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: u
Re: PySpark/SQL Octet Length
park 1.5.2/Python 2.7. > > Thanks > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe
Re: Does parallelize and collect preserve the original order of list?
Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >
Re: Is it a bug?
> > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > - > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > ----
Re: Filter out the elements from xml file in Spark
--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apach
Re: Kafka connection logs in Spark
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Serializing DataSets
------ >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > ----- To unsub
Re: Spark SQL 1.3 not finding attribute in DF
ute-in-DF-tp25599p25600.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ---
Is there more information about spark shuffer-service
There is a saying "If the service is enabled, Spark executors will fetch shuffle files from the service instead of from each other. " in the wiki https://spark.apache.org/docs/1.3.0/job-scheduling.html#graceful-decommission-of-executors <https://spark.apache.org/docs/1.3.0/job-sc
Re: How to update python code in memory
---- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Kafka message metadata with Dstreams
http://spark.apache.org/docs/latest/api/java/index.html messageHandler receives a kafka MessageAndMetadata object. Alternatively, if you just need metadata information on a per-partition basis, you can use HasOffsetRanges http://spark.apache.org/docs/latest/streaming-kafka-integration.html
Re: Spark 1.1 (slow, working), Spark 1.2 (fast, freezing)
gt; To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Is Ubuntu server or desktop better for spark cluster
gt; To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: HDP 2.2 AM abort : Unable to find ExecutorLauncher class
tion" message.) > > Mike Stone > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: use
Re: LDA code little error @Xiangrui Meng
the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscri
Re: [PySpark][Python 2.7.8][Spark 1.0.2] count() with TypeError: an integer is required
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: New features (Discretization) for v1.x in xiangrui.pdf
ing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr..
Re: Running spark-shell (or queries) over the network (not from master)
st.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13593.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.
Re: groupBy gives non deterministic results
Hi, Xianjin I checked user@spark.apache.org, and found my post there: http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/browser I am using nabble to send this mail, which indicates that the mail will be sent from my email address to the u...@spark.incubator.apache.org mailing list
Re: Question About Submit Application
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark SQL DDL, DML commands
he Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To u
RE: spark sql: timestamp in json - fails
. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For
Re: spark-repl_1.2.0 was not uploaded to central maven repository.
- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
textFile() ordering and header rows
Since RDDs are generally unordered, aren't things like textFile().first() not guaranteed to return the first row (such as looking for a header row)? If so, doesn't that make the example in http://spark.apache.org/docs/1.2.1/quick-start.html#basics
[GitHub] spark issue #21419: Branch 2.2
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21419 @gentlewangyu, please close this and read https://spark.apache.org/contributing.html. Questions should go to mailing list and issues should be filed in JIRA
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user foxish commented on the issue: https://github.com/apache/spark/pull/20669 There's a section explaining it at the bottom of https://spark.apache.org/committers.html --- - To unsubscribe, e-mail: re
[GitHub] spark issue #20370: Changing JDBC relation to better process quotes
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20370 @conorbmurphy Could you create a JIRA and follow [the instruction](https://spark.apache.org/contributing.html) to make a contribution
[GitHub] spark issue #20790: AccumulatorV2 subclass isZero scaladoc fix
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20790 Wait .. I just found you opened a JIRA - SPARK-23642. Please link it by `[SPARK-23642][DOCS] ...`. see https://spark.apache.org/contributing.html
[GitHub] spark issue #21988: [SPARK-25003][PYSPARK][BRANCH-2.2] Use SessionExtensions...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21988 we always open against master and backport if agreed upon. this is documented here https://spark.apache.org/contributing.html
[GitHub] spark issue #21884: k8s: explicitly expose ports on driver container
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21884 can you update the format of the title and description as described here "Pull Request" in https://spark.apache.org/contrib
[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18833 Can we document this difference in https://spark.apache.org/docs/latest/sql-programming-guide.html#compatibility-with-apache-hive
[GitHub] spark issue #19263: Optionally add block updates to log
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19263 @michaelmior would you please follow the instruction (https://spark.apache.org/contributing.html) to update PR title and create a corresponding JIRA, thanks
Python Kafka support?
Hi, I read on this page http://spark.apache.org/docs/latest/streaming-kafka-integration.html about python support for "receiverless" kafka integration (Approach 2) but it says its incomplete as of version 1.4. Has this been updated in version 1.5.
Re: subscribe
https://www.youtube.com/watch?v=umDr0mPuyQc On Sat, Aug 22, 2015 at 8:01 AM, Ted Yu wrote: > See http://spark.apache.org/community.html > > Cheers > > On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes < > li...@hermes-it-consulting.de&g
Re: unsubscribe
Hi, Sonu. You can send email to user-unsubscr...@spark.apache.org with subject "(send this email to unsubscribe)" to unsubscribe from this mailling list[1]. Regards. [1] https://spark.apache.org/community.html 2019-05-27 2:01 GMT+07.00, Sonu Jyotshna : > > -- -- Salam H
[53/56] spark-website git commit: Rebuild for 2.2.0
://spark.apache.org/releases/spark-release-2-2-0.html + weekly + + + https://spark.apache.org/news/spark-2-2-0-released.html + weekly + + https://spark.apache.org/releases/spark-release-2-1-1.html weekly @@ -644,11 +652,11 @@ weekly - https://spark.apache.org/graphx/ + https
Re: spark on kubernetes
kers UI is not possible as then I have to expose >> >>>>> them too >> >>>>>individually and given I can have multiple application it becomes >> >>>>> hard >> >>>>>to manage. >> >>>>&g