Mailing lists matching spark.apache.org

commits spark.apache.org
dev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns

2023-04-23 Thread via GitHub
HyukjinKwon commented on PR #40907: URL: https://github.com/apache/spark/pull/40907#issuecomment-1519352303 Please file a JIRA in ASF JIRA (at here https://issues.apache.org/jira/projects/SPARK/issues). See also https://spark.apache.org/contributing.html -- This is an automated message

[GitHub] [spark] srowen commented on pull request #41199: Spark-43536 Fixing statsd sink reporter

2023-05-17 Thread via GitHub
srowen commented on PR #41199: URL: https://github.com/apache/spark/pull/41199#issuecomment-1551416331 See https://spark.apache.org/contributing.html and please fix up this PR. Needs more explanation too -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Support message format in connect [spark]

2024-01-12 Thread via GitHub
HyukjinKwon commented on PR #44714: URL: https://github.com/apache/spark/pull/44714#issuecomment-1890180706 Let's file a JIRA as well, see https://spark.apache.org/contributing.html. For review, I will defer to @heyihong who's the original author of this code path. --

Re: [PR] Change the signature of the hllInvalidLgK query execution error to take an integer as 4th argument [spark]

2024-02-02 Thread via GitHub
gengliangwang commented on PR #44995: URL: https://github.com/apache/spark/pull/44995#issuecomment-1924562289 @mkaravel Please create a JIRA for this as per https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] [SPARK-46964] Change the signature of the hllInvalidLgK query execution error to take an integer as 4th argument [spark]

2024-02-02 Thread via GitHub
mkaravel commented on PR #44995: URL: https://github.com/apache/spark/pull/44995#issuecomment-1924574757 > @mkaravel Please create a JIRA for this as per https://spark.apache.org/contributing.html Done. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] dongjoon-hyun commented on pull request #36844: Update ExecutorClassLoader.scala

2022-06-11 Thread GitBox
dongjoon-hyun commented on PR #36844: URL: https://github.com/apache/spark/pull/36844#issuecomment-1153007121 On top of @wangyum 's comment, please file an Apache Spark JIRA issue . You can see more contributor's guide here. - https://spark.apache.org/contributing.html --

[GitHub] [spark] srowen commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-22 Thread GitBox
srowen commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1163429380 Sorry, see "Testing with GitHub actions workflow" under https://spark.apache.org/developer-tools.html -- This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] HyukjinKwon commented on pull request #37128: What do fit in BucketedRandomProjectionLSH in spark?

2022-07-08 Thread GitBox
HyukjinKwon commented on PR #37128: URL: https://github.com/apache/spark/pull/37128#issuecomment-1178673181 @MammadTavakoli Let's either file a JIRA in https://issues.apache.org/jira/projects/SPARK/issues or ask u...@spark.apache.org -- This is an automated message from the Apach

[GitHub] [spark] c21 commented on pull request #37189: [SPARK-39777][DOCS] Remove Hive bucketing incompatiblity documentation

2022-07-14 Thread GitBox
c21 commented on PR #37189: URL: https://github.com/apache/spark/pull/37189#issuecomment-1184099758 The removed documentation is on https://spark.apache.org/docs/latest/sql-migration-guide.html: https://user-images.githubusercontent.com/4629931/178927331-80befc58-a40c-4241-bbe2

Re: [PR] Typo fixed yyy to yyyy [spark]

2023-10-18 Thread via GitHub
HyukjinKwon commented on PR #43442: URL: https://github.com/apache/spark/pull/43442#issuecomment-1769879428 Mind taking a look at https://github.com/apache/spark/pull/43442/checks?check_run_id=17836826969? Let's also file a JIRA, see also https://spark.apache.org/contributing

Re: ANOVA test in Spark

2016-05-13 Thread mylisttech
- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark Website

2016-07-13 Thread Pradeep Gollakota
Worked for me if I go to https://spark.apache.org/site/ but not https://spark.apache.org On Wed, Jul 13, 2016 at 11:48 AM, Maurin Lenglart wrote: > Same here > > > > *From: *Benjamin Kim > *Date: *Wednesday, July 13, 2016 at 11:47 AM > *To: *manish ranjan > *Cc: *user

DenseMatrix update

2016-02-05 Thread Zapper22
There was Update method in Spark 1.3.1 https://spark.apache.org/docs/1.3.1/api/java/org/apache/spark/mllib/linalg/DenseMatrix.html But in Spark 1.6.0, there is no Update method https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/mllib/linalg/DenseMatrix.html My idea is to store large

Re: execute native system commands in Spark

2015-11-02 Thread Deenar Toraskar
PM, "patcharee" wrote: > > >Hi, > > > >Is it possible to execute native system commands (in parallel) Spark, > >like scala.sys.process ? > > > >Best, > >Patcharee > > > >---

Re: Distributing Python code packaged as tar balls

2015-11-13 Thread Davies Liu
s seem to be supported, I > have tried distributing tar balls unsuccessfully. > > Is it worth adding support for tar balls? > > Best regards, > Praveen Chundi > > - > To unsubscribe, e-mail: user-unsubscr...@spark.

Re: Release data for spark 1.6?

2015-12-09 Thread Ted Yu
--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: use GraphX with Spark Streaming

2015-08-25 Thread ponkin
Hi, Sure you can. StreamingContext has property /def sparkContext: SparkContext/(see docs <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.StreamingContext> ). Think about DStream - main abstraction in Spark Streaming, as a sequence of RDD. Each DStre

Re: reading multiple parquet file using spark sql

2015-09-01 Thread Cheng Lian
-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

K Means Explanation

2015-09-23 Thread Tapan Sharma
tor center : model.clusterCenters()) { System.out.println(" " + center); } https://spark.apache.org/docs/1.3.0/mllib-clustering.html#k-means <https://spark.apache.org/docs/1.3.0/mllib-clustering.html#k-means> *How can I know the points contained in the particular cluster?*

Re: reducing number of output files

2015-01-22 Thread DEVAN M.S.
at 10:46 PM, Kane Kim wrote: > > How I can reduce number of output files? Is there a parameter to > saveAsTextFile? > > > > Thanks. > > > > - > > To unsubscribe, e-mail: user-unsubscr...

[ANNOUNCE] Announcing Spark 1.3!

2015-03-13 Thread Patrick Wendell
atures, or download [2] the release today. For errata in the contributions or release notes, please e-mail me *directly* (not on-list). Thanks to everyone who helped work on this release! [1] http://spark.apache.org/releases/spark-release-1-3-0.html [2] http://spark.apache.org/down

Re: Spark Job History Server

2015-03-18 Thread Marcelo Vanzin
> But got Exception in thread "main" java.lang.ClassNotFoundException: > org.apache.spark.deploy.yarn.history.YarnHistoryProvider > > What class is really needed? How to fix it? > > Br, > Patcharee > > - >

Re: Spark Performance on Yarn

2015-04-22 Thread Ted Yu
---- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: StackOverflow Error when run ALS with 100 iterations

2015-04-22 Thread Xiangrui Meng
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.o

Re: Problem reading from S3 in standalone application

2014-08-06 Thread Evan Sparks
the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mai

Re: PySpark + executor lost

2014-08-07 Thread Davies Liu
- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark SQL and running parquet tables?

2014-09-11 Thread Yin Huai
It is in SQLContext ( http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.SQLContext ). On Thu, Sep 11, 2014 at 3:21 PM, DanteSama wrote: > Michael Armbrust wrote > > You'll need to run parquetFile("path").registerTempTable("name") to

Re: Does Spark always wait for stragglers to finish running?

2014-09-15 Thread Du Li
There is a parameter spark.speculation that is turned off by default. Look at the configuration doc: http://spark.apache.org/docs/latest/configuration.html From: Pramod Biligiri mailto:pramodbilig...@gmail.com>> Date: Monday, September 15, 2014 at 3:30 PM To: "user@spark

Re: Avoid broacasting huge variables

2014-09-20 Thread Martin Goodson
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- -- Martin Goodson @martingoodson - - To unsubsc

Re: partitions number with variable number of cores

2014-10-03 Thread Gen
Maybe I am wrong, but how many resource that a spark application can use depends on the mode of deployment(the type of resource manager), you can take a look at https://spark.apache.org/docs/latest/job-scheduling.html <https://spark.apache.org/docs/latest/job-scheduling.html> . For you

Re: MLlib linking error Mac OS X

2014-10-17 Thread Xiangrui Meng
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: using LogisticRegressionWithSGD.train in Python crashes with "Broken pipe"

2014-11-05 Thread Xiangrui Meng
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

[GitHub] [spark-website] panbingkun opened a new pull request, #474: [SPARK-44820][DOCS] Switch languages consistently across docs for all code snippets

2023-08-23 Thread via GitHub
://spark.apache.org/docs/2.0.0/structured-streaming-programming-guide.html But it was broken for later docs, for example the Spark 3.4.1 doc: https://spark.apache.org/docs/latest/quick-start.html We should fix this behavior change and possibly add test cases to prevent future

Re: Get size of rdd in memory

2015-02-02 Thread Cheng Lian
It's already fixed in the master branch. Sorry that we forgot to update this before releasing 1.2.0 and caused you trouble... Cheng On 2/2/15 2:03 PM, ankits wrote: Great, thank you very much. I was confused because this is in the docs: https://spark.apache.org/docs/1.2.0/sql-progra

Re: May we merge into branch-1.3 at this point?

2015-03-13 Thread Sean Owen
holas Chammas wrote: > Looks like the release is out: > http://spark.apache.org/releases/spark-release-1-3-0.html > > Though, interestingly, I think we are missing the appropriate v1.3.0 tag: > https://github.com/apache/spark/releases > > Nick > > On Fri, Mar 13, 2015 at 6:

[jira] [Commented] (SPARK-21593) Fix broken configuration page

2017-08-01 Thread Sean Owen (JIRA)
Spark 2.2.0 has broken menu list and named > anchors. > Compare [2.1.1 docs |https://spark.apache.org/docs/2.1.1/configuration.html] > with [Latest docs |https://spark.apache.org/docs/latest/configuration.html] > Or try this link [Configuration # Dynamic > Allocation|https://sp

[jira] [Updated] (SPARK-18279) ML programming guide should have R examples

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-18279: - Description: http://spark.apache.org/docs/latest/ml-classification-regression.html for example

[jira] [Updated] (SPARK-37335) Clarify output of FPGrowth

2021-11-15 Thread Nicholas Chammas (Jira)
documented, like {{{}lift{}}}: [https://spark.apache.org/docs/latest/ml-frequent-pattern-mining.html] We should offer a basic description of these columns. An _itemset_ should also be briefly defined. was: The association rules returned by FPGrow include more columns than are documented

[jira] [Assigned] (SPARK-25082) Documentation for Spark Function expm1 is incomplete

2018-08-15 Thread Apache Spark (JIRA)
Affects Versions: 2.0.0, 2.3.1 >Reporter: Alexander Belov >Priority: Trivial > Labels: documentation, easyfix > > The documentation for the function expm1 that takes in a string > public static  > [Column|https://spark.apache.org/docs/2.3.1/api

[jira] [Assigned] (SPARK-25082) Documentation for Spark Function expm1 is incomplete

2018-08-15 Thread Hyukjin Kwon (JIRA)
ects Versions: 2.0.0, 2.3.1 >Reporter: Alexander Belov >Assignee: Bo Meng >Priority: Trivial > Labels: documentation, easyfix > > The documentation for the function expm1 that takes in a string > public static  > [Column|https://spark.

[jira] [Commented] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-02-01 Thread Hyukjin Kwon (JIRA)
ion > Components: Documentation >Affects Versions: 2.4.0 >Reporter: Emmanuel Arias >Priority: Minor > > Hello! > I am new using Spark. Reading the documentation I think that is a little > confusing on Downloading section. > [tt

[jira] [Updated] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Sean Owen (JIRA)
Components: Documentation >Affects Versions: 2.4.0 >Reporter: Emmanuel Arias >Priority: Trivial > > Hello! > I am new using Spark. Reading the documentation I think that is a little > confusing on Downloading section. > [ttps://spark.apache.org/d

[jira] [Resolved] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Hyukjin Kwon (JIRA)
new using Spark. Reading the documentation I think that is a little > confusing on Downloading section. > [ttps://spark.apache.org/docs/latest/#downloading|https://spark.apache.org/docs/latest/#downloading] > write: "Scala and Java users can include Spark in their projects using

[jira] [Resolved] (SPARK-19445) Please remove tylerchap...@yahoo-inc.com subscription from u...@spark.apache.org

2017-02-03 Thread Sean Owen (JIRA)
> u...@spark.apache.org > > > Key: SPARK-19445 > URL: https://issues.apache.org/jira/browse/SPARK-19445 > Project: Spark > Issue Type: IT Help >

[jira] [Updated] (SPARK-25795) Fix CSV SparkR SQL Example

2018-10-21 Thread Dongjoon Hyun (JIRA)
v > {code} > > - > https://dist.apache.org/repos/dist/dev/spark/v2.4.0-rc3-docs/_site/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.2/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.1/sql

[jira] [Updated] (SPARK-25795) Fix CVS SparkR SQL Example

2018-10-21 Thread Dongjoon Hyun (JIRA)
} > > - > https://dist.apache.org/repos/dist/dev/spark/v2.4.0-rc3-docs/_site/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.2/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.1/sql-pro

[jira] [Assigned] (SPARK-25795) Fix CSV SparkR SQL Example

2018-10-21 Thread Dongjoon Hyun (JIRA)
ode} > > - > https://dist.apache.org/repos/dist/dev/spark/v2.4.0-rc3-docs/_site/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.2/sql-programming-guide.html#manually-specifying-options > - > http://spark.apache.org/docs/2.3.1/sql-

[jira] [Updated] (SPARK-13322) AFTSurvivalRegression should support feature standardization

2016-02-16 Thread Yanbo Liang (JIRA)
@spark.apache.org/msg45643.html The lossSum has possibility of infinity because we do not standardize the feature before fitting model, we should support feature standardization. was: This bug is reported by Stuti Awasthi. https://www.mail-archive.com/user@spark.apache.org/msg45643.html The lossSum has

[jira] [Updated] (SPARK-14683) Configure external links in ScalaDoc

2016-04-16 Thread Yang Bo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Bo updated SPARK-14683: Description: Right now [Spark's Scaladoc|https://spark.apache.org/docs/latest/api/scala/] does not

[jira] [Commented] (SPARK-43322) Spark SQL docs for explode_outer and posexplode_outer omit behavior for null/empty

2023-05-19 Thread Sean R. Owen (Jira)
Project: Spark > Issue Type: Documentation > Components: SQL >Affects Versions: 3.4.0 >Reporter: Robert Juchnicki >Priority: Minor > > The Spark SQL documentation for > [explode_outer|https://spark.apache.org/doc

[jira] [Commented] (SPARK-40103) Support read/write.csv() in SparkR

2022-08-16 Thread Hyukjin Kwon (Jira)
df.read() to read the csv file. We need a more > high-level api for it. > Java: > [DataFrameReader.csv()|https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/DataFrameReader.html] > Scala: > [DataFrameReader.csv()|https://spark.apache.org/docs/latest/api/sca

[jira] [Comment Edited] (SPARK-40103) Support read/write.csv() in SparkR

2022-08-17 Thread deshanxiao (Jira)
port the DataFrameReader.csv API, only R is > missing. we need to use df.read() to read the csv file. We need a more > high-level api for it. > Java: > [DataFrameReader.csv()|https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/DataFrameReader.html] > Scala: > [DataFra

[jira] [Updated] (SPARK-18705) Docs for one-pass solver for linear regression with L1 and elastic-net penalties

2016-12-04 Thread Yanbo Liang (JIRA)
}}|http://spark.apache.org/docs/latest/ml-advanced.html#normal-equation-solver-for-weighted-least-squares] session. (was: Add document for one-pass solver for linear regression with L1 and elastic-net penalties at [{{Normal equation solver for weighted least squares}}|http://spark.apache.org

Re: [GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread Sean Owen
user nchammas commented on the pull request: > > https://github.com/apache/spark/pull/2014#issuecomment-55770066 > > FYI: This page is 404-ing: > http://spark.apache.org/docs/latest/building-spark.html > > Is that temporary? > > > --- > If your project is set u

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13734 For other important issue about `see also`, all the previous doc look like that. http://spark.apache.org/docs/1.6.0/api/R/approxCountDistinct.html http://spark.apache.org/docs

[GitHub] [spark] MaxGekk commented on a diff in pull request #39281: [SPARK-41576][SQL] Assign name to _LEGACY_ERROR_TEMP_2051

2023-01-01 Thread GitBox
_FOUND" : { +"message" : [ + "Failed to find data source: . Please find packages at `https://spark.apache.org/third-party-projects.html`"; Review Comment: nit: ```suggestion "Failed to find the data source: . Please find packages

[GitHub] [spark] itholic opened a new pull request, #39852: [SPARK-42281][SQL] Update Debugging PySpark documents to show error message properly

2023-02-01 Thread via GitHub
itholic opened a new pull request, #39852: URL: https://github.com/apache/spark/pull/39852 ### What changes were proposed in this pull request? This PR proposes to update examples in [Debugging PySpark](https://spark.apache.org/docs/latest/api/python/development/debugging.html

[GitHub] [spark] derhagen opened a new pull request, #38389: Sphinx stubs

2022-10-25 Thread GitBox
derhagen opened a new pull request, #38389: URL: https://github.com/apache/spark/pull/38389 ### What changes were proposed in this pull request? The documentation under https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/data_types.html chops off the stubs on periods

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32285: [SPARK-35180][BUILD] Allow to build SparkR with SBT

2021-04-21 Thread GitBox
://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run +Build Spark with [Maven](https://spark.apache.org/docs/latest/building-spark.html#buildmvn) or [SBT](http://spark.apache.org

[GitHub] [spark] allisonwang-db opened a new pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-10-29 Thread GitBox
allisonwang-db opened a new pull request #34443: URL: https://github.com/apache/spark/pull/34443 ### What changes were proposed in this pull request? This PR improves error messages for SQL functions and operators when ANSI mode is enabled. See [SQL Functions](https://spark.apache.org

Re: [PR] [SPARK-47043][BUILD] add `jackson-core` and `jackson-annotations` dependencies to module `spark-common-utils` [spark]

2024-02-22 Thread via GitHub
dongjoon-hyun commented on PR #45103: URL: https://github.com/apache/spark/pull/45103#issuecomment-1960064027 Did you send an email to dev, @William1104 ? It seems that I missed it. > Let me send an email to [d...@spark.apache.org](mailto:d...@spark.apache.org) on this topic. Thank

[GitHub] [spark] gengliangwang commented on a diff in pull request #42428: [SPARK-44742][PYTHON][DOCS] Add Spark version drop down to the PySpark doc site

2023-08-10 Thread via GitHub
tps://github.com/apache/spark>`_ | `Issues <https://issues.apache.org/jira/projects/SPARK/issues>`_ | |examples|_ | `Community <https://spark.apache.org/community.html>`_ +|binder|_ | `GitHub <https://github.com/apache/spark>`_ | `Issues <https://issues.apache.org

Re: Example Page Java Function2

2015-06-03 Thread linkstar350 .
6:23 PM, linkstar350 . > wrote: >> Hi, I'm Taira. >> >> I notice that this example page may be a mistake. >> >> https://spark.apache.org/examples.html >> >> >> Word Count (Java) >> >> JavaRDD textFile = spark

Re: Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

2015-06-13 Thread Will Briggs
The context that is created by spark-shell is actually an instance of HiveContext. If you want to use it programmatically in your driver, you need to make sure that your context is a HiveContext, and not a SQLContext. https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables

Re: Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

2015-06-13 Thread pth001
://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables Hope this helps, Will On June 13, 2015, at 3:36 PM, pth001 wrote: Hi, I am using spark 0.14. I try to insert data into a hive table (in orc format) from DF. partitionedTestDF.write.format

Re: Mllib using model to predict probability

2016-05-04 Thread ndjido
You can user the BinaryClassificationEvaluator class to get both predicted classes (0/1) and probabilities. Check the following spark doc https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html . Cheers, Ardo Sent from my iPhone > On 05 May 2016, at 07:59, colin wrote: >

答复: G1 GC takes too much time

2016-05-29 Thread condor join
The follwing are the parameters: -XX:+UseG1GC -XX:+UnlockDiagnostivVMOptions -XX:G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 spark.executor.memory=4G 发件人: Ted Yu 发送时间: 2016年5月30日 9:47:05 收件人: condor join 抄送: user@spark.apache.org 主题: Re: G1 GC takes

RE: Is it possible to use SparkSQL JDBC ThriftServer without Hive

2016-01-13 Thread Mohammed Guller
Hi Angela, Yes, you can use Spark SQL JDBC/ThriftServer without Hive. Mohammed -Original Message- From: angela.whelan [mailto:angela.whe...@synchronoss.com] Sent: Wednesday, January 13, 2016 3:37 AM To: user@spark.apache.org Subject: Is it possible to use SparkSQL JDBC ThriftServer

RE: submit spark job with spcified file for driver

2016-02-04 Thread Mohammed Guller
[mailto:alexey.yakubov...@searshc.com] Sent: Thursday, February 4, 2016 2:18 PM To: user@spark.apache.org Subject: submit spark job with spcified file for driver Is it possible to specify a file (with key-value properties) when submitting spark app with spark-submit? Some mails refers to the key

Re: How to delete a record from parquet files using dataframes

2016-02-24 Thread Jakob Odersky
You can `filter` (scaladoc <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame@filter%28String%29:DataFrame>) your dataframes before saving them to- or after reading them from parquet files On Wed, Feb 24, 2016 at 1:28 AM, Cheng Lian wrote: > Par

RE: Update edge weight in graphx

2016-03-01 Thread Mohammed Guller
Like RDDs, Graphs are also immutable. Mohammed Author: Big Data Analytics with Spark -Original Message- From: naveen.marri [mailto:naveenkumarmarri6...@gmail.com] Sent: Monday, February 29, 2016 9:11 PM To: user@spark.apache.org Subject: Update edge weight in graphx Hi, I&#

RE: textFile() and includePackage() not found

2015-09-27 Thread Sun, Rui
@spark.apache.org Subject: textFile() and includePackage() not found Error: no methods for 'textFile' when I run the following 2nd command after SparkR initialized sc <- sparkR.init(appName = "RwordCount") lines <- textFile(sc, args[[1]]) But the following command works: lines2 &

RE: Hive with apache spark

2015-10-11 Thread Cheng, Hao
Hive Server does, and you can load the Hive table as need. -Original Message- From: Hafiz Mujadid [mailto:hafizmujadi...@gmail.com] Sent: Monday, October 12, 2015 1:43 AM To: user@spark.apache.org Subject: Hive with apache spark Hi how can we read data from external hive server.

Re: JMX with Spark

2015-11-05 Thread Yogesh Vyas
https://spark.apache.org/docs/latest/monitoring.html > > Romi Kuntsman, Big Data Engineer > http://www.totango.com > > On Thu, Nov 5, 2015 at 2:08 PM, Yogesh Vyas wrote: >> >> Hi, >> How we can use JMX and JCo

Re: Receiver and Parallelization

2015-09-25 Thread Adrian Tanase
1) yes, just use .repartition on the inbound stream, this will shuffle data across your whole cluster and process in parallel as specified. 2) yes, although I’m not sure how to do it for a totally custom receiver. Does this help as a starting point? http://spark.apache.org/docs/latest/streaming

RE: Performance tuning in Spark SQL.

2015-03-02 Thread Cheng, Hao
DED] query" is your best friend to tuning your SQL itself. *... And, a real use case scenario probably be more helpful in answering your question. -Original Message- From: dubey_a [mailto:abhishek.du...@xoriant.com] Sent: Monday, March 2, 2015 6:02 PM To: user@spark.apa

Re: What happened to the Row class in 1.3.0?

2015-04-06 Thread Nan Zhu
to call Row.create(object[]) similarly to what's shown in this > programming guide > <https://spark.apache.org/docs/latest/sql-programming-guide.html#programmatically-specifying-the-schema> > > , but the create() method is no longer recognized. I tried to look up the >

Re: Spark on EC2

2014-09-18 Thread Burak Yavuz
quot; To: user@spark.apache.org Sent: Thursday, September 18, 2014 11:48:03 AM Subject: Spark on EC2 Hello, I am trying to run a python script that makes use of the kmeans MLIB and I'm not getting anywhere. I'm using an c3.xlarge instance as master, and 10 c3.large instances as slaves. In the cod

spark git commit: Fix "Building Spark With Maven" link in README.md

2014-12-25 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 11dd99317 -> 08b18c7eb Fix "Building Spark With Maven" link in README.md Corrected link to the Building Spark with Maven page from its original (http://spark.apache.org/docs/latest/building-with-maven.html) to the curren

[spark] branch branch-2.3 updated: [R][BACKPORT-2.4] update package description

2019-02-21 Thread felixcheung
, "Venkataraman", role = c("aut", "cre"), email = "felixche...@apache.org"), person(family = "The Apache Software Foundation", role = c("aut", "cph"))) License: Apache License (== 2.0) -URL: http://www.a

Re: [ANNOUNCE] Apache Spark 3.1.2 released

2021-06-01 Thread Gengliang Wang
;>> > >>> 2021년 6월 2일 (수) 오전 9:59, Dongjoon Hyun < > > > dongjoon.hyun@ > > > >님이 작성: > >>> > >>>> We are happy to announce the availability of Spark 3.1.2! > >>>> > >>>> Spark 3.1.2 is a maintenance

Re: Welcoming three new PMC members

2022-08-10 Thread Maciej
> > >>>>> Hi all, > > >>>>> > > >>>>> The Spark PMC recently voted to add three new PMC members. Join me in welcoming them to their new roles! > > >>>>> > > >>>>> New PMC members: Huaxin Gao, Gengliang Wang and Maxim Gekk > > >>>>>

Re: Welcoming Felix Cheung as a committer

2016-08-08 Thread Felix Cheung
add Felix Cheung as a committer. Felix has >>>> > been a major contributor to SparkR and we're excited to have him join >>>> > officially. Congrats and welcome, Felix! >>>> > >>>> > Matei >>>> > -

Re: [VOTE] Designating maintainers for some Spark components

2014-11-05 Thread Jeremy Freeman
gt; On Wed, Nov 5, 2014 at 8:55 PM, Nan Zhu >>> wrote: >>>>>>>> >>>>>>>> Will these maintainers have a cleanup for those pending PRs upon we >>>>> start >>>&

Re: Welcoming three new committers

2015-02-03 Thread Timothy Chen
gt;>>> on MLlib, and Sean on ML and >>> many >>>> pieces throughout Spark Core. Join me in welcoming them as committers! >>>> >>>> Matei >>>> --- >&g

Re: [discuss] Removing individual commit messages from the squash commit message

2015-07-18 Thread Patrick Wendell
eckpointing doesn't retain driver port issue >>> >>> >>> Anybody against removing those from the merge script so the log looks >>> cleaner? If nobody feels strongly about this, we can just create a JIRA to >>

Re: Spark Implementation of XGBoost

2015-10-26 Thread YiZhi Liu
re sub-sampling are also employed to avoid >>> overfitting. >>> >>> Thank you for testing. I am looking forward to your comments and >>> suggestions. Bugs or improvements can be reported through GitHub. >>> >>> Many thanks! >>> >>> Meihua >>> >>> --

Re: BUILD FAILURE...again?! :( Spark Project External Flume on fire

2016-01-11 Thread Jean-Baptiste Onofré
- http://www.talend.com - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.ap

[jira] [Commented] (SPARK-17339) Fix SparkR tests on Windows

2016-09-07 Thread Hadoop QA (JIRA)
sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44027) create *permanent* Spark View from DataFrame via PySpark & Scala DataFrame API

2023-06-12 Thread Martin Bode (Jira)
> * > [DataFrame.createGlobalTempView|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.createGlobalTempView.html#pyspark.sql.DataFrame.createGlobalTempView] > * > [DataFrame.createOrReplaceGlobalTempView|https://spark.a

[jira] [Created] (SPARK-40723) Add .asf.yaml to apache/spark-docker

2022-10-10 Thread Yikun Jiang (Jira)
the License for the specific language governing permissions and # limitations under the License. # https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features --- github: description: "Official Dockerfile for Apache Spark" homepage: https://spark.apache.org/ labels:

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-01 Thread dongjoon-hyun
OCOL, AUTHORITY, FILE, USERINFO\n" + + "key specifies which query to extract\n" + + "Examples:\n" + + " > SELECT _FUNC_('http://spark.apache.org/path?query=1', " + + "'HOST') FROM src LIMIT 1;\n" + "

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
ion( + usage = "_FUNC_(url, partToExtract[, key]) - extracts a part from a URL", + extended = """Parts: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO. +Key specifies which query to extract. +Examples: + > SELECT _FU

Can not subscript to mailing list

2015-10-20 Thread jeff.sadow...@gmail.com
I am having issues subscribing to the user@spark.apache.org mailing list. I would like to be added to the mailing list so I can post some configuration questions I have to the list that I do not see asked on the list. When I tried adding myself I got an email titled "confirm subscribe to

Re: Spark Implementation of XGBoost

2015-10-26 Thread YiZhi Liu
re sub-sampling are also employed to avoid >>> overfitting. >>> >>> Thank you for testing. I am looking forward to your comments and >>> suggestions. Bugs or improvements can be reported through GitHub. >>> >>> Many thanks! >>> >>> Meihua >>> >>> --

RE: JMX with Spark

2015-11-05 Thread Liu shen
Hi, This article may help you. Expose your counter through akka actor https://tersesystems.com/2014/08/19/exposing-akka-actor-state-with-jmx/ Sent from Mail for Windows 10 From: Yogesh Vyas Sent: 2015年11月5日 21:21 To: Romi Kuntsman Cc: user@spark.apache.org Subject: Re: JMX with Spark Hi

Re: use S3-Compatible Storage with spark

2015-07-20 Thread Schmirr Wurst
t;> >> >> >> I wonder how to use S3 compatible Storage in Spark ? >> >> If I'm using s3n:// url schema, the it will point to amazon, is there >> >> a way I can spec

Re: CSV escaping not working

2016-10-27 Thread Jain, Nishit
Do you mind sharing why should escaping not work without quotes? From: Koert Kuipers mailto:ko...@tresata.com>> Date: Thursday, October 27, 2016 at 12:40 PM To: "Jain, Nishit" mailto:nja...@underarmour.com>> Cc: "user@spark.apache.org<mailto:user@spark.apache.org&g

Re: Kryo On Spark 1.6.0

2017-01-10 Thread Richard Startin
Hi Enrico, Only set spark.kryo.registrationRequired if you want to forbid any classes you have not explicitly registered - see http://spark.apache.org/docs/latest/configuration.html. Configuration - Spark 2.0.2 Documentation<http://spark.apache.org/docs/latest/configuration.h

RE: Spark Avarage

2015-04-06 Thread Cheng, Hao
The Dataframe API should be perfectly helpful in this case. https://spark.apache.org/docs/1.3.0/sql-programming-guide.html Some code snippet will like: val sqlContext = new org.apache.spark.sql.SQLContext(sc) // this is used to implicitly convert an RDD to a DataFrame. import

<    5   6   7   8   9   10   11   12   13   14   >