spark.apache.org

Mailing lists matching spark.apache.org

commits spark.apache.org
dev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org

[GitHub] [spark] hvanhovell commented on a diff in pull request #41426: [SPARK-43920][SQL][CONNECT] Create sql/api module

2023-06-01 Thread via GitHub

k Project SQL API +https://spark.apache.org/ + +sql-api + + + Review Comment: Should we add common/util as a dependency? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] dongjoon-hyun commented on pull request #41490: [SPARK-43990][BUILD] Upgrade `kubernetes-client` to 6.7.1

2023-06-07 Thread via GitHub

dongjoon-hyun commented on PR #41490: URL: https://github.com/apache/spark/pull/41490#issuecomment-1581304467 It's because our feature freeze is July 16th and I believe you want to have up-to-date `kubernetes-client` before branch cut. https://spark.apache.org/versioning-policy

[GitHub] [spark] itholic commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM

2023-06-26 Thread via GitHub

prompt of GhatGPT to adhere to Apache Spark's [Error Message Guidelines](https://spark.apache.org/error-message-guidelines.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [Work in Progress] Experimenting to move TransportCipher to GCM based on Google Tink [spark]

2024-03-05 Thread via GitHub

that context. If this ends up being a nontrivial and user facing change, would be good to go through [an SPIP](https://spark.apache.org/improvement-proposals.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47543][CONNECT][PYTHON] Inferring `dict` as `MapType` from Pandas DataFrame to allow DataFrame creation. [spark]

2024-03-25 Thread via GitHub

): self.spark.createDataFrame(pdf).collect(), ) +def test_schema_inference_from_pandas_with_dict(self): +from pyspark.sql.connect import functions as CF Review Comment: Let's add a comment see https://spark.apache.org/contributing.html ``` def test_case

[GitHub] [spark] wangyum commented on a diff in pull request #42344: [SPARK-44675][INFRA] Increase ReservedCodeCacheSize for release build

2023-08-04 Thread via GitHub

MAVEN_OPTS="-Xss128m -Xmx12g -XX:ReservedCodeCacheSize=1g" Review Comment: `ReservedCodeCacheSize` is consistent with https://spark.apache.org/docs/latest/building-spark.html#setting-up-mavens-memory-usage. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] bjornjorgensen commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM

2023-08-04 Thread via GitHub

he link is wrong for me.. it goes to https://github.com/apache/spark/pull/37113 This is the right link https://spark.apache.org/error-message-guidelines.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] srowen commented on pull request #42382: [ML] Remove usage of RDD APIs for load/save in spark-ml

2023-08-09 Thread via GitHub

srowen commented on PR #42382: URL: https://github.com/apache/spark/pull/42382#issuecomment-1671614095 See https://spark.apache.org/contributing.html - go ahead and make and link a JIRA. I think this would target Spark 4.0? Does this relate to Spark Connect, like does it improve

Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]

2023-11-09 Thread via GitHub

]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: cc @GauthamBanasandra -- This is an automated message from the Apache Git S

[PR] [SPARK-44496][SQL][FOLLOW-UP] CalendarIntervalType is also orderable [spark]

2023-11-14 Thread via GitHub

amaliujia opened a new pull request, #43805: URL: https://github.com/apache/spark/pull/43805 ### What changes were proposed in this pull request? CalendarIntervalType is also orderable https://spark.apache.org/docs/3.0.2/api/java/org/apache/spark/sql/types

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-14 Thread via GitHub

particular chunk of a stream. */ -public final class StreamChunkId implements Encodable { Review Comment: Although this is `public`, this is not a part of our public Java doc, right? - https://spark.apache.org/docs/latest/api/java/index.html -- This is an automated message from the Apache Git

Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]

2023-11-18 Thread via GitHub

=$([DateTime]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: LGTM -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]

2023-11-29 Thread via GitHub

SQL Review Comment: I think we should be consistent with the main PySpark doc page terminology now that we are changing this: https://spark.apache.org/docs/latest/api/python/index.html On the page, we say "Spark SQL and DataFrames" I suggest we do the same here. --

[PR] [SPARK-46196][PYTHON][DOCS] Add missing function descriptions [spark]

2023-11-30 Thread via GitHub

zhengruifeng opened a new pull request, #44104: URL: https://github.com/apache/spark/pull/44104 ### What changes were proposed in this pull request? Add missing function descriptions ### Why are the changes needed? they are missing in https://spark.apache.org/docs/latest

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub

n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub

n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache

[GitHub] [spark] wangyum opened a new pull request, #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md

2023-04-15 Thread via GitHub

wangyum opened a new pull request, #40807: URL: https://github.com/apache/spark/pull/40807 ### What changes were proposed in this pull request? This PR fixes incorrect column names in [sql-ref-syntax-dml-insert-table.md](https://spark.apache.org/docs/3.4.0/sql-ref-syntax-dml-insert

[GitHub] [spark] zwangsheng commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately.

2023-04-18 Thread via GitHub

-archive.com/search?l=d...@spark.apache.org&q=subject:%22spark+executor+pod+has+same+memory+value+for+request+and+limit%22&o=newest) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] zzzzming95 commented on pull request #41000: [SPARK-43327] Trigger `committer.setupJob` before plan execute in `FileFormatWriter#write`

2023-05-11 Thread via GitHub

3.2`. > > * https://spark.apache.org/versioning-policy.html OK , I see a similar implementation for Spark3.3, and I will submit it to Spark3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] LuciferYang commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml

2023-05-23 Thread via GitHub

](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: hmm... why we need this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] ronandoolan2 commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml

2023-05-23 Thread via GitHub

](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: I had trouble retriggering the job after enabling github actions on my fork, I'll remove this line again -- This is an automated message from the Apach

[GitHub] [spark] dongjoon-hyun commented on pull request #41422: [SPARK-43541][SQL][3.2] Propagate all `Project` tags in resolving of expressions and missing columns

2023-06-01 Thread via GitHub

dongjoon-hyun commented on PR #41422: URL: https://github.com/apache/spark/pull/41422#issuecomment-1572331320 Hi, @MaxGekk . Sorry but Apache Spark 3.2 is EOL according to our versioning policy. - https://spark.apache.org/versioning-policy.html > No more ... releases should

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-31 Thread via GitHub

bjornjorgensen commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1873036859 There are actually more places where this issue are. Like in [GROUP BY Clause for spark 3.5](https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-groupby.html) where

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub

the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 https://spark.apache.org/docs/2.4.0/running-on-yarn.html

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub

the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 First: https://spark.apache.org/docs/2.1.0/running-on-yarn.html

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub

-71e304edb6adff7be2edd8855cd040b965240627aa6ebe5b5e941b2fc41e090dR105 https://spark.apache.org/docs/3.2.0/running-on-yarn.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr

[GitHub] [spark] dongjoon-hyun commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-10 Thread GitBox

ver, `branch-3.0` is EOL currently because Apache Spark 3.0.0 was two years ago. Please see here. - https://spark.apache.org/versioning-policy.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36844: Update ExecutorClassLoader.scala

2022-06-12 Thread GitBox

: ClassLoader, userClassPathFirst: Boolean) extends ClassLoader(null) with Logging { - val uri = new URI(classUri) Review Comment: @guolianwei plase 1. file a JIRA 2. describe how you reproduced the issue and tested this patch 3. add a unit test. See also https://spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox

a timestamp column") { Review Comment: Let's add a prefix "SPARK-39469: ...", see also https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37210: add ignore for the recently added and failing mypy error 'type-var'

2022-07-17 Thread GitBox

ef _from_java(java_stage: "JavaObject") -> "JP": +def _from_java(java_stage: "JavaObject") -> "JP": # type: ignore[type-var] Review Comment: @anilbey mind filing a JIRA and linking it into PR title? See also, https://spark.apache.org/contributi

[GitHub] [spark] cxzl25 commented on pull request #36238: [SPARK-38916][CORE] Tasks not killed caused by race conditions between killTask() and launchTask()

2022-07-20 Thread GitBox

https://issues.apache.org/jira/browse/SPARK-38916 https://spark.apache.org/releases/spark-release-3-2-2.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-18 Thread via GitHub

HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1724683270 Maybe it's the first time you are contributing to Apache Spark? If then, congrats on your first contribution! https://spark.apache.org/contributing.html Please chec

Re: [PR] [SPARK-45425] Mapped TINYINT to ShortType for MsSqlServerDialect [spark]

2023-10-05 Thread via GitHub

{ None } else { sqlType match { - case java.sql.Types.SMALLINT => Some(ShortType) + case java.sql.Types.SMALLINT | java.sql.Types.TINYINT => Some(ShortType) Review Comment: From the doc https://spark.apache.org/docs/latest/sql-ref-datatypes.html, it seem

Re: [PR] [SPARK-45575][SQL] Support time travel options for df read API [spark]

2023-10-18 Thread via GitHub

"OPTION" : { +"message" : [ + "Timestamp string in the options should be in the format of '-MM-dd HH:mm:ss[.us][zone_id]'." Review Comment: The given formatter might be insufficient for valid timestamps and the `[.us][zone_id]` part diff

Re: [PR] [SPARK-40559][PYTHON] Add applyInArrow to groupBy and cogroup [spark]

2023-10-26 Thread via GitHub

list (https://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html), and see if people like, or ping other committers if there are some support here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: Pandas timezone problems

2015-05-21 Thread Xiangrui Meng

> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: [pyspark] Starting workers in a virtualenv

2015-05-21 Thread Davies Liu

the same machine > in local mode. > > Thanks in advance! > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional co

Re: If not stop StreamingContext gracefully, will checkpoint data be consistent?

2015-06-15 Thread Akhil Das

be > consistent? > Or I should always gracefully shutdown the application even in order to > use the checkpoint? > > Thank you very much! > > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.

Re: Spark or Storm

2015-06-16 Thread Will Briggs

to see what is equivalent of Bolt in storm inside spark. Any help will be appreciated on this ? Thanks , Ashish - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s

Re: Not getting event logs >= spark 1.3.1

2015-06-16 Thread Tsai Li Ming

-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Futures timed out after 10000 milliseconds

2015-07-05 Thread Sean Owen

abble.com/Futures-timed-out-after-1-milliseconds-tp23622p23629.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o

Re: Is it possible to change the default port number 7077 for spark?

2015-07-13 Thread Arun Verma

r List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Thanks and Regards, Arun Verma

Re: it seem like the exactly once feature not work on spark1.4

2015-07-17 Thread Tathagata Das

shing out the data > > SparkStreaming + Kafka only provide an exactly-once guarantee on step 1 & 2 > We need to ensure exactly once on step 3 by myself. > > More details see base on > http://spark.apache.org/docs/latest/streaming-programming-guide.html > <http://spark.apache.or

Re: DataFrame more efficient than RDD?

2015-07-18 Thread Ted Yu

active; > > https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection > <https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection> > > > Is a DataFrame more efficient (space-wise) than an RDD for

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread Silvio Fiorito

nit-testing-framework-for-Spark-Jobs-tp26380.html >Sent from the Apache Spark User List mailing list archive at Nabble.com. > >----- >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Re: SFTP Compressed CSV into Dataframe

2016-03-02 Thread Ewan Leith

emote file decompressed, read, and loaded. Can someone give me any hints? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.

Re: How to reduce the Executor Computing Time.

2016-03-29 Thread Ted Yu

> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-reduce-the-Executor-Computing-Time-tp26623.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail

Re: How to estimate the size of dataframe using pyspark?

2016-04-09 Thread ndjido

ow-to-estimate-the-size-of-dataframe-using-pyspark-tp26729.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For addi

Re: Save DataFrame to HBase

2016-04-21 Thread Zhan Zhang

6 at 6:52 AM, Benjamin Kim mailto:bbuil...@gmail.com>> wrote: Has anyone found an easy way to save a DataFrame into HBase? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@

Re: How to insert data for 100 partitions at a time using Spark SQL

2016-05-22 Thread Jörn Franke

SQL-tp26997.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a

Re: Spark 2.0 Preview After caching query didn't work and can't kill job.

2016-06-15 Thread Chanh Le

> I ran in cluster 5 nodes in spark-shell. > > Did anyone has this issue? > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org

Re: Spark Website

2016-07-13 Thread Reynold Xin

Thanks for reporting. This is due to https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-12055 On Wed, Jul 13, 2016 at 11:52 AM, Pradeep Gollakota wrote: > Worked for me if I go to https://spark.apache.org/site/ but not > https://spark.apache.org > > On Wed, Jul 13

Re: Bzip2 to Parquet format

2016-07-24 Thread Andrew Ehrlich

the RDD[Row] to create a DataFrame. http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType> Once you have the DataFrame, save it to parquet with datafram

Re: frequent itemsets

2016-01-02 Thread Roberto Pagliari

Hi Yanbo, Unfortunately, I cannot share the data. I am using the code in the tutorial https://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html Did you ever try run it when there are hundreds of millions of co-purchases of at least two products? I suspect AR does not handle that

Re: Serializing DataSets

2016-01-19 Thread Simon Hafner

; >> DataFrame and use the writers there? >> >> >> >> ----- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > > > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Re: --driver-java-options not support multiple JVM configuration ?

2016-01-21 Thread Marcelo Vanzin

s \ > > You need quotes around "$sparkdriverextraJavaOptions". > > -- > Marcelo > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -- Marcelo

Re: Spark with SAS

2016-02-03 Thread Benjamin Kim

ies in Spark? >> >> For example calling Spark Jobs from SAS using Spark SQL through Spark SQL's >> JDBC/ODBC library. >> >> Regards, >> Sourav > > ----- > To unsubs

RE: Union Parquet, DataFrame

2016-03-01 Thread Andres.Fernandez

Worked perfectly. Thanks very much Silvio. From: Silvio Fiorito [mailto:silvio.fior...@granturing.com] Sent: Tuesday, March 01, 2016 2:14 PM To: Fernandez, Andres; user@spark.apache.org Subject: Re: Union Parquet, DataFrame Just replied to your other email, but here’s the same thing: Just do

Spark mailing list confusion

2015-09-29 Thread Robineast

Does anyone have any idea why some topics on the mailing list end up on https://www.mail-archive.com/user@spark.apache.org e.g. this message thread <https://www.mail-archive.com/user@spark.apache.org/msg37855.html> , but not on http://apache-spark-user-list.1001560.n3.nabble.com ? Whilst

Re: spark-ec2 config files.

2015-10-05 Thread Renato Perini

scribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>

Re: API to run spark Jobs

2015-10-06 Thread shahid qadri

submit my spark app(python) to the cluster without using > spark-submit, actually i need to invoke jobs from UI > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.ap

unsubscribe

2015-10-20 Thread Pete Zybrick

ew this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Filter-RDD-tp25133p25148.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ---

Re: Using spark in cluster mode

2015-10-21 Thread Jacek Laskowski

Hi, Start here -> http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds and then hop to http://spark.apache.org/docs/latest/spark-standalone.html. Once done, be back with your questions. I think it's gonna help a lot. Pozdrawiam, Jacek

Re: out of memory error with Parquet

2015-11-13 Thread Josh Rosen

25381p25382.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >

Re: NoSuchMethodError

2015-11-15 Thread Fengdong Yu

ith version 1.5.1 > > Can anyone please help me out in resolving this ? > > Regards, > Yogesh > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@sp

Re: Spark UI - Streaming Tab

2015-12-04 Thread patcharee

anks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>

Re: Release data for spark 1.6?

2015-12-09 Thread Sri

t;> Thanks >> Sri >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Release-data-for-spark-1-6-tp25654.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>

Re: Spark - Eclipse IDE - Maven

2015-07-28 Thread Petar Zecevic

0.n3.nabble.com/Spark-Eclipse-IDE-Maven-tp23977.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s

RE: Spark Interview Questions

2015-07-29 Thread Mishra, Abhishek

: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>

Re: Does RDD.cartesian involve shuffling?

2015-08-04 Thread Meihua Wu

esian involve shuffling? >> >> Thanks! >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > &

Re: Spark SQL support for Hive 0.14

2015-08-04 Thread Steve Loughran

-spark-user-list.1001560.n3.nabble.com/Spark-SQL-support-for-Hive-0-14-tp24122.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.

Re: Ranger-like Security on Spark

2015-09-03 Thread Matei Zaharia

erberos my only option > then? > > Kind regards, Daniel. > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -

[ANNOUNCE] Announcing Apache Spark 2.1.0

2016-12-29 Thread Yin Huai

Hi all, Apache Spark 2.1.0 is the second release of Spark 2.x line. This release makes significant strides in the production readiness of Structured Streaming, with added support for event time watermarks <https://spark.apache.org/docs/2.1.0/structured-streaming-programming-guide.html#handl

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Marco Shaw

ted. > > > Thanks and regards, > Sudipta > > > > > -- > Sudipta Banerjee > Consultant, Business Analytics and Cloud Based Architecture > Call me +919019578099 > > > ----- > T

Re: spark 1.2 ec2 launch script hang

2015-01-26 Thread Pete Zybrick

--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: no option to add intercepts for StreamingLinearAlgorithm

2015-02-09 Thread Xiangrui Meng

pts-for-StreamingLinearAlgorithm-tp21526.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o

Re: method newAPIHadoopFile

2015-02-25 Thread patcharee

, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apa

Re: Get importerror when i run pyspark with ipython=1

2015-02-26 Thread Jey Kottalam

-list.1001560.n3.nabble.com/Get-importerror-when-i-run-pyspark-with-ipython-1-tp21843.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu

Re: Upgrade to Spark 1.2.1 using Guava

2015-02-27 Thread Pat Ferrel

ar blah -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubsc

Re: Training Random Forest

2015-03-05 Thread Xiangrui Meng

-user-list.1001560.n3.nabble.com/Training-Random-Forest-tp21935.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For

Re: Spark Streaming input data source list

2015-03-09 Thread Cui Lin

t; Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: Spark Streaming input data source list Spark Streaming has StreamingContext.socketStream() http://spark.apache.org/docs/1.2.1/api/java/org/apache/spark/streaming/StreamingContext.html#s

Re: Reading a text file into RDD[Char] instead of RDD[String]

2015-03-19 Thread Manoj Awasthi

tring], > > > > Can anyone suggest the most efficient way to create the RDD[Char] ? I’m > sure I’ve missed something simple… > > > > Regards, > > Mike > > -

Mailing list schizophrenia?

2015-03-20 Thread Jim Kleckner

I notice that some people send messages directly to user@spark.apache.org and some via nabble, either using email or the web client. There are two index sites, one directly at apache.org and one at nabble. But messages sent directly to user@spark.apache.org only show up in the apache list

Re: Does HiveContext connect to HiveServer2?

2015-03-24 Thread Marcelo Vanzin

park-user-list.1001560.n3.nabble.com/Does-HiveContext-connect-to-HiveServer2-tp22200.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.ap

Re: Add row IDs column to data frame

2015-04-05 Thread Xiangrui Meng

D", rowDF("ID")) > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Add-row-IDs-column-to-data-frame-tp22385.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > >

Re: java.lang.ClassCastException: scala.Tuple2 cannot be cast to org.apache.spark.mllib.regression.LabeledPoint

2015-04-06 Thread Xiangrui Meng

eciated. > > Thanks! > > J > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: spark.dynamicAllocation.minExecutors

2015-04-16 Thread Marcelo Vanzin

gt;> >> + private val minNumExecutors = >> conf.getInt("spark.dynamicAllocation.minExecutors", 0) >> ... >> + if (maxNumExecutors == 0) { >> + throw new SparkException("spark.dynamicAllocation.maxExecutors cannot be >> 0!") > &g

Re: SPARKTA: a real-time aggregation engine based on Spark Streaming

2015-05-14 Thread Matei Zaharia

hive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -

Re: Streaming + SQL : How to resgister a DStream content as a table and access it

2014-08-04 Thread Tathagata Das

t-as-a-table-and-access-it-tp11372.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a

Re: How to implement multinomial logistic regression(softmax regression) in Spark?

2014-08-15 Thread DB Tsai

t.1001560.n3.nabble.com/How-to-implement-multinomial-logistic-regression-softmax-regression-in-Spark-tp11939p12175.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe,

Re: application as a service

2014-08-17 Thread Davies Liu

context: > http://apache-spark-user-list.1001560.n3.nabble.com/application-as-a-service-tp12253p12267.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-u

Re: Python script runs fine in local mode, errors in other modes

2014-08-19 Thread Davies Liu

list.1001560.n3.nabble.com/Python-script-runs-fine-in-local-mode-errors-in-other-modes-tp12390p12398.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu

Re: Key-Value in PairRDD

2014-08-26 Thread Sean Owen

I'd suggest first reading the scaladoc for RDD and PairRDDFunctions to familiarize yourself with all the operations available: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD http://spark.apache.org/docs/latest/api/scala/index

Re: New features (Discretization) for v1.x in xiangrui.pdf

2014-09-03 Thread Xiangrui Meng

nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13338.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spar

Re: Running spark-shell (or queries) over the network (not from master)

2014-09-05 Thread Ognen Duzlevski

master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands,

Re: Efficient way to sum multiple columns

2014-09-15 Thread Xiangrui Meng

001560.n3.nabble.com/Efficient-way-to-sum-multiple-columns-tp14281.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@

Re: Null values in pyspark Row

2014-09-24 Thread Davies Liu

ull-values-in-pyspark-Row-tp15065.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache

Re: java.lang.OutOfMemoryError while running SVD MLLib example

2014-09-25 Thread Xiangrui Meng

user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-while-running-SVD-MLLib-example-tp14972p15083.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr.

Re: Access by name in "tuples" in Scala with Spark

2014-09-26 Thread Sean Owen

60.n3.nabble.com/Access-by-name-in-tuples-in-Scala-with-Spark-tp15212.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark

Re: Trouble getting filtering on field correct

2014-10-03 Thread Davies Liu

sage in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Trouble-getting-filtering-on-field-correct-tp15728.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To u

Mailing lists matching spark.apache.org

< 15 16 17 18 19 20 21 22 23 24 >

1901 - 2000 of 362695 matches

Mail list logo