Mailing lists matching spark.apache.org

commits spark.apache.org
dev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #41426: [SPARK-43920][SQL][CONNECT] Create sql/api module

2023-06-01 Thread via GitHub
k Project SQL API +https://spark.apache.org/ + +sql-api + + + Review Comment: Should we add common/util as a dependency? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] dongjoon-hyun commented on pull request #41490: [SPARK-43990][BUILD] Upgrade `kubernetes-client` to 6.7.1

2023-06-07 Thread via GitHub
dongjoon-hyun commented on PR #41490: URL: https://github.com/apache/spark/pull/41490#issuecomment-1581304467 It's because our feature freeze is July 16th and I believe you want to have up-to-date `kubernetes-client` before branch cut. https://spark.apache.org/versioning-policy

[GitHub] [spark] itholic commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM

2023-06-26 Thread via GitHub
prompt of GhatGPT to adhere to Apache Spark's [Error Message Guidelines](https://spark.apache.org/error-message-guidelines.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [Work in Progress] Experimenting to move TransportCipher to GCM based on Google Tink [spark]

2024-03-05 Thread via GitHub
that context. If this ends up being a nontrivial and user facing change, would be good to go through [an SPIP](https://spark.apache.org/improvement-proposals.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47543][CONNECT][PYTHON] Inferring `dict` as `MapType` from Pandas DataFrame to allow DataFrame creation. [spark]

2024-03-25 Thread via GitHub
): self.spark.createDataFrame(pdf).collect(), ) +def test_schema_inference_from_pandas_with_dict(self): +from pyspark.sql.connect import functions as CF Review Comment: Let's add a comment see https://spark.apache.org/contributing.html ``` def test_case

[GitHub] [spark] wangyum commented on a diff in pull request #42344: [SPARK-44675][INFRA] Increase ReservedCodeCacheSize for release build

2023-08-04 Thread via GitHub
MAVEN_OPTS="-Xss128m -Xmx12g -XX:ReservedCodeCacheSize=1g" Review Comment: `ReservedCodeCacheSize` is consistent with https://spark.apache.org/docs/latest/building-spark.html#setting-up-mavens-memory-usage. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] bjornjorgensen commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM

2023-08-04 Thread via GitHub
he link is wrong for me.. it goes to https://github.com/apache/spark/pull/37113 This is the right link https://spark.apache.org/error-message-guidelines.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] srowen commented on pull request #42382: [ML] Remove usage of RDD APIs for load/save in spark-ml

2023-08-09 Thread via GitHub
srowen commented on PR #42382: URL: https://github.com/apache/spark/pull/42382#issuecomment-1671614095 See https://spark.apache.org/contributing.html - go ahead and make and link a JIRA. I think this would target Spark 4.0? Does this relate to Spark Connect, like does it improve

Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]

2023-11-09 Thread via GitHub
]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: cc @GauthamBanasandra -- This is an automated message from the Apache Git S

[PR] [SPARK-44496][SQL][FOLLOW-UP] CalendarIntervalType is also orderable [spark]

2023-11-14 Thread via GitHub
amaliujia opened a new pull request, #43805: URL: https://github.com/apache/spark/pull/43805 ### What changes were proposed in this pull request? CalendarIntervalType is also orderable https://spark.apache.org/docs/3.0.2/api/java/org/apache/spark/sql/types

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-14 Thread via GitHub
particular chunk of a stream. */ -public final class StreamChunkId implements Encodable { Review Comment: Although this is `public`, this is not a part of our public Java doc, right? - https://spark.apache.org/docs/latest/api/java/index.html -- This is an automated message from the Apache Git

Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]

2023-11-18 Thread via GitHub
=$([DateTime]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: LGTM -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]

2023-11-29 Thread via GitHub
SQL Review Comment: I think we should be consistent with the main PySpark doc page terminology now that we are changing this: https://spark.apache.org/docs/latest/api/python/index.html On the page, we say "Spark SQL and DataFrames" I suggest we do the same here. --

[PR] [SPARK-46196][PYTHON][DOCS] Add missing function descriptions [spark]

2023-11-30 Thread via GitHub
zhengruifeng opened a new pull request, #44104: URL: https://github.com/apache/spark/pull/44104 ### What changes were proposed in this pull request? Add missing function descriptions ### Why are the changes needed? they are missing in https://spark.apache.org/docs/latest

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache

[GitHub] [spark] wangyum opened a new pull request, #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md

2023-04-15 Thread via GitHub
wangyum opened a new pull request, #40807: URL: https://github.com/apache/spark/pull/40807 ### What changes were proposed in this pull request? This PR fixes incorrect column names in [sql-ref-syntax-dml-insert-table.md](https://spark.apache.org/docs/3.4.0/sql-ref-syntax-dml-insert

[GitHub] [spark] zwangsheng commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately.

2023-04-18 Thread via GitHub
-archive.com/search?l=d...@spark.apache.org&q=subject:%22spark+executor+pod+has+same+memory+value+for+request+and+limit%22&o=newest) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] zzzzming95 commented on pull request #41000: [SPARK-43327] Trigger `committer.setupJob` before plan execute in `FileFormatWriter#write`

2023-05-11 Thread via GitHub
3.2`. > > * https://spark.apache.org/versioning-policy.html OK , I see a similar implementation for Spark3.3, and I will submit it to Spark3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] LuciferYang commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml

2023-05-23 Thread via GitHub
](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: hmm... why we need this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] ronandoolan2 commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml

2023-05-23 Thread via GitHub
](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: I had trouble retriggering the job after enabling github actions on my fork, I'll remove this line again -- This is an automated message from the Apach

[GitHub] [spark] dongjoon-hyun commented on pull request #41422: [SPARK-43541][SQL][3.2] Propagate all `Project` tags in resolving of expressions and missing columns

2023-06-01 Thread via GitHub
dongjoon-hyun commented on PR #41422: URL: https://github.com/apache/spark/pull/41422#issuecomment-1572331320 Hi, @MaxGekk . Sorry but Apache Spark 3.2 is EOL according to our versioning policy. - https://spark.apache.org/versioning-policy.html > No more ... releases should

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-31 Thread via GitHub
bjornjorgensen commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1873036859 There are actually more places where this issue are. Like in [GROUP BY Clause for spark 3.5](https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-groupby.html) where

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub
the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 https://spark.apache.org/docs/2.4.0/running-on-yarn.html

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub
the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 First: https://spark.apache.org/docs/2.1.0/running-on-yarn.html

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]

2024-01-03 Thread via GitHub
-71e304edb6adff7be2edd8855cd040b965240627aa6ebe5b5e941b2fc41e090dR105 https://spark.apache.org/docs/3.2.0/running-on-yarn.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr

[GitHub] [spark] dongjoon-hyun commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-10 Thread GitBox
ver, `branch-3.0` is EOL currently because Apache Spark 3.0.0 was two years ago. Please see here. - https://spark.apache.org/versioning-policy.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36844: Update ExecutorClassLoader.scala

2022-06-12 Thread GitBox
: ClassLoader, userClassPathFirst: Boolean) extends ClassLoader(null) with Logging { - val uri = new URI(classUri) Review Comment: @guolianwei plase 1. file a JIRA 2. describe how you reproduced the issue and tested this patch 3. add a unit test. See also https://spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
a timestamp column") { Review Comment: Let's add a prefix "SPARK-39469: ...", see also https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37210: add ignore for the recently added and failing mypy error 'type-var'

2022-07-17 Thread GitBox
ef _from_java(java_stage: "JavaObject") -> "JP": +def _from_java(java_stage: "JavaObject") -> "JP": # type: ignore[type-var] Review Comment: @anilbey mind filing a JIRA and linking it into PR title? See also, https://spark.apache.org/contributi

[GitHub] [spark] cxzl25 commented on pull request #36238: [SPARK-38916][CORE] Tasks not killed caused by race conditions between killTask() and launchTask()

2022-07-20 Thread GitBox
https://issues.apache.org/jira/browse/SPARK-38916 https://spark.apache.org/releases/spark-release-3-2-2.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-18 Thread via GitHub
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1724683270 Maybe it's the first time you are contributing to Apache Spark? If then, congrats on your first contribution! https://spark.apache.org/contributing.html Please chec

Re: [PR] [SPARK-45425] Mapped TINYINT to ShortType for MsSqlServerDialect [spark]

2023-10-05 Thread via GitHub
{ None } else { sqlType match { - case java.sql.Types.SMALLINT => Some(ShortType) + case java.sql.Types.SMALLINT | java.sql.Types.TINYINT => Some(ShortType) Review Comment: From the doc https://spark.apache.org/docs/latest/sql-ref-datatypes.html, it seem

Re: [PR] [SPARK-45575][SQL] Support time travel options for df read API [spark]

2023-10-18 Thread via GitHub
"OPTION" : { +"message" : [ + "Timestamp string in the options should be in the format of '-MM-dd HH:mm:ss[.us][zone_id]'." Review Comment: The given formatter might be insufficient for valid timestamps and the `[.us][zone_id]` part diff

Re: [PR] [SPARK-40559][PYTHON] Add applyInArrow to groupBy and cogroup [spark]

2023-10-26 Thread via GitHub
list (https://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html), and see if people like, or ping other committers if there are some support here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: Pandas timezone problems

2015-05-21 Thread Xiangrui Meng
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: [pyspark] Starting workers in a virtualenv

2015-05-21 Thread Davies Liu
the same machine > in local mode. > > Thanks in advance! > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional co

Re: If not stop StreamingContext gracefully, will checkpoint data be consistent?

2015-06-15 Thread Akhil Das
be > consistent? > Or I should always gracefully shutdown the application even in order to > use the checkpoint? > > Thank you very much! > > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.

Re: Spark or Storm

2015-06-16 Thread Will Briggs
to see what is equivalent of Bolt in storm inside spark. Any help will be appreciated on this ? Thanks , Ashish - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s

Re: Not getting event logs >= spark 1.3.1

2015-06-16 Thread Tsai Li Ming
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Futures timed out after 10000 milliseconds

2015-07-05 Thread Sean Owen
abble.com/Futures-timed-out-after-1-milliseconds-tp23622p23629.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o

Re: Is it possible to change the default port number 7077 for spark?

2015-07-13 Thread Arun Verma
r List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Thanks and Regards, Arun Verma

Re: it seem like the exactly once feature not work on spark1.4

2015-07-17 Thread Tathagata Das
shing out the data > > SparkStreaming + Kafka only provide an exactly-once guarantee on step 1 & 2 > We need to ensure exactly once on step 3 by myself. > > More details see base on > http://spark.apache.org/docs/latest/streaming-programming-guide.html > <http://spark.apache.or

Re: DataFrame more efficient than RDD?

2015-07-18 Thread Ted Yu
active; > > https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection > <https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection> > > > Is a DataFrame more efficient (space-wise) than an RDD for

Re: Unit testing framework for Spark Jobs?

2016-03-02 Thread Silvio Fiorito
nit-testing-framework-for-Spark-Jobs-tp26380.html >Sent from the Apache Spark User List mailing list archive at Nabble.com. > >----- >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Re: SFTP Compressed CSV into Dataframe

2016-03-02 Thread Ewan Leith
emote file decompressed, read, and loaded. Can someone give me any hints? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.

Re: How to reduce the Executor Computing Time.

2016-03-29 Thread Ted Yu
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-reduce-the-Executor-Computing-Time-tp26623.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail

Re: How to estimate the size of dataframe using pyspark?

2016-04-09 Thread ndjido
ow-to-estimate-the-size-of-dataframe-using-pyspark-tp26729.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For addi

Re: Save DataFrame to HBase

2016-04-21 Thread Zhan Zhang
6 at 6:52 AM, Benjamin Kim mailto:bbuil...@gmail.com>> wrote: Has anyone found an easy way to save a DataFrame into HBase? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@

Re: How to insert data for 100 partitions at a time using Spark SQL

2016-05-22 Thread Jörn Franke
SQL-tp26997.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a

Re: Spark 2.0 Preview After caching query didn't work and can't kill job.

2016-06-15 Thread Chanh Le
> I ran in cluster 5 nodes in spark-shell. > > Did anyone has this issue? > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org

Re: Spark Website

2016-07-13 Thread Reynold Xin
Thanks for reporting. This is due to https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-12055 On Wed, Jul 13, 2016 at 11:52 AM, Pradeep Gollakota wrote: > Worked for me if I go to https://spark.apache.org/site/ but not > https://spark.apache.org > > On Wed, Jul 13

Re: Bzip2 to Parquet format

2016-07-24 Thread Andrew Ehrlich
the RDD[Row] to create a DataFrame. http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType> Once you have the DataFrame, save it to parquet with datafram

Re: frequent itemsets

2016-01-02 Thread Roberto Pagliari
Hi Yanbo, Unfortunately, I cannot share the data. I am using the code in the tutorial https://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html Did you ever try run it when there are hundreds of millions of co-purchases of at least two products? I suspect AR does not handle that

Re: Serializing DataSets

2016-01-19 Thread Simon Hafner
; >> DataFrame and use the writers there? >> >> >> >> ----- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > > > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Re: --driver-java-options not support multiple JVM configuration ?

2016-01-21 Thread Marcelo Vanzin
s \ > > You need quotes around "$sparkdriverextraJavaOptions". > > -- > Marcelo > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -- Marcelo

Re: Spark with SAS

2016-02-03 Thread Benjamin Kim
ies in Spark? >> >> For example calling Spark Jobs from SAS using Spark SQL through Spark SQL's >> JDBC/ODBC library. >> >> Regards, >> Sourav > > ----- > To unsubs

RE: Union Parquet, DataFrame

2016-03-01 Thread Andres.Fernandez
Worked perfectly. Thanks very much Silvio. From: Silvio Fiorito [mailto:silvio.fior...@granturing.com] Sent: Tuesday, March 01, 2016 2:14 PM To: Fernandez, Andres; user@spark.apache.org Subject: Re: Union Parquet, DataFrame Just replied to your other email, but here’s the same thing: Just do

Spark mailing list confusion

2015-09-29 Thread Robineast
Does anyone have any idea why some topics on the mailing list end up on https://www.mail-archive.com/user@spark.apache.org e.g. this message thread <https://www.mail-archive.com/user@spark.apache.org/msg37855.html> , but not on http://apache-spark-user-list.1001560.n3.nabble.com ? Whilst

Re: spark-ec2 config files.

2015-10-05 Thread Renato Perini
scribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>

Re: API to run spark Jobs

2015-10-06 Thread shahid qadri
submit my spark app(python) to the cluster without using > spark-submit, actually i need to invoke jobs from UI > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.ap

unsubscribe

2015-10-20 Thread Pete Zybrick
ew this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Filter-RDD-tp25133p25148.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ---

Re: Using spark in cluster mode

2015-10-21 Thread Jacek Laskowski
Hi, Start here -> http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds and then hop to http://spark.apache.org/docs/latest/spark-standalone.html. Once done, be back with your questions. I think it's gonna help a lot. Pozdrawiam, Jacek

Re: out of memory error with Parquet

2015-11-13 Thread Josh Rosen
25381p25382.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >

Re: NoSuchMethodError

2015-11-15 Thread Fengdong Yu
ith version 1.5.1 > > Can anyone please help me out in resolving this ? > > Regards, > Yogesh > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@sp

Re: Spark UI - Streaming Tab

2015-12-04 Thread patcharee
anks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>

Re: Release data for spark 1.6?

2015-12-09 Thread Sri
t;> Thanks >> Sri >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Release-data-for-spark-1-6-tp25654.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>

Re: Spark - Eclipse IDE - Maven

2015-07-28 Thread Petar Zecevic
0.n3.nabble.com/Spark-Eclipse-IDE-Maven-tp23977.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s

RE: Spark Interview Questions

2015-07-29 Thread Mishra, Abhishek
: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>

Re: Does RDD.cartesian involve shuffling?

2015-08-04 Thread Meihua Wu
esian involve shuffling? >> >> Thanks! >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > &

Re: Spark SQL support for Hive 0.14

2015-08-04 Thread Steve Loughran
-spark-user-list.1001560.n3.nabble.com/Spark-SQL-support-for-Hive-0-14-tp24122.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.

Re: Ranger-like Security on Spark

2015-09-03 Thread Matei Zaharia
erberos my only option > then? > > Kind regards, Daniel. > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -

[ANNOUNCE] Announcing Apache Spark 2.1.0

2016-12-29 Thread Yin Huai
Hi all, Apache Spark 2.1.0 is the second release of Spark 2.x line. This release makes significant strides in the production readiness of Structured Streaming, with added support for event time watermarks <https://spark.apache.org/docs/2.1.0/structured-streaming-programming-guide.html#handl

Re: Spark Team - Paco Nathan said that your team can help

2015-01-22 Thread Marco Shaw
ted. > > > Thanks and regards, > Sudipta > > > > > -- > Sudipta Banerjee > Consultant, Business Analytics and Cloud Based Architecture > Call me +919019578099 > > > ----- > T

Re: spark 1.2 ec2 launch script hang

2015-01-26 Thread Pete Zybrick
--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: no option to add intercepts for StreamingLinearAlgorithm

2015-02-09 Thread Xiangrui Meng
pts-for-StreamingLinearAlgorithm-tp21526.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o

Re: method newAPIHadoopFile

2015-02-25 Thread patcharee
, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apa

Re: Get importerror when i run pyspark with ipython=1

2015-02-26 Thread Jey Kottalam
-list.1001560.n3.nabble.com/Get-importerror-when-i-run-pyspark-with-ipython-1-tp21843.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu

Re: Upgrade to Spark 1.2.1 using Guava

2015-02-27 Thread Pat Ferrel
ar blah -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubsc

Re: Training Random Forest

2015-03-05 Thread Xiangrui Meng
-user-list.1001560.n3.nabble.com/Training-Random-Forest-tp21935.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For

Re: Spark Streaming input data source list

2015-03-09 Thread Cui Lin
t; Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: Spark Streaming input data source list Spark Streaming has StreamingContext.socketStream() http://spark.apache.org/docs/1.2.1/api/java/org/apache/spark/streaming/StreamingContext.html#s

Re: Reading a text file into RDD[Char] instead of RDD[String]

2015-03-19 Thread Manoj Awasthi
tring], > > > > Can anyone suggest the most efficient way to create the RDD[Char] ? I’m > sure I’ve missed something simple… > > > > Regards, > > Mike > > -

Mailing list schizophrenia?

2015-03-20 Thread Jim Kleckner
I notice that some people send messages directly to user@spark.apache.org and some via nabble, either using email or the web client. There are two index sites, one directly at apache.org and one at nabble. But messages sent directly to user@spark.apache.org only show up in the apache list

Re: Does HiveContext connect to HiveServer2?

2015-03-24 Thread Marcelo Vanzin
park-user-list.1001560.n3.nabble.com/Does-HiveContext-connect-to-HiveServer2-tp22200.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.ap

Re: Add row IDs column to data frame

2015-04-05 Thread Xiangrui Meng
D", rowDF("ID")) > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Add-row-IDs-column-to-data-frame-tp22385.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > >

Re: java.lang.ClassCastException: scala.Tuple2 cannot be cast to org.apache.spark.mllib.regression.LabeledPoint

2015-04-06 Thread Xiangrui Meng
eciated. > > Thanks! > > J > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: spark.dynamicAllocation.minExecutors

2015-04-16 Thread Marcelo Vanzin
gt;> >> + private val minNumExecutors = >> conf.getInt("spark.dynamicAllocation.minExecutors", 0) >> ... >> + if (maxNumExecutors == 0) { >> + throw new SparkException("spark.dynamicAllocation.maxExecutors cannot be >> 0!") > &g

Re: SPARKTA: a real-time aggregation engine based on Spark Streaming

2015-05-14 Thread Matei Zaharia
hive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -

Re: Streaming + SQL : How to resgister a DStream content as a table and access it

2014-08-04 Thread Tathagata Das
t-as-a-table-and-access-it-tp11372.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a

Re: How to implement multinomial logistic regression(softmax regression) in Spark?

2014-08-15 Thread DB Tsai
t.1001560.n3.nabble.com/How-to-implement-multinomial-logistic-regression-softmax-regression-in-Spark-tp11939p12175.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe,

Re: application as a service

2014-08-17 Thread Davies Liu
context: > http://apache-spark-user-list.1001560.n3.nabble.com/application-as-a-service-tp12253p12267.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-u

Re: Python script runs fine in local mode, errors in other modes

2014-08-19 Thread Davies Liu
list.1001560.n3.nabble.com/Python-script-runs-fine-in-local-mode-errors-in-other-modes-tp12390p12398.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu

Re: Key-Value in PairRDD

2014-08-26 Thread Sean Owen
I'd suggest first reading the scaladoc for RDD and PairRDDFunctions to familiarize yourself with all the operations available: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD http://spark.apache.org/docs/latest/api/scala/index

Re: New features (Discretization) for v1.x in xiangrui.pdf

2014-09-03 Thread Xiangrui Meng
nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13338.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spar

Re: Running spark-shell (or queries) over the network (not from master)

2014-09-05 Thread Ognen Duzlevski
master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands,

Re: Efficient way to sum multiple columns

2014-09-15 Thread Xiangrui Meng
001560.n3.nabble.com/Efficient-way-to-sum-multiple-columns-tp14281.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@

Re: Null values in pyspark Row

2014-09-24 Thread Davies Liu
ull-values-in-pyspark-Row-tp15065.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache

Re: java.lang.OutOfMemoryError while running SVD MLLib example

2014-09-25 Thread Xiangrui Meng
user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-while-running-SVD-MLLib-example-tp14972p15083.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr.

Re: Access by name in "tuples" in Scala with Spark

2014-09-26 Thread Sean Owen
60.n3.nabble.com/Access-by-name-in-tuples-in-Scala-with-Spark-tp15212.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark

Re: Trouble getting filtering on field correct

2014-10-03 Thread Davies Liu
sage in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Trouble-getting-filtering-on-field-correct-tp15728.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To u

<    15   16   17   18   19   20   21   22   23   24   >