Mailing lists matching spark.apache.org
commits spark.apache.orgdev spark.apache.org
issues spark.apache.org
reviews spark.apache.org
user spark.apache.org
[GitHub] [spark] hvanhovell commented on a diff in pull request #41426: [SPARK-43920][SQL][CONNECT] Create sql/api module
k Project SQL API +https://spark.apache.org/ + +sql-api + + + Review Comment: Should we add common/util as a dependency? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm
[GitHub] [spark] dongjoon-hyun commented on pull request #41490: [SPARK-43990][BUILD] Upgrade `kubernetes-client` to 6.7.1
dongjoon-hyun commented on PR #41490: URL: https://github.com/apache/spark/pull/41490#issuecomment-1581304467 It's because our feature freeze is July 16th and I believe you want to have up-to-date `kubernetes-client` before branch cut. https://spark.apache.org/versioning-policy
[GitHub] [spark] itholic commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM
prompt of GhatGPT to adhere to Apache Spark's [Error Message Guidelines](https://spark.apache.org/error-message-guidelines.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif
Re: [PR] [Work in Progress] Experimenting to move TransportCipher to GCM based on Google Tink [spark]
that context. If this ends up being a nontrivial and user facing change, would be good to go through [an SPIP](https://spark.apache.org/improvement-proposals.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the
Re: [PR] [SPARK-47543][CONNECT][PYTHON] Inferring `dict` as `MapType` from Pandas DataFrame to allow DataFrame creation. [spark]
): self.spark.createDataFrame(pdf).collect(), ) +def test_schema_inference_from_pandas_with_dict(self): +from pyspark.sql.connect import functions as CF Review Comment: Let's add a comment see https://spark.apache.org/contributing.html ``` def test_case
[GitHub] [spark] wangyum commented on a diff in pull request #42344: [SPARK-44675][INFRA] Increase ReservedCodeCacheSize for release build
MAVEN_OPTS="-Xss128m -Xmx12g -XX:ReservedCodeCacheSize=1g" Review Comment: `ReservedCodeCacheSize` is consistent with https://spark.apache.org/docs/latest/building-spark.html#setting-up-mavens-memory-usage. -- This is an automated message from the Apache Git Service. To respond to
[GitHub] [spark] bjornjorgensen commented on pull request #41711: [SPARK-44155] Adding a dev utility to improve error messages based on LLM
he link is wrong for me.. it goes to https://github.com/apache/spark/pull/37113 This is the right link https://spark.apache.org/error-message-guidelines.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a
[GitHub] [spark] srowen commented on pull request #42382: [ML] Remove usage of RDD APIs for load/save in spark-ml
srowen commented on PR #42382: URL: https://github.com/apache/spark/pull/42382#issuecomment-1671614095 See https://spark.apache.org/contributing.html - go ahead and make and link a JIRA. I think this would target Spark 4.0? Does this relate to Spark Connect, like does it improve
Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]
]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: cc @GauthamBanasandra -- This is an automated message from the Apache Git S
[PR] [SPARK-44496][SQL][FOLLOW-UP] CalendarIntervalType is also orderable [spark]
amaliujia opened a new pull request, #43805: URL: https://github.com/apache/spark/pull/43805 ### What changes were proposed in this pull request? CalendarIntervalType is also orderable https://spark.apache.org/docs/3.0.2/api/java/org/apache/spark/sql/types
Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]
particular chunk of a stream. */ -public final class StreamChunkId implements Encodable { Review Comment: Although this is `public`, this is not a part of our public Java doc, right? - https://spark.apache.org/docs/latest/api/java/index.html -- This is an automated message from the Apache Git
Re: [PR] [SPARK-45848][BUILD] Make `spark-version-info.properties` generated by `spark-build-info.ps1` include `docroot` [spark]
=$([DateTime]::UtcNow | Get-Date -UFormat +%Y-%m-%dT%H:%M:%SZ) -url=$(git config --get remote.origin.url)" +url=$(git config --get remote.origin.url), +docroot=https://spark.apache.org/docs/latest"; Review Comment: LGTM -- This is an automated message from the Apache Git Service. To
Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]
SQL Review Comment: I think we should be consistent with the main PySpark doc page terminology now that we are changing this: https://spark.apache.org/docs/latest/api/python/index.html On the page, we say "Spark SQL and DataFrames" I suggest we do the same here. --
[PR] [SPARK-46196][PYTHON][DOCS] Add missing function descriptions [spark]
zhengruifeng opened a new pull request, #44104: URL: https://github.com/apache/spark/pull/44104 ### What changes were proposed in this pull request? Add missing function descriptions ### Why are the changes needed? they are missing in https://spark.apache.org/docs/latest
Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]
n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache
Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]
n " +] + }, + "_LEGACY_ERROR_TEMP_3052" : { Review Comment: Can we use `UNSUPPORTED_FEATURE`? - https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-feature-error-class.html#unsupported_feature-error-class -- This is an automated message from the Apache
[GitHub] [spark] wangyum opened a new pull request, #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md
wangyum opened a new pull request, #40807: URL: https://github.com/apache/spark/pull/40807 ### What changes were proposed in this pull request? This PR fixes incorrect column names in [sql-ref-syntax-dml-insert-table.md](https://spark.apache.org/docs/3.4.0/sql-ref-syntax-dml-insert
[GitHub] [spark] zwangsheng commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately.
-archive.com/search?l=d...@spark.apache.org&q=subject:%22spark+executor+pod+has+same+memory+value+for+request+and+limit%22&o=newest) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm
[GitHub] [spark] zzzzming95 commented on pull request #41000: [SPARK-43327] Trigger `committer.setupJob` before plan execute in `FileFormatWriter#write`
3.2`. > > * https://spark.apache.org/versioning-policy.html OK , I see a similar implementation for Spark3.3, and I will submit it to Spark3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above
[GitHub] [spark] LuciferYang commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml
](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: hmm... why we need this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL
[GitHub] [spark] ronandoolan2 commented on a diff in pull request #41281: [WIP] update to secure version of fasterxml
](https://spark.apache.org/contributing.html) for information on how to get started contributing to the project. +## Review Comment: I had trouble retriggering the job after enabling github actions on my fork, I'll remove this line again -- This is an automated message from the Apach
[GitHub] [spark] dongjoon-hyun commented on pull request #41422: [SPARK-43541][SQL][3.2] Propagate all `Project` tags in resolving of expressions and missing columns
dongjoon-hyun commented on PR #41422: URL: https://github.com/apache/spark/pull/41422#issuecomment-1572331320 Hi, @MaxGekk . Sorry but Apache Spark 3.2 is EOL according to our versioning policy. - https://spark.apache.org/versioning-policy.html > No more ... releases should
Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]
bjornjorgensen commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1873036859 There are actually more places where this issue are. Like in [GROUP BY Clause for spark 3.5](https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-groupby.html) where
Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]
the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 https://spark.apache.org/docs/2.4.0/running-on-yarn.html
Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]
the shuffle service is running on YARN: -Property NameDefaultMeaning +Property NameDefaultMeaningSince Version spark.yarn.shuffle.stopOnFailure Review Comment: For a record: https://github.com/apache/spark/pull/14162 First: https://spark.apache.org/docs/2.1.0/running-on-yarn.html
Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-yarn` pages [spark]
-71e304edb6adff7be2edd8855cd040b965240627aa6ebe5b5e941b2fc41e090dR105 https://spark.apache.org/docs/3.2.0/running-on-yarn.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr
[GitHub] [spark] dongjoon-hyun commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries
ver, `branch-3.0` is EOL currently because Apache Spark 3.0.0 was two years ago. Please see here. - https://spark.apache.org/versioning-policy.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36844: Update ExecutorClassLoader.scala
: ClassLoader, userClassPathFirst: Boolean) extends ClassLoader(null) with Logging { - val uri = new URI(classUri) Review Comment: @guolianwei plase 1. file a JIRA 2. describe how you reproduced the issue and tested this patch 3. add a unit test. See also https://spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference
a timestamp column") { Review Comment: Let's add a prefix "SPARK-39469: ...", see also https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th
[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37210: add ignore for the recently added and failing mypy error 'type-var'
ef _from_java(java_stage: "JavaObject") -> "JP": +def _from_java(java_stage: "JavaObject") -> "JP": # type: ignore[type-var] Review Comment: @anilbey mind filing a JIRA and linking it into PR title? See also, https://spark.apache.org/contributi
[GitHub] [spark] cxzl25 commented on pull request #36238: [SPARK-38916][CORE] Tasks not killed caused by race conditions between killTask() and launchTask()
https://issues.apache.org/jira/browse/SPARK-38916 https://spark.apache.org/releases/spark-release-3-2-2.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific
[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1724683270 Maybe it's the first time you are contributing to Apache Spark? If then, congrats on your first contribution! https://spark.apache.org/contributing.html Please chec
Re: [PR] [SPARK-45425] Mapped TINYINT to ShortType for MsSqlServerDialect [spark]
{ None } else { sqlType match { - case java.sql.Types.SMALLINT => Some(ShortType) + case java.sql.Types.SMALLINT | java.sql.Types.TINYINT => Some(ShortType) Review Comment: From the doc https://spark.apache.org/docs/latest/sql-ref-datatypes.html, it seem
Re: [PR] [SPARK-45575][SQL] Support time travel options for df read API [spark]
"OPTION" : { +"message" : [ + "Timestamp string in the options should be in the format of '-MM-dd HH:mm:ss[.us][zone_id]'." Review Comment: The given formatter might be insufficient for valid timestamps and the `[.us][zone_id]` part diff
Re: [PR] [SPARK-40559][PYTHON] Add applyInArrow to groupBy and cogroup [spark]
list (https://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html), and see if people like, or ping other committers if there are some support here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to
Re: Pandas timezone problems
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: [pyspark] Starting workers in a virtualenv
the same machine > in local mode. > > Thanks in advance! > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional co
Re: If not stop StreamingContext gracefully, will checkpoint data be consistent?
be > consistent? > Or I should always gracefully shutdown the application even in order to > use the checkpoint? > > Thank you very much! > > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.
Re: Spark or Storm
to see what is equivalent of Bolt in storm inside spark. Any help will be appreciated on this ? Thanks , Ashish - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s
Re: Not getting event logs >= spark 1.3.1
-- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Futures timed out after 10000 milliseconds
abble.com/Futures-timed-out-after-1-milliseconds-tp23622p23629.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o
Re: Is it possible to change the default port number 7077 for spark?
r List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Thanks and Regards, Arun Verma
Re: it seem like the exactly once feature not work on spark1.4
shing out the data > > SparkStreaming + Kafka only provide an exactly-once guarantee on step 1 & 2 > We need to ensure exactly once on step 3 by myself. > > More details see base on > http://spark.apache.org/docs/latest/streaming-programming-guide.html > <http://spark.apache.or
Re: DataFrame more efficient than RDD?
active; > > https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection > <https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection> > > > Is a DataFrame more efficient (space-wise) than an RDD for
Re: Unit testing framework for Spark Jobs?
nit-testing-framework-for-Spark-Jobs-tp26380.html >Sent from the Apache Spark User List mailing list archive at Nabble.com. > >----- >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
Re: SFTP Compressed CSV into Dataframe
emote file decompressed, read, and loaded. Can someone give me any hints? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.
Re: How to reduce the Executor Computing Time.
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-reduce-the-Executor-Computing-Time-tp26623.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail
Re: How to estimate the size of dataframe using pyspark?
ow-to-estimate-the-size-of-dataframe-using-pyspark-tp26729.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For addi
Re: Save DataFrame to HBase
6 at 6:52 AM, Benjamin Kim mailto:bbuil...@gmail.com>> wrote: Has anyone found an easy way to save a DataFrame into HBase? Thanks, Ben - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@
Re: How to insert data for 100 partitions at a time using Spark SQL
SQL-tp26997.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a
Re: Spark 2.0 Preview After caching query didn't work and can't kill job.
> I ran in cluster 5 nodes in spark-shell. > > Did anyone has this issue? > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org
Re: Spark Website
Thanks for reporting. This is due to https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-12055 On Wed, Jul 13, 2016 at 11:52 AM, Pradeep Gollakota wrote: > Worked for me if I go to https://spark.apache.org/site/ but not > https://spark.apache.org > > On Wed, Jul 13
Re: Bzip2 to Parquet format
the RDD[Row] to create a DataFrame. http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.types.StructType> Once you have the DataFrame, save it to parquet with datafram
Re: frequent itemsets
Hi Yanbo, Unfortunately, I cannot share the data. I am using the code in the tutorial https://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html Did you ever try run it when there are hundreds of millions of co-purchases of at least two products? I suspect AR does not handle that
Re: Serializing DataSets
; >> DataFrame and use the writers there? >> >> >> >> ----- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > > > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Re: --driver-java-options not support multiple JVM configuration ?
s \ > > You need quotes around "$sparkdriverextraJavaOptions". > > -- > Marcelo > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -- Marcelo
Re: Spark with SAS
ies in Spark? >> >> For example calling Spark Jobs from SAS using Spark SQL through Spark SQL's >> JDBC/ODBC library. >> >> Regards, >> Sourav > > ----- > To unsubs
RE: Union Parquet, DataFrame
Worked perfectly. Thanks very much Silvio. From: Silvio Fiorito [mailto:silvio.fior...@granturing.com] Sent: Tuesday, March 01, 2016 2:14 PM To: Fernandez, Andres; user@spark.apache.org Subject: Re: Union Parquet, DataFrame Just replied to your other email, but here’s the same thing: Just do
Spark mailing list confusion
Does anyone have any idea why some topics on the mailing list end up on https://www.mail-archive.com/user@spark.apache.org e.g. this message thread <https://www.mail-archive.com/user@spark.apache.org/msg37855.html> , but not on http://apache-spark-user-list.1001560.n3.nabble.com ? Whilst
Re: spark-ec2 config files.
scribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>
Re: API to run spark Jobs
submit my spark app(python) to the cluster without using > spark-submit, actually i need to invoke jobs from UI > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.ap
unsubscribe
ew this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Filter-RDD-tp25133p25148.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ---
Re: Using spark in cluster mode
Hi, Start here -> http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds and then hop to http://spark.apache.org/docs/latest/spark-standalone.html. Once done, be back with your questions. I think it's gonna help a lot. Pozdrawiam, Jacek
Re: out of memory error with Parquet
25381p25382.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >
Re: NoSuchMethodError
ith version 1.5.1 > > Can anyone please help me out in resolving this ? > > Regards, > Yogesh > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@sp
Re: Spark UI - Streaming Tab
anks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org>
Re: Release data for spark 1.6?
t;> Thanks >> Sri >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Release-data-for-spark-1-6-tp25654.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>
Re: Spark - Eclipse IDE - Maven
0.n3.nabble.com/Spark-Eclipse-IDE-Maven-tp23977.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@s
RE: Spark Interview Questions
: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>
Re: Does RDD.cartesian involve shuffling?
esian involve shuffling? >> >> Thanks! >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > &
Re: Spark SQL support for Hive 0.14
-spark-user-list.1001560.n3.nabble.com/Spark-SQL-support-for-Hive-0-14-tp24122.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.
Re: Ranger-like Security on Spark
erberos my only option > then? > > Kind regards, Daniel. > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -
[ANNOUNCE] Announcing Apache Spark 2.1.0
Hi all, Apache Spark 2.1.0 is the second release of Spark 2.x line. This release makes significant strides in the production readiness of Structured Streaming, with added support for event time watermarks <https://spark.apache.org/docs/2.1.0/structured-streaming-programming-guide.html#handl
Re: Spark Team - Paco Nathan said that your team can help
ted. > > > Thanks and regards, > Sudipta > > > > > -- > Sudipta Banerjee > Consultant, Business Analytics and Cloud Based Architecture > Call me +919019578099 > > > ----- > T
Re: spark 1.2 ec2 launch script hang
--- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: no option to add intercepts for StreamingLinearAlgorithm
pts-for-StreamingLinearAlgorithm-tp21526.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.o
Re: method newAPIHadoopFile
, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apa
Re: Get importerror when i run pyspark with ipython=1
-list.1001560.n3.nabble.com/Get-importerror-when-i-run-pyspark-with-ipython-1-tp21843.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu
Re: Upgrade to Spark 1.2.1 using Guava
ar blah -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubsc
Re: Training Random Forest
-user-list.1001560.n3.nabble.com/Training-Random-Forest-tp21935.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For
Re: Spark Streaming input data source list
t; Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: Spark Streaming input data source list Spark Streaming has StreamingContext.socketStream() http://spark.apache.org/docs/1.2.1/api/java/org/apache/spark/streaming/StreamingContext.html#s
Re: Reading a text file into RDD[Char] instead of RDD[String]
tring], > > > > Can anyone suggest the most efficient way to create the RDD[Char] ? I’m > sure I’ve missed something simple… > > > > Regards, > > Mike > > -
Mailing list schizophrenia?
I notice that some people send messages directly to user@spark.apache.org and some via nabble, either using email or the web client. There are two index sites, one directly at apache.org and one at nabble. But messages sent directly to user@spark.apache.org only show up in the apache list
Re: Does HiveContext connect to HiveServer2?
park-user-list.1001560.n3.nabble.com/Does-HiveContext-connect-to-HiveServer2-tp22200.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.ap
Re: Add row IDs column to data frame
D", rowDF("ID")) > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Add-row-IDs-column-to-data-frame-tp22385.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > >
Re: java.lang.ClassCastException: scala.Tuple2 cannot be cast to org.apache.spark.mllib.regression.LabeledPoint
eciated. > > Thanks! > > J > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > ----- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark.dynamicAllocation.minExecutors
gt;> >> + private val minNumExecutors = >> conf.getInt("spark.dynamicAllocation.minExecutors", 0) >> ... >> + if (maxNumExecutors == 0) { >> + throw new SparkException("spark.dynamicAllocation.maxExecutors cannot be >> 0!") > &g
Re: SPARKTA: a real-time aggregation engine based on Spark Streaming
hive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > -
Re: Streaming + SQL : How to resgister a DStream content as a table and access it
t-as-a-table-and-access-it-tp11372.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For a
Re: How to implement multinomial logistic regression(softmax regression) in Spark?
t.1001560.n3.nabble.com/How-to-implement-multinomial-logistic-regression-softmax-regression-in-Spark-tp11939p12175.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe,
Re: application as a service
context: > http://apache-spark-user-list.1001560.n3.nabble.com/application-as-a-service-tp12253p12267.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-u
Re: Python script runs fine in local mode, errors in other modes
list.1001560.n3.nabble.com/Python-script-runs-fine-in-local-mode-errors-in-other-modes-tp12390p12398.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsu
Re: Key-Value in PairRDD
I'd suggest first reading the scaladoc for RDD and PairRDDFunctions to familiarize yourself with all the operations available: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD http://spark.apache.org/docs/latest/api/scala/index
Re: New features (Discretization) for v1.x in xiangrui.pdf
nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13338.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spar
Re: Running spark-shell (or queries) over the network (not from master)
master-tp13543p13595.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands,
Re: Efficient way to sum multiple columns
001560.n3.nabble.com/Efficient-way-to-sum-multiple-columns-tp14281.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@
Re: Null values in pyspark Row
ull-values-in-pyspark-Row-tp15065.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark.apache
Re: java.lang.OutOfMemoryError while running SVD MLLib example
user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-while-running-SVD-MLLib-example-tp14972p15083.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr.
Re: Access by name in "tuples" in Scala with Spark
60.n3.nabble.com/Access-by-name-in-tuples-in-Scala-with-Spark-tp15212.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To unsubscribe, e-mail: user-unsubscr...@spark
Re: Trouble getting filtering on field correct
sage in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Trouble-getting-filtering-on-field-correct-tp15728.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----- > To u