java.nio.BufferUnderflowException
Can you try reading the same data in Scala?
From: Liana Napalkova
Sent: Wednesday, January 10, 2018 12:04:00 PM
To: Timur Shenkao
Cc: user@spark.apache.org
Subject: Re: py4j.protocol.Py4JJavaError:
Yes it is issue with the newer release of testthat.
To workaround could you install an earlier version with devtools? will follow
up for a fix.
_
From: Hyukjin Kwon
Sent: Wednesday, February 14, 2018 6:49 PM
Subject: Re: SparkR test script
Generally that would be the approach.
But since you have effectively double the number of edges this will likely
affect the scale your job will run.
From: xiaobo
Sent: Monday, February 19, 2018 3:22:02 AM
To: user@spark.apache.org
Subject:
No it does not support bi directional edges as of now.
_
From: xiaobo <guxiaobo1...@qq.com>
Sent: Tuesday, February 20, 2018 4:35 AM
Subject: Re: [graphframes]how Graphframes Deal With BidirectionalRelationships
To: Felix Cheung <felixcheun...@hotmail.co
Hi - I’m maintaining it. As of now there is an issue with 2.2 that breaks
personalized page rank, and that’s largely the reason there isn’t a release for
2.2 support.
There are attempts to address this issue - if you are interested we would love
for your help.
7 9:13 PM
Subject: Re: Passing an array of more than 22 elements in a UDF
To: Felix Cheung <felixcheun...@hotmail.com>
Cc: ayan guha <guha.a...@gmail.com>, user <user@spark.apache.org>
What's the privilege of using that specific version for this? Please throw some
light onto i
I think you are looking for spark.executor.extraJavaOptions
https://spark.apache.org/docs/latest/configuration.html#runtime-environment
From: Christopher Piggott
Sent: Tuesday, December 26, 2017 8:00:56 AM
To: user@spark.apache.org
Subject:
I'm not sure we have completed support for Java 10
From: Rahul Agrawal
Sent: Thursday, June 21, 2018 7:22:42 AM
To: user@spark.apache.org
Subject: Spark 2.3.1 not working on Java 10
Dear Team,
I have installed Java 10, Scala 2.12.6 and spark 2.3.1 in my
It’s best to start with Structured Streaming
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#tab_python_0
https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#tab_python_0
_
From: Aakash Basu
Instead of write to console you need to write to memory for it to be queryable
.format("memory")
.queryName("tableName")
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#output-sinks
From: Aakash Basu
You might want to check with the spark-on-k8s
Or try using kubernetes from the official spark 2.3.0 release. (Yes we don't
have an official docker image though but you can build with the script)
From: Rico Bergmann
Sent: Wednesday, April
If your data can be split into groups and you can call into your favorite R
package on each group of data (in parallel):
https://spark.apache.org/docs/latest/sparkr.html#run-a-given-function-on-a-large-dataset-grouping-by-input-columns-and-using-gapply-or-gapplycollect
There is a proposal to expose them. See SPARK-14151
From: Christopher Piggott
Sent: Friday, March 16, 2018 1:09:38 PM
To: user@spark.apache.org
Subject: Custom metrics sink
Just for fun, i want to make a stupid program that makes different
That's in the plan. We should be sharing a bit more about the roadmap in future
releases shortly.
In the mean time this is in the official documentation on what is coming:
https://spark.apache.org/docs/latest/running-on-kubernetes.html#future-work
This supports started as a fork of the Apache
For pyspark specifically IMO should be very high on the list to port back...
As for roadmap - should be sharing more soon.
From: lucas.g...@gmail.com <lucas.g...@gmail.com>
Sent: Friday, March 2, 2018 9:41:46 PM
To: user@spark.apache.org
Cc: Felix Cheung
S
Yes you were pointing to HDFS on a loopback address...
From: Jenna Hoole
Sent: Monday, February 26, 2018 1:11:35 PM
To: Yinan Li; user@spark.apache.org
Subject: Re: Spark on K8s - using files fetched by init-container?
Oh, duh. I
1 seems like its spending a lot of time in R (slicing the data I guess?) and
not with Spark
2 could you write it into a csv file locally and then read it from Spark?
From: ayan guha
Sent: Monday, October 8, 2018 11:21 PM
To: user
Subject: SparkR issue
Hi
We
Not officially. We have seen problem with JDK 10 as well. It will be great if
you or someone would like to contribute to get it to work..
From: kant kodali
Sent: Tuesday, September 25, 2018 2:31 PM
To: user @spark
Subject: can Spark 2.4 work on JDK 11?
Hi All,
It looks like the native R process is terminated from buffer overflow. Do you
know how much data is involved?
From: Junior Alvarez
Sent: Wednesday, September 26, 2018 7:33 AM
To: user@spark.apache.org
Subject: spark.lapply
Hi!
I’m using spark.lapply() in
Not as far as I recall...
From: Serega Sheypak
Sent: Friday, January 18, 2019 3:21 PM
To: user
Subject: Spark on Yarn, is it possible to manually blacklist nodes before
running spark job?
Hi, is there any possibility to tell Scheduler to blacklist specific
You can call coalesce to combine partitions..
From: Shivam Sharma <28shivamsha...@gmail.com>
Sent: Saturday, January 19, 2019 7:43 AM
To: user@spark.apache.org
Subject: Persist Dataframe to HDFS considering HDFS Block Size.
Hi All,
I wanted to persist dataframe
From: Li Gao
Sent: Saturday, January 19, 2019 8:43 AM
To: Felix Cheung
Cc: Serega Sheypak; user
Subject: Re: Spark on Yarn, is it possible to manually blacklist nodes before
running spark job?
on yarn it is impossible afaik. on kubernetes you can use taints
Do you mean you run the same code on yarn and standalone? Can you check if they
are running the same python versions?
From: Bryan Cutler
Sent: Thursday, January 10, 2019 5:29 PM
To: libinsong1...@gmail.com
Cc: zlist Spark
Subject: Re: spark2.4 arrow enabled
I don’t think we should remove any API even in a major release without
deprecating it first...
From: Mark Hamstra
Sent: Sunday, September 16, 2018 12:26 PM
To: Erik Erlandson
Cc: user@spark.apache.org; dev
Subject: Re: Should python-2 be supported in Spark 3.0?
About deployment/serving
SPIP
https://issues.apache.org/jira/browse/SPARK-26247
From: Riccardo Ferrari
Sent: Tuesday, January 22, 2019 8:07 AM
To: User
Subject: I have trained a ML model, now what?
Hi list!
I am writing here to here about your experience on
If anyone wants to improve docs please create a PR.
lol
But seriously you might want to explore other projects that manage job
submission on top of spark instead of rolling your own with spark-submit.
From: Pat Ferrel
Sent: Tuesday, March 26, 2019 2:38 PM
Hmm thanks. Do you have a proposed solution?
From: Jhon Anderson Cardenas Diaz
Sent: Monday, March 18, 2019 1:24 PM
To: user
Subject: Spark - Hadoop custom filesystem service loading
Hi everyone,
On spark 2.2.0, if you wanted to create a custom file system
You should check with HDInsight support
From: Jay Singh
Sent: Wednesday, February 20, 2019 11:43:23 PM
To: User
Subject: Spark-hive integration on HDInsight
I am trying to integrate spark with hive on HDInsight spark cluster .
I copied hive-site.xml in
: Thijs Haarhuis
Sent: Thursday, February 14, 2019 4:01 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Sure..
I have the following code:
printSchema(results)
cat("\n\n\n")
firstRow <- first(results
Please share your code
From: Thijs Haarhuis
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has
a binary
And it might not work completely. Spark only officially supports JDK 8.
I’m not sure if JDK 9 and + support is complete?
From: Jungtaek Lim
Sent: Thursday, February 7, 2019 5:22 AM
To: Gabor Somogyi
Cc: Hande, Ranjit Dilip (Ranjit); user@spark.apache.org
there:
From: Thijs Haarhuis
Sent: Tuesday, February 19, 2019 5:28 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Thanks. I got it working now by using the unlist function.
I have another question, maybe you can help me with, since I did
Please comment in the JIRA/SPIP if you are interested! We can see the community
support for a proposal like this.
From: Pola Yao
Sent: Wednesday, January 23, 2019 8:01 AM
To: Riccardo Ferrari
Cc: Felix Cheung; User
Subject: Re: I have trained a ML model, now
And a plug for the Graph Processing track -
A discussion of comparison talk between the various Spark options (GraphX,
GraphFrames, CAPS), or the ongoing work with SPARK-25994 Property Graphs,
Cypher Queries, and Algorithms
Would be great!
From: Felix Cheung
Hi Spark community!
As you know ApacheCon NA 2019 is coming this Sept and it’s CFP is now open!
This is an important milestone as we celebrate 20 years of ASF. We have tracks
like Big Data and Machine Learning among many others. Please submit your
talks/thoughts/challenges/learnings here:
This seem to be more a question of spark-sql shell? I may suggest you change
the email title to get more attention.
From: ya
Sent: Wednesday, June 5, 2019 11:48:17 PM
To: user@spark.apache.org
Subject: sparksql in sparkR?
Dear list,
I am trying to use sparksql
We don’t usually reference a future release on website
> Spark website and state that Python 2 is deprecated in Spark 3.0
I suspect people will then ask when is Spark 3.0 coming out then. Might need to
provide some clarity on that.
From: Reynold Xin
Sent:
.
From: shane knapp
Sent: Friday, May 31, 2019 7:38:10 PM
To: Denny Lee
Cc: Holden Karau; Bryan Cutler; Erik Erlandson; Felix Cheung; Mark Hamstra;
Matei Zaharia; Reynold Xin; Sean Owen; Wenchen Fen; Xiangrui Meng; dev; user
Subject: Re: Should python-2 be supported in Spark 3.0?
+1000
I don’t think you should get a hive-xml from the internet.
It should have connection information about a running hive metastore - if you
don’t have a hive metastore service as you are running locally (from a laptop?)
then you don’t really need it. You can get spark to work with it’s own.
You could
df.filter(col(“c”) = “c1”).write().partitionBy(“c”).save
It could get some data skew problem but might work for you
From: Burak Yavuz
Sent: Tuesday, May 7, 2019 9:35:10 AM
To: Shubham Chaurasia
Cc: dev; user@spark.apache.org
Subject: Re: Static
Not currently in Spark.
However, there are systems out there that can share DataFrame between languages
on top of Spark - it’s not calling the python UDF directly but you can pass the
DataFrame to python and then .map(UDF) that way.
From: Fiske, Danny
Sent:
That’s great!
From: ☼ R Nair
Sent: Saturday, August 24, 2019 10:57:31 AM
To: Dongjoon Hyun
Cc: d...@spark.apache.org ; user @spark/'user
@spark'/spark users/user@spark
Subject: Re: JDK11 Support in Apache Spark
Finally!!! Congrats
On Sat, Aug 24, 2019, 11:11
I think you will get more answer if you ask without SparkR.
You question is independent on SparkR.
Spark support for Hive 3.x (3.1.2) was added here
https://github.com/apache/spark/commit/1b404b9b9928144e9f527ac7b1caa15f932c2649
You should be able to connect Spark to Hive metastore.
Maybe it’s the reverse - the package is built to run in latest but not
compatible with slightly older (3.5.2 was Dec 2018)
From: Jeff Zhang
Sent: Thursday, December 26, 2019 5:36:50 PM
To: Felix Cheung
Cc: user.spark
Subject: Re: Fail to use SparkR of 3.0
It looks like a change in the method signature in R base packages.
Which version of R are you running on?
From: Jeff Zhang
Sent: Thursday, December 26, 2019 12:46:12 AM
To: user.spark
Subject: Fail to use SparkR of 3.0 preview 2
I tried SparkR of spark 3.0
-- Forwarded message -
We are pleased to announce that ApacheCon @Home will be held online,
September 29 through October 1.
More event details are available at https://apachecon.com/acah2020 but
there’s a few things that I want to highlight for you, the members.
Yes, the CFP
Congrats
From: Jungtaek Lim
Sent: Thursday, June 18, 2020 8:18:54 PM
To: Hyukjin Kwon
Cc: Mridul Muralidharan ; Reynold Xin ;
dev ; user
Subject: Re: [ANNOUNCE] Apache Spark 3.0.0
Great, thanks all for your efforts on the huge step forward!
On Fri, Jun 19,
Congrats and thanks!
From: Hyukjin Kwon
Sent: Wednesday, March 3, 2021 4:09:23 PM
To: Dongjoon Hyun
Cc: Gabor Somogyi ; Jungtaek Lim
; angers zhu ; Wenchen Fan
; Kent Yao ; Takeshi Yamamuro
; dev ; user @spark
Subject: Re: [ANNOUNCE] Announcing Apache Spark
101 - 148 of 148 matches
Mail list logo