ashok34...@yahoo.com.INVALID wrote:
Is it possible to use Spark docker built on GCP on AWS without
rebuilding from new on AWS?
I am using the spark image from bitnami for running on k8s.
And yes, it's deployed by helm.
--
https://kenpeng.pages.dev/
We use Spark with NFS as the data store, mainly using Dr. Jeremy Freeman’s Thunder framework. Works very well (and I see HUGE throughput on the storage system during loads). I haven’t seen (or heard from the devs/users) a need for HDFS or S3.
—Ken
On Aug 25, 2016, at 8:02 PM
Hi Deepak,
Yes, that’s about the size of it. The spark job isn’t filling the disk by any stretch of the imagination; in fact the only stuff that’s writing to the disk from Spark in certain of these instances is the logging.
Thanks,
—Ken
On Jun 16, 2016, at 12:17 PM
attempted to run with 15 cores out of 16 and 25GB of RAM out of 128. He still lost nodes.
4. He’s currently running storage benchmarking tests, which consist mainly of shuffles.
Thanks!
Ken
On Jun 16, 2016, at 8:00 AM, Deepak Goel <deic...@gmail.com> wrote:
I am no expert, but some
anyone seen anything like this? Any ideas where to look next?
Thanks,
Ken
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
accordingly.
Thanks!
—Ken
On Apr 3, 2016, at 11:06 AM, Yong Zhang <java8...@hotmail.com> wrote:
In the standalone mode, it applies to the Driver JVM processor heap size.
You should consider giving enough memory space to it, in standalone mode, due to:
1) Any data you brin
both 256GB nodes and 128GB nodes available for use as the
driver)
Thanks,
Ken
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
cluster work as far as failed workers.
Thanks again,
—Ken
On Mar 26, 2016, at 4:08 PM, Sven Krasser <kras...@gmail.com> wrote:
My understanding is that the spark.executor.cores setting controls the number of worker threads in the executor in the JVM. Each worker
threads?
Thanks!
Ken
On Mar 25, 2016, at 9:10 PM, Sven Krasser <kras...@gmail.com> wrote:
Hey Ken,
I also frequently see more pyspark daemons than configured concurrency, often it's a low multiple. (There was an issue pre-1.3.0 that caused this to be quite a bit
own, driving the load up. I’m hoping someone has seen something like this.
—Ken
On Mar 21, 2016, at 3:07 PM, Carlile, Ken <carli...@janelia.hhmi.org> wrote:
No further input on this? I discovered today that the pyspark.daemon threadcount was actually 48, which makes a littl
No further input on this? I discovered today that the pyspark.daemon threadcount was actually 48, which makes a little more sense (at least it’s a multiple of 16), and it seems to be happening at reduce and collect portions of the code.
—Ken
On Mar 17, 2016, at 10:51 AM, Carlile
.
—Ken
On Mar 17, 2016, at 10:50 AM, Ted Yu <yuzhih...@gmail.com> wrote:
I took a look at docs/configuration.md
Though I didn't find answer for your first question, I think the following pertains to your second question:
spark.python.worker.memory
30GB to play
with, assuming there is no overhead outside the JVM’s 90GB heap (ha ha.)
Thanks,
Ken Carlile
Sr. Unix Engineer
HHMI/Janelia Research Campus
571-209-4363
shuffling via a merge join?
I know that Flink supports this, but its JDBC support is pretty lacking in
general.
Thanks,
Ken
java version as 1.8, but I just got the
same error with invalid source release: 1.8 instead of 1.7.
My java -version and javac -version are reporting as 1.8.0.45, and I have the
JAVA_HOME env set. Anyone have any ideas?
Incidentally, building 2.0.0 from source worked fine…
Thanks,
Ken
today:
https://unscrupulousmodifier.wordpress.com/2015/07/20/running-spark-as-a-job-on-a-grid-engine-hpc-cluster-part-1
—Ken
> On Dec 21, 2015, at 4:00 PM, MegaLearn <j...@megalearningllc.com> wrote:
>
> How do you start the Spark daemon, directly?
> https://issues.apache.org/jira/
Dani, this appears to be addressed in SPARK-5567, scheduled for Spark 1.5.0.
Ken
On May 21, 2015, at 11:12 PM, user-digest-h...@spark.apache.org wrote:
From: Dani Qiu zongmin@gmail.com
Subject: LDA prediction on new document
Date: May 21, 2015 at 8:48:40 PM PDT
To: user
From: Williams, Ken Williams
ken.willi...@windlogics.commailto:ken.willi...@windlogics.com
Date: Thursday, March 19, 2015 at 10:59 AM
To: Spark list user@spark.apache.orgmailto:user@spark.apache.org
Subject: JAVA_HOME problem with upgrade to 1.3.0
[…]
Finally, I go and check the YARN
Log Length: 0
I’m not sure how to interpret that – is '{{JAVA_HOME}}' a literal (including
the brackets) that’s somehow making it into a script? Is this coming from the
worker nodes or the driver? Anything I can do to experiment troubleshoot?
-Ken
JAVA_HOME=/usr/jdk64/jdk1.6.0_31
-Ken
CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information. Any
unauthorized review, use, disclosure or distribution of any kind
I am using Spark SQL from Hive table with Parquet SerDe. Most queries are
executed from Spark's JDBC Thrift server. Is there more efficient way to
access/query data? For example, using saveAsParquetFile() and parquetFile()
to save/load Parquet data and run queries directly?
Thanks,
Ken
--
View
Thanks Akhil.
So the worker spark node doesn't need access to metastore to run Hive
queries? If yes, which component accesses the metastore?
For Hive, the Hive-cli accesses the metastore before submitting M/R jobs.
Thanks,
Ken
--
View this message in context:
http://apache-spark-user-list
Does a Spark worker node need access to Hive's metastore if part of a job
contains Hive queries?
Thanks,
Ken
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Worker-node-accessing-Hive-metastore-tp17255.html
Sent from the Apache Spark User List
I am using Spark's Thrift server to connect to Hive and use JDBC to issue
queries. Is there a way to cache table in Sparck by using JDBC call?
Thanks,
Ken
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/cache-table-with-JDBC-tp12675.html
Sent from
Is Spark SQL Thrift Server part of the 1.0.2 release? If not, which release is
the target?
Thanks,
Ken
What is the best way to run Hive queries in 1.0.2? In my case. Hive queries
will be invoked from a middle tier webapp. I am thinking to use the Hive JDBC
driver.
Thanks,
Ken
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Wednesday, August 20, 2014 9:38 AM
To: Tam, Ken K
Cc: user
of
CPU time). After that, I did 'SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true
sbt/sbt assembly' and that took 25 minutes wallclock, 73 minutes CPU.
Is that typical? Or does that indicate some setup problem in my environment?
--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com
way to get more information about where in the
process it's failing?
Thanks.
--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com
CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended
recipient(s) and may
the Hadoop command-line tools do, but
that's not so important.
-Ken
-Original Message-
From: Williams, Ken [mailto:ken.willi...@windlogics.com]
Sent: Monday, April 21, 2014 2:04 PM
To: Spark list
Subject: Problem connecting to HDFS in Spark shell
I'm trying to get my feet wet with Spark
-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com]
Hi Ken,
On Mon, Apr 21, 2014 at 1:39 PM, Williams, Ken
ken.willi...@windlogics.com wrote:
I haven't figured out how to let the hostname default to the host
mentioned in our /etc/hadoop/conf/hdfs-site.xml like
14/04/10 08:00:42 INFO AppClient$ClientActor: Executor added:
app-20140410080041-0017/9 on worker-20140409145028-ken-
VirtualBox-39159 (ken-VirtualBox:39159) with 4 cores
14/04/10
08:00:42 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140410080041-0017/9 on hostPort ken
Sorry, I forgot to mention this is spark-0.9.1 and shark-0.9.1.
Ken
On Thursday, April 10, 2014 9:02 AM, Ken Ellinwood kellinw...@yahoo.com wrote:
14/04/10 08:00:42 INFO AppClient$ClientActor: Executor added:
app-20140410080041-0017/9 on worker-20140409145028-ken-
VirtualBox-39159 (ken
32 matches
Mail list logo