Hi all:
I am attempting to execute a simple test of the SparkSQL system capability
of persisting to parquet files...
My code is:
val conf = new SparkConf()
.setMaster( """local[1]""")
.setAppName("test")
implicit val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark
Hello all:
I am attempting to persist a parquet file comprised of a SchemaRDD of nested
case classes...
Creating a schemaRDD object seems to work fine, but exception is thrown when
I attempt to persist this object to a parquet file...
my code:
case class Trivial(trivial: String = "trivial", l
Thanks. That might be a good note to add to the official Programming
Guide...
On Thu, Jun 26, 2014 at 5:05 PM, Michael Armbrust [via Apache Spark User
List] wrote:
> Nested parquet is not supported in 1.0, but is part of the upcoming 1.0.1
> release.
>
>
> On Thu, Jun 26, 2014 at 3:03 PM, [hidd
I am (not) seeing this also... No items in the storage UI page. using 1.0
with HDFS...
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-see-any-thing-one-the-storage-panel-of-application-UI-tp10296p11361.html
Sent from the Apache Spark User List mailing
Good idea Andrew... Using this feature allowed me to debug that my app
wasn't caching properly-- the UI is working as designed for me in 1.0. It
might be a good idea to say "no cached blocks" instead of an empty page...
just a thought...
On Mon, Aug 4, 2014 at 1:17 PM, Andrew Or-2 [via Apache Spa
I'm sure this must be a fairly common use-case for spark, yet I have not
found a satisfactory discussion of it on the spark website or forum:
I work at a company with a lot of previous-generation server hardware
sitting idle-- I want to add this hardware to my spark cluster to increase
performance
Similarly, I am seeing tasks moved to the "completed" section which
apparently haven't finished all elements... (succeeded/total < 1)... is this
related?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/All-of-the-tasks-have-been-completed-but-the-Stage-is-st
I am having a similar problem:
I have a large dataset in HDFS and (for a few possible reason including a
filter operation, and some of my computation nodes simply not being hdfs
datanodes) have a large skew on my RDD blocks: the master node always has
the most, while the worker nodes have few... (
I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single or dual
quad core processor cores per machine. I think I will have them loaded with
CentOS. Eventually, I may want to add GPUs on the nodes to handle linear
alg. operations...
My Idea has been:
1) to find a way to configure Spar
Jörn, thanks for the post...
Unfortunately, I am stuck with the hardware I have and might not be
able to get budget allocated for a new stack of servers when I've
already got so many "ok" servers on hand... And even more
unfortunately, a large subset of these machines are... shall we say...
extrem
This is what I thought the simplest method would be, but I can't seem to
figure out how to configure it--
When you set:
SPARK_WORKER_INSTANCES, to set the number of worker processes per node
but when you set
SPARK_WORKER_MEMORY, to set how much total memory workers have to give
executors (e.g.
I wonder if anyone has any tips for using repartition?
It seems that when you call the repartition method, the entire RDD gets
split up, shuffled, and redistributed... This is an extremely heavy task if
you have a large hdfs dataset and all you want to do is make sure your RDD
is balance/ data ske
It would probably be much easier to use an existing library in conjunction
with mlib or spark. I have been using jblas as my matrix backend for machine
learning work (I don't use mlib yet)... but jblas does not support sparse
matrices afaik. Consider checking scala Breeze. Interestingly, I have fou
I think that should be possible. Make sure spark is installed on your local
machine and is the same version as on the cluster.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-shell-or-queries-over-the-network-not-from-master-tp13543p13590.html
14 matches
Mail list logo