hi Akhil,
I just use property key LD_LIBRARY_PATH in conf/spark-env.xml instead of
SPARK_LIBRARY_PATH which points to the path of native, it works.
thanks.
On Tue, Sep 8, 2015 at 6:14 PM, Akhil Das
wrote:
> Looks like you are having different versions of snappy library. Here's a
> similar disc
Hi All,
I'm trying to build a distribution off of the latest in master and I keep
getting errors on MQTT and the build fails. I'm running the build on a
m1.large which has 7.5 GB of RAM and no other major processes are running.
MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=5
Hi svelusamy,
Were you able to make it work? I am facing the exact same problem. Getting
connection timed when trying to access S3.
Thank you.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-files-from-S3-from-Spark-local-when-there-is-a-http-p
Thans, Igor; I've got it running again right now, and can attach the stack
trace when it finishes.
In the mean time, I've noticed something interesting: in the Spark UI, the
application jar that I submit is not being included on the classpath. It
has been successfully uploaded to the nodes -- in
1.in order to change log4j.properties at the name node, u can change
/home/hadoop/log4j.properties.
2.in order to change log4j.properties for the container logs, u need to
change it at the yarn containers jar, since they hard-coded loading the
file directly from project resources.
2.1 ssh to the
In Spark 1.4 and 1.5, you can do something like this:
df.write.partitionBy("key").parquet("/datasink/output-parquets")
BTW, I'm curious about how did you do it without partitionBy using
saveAsHadoopFile?
Cheng
On 9/8/15 2:34 PM, Adrien Mogenet wrote:
Hi there,
We've spent several hours to
Yeah, this is a typical Parquet interoperability issue due to
unfortunate historical reasons. Hive (actually parquet-hive) gives the
following schema for array:
message m0 {
optional group f (LIST) {
repeated group bag {
optional int32 array_element;
}
}
}
while Spark SQL gives
me
Here's the mesos master log
>
> I0908 15:08:16.515960 301916160 master.cpp:1767] Received registration
> request for framework 'Spark shell' at
> scheduler-1ea1c85b-68bd-40b4-8c7c-ddccfd56f82b@192.168.3.3:57133
> I0908 15:08:16.520545 301916160 master.cpp:1834] Regist
Try to add a filter to remove/replace the null elements within/before the
map operation.
Thanks
Best Regards
On Mon, Sep 7, 2015 at 3:34 PM, ZhengHanbin wrote:
> Hi,
>
> I am using spark streaming to join every RDD of a DStream to a stand alone
> RDD to generate a new DStream as followed:
>
> *
Looks like you are having different versions of snappy library. Here's a
similar discussion if you haven't seen it already
http://stackoverflow.com/questions/22150417/hadoop-mapreduce-java-lang-unsatisfiedlinkerror-org-apache-hadoop-util-nativec
Thanks
Best Regards
On Mon, Sep 7, 2015 at 7:41 AM,
Hi All,
I'd like to apply a chain of Spark transformations (map/filter) on a given
JavaRDD. I'll have the set of Spark transformations as Function, and
even though I can determine the classes of T and A at the runtime, due to
the type erasure, I cannot call JavaRDD's transformations as they expect
Hi, community
I have an application which I try to migrate from MR to Spark.
It will do some calculations from Hive and output to hfile which will
be bulk load to HBase Table, details as follow:
Rdd input = getSourceInputFromHive()
Rdd> mapSideResult =
input.glom().mapPartition
Haha ok, its one of those days, Array isn't valid. RTFM and it says
Catalyst array maps to a Scala Seq, that makes sense.
So it works! Two follow up questions;
1 - Is this the best approach?
2 - what if I want my expression to return multiple rows? - my binary
classification model gives me a arra
Sorry for the spam - I had some success;
case class ScoringDF(function: Row => Double) extends Expression {
val dataType = DataTypes.DoubleType
override type EvaluatedType = Double
override def eval(input: Row): EvaluatedType = {
function(input)
}
override def nullable: Boolean =
as a starting point, attach your stacktrace...
ps: look for duplicates in your classpath, maybe you include another jar
with same class
On 8 September 2015 at 06:38, Nicholas R. Peterson
wrote:
> I'm trying to run a Spark 1.4.1 job on my CDH5.4 cluster, through Yarn.
> Serialization is set to us
So basically I need something like
df.withColumn("score", new Column(new Expression {
...
def eval(input: Row = null): EvaluatedType = myModel.score(input)
...
}))
But I can't do this, so how can I make a UDF or something like it, that can
take in a Row and pass back a double value or some str
Not sure how that would work. Really I want to tack on an extra column onto
the DF with a UDF that can take a Row object.
On Tue, Sep 8, 2015 at 1:54 AM, Jörn Franke wrote:
> Can you use a map or list with different properties as one parameter?
> Alternatively a string where parameters are Comma
I0908 15:08:16.515960 301916160 master.cpp:1767] Received registration
request for framework 'Spark shell' at
scheduler-1ea1c85b-68bd-40b4-8c7c-ddccfd56f82b@192.168.3.3:57133
I0908 15:08:16.520545 301916160 master.cpp:1834] Registering framework
20150908-143320-16777343-5050-41965-0
Ok, thanks Reynold. When I tested dynamic allocation with Spark 1.4, it
complained saying that it was not tungsten compliant. Lets hope it works
with 1.5 then!
On Tue, Sep 8, 2015 at 5:49 AM Reynold Xin wrote:
>
> On Wed, Sep 2, 2015 at 12:03 AM, Anders Arpteg wrote:
>
>>
>> BTW, is it possible
Try using a custom partitioner for the keys so that they will get evenly
distributed across tasks
Thanks
Best Regards
On Fri, Sep 4, 2015 at 7:19 PM, mark wrote:
> I am trying to tune a Spark job and have noticed some strange behavior -
> tasks in a stage vary in execution time, ranging from 2
Compiling from source with Scala 2.11 support fixed this issue. Thanks
again for the help!
On Tue, Sep 8, 2015 at 7:33 AM, Gheorghe Postelnicu <
gheorghe.posteln...@gmail.com> wrote:
> Good point. It is a pre-compiled Spark version. Based on the text on the
> downloads page, the answer to your
101 - 121 of 121 matches
Mail list logo