Like this?
messages.foreachRDD(rdd = {
if(rdd.count() 0) //Do whatever you want.
})
Thanks
Best Regards
On Fri, Apr 24, 2015 at 11:20 PM, Sergio Jiménez Barrio
drarse.a...@gmail.com wrote:
Hi,
I need compare the count of messages recived if is 0 or not, but
messages.count() return a
It is solved. Thank u! Is more efficient
messages.foreachRDD(rdd = {
if(!rdd.isEmpty) //Do whatever you want.
})
2015-04-25 19:21 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
Like this?
messages.foreachRDD(rdd = {
if(rdd.count() 0) //Do whatever you want.
})
Thanks
Best
I have no problem running the socket text stream sample in the same
environment.
Thanks
Yang
Sent from my iPhone
On Apr 25, 2015, at 1:30 PM, Akhil Das ak...@sigmoidanalytics.com wrote:
Make sure you are having =2 core for your streaming application.
Thanks
Best Regards
On Sat,
May be this will give you a good start
https://github.com/apache/spark/pull/2077
Thanks
Best Regards
On Sat, Apr 25, 2015 at 1:29 AM, Giovanni Paolo Gibilisco gibb...@gmail.com
wrote:
Hi,
I would like to know if it is possible to build the DAG before actually
executing the application. My
This code is in python. Also I tried with fwd slash at the end with same
result
On 26 Apr 2015 01:36, Jeetendra Gangele gangele...@gmail.com wrote:
also if this code is in scala why not val in newsY? is this define above?
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds
newsY =
Make sure you are having =2 core for your streaming application.
Thanks
Best Regards
On Sat, Apr 25, 2015 at 3:02 AM, Yang Lei genia...@gmail.com wrote:
I hit the same issue as if the directory has no files at all when running
the sample examples/src/main/python/streaming/hdfs_wordcount.py
Even through grouping by only on name, the issue (CassCastException) still be
here.
- 原始邮件
发件人:ayan guha guha.a...@gmail.com
收件人:doovs...@sina.com
抄送人:user user@spark.apache.org
主题:Re: Spark SQL 1.3.1: java.lang.ClassCastException is thrown
日期:2015年04月25日 22点33分
Sorry if I am looking
Yes, you would need to add the MySQL driver jar to the Spark driver
executor classpath.
Either using the deprecated SPARK_CLASSPATH environment variable (which the
latest docs still recommend anyway although its deprecated) like so
export SPARK_CLASSPATH=/usr/share/java/mysql-connector.jar
Giovanni,
The DAG can be walked by calling the dependencies() function on any RDD.
It returns a Seq containing the parent RDDs. If you start at the leaves
and walk through the parents until dependencies() returns an empty Seq, you
ultimately have your DAG.
On Sat, Apr 25, 2015 at 1:28 PM, Akhil
It looks like your code is making 1 Row per item, which means that
columnSimilarities will compute similarities between users. If you
transpose the matrix (or construct it as the transpose), then
columnSimilarities should do what you want, and it will return meaningful
indices.
Joseph
On Fri,
Yes, the count() should be the first task, and the sampling + collecting
should be the second task. The first one is probably slow because the RDD
being sampled is not yet cached/materialized.
K-Means creates some RDDs internally while learning, and since they aren't
needed after learning, they
I have encountered the all-pairs similarity problem in my recommendation
system. Thanks to this databricks blog, it seems RowMatrix may come to help.
However, RowMatrix is a matrix type without meaningful row indices, thereby
I don't know how to retrieve the similarity result after invoking
Actually, Spark SQL provides a data source. Here is from documentation -
JDBC To Other Databases
Spark SQL also includes a data source that can read data from other
databases using JDBC. This functionality should be preferred over using
JdbcRDD
Yeah, same issue. I noticed this issue is not solved yet.
- 原始邮件 -
发件人:Ted Yu yuzhih...@gmail.com
收件人:doovs...@sina.com
抄送人:user user@spark.apache.org
主题:Re: Spark SQL 1.3.1: java.lang.ClassCastException is thrown
日期:2015年04月25日 22点04分
Looks like this is related:
Sorry if I am looking at the wrong issue, but your query is wrong.you
shoulf group by only on name.
On Sat, Apr 25, 2015 at 11:59 PM, doovs...@sina.com wrote:
Hi all,
When I query Postgresql based on Spark SQL like this:
dataFrame.registerTempTable(Employees)
val emps =
Hi
I am facing this weird issue.
I am on Windows, and I am trying to load all files within a folder. Here is
my code -
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds
newsY = sc.textFile(loc)
print newsY.count()
Even this simple code fails. I have tried with giving exact file names,
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds\\
On 25 April 2015 at 20:49, Jeetendra Gangele gangele...@gmail.com wrote:
Hi Ayan can you try below line
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds
On 25 April 2015 at 20:08, ayan guha guha.a...@gmail.com wrote:
Hi
I am facing this
Hi Ayan can you try below line
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds
On 25 April 2015 at 20:08, ayan guha guha.a...@gmail.com wrote:
Hi
I am facing this weird issue.
I am on Windows, and I am trying to load all files within a folder. Here
is my code -
loc =
Hi,
i am running k-means algorithm with initialization mode set to random and
various dataset sizes and values for clusters and i have a question
regarding the takeSample job of the algorithm.
More specific i notice that in every application there are two sampling
jobs. The first one is consuming
Hi all,
When I query Postgresql based on Spark SQL like this:
dataFrame.registerTempTable(Employees)
val emps = sqlContext.sql(select name, sum(salary) from Employees group
by name, salary)
monitor {
emps.take(10)
.map(row = (row.getString(0),
Looks like this is related:
https://issues.apache.org/jira/browse/SPARK-5456
On Sat, Apr 25, 2015 at 6:59 AM, doovs...@sina.com wrote:
Hi all,
When I query Postgresql based on Spark SQL like this:
dataFrame.registerTempTable(Employees)
val emps = sqlContext.sql(select name,
extra forward slash at the end. sometime I have seen this kind of issues
On 25 April 2015 at 20:50, Jeetendra Gangele gangele...@gmail.com wrote:
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds\\
On 25 April 2015 at 20:49, Jeetendra Gangele gangele...@gmail.com wrote:
Hi Ayan can you try
also if this code is in scala why not val in newsY? is this define above?
loc = D:\\Project\\Spark\\code\\news\\jsonfeeds
newsY = sc.textFile(loc)
print newsY.count()
On 25 April 2015 at 20:08, ayan guha guha.a...@gmail.com wrote:
Hi
I am facing this weird issue.
I am on Windows, and I
If your use case is more to do with querying RDBMS and then bringing the
results to spark do some analysis then Spark SQL JDBC datasource API
http://www.sparkexpert.com/2015/03/28/loading-database-data-into-spark-using-data-sources-api/
is the best. If your use case is to bring entire data to
24 matches
Mail list logo