Problem solved by creating only one RDD.
> On Jun 1, 2016, at 14:05, Cyril Scetbon wrote:
>
> It seems that to join a DStream with a RDD I can use :
>
> mgs.transform(rdd => rdd.join(rdd1))
>
> or
>
> mgs.foreachRDD(rdd => rdd.join(rdd1))
>
> But, I can
Hey guys,
Can someone help me to solve the current issue. My code is the following :
var arr = new ArrayBuffer[String]()
sa_msgs.map(x => x._1)
.foreachRDD { rdd => arr = new ArrayBuffer[String]()
}
(2)
sa_msgs.map(x => x._1)
.foreachRDD { rdd => arr ++=
It seems that to join a DStream with a RDD I can use :
mgs.transform(rdd => rdd.join(rdd1))
or
mgs.foreachRDD(rdd => rdd.join(rdd1))
But, I can't see why rdd1.toDF("id","aid") really causes SPARK-5063
> On Jun 1, 2016, at 12:00, Cyril Scetbon wrote:
>
do it or I need to do everything
using RDDs join only ?
And If I need to use only RDDs join, how to do it ? as I have a RDD (rdd1) and
a DStream (mgs) ?
Thanks
--
Cyril SCETBON
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
DataFrame.scala:152)
Is there a better way to query nested objects and to join between a DF
containing nested objects and another regular data frame (yes it's the current
case)
> On May 9, 2016, at 00:42, Cyril Scetbon wrote:
>
> Hi Ashish,
>
> The issue is not relate
comment on it and maybe give me advices.
Thank you.
> On May 8, 2016, at 22:12, Ashish Dubey wrote:
>
> Is there any reason you dont want to convert this - i dont think join b/w RDD
> and DF is supported.
>
> On Sat, May 7, 2016 at 11:41 PM, Cyril Scetbon <mailto:cyril.s
Hi,
I have a RDD built during a spark streaming job and I'd like to join it to a
DataFrame (E/S input) to enrich it.
It seems that I can't join the RDD and the DF without converting first the RDD
to a DF (Tell me if I'm wrong). Here are the schemas of both DF :
scala> df
res32: org.apache.spark
The only way I've found to make it work now is by using the current spark
context and changing its configuration using spark-shell options. Which is
really different from pyspark where you can't instantiate a new one, initialize
it etc..
> On Apr 4, 2016, at 18:16, Cyril Scetbon w
set("spark.hadoop.validateOutputSpecs", "false")
>
> And that will work
>
> Have you tried it?
>
> HTH
>
> Dr Mich Talebzadeh
>
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
deh
>
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
Nobody has any idea ?
> On Mar 31, 2016, at 23:22, Cyril Scetbon wrote:
>
> Hi,
>
> I'm having issues to create a StreamingContext with Scala using spark-shell.
> It tries to access the localhost interface and the Application Master is not
> running on t
seems there are differences. Are there any
parameters set using Scala but not Python ?
Thanks
--
Cyril SCETBON
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
gt; val spec = StateSpec.function(mappingFunction)
spec: org.apache.spark.streaming.StateSpec[String,Int,Int,Option[String]] =
StateSpecImpl()
This code comes from one of the spark examples. If anyone can explain me the
difference between the two.
Regards
> On Mar 27, 2016, at 23:57, Cyril
Hi,
I'm testing a sample code and I get this error. Here is the sample code I use :
scala> import org.apache.spark.streaming._
import org.apache.spark.streaming._
scala> def mappingFunction(key: String, value: Option[Int], state: State[Int]):
Option[String] = {
|Option(key)
| }
ma
sync since they are both in the JVM and a
> bit more work is needed to expose the same functionality from the JVM in
> Python (or re-implement the Scala code in Python where appropriate).
>
> On Thursday, March 24, 2016, Cyril Scetbon <mailto:cyril.scet...@free.fr>> wrote:
Hi guys,
It seems that mapWithState is not supported. Am I right ? Is there a reason
there are 3 languages supported and one is behind the two others ?
Thanks
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For addit
I don't think we can print an integer value in a spark streaming process As
opposed to a spark job. I think I can print the content of an rdd but not debug
messages. Am I wrong ?
Cyril Scetbon
> On Feb 17, 2016, at 12:51 AM, ayan guha wrote:
>
> Hi
>
> You can alway
erstanding.
>
> Direct stream generates 1 RDD per batch, however, number of partitions in
> that RDD = number of partitions in kafka topic.
>
> On Wed, Feb 17, 2016 at 12:18 PM, Cyril Scetbon <mailto:cyril.scet...@free.fr>> wrote:
> Hi guys,
>
> I'm makin
Hi guys,
I'm making some tests with Spark and Kafka using a Python script. I use the
second method that doesn't need any receiver (Direct Approach). It should adapt
the number of RDDs to the number of partitions in the topic. I'm trying to
verify it. What's the easiest way to verify it ? I also
19 matches
Mail list logo