Are you by any chance starting two StreamingContexts in the same JVM? That
could explain a lot of the weird mixing of data that you are seeing. Its
not a supported usage scenario to start multiple streamingContexts
simultaneously in the same JVM.

TD


On Thu, Apr 17, 2014 at 10:58 PM, gaganbm <gagan.mis...@gmail.com> wrote:

> It happens with normal data rate, i.e., lets say 20 records per second.
>
> Apart from that, I am also getting some more strange behavior. Let me
> explain.
>
> I establish two sscs. Start them one after another. In SSCs I get the
> streams from Kafka sources, and do some manipulations. Like, adding some
> "Record_Name" for example, to each of the incoming records. Now this
> Record_Name is different for both the SSCs, and I get this field from some
> other class, not relevant to the streams.
>
> Now, expected behavior should be, all records in SSC1 gets added with the
> field RECORD_NAME_1 and all records in SSC2 should get added with the field
> RECORD_NAME_2. Both the SSCs have nothing to do with each other as I
> believe.
>
> However, strangely enough, I find many records in SSC1 get added with
> RECORD_NAME_2 and vice versa. Is it some kind of serialization issue ?
> That, the class which provides this RECORD_NAME gets serialized and is
> reconstructed and then some weird thing happens inside ? I am unable to
> figure out.
>
> So, apart from skewed frequency and volume of records in both the streams,
> I am getting this inter-mingling of data among the streams.
>
> Can you help me in how to use some external data to manipulate the RDD
> records ?
>
> Thanks and regards
>
> Gagan B Mishra
>
>
> *Programmer*
> *560034, Bangalore*
> *India*
>
>
> On Tue, Apr 15, 2014 at 4:09 AM, Tathagata Das [via Apache Spark User
> List] <[hidden email] 
> <http://user/SendEmail.jtp?type=node&node=4434&i=0>>wrote:
>
>> Does this happen at low event rate for that topic as well, or only for a
>> high volume rate?
>>
>> TD
>>
>>
>> On Wed, Apr 9, 2014 at 11:24 PM, gaganbm <[hidden 
>> email]<http://user/SendEmail.jtp?type=node&node=4238&i=0>
>> > wrote:
>>
>>> I am really at my wits' end here.
>>>
>>> I have different Streaming contexts, lets say 2, and both listening to
>>> same
>>> Kafka topics. I establish the KafkaStream by setting different consumer
>>> groups to each of them.
>>>
>>> Ideally, I should be seeing the kafka events in both the streams. But
>>> what I
>>> am getting is really unpredictable. Only one stream gets a lot of events
>>> and
>>> the other one almost gets nothing or very less compared to the other.
>>> Also
>>> the frequency is very skewed. I get a lot of events in one stream
>>> continuously, and after some duration I get a few events in the other
>>> one.
>>>
>>> I don't know where I am going wrong. I can see consumer fetcher threads
>>> for
>>> both the streams that listen to the Kafka topics.
>>>
>>> I can give further details if needed. Any help will be great.
>>>
>>> Thanks
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4238.html
>>  To start a new topic under Apache Spark User List, email [hidden 
>> email]<http://user/SendEmail.jtp?type=node&node=4434&i=1>
>> To unsubscribe from Apache Spark User List, click here.
>> NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> ------------------------------
> View this message in context: Re: Strange behaviour of different SSCs
> with same Kafka 
> topic<http://apache-spark-user-list.1001560.n3.nabble.com/Strange-behaviour-of-different-SSCs-with-same-Kafka-topic-tp4050p4434.html>
>
> Sent from the Apache Spark User List mailing list 
> archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com.
>

Reply via email to