Yes, look at KafkaUtils.createRDD

On Wed, Jul 22, 2015 at 11:17 AM, Shushant Arora <shushantaror...@gmail.com>
wrote:

> Thanks !
>
> I am using spark streaming 1.3 , And if some post fails because of any
> reason, I will store the offset of that message in another kafka topic. I
> want to read these offsets in another spark job  and from them the original
> kafka topic's messages based on these offsets-
>  So is it possible in spark job to get kafka messages based on random
> offsets ? Or is there any better alternative to handle failure of post
> request?
>
> On Wed, Jul 22, 2015 at 11:31 AM, Tathagata Das <t...@databricks.com>
> wrote:
>
>> Yes, you could unroll from the iterator in batch of 100-200 and then post
>> them in multiple rounds.
>> If you are using the Kafka receiver based approach (not Direct), then the
>> raw Kafka data is stored in the executor memory. If you are using Direct
>> Kafka, then it is read from Kafka directly at the time of filtering.
>>
>> TD
>>
>> On Tue, Jul 21, 2015 at 9:34 PM, Shushant Arora <
>> shushantaror...@gmail.com> wrote:
>>
>>> I can post multiple items at a time.
>>>
>>> Data is being read from kafka and filtered after that its posted . Does 
>>> foreachPartition
>>> load complete partition in memory or use an iterator of batch underhood? If
>>> compete batch is not loaded will using custim size of 100-200 request in
>>> one batch and post will help instead of whole partition ?
>>>
>>> On Wed, Jul 22, 2015 at 12:18 AM, Tathagata Das <t...@databricks.com>
>>> wrote:
>>>
>>>> If you can post multiple items at a time, then use foreachPartition to
>>>> post the whole partition in a single request.
>>>>
>>>> On Tue, Jul 21, 2015 at 9:35 AM, Richard Marscher <
>>>> rmarsc...@localytics.com> wrote:
>>>>
>>>>> You can certainly create threads in a map transformation. We do this
>>>>> to do concurrent DB lookups during one stage for example. I would
>>>>> recommend, however, that you switch to mapPartitions from map as this
>>>>> allows you to create a fixed size thread pool to share across items on a
>>>>> partition as opposed to spawning a future per record in the RDD for 
>>>>> example.
>>>>>
>>>>> On Tue, Jul 21, 2015 at 4:11 AM, Shushant Arora <
>>>>> shushantaror...@gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> Can I create user threads in executors.
>>>>>> I have a streaming app where after processing I have a requirement to
>>>>>> push events to external system . Each post request costs ~90-100 ms.
>>>>>>
>>>>>> To make post parllel, I can not use same thread because that is
>>>>>> limited by no of cores available in system , can I useuser therads in 
>>>>>> spark
>>>>>> App? I tried to create 2 thredas in a map tasks and it worked.
>>>>>>
>>>>>> Is there any upper limit on no of user threds in spark executor ? Is
>>>>>> it a good idea to create user threads in spark map task?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Richard Marscher*
>>>>> Software Engineer
>>>>> Localytics
>>>>> Localytics.com <http://localytics.com/> | Our Blog
>>>>> <http://localytics.com/blog> | Twitter <http://twitter.com/localytics>
>>>>>  | Facebook <http://facebook.com/localytics> | LinkedIn
>>>>> <http://www.linkedin.com/company/1148792?trk=tyah>
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to