RE: Optimizing Streaming from Websphere MQ

2015-06-16 Thread Chaudhary, Umesh
Thanks Akhil for taking this point, I am also talking about the MQ bottleneck.
I am currently having 5 receivers for a unreliable Websphere MQ receiver 
implementations.
Is there any proven way to convert this implementation to reliable one ?


Regards,
Umesh Chaudhary
From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Tuesday, June 16, 2015 12:44 PM
To: Chaudhary, Umesh
Cc: user@spark.apache.org
Subject: Re: Optimizing Streaming from Websphere MQ

Each receiver will run on 1 core. So if your network is not the bottleneck then 
to test the consumption speed of the receivers you can simply do a 
dstream.count.print to see how many records it can receive. (Also it will be 
available in the Streaming tab of the driver UI). If you spawn 10 receivers on 
10 cores then possibly no processing will happen other than receiving.
Now, on the other hand the MQ can also be the bottleneck (you could possibly 
configure it to achieve more parallelism)

Thanks
Best Regards

On Mon, Jun 15, 2015 at 2:40 PM, Chaudhary, Umesh 
umesh.chaudh...@searshc.commailto:umesh.chaudh...@searshc.com wrote:
Hi Akhil,
Thanks for your response.
I have 10 cores which sums of all my 3 machines and I am having 5-10 receivers.
I have tried to test the processed number of records per second by varying 
number of receivers.
If I am having 10 receivers (i.e. one receiver for each core), then I am not 
experiencing any performance benefit from it.
Is it something related to the bottleneck of MQ or Reliable Receiver?

From: Akhil Das 
[mailto:ak...@sigmoidanalytics.commailto:ak...@sigmoidanalytics.com]
Sent: Saturday, June 13, 2015 1:10 AM
To: Chaudhary, Umesh
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Optimizing Streaming from Websphere MQ

How many cores are you allocating for your job? And how many receivers are you 
having? It would be good if you can post your custom receiver code, it will 
help people to understand it better and shed some light.

Thanks
Best Regards

On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh 
umesh.chaudh...@searshc.commailto:umesh.chaudh...@searshc.com wrote:
Hi,
I have created a Custom Receiver in Java which receives data from Websphere MQ 
and I am only writing the received records on HDFS.

I have referred many forums for optimizing speed of spark streaming 
application. Here I am listing a few:


• Spark 
Officialhttp://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning

• VIrdatahttp://www.virdata.com/tuning-spark/

•  TD’s Slide (A bit Old but 
Useful)http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617

I got mainly two point for my applicability :


• giving batch interval as 1 sec

• Controlling “spark.streaming.blockInterval” =200ms

• inputStream.repartition(3)

But that did not improve my actual speed (records/sec) of receiver which is MAX 
5-10 records /sec. This is way less from my expectation.
Am I missing something?

Regards,
Umesh Chaudhary
This message, including any attachments, is the property of Sears Holdings 
Corporation and/or one of its subsidiaries. It is confidential and may contain 
proprietary or legally privileged information. If you are not the intended 
recipient, please delete it without reading the contents. Thank you.

This message, including any attachments, is the property of Sears Holdings 
Corporation and/or one of its subsidiaries. It is confidential and may contain 
proprietary or legally privileged information. If you are not the intended 
recipient, please delete it without reading the contents. Thank you.


This message, including any attachments, is the property of Sears Holdings 
Corporation and/or one of its subsidiaries. It is confidential and may contain 
proprietary or legally privileged information. If you are not the intended 
recipient, please delete it without reading the contents. Thank you.


Re: Optimizing Streaming from Websphere MQ

2015-06-16 Thread Akhil Das
Each receiver will run on 1 core. So if your network is not the bottleneck
then to test the consumption speed of the receivers you can simply do a
*dstream.count.print* to see how many records it can receive. (Also it will
be available in the Streaming tab of the driver UI). If you spawn 10
receivers on 10 cores then possibly no processing will happen other than
receiving.
Now, on the other hand the MQ can also be the bottleneck (you could
possibly configure it to achieve more parallelism)

Thanks
Best Regards

On Mon, Jun 15, 2015 at 2:40 PM, Chaudhary, Umesh 
umesh.chaudh...@searshc.com wrote:

  Hi Akhil,

 Thanks for your response.

 I have 10 cores which sums of all my 3 machines and I am having 5-10
 receivers.

 I have tried to test the processed number of records per second by varying
 number of receivers.

 If I am having 10 receivers (i.e. one receiver for each core), then I am
 not experiencing any performance benefit from it.

 Is it something related to the bottleneck of MQ or Reliable Receiver?



 *From:* Akhil Das [mailto:ak...@sigmoidanalytics.com]
 *Sent:* Saturday, June 13, 2015 1:10 AM
 *To:* Chaudhary, Umesh
 *Cc:* user@spark.apache.org
 *Subject:* Re: Optimizing Streaming from Websphere MQ



 How many cores are you allocating for your job? And how many receivers are
 you having? It would be good if you can post your custom receiver code, it
 will help people to understand it better and shed some light.


   Thanks

 Best Regards



 On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh 
 umesh.chaudh...@searshc.com wrote:

 Hi,

 I have created a Custom Receiver in Java which receives data from
 Websphere MQ and I am only writing the received records on HDFS.



 I have referred many forums for optimizing speed of spark streaming
 application. Here I am listing a few:



 · Spark Official
 http://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning

 · VIrdata http://www.virdata.com/tuning-spark/

 ·  TD’s Slide (A bit Old but Useful)
 http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617



 I got mainly two point for my applicability :



 · giving batch interval as 1 sec

 · Controlling “spark.streaming.blockInterval” =200ms

 · inputStream.repartition(3)



 But that did not improve my actual speed (records/sec) of receiver which
 is MAX 5-10 records /sec. This is way less from my expectation.

 Am I missing something?



 Regards,

 Umesh Chaudhary

 This message, including any attachments, is the property of Sears Holdings
 Corporation and/or one of its subsidiaries. It is confidential and may
 contain proprietary or legally privileged information. If you are not the
 intended recipient, please delete it without reading the contents. Thank
 you.


   This message, including any attachments, is the property of Sears
 Holdings Corporation and/or one of its subsidiaries. It is confidential and
 may contain proprietary or legally privileged information. If you are not
 the intended recipient, please delete it without reading the contents.
 Thank you.



RE: Optimizing Streaming from Websphere MQ

2015-06-15 Thread Chaudhary, Umesh
Hi Akhil,
Thanks for your response.
I have 10 cores which sums of all my 3 machines and I am having 5-10 receivers.
I have tried to test the processed number of records per second by varying 
number of receivers.
If I am having 10 receivers (i.e. one receiver for each core), then I am not 
experiencing any performance benefit from it.
Is it something related to the bottleneck of MQ or Reliable Receiver?

From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Saturday, June 13, 2015 1:10 AM
To: Chaudhary, Umesh
Cc: user@spark.apache.org
Subject: Re: Optimizing Streaming from Websphere MQ

How many cores are you allocating for your job? And how many receivers are you 
having? It would be good if you can post your custom receiver code, it will 
help people to understand it better and shed some light.

Thanks
Best Regards

On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh 
umesh.chaudh...@searshc.commailto:umesh.chaudh...@searshc.com wrote:
Hi,
I have created a Custom Receiver in Java which receives data from Websphere MQ 
and I am only writing the received records on HDFS.

I have referred many forums for optimizing speed of spark streaming 
application. Here I am listing a few:


• Spark 
Officialhttp://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning

• VIrdatahttp://www.virdata.com/tuning-spark/

•  TD’s Slide (A bit Old but 
Useful)http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617

I got mainly two point for my applicability :


• giving batch interval as 1 sec

• Controlling “spark.streaming.blockInterval” =200ms

• inputStream.repartition(3)

But that did not improve my actual speed (records/sec) of receiver which is MAX 
5-10 records /sec. This is way less from my expectation.
Am I missing something?

Regards,
Umesh Chaudhary
This message, including any attachments, is the property of Sears Holdings 
Corporation and/or one of its subsidiaries. It is confidential and may contain 
proprietary or legally privileged information. If you are not the intended 
recipient, please delete it without reading the contents. Thank you.


This message, including any attachments, is the property of Sears Holdings 
Corporation and/or one of its subsidiaries. It is confidential and may contain 
proprietary or legally privileged information. If you are not the intended 
recipient, please delete it without reading the contents. Thank you.


Re: Optimizing Streaming from Websphere MQ

2015-06-12 Thread Akhil Das
How many cores are you allocating for your job? And how many receivers are
you having? It would be good if you can post your custom receiver code, it
will help people to understand it better and shed some light.

Thanks
Best Regards

On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh 
umesh.chaudh...@searshc.com wrote:

  Hi,

 I have created a Custom Receiver in Java which receives data from
 Websphere MQ and I am only writing the received records on HDFS.



 I have referred many forums for optimizing speed of spark streaming
 application. Here I am listing a few:



 · Spark Official
 http://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning

 · VIrdata http://www.virdata.com/tuning-spark/

 ·  TD’s Slide (A bit Old but Useful)
 http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617



 I got mainly two point for my applicability :



 · giving batch interval as 1 sec

 · Controlling “spark.streaming.blockInterval” =200ms

 · inputStream.repartition(3)



 But that did not improve my actual speed (records/sec) of receiver which
 is MAX 5-10 records /sec. This is way less from my expectation.

 Am I missing something?



 Regards,

 Umesh Chaudhary
  This message, including any attachments, is the property of Sears
 Holdings Corporation and/or one of its subsidiaries. It is confidential and
 may contain proprietary or legally privileged information. If you are not
 the intended recipient, please delete it without reading the contents.
 Thank you.