Re: Spark Streaming non functional requirements

2021-04-27 Thread Mich Talebzadeh
Forgot to add under non-functional requirements under heading



   - *Supportability and Maintainability*

Someone queried the other day on how to shutdown a streaming job
gracefully, meaning wait until such time as the "current queue" including
backlog is drained and all processing is completed.

I have come back with a suggested solution to implement this feature and
raised it as a topic in spark-developers list

http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-gracefully-shutdown-Spark-Structured-Streaming-tp31171.html

Regardless, this feature needs to be a consideration as well.

HTH


   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 27 Apr 2021 at 15:16, ashok34...@yahoo.com.INVALID
 wrote:

> Hello Mich
>
> Thank you for your great explanation.
>
> Best
>
> A.
>
> On Tuesday, 27 April 2021, 11:25:19 BST, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>
>
> Hi,
>
> Any design (in whatever framework) needs to consider both Functional and
> non-functional requirements. Functional requirements are those which are
> related to the technical functionality of the system that we cover daily in
> this forum. The non-functional requirement is a requirement that
> specifies criteria that can be used to judge the operation of a system
> conditions, rather than specific behaviours.  From my experience the
> non-functional requirements are equally important and in some cases they
> are underestimated when systems go to production. Probably, most
> importantly they need to cover the following:
>
>
>- *Processing time meeting a service-level agreement (SLA). *
>
>   You can get some of this from Spark GUI. Are you comfortably satisfying
> the requirements? How about total delay, Back pressure etc. Are you within
> your SLA. In streaming applications, there is no such thing as an answer
> which is supposed to be late and correct. The timeliness is part of the
> application. If we get the right answer too slowly it becomes useless or
> wrong. We also need to be aware of latency trades off with throughput.
>
>- *Capacity, current and forecast. *
>
>   What is the current capacity? Have you accounted for extra demands,
> sudden surge and loads such as year-end. Can your pipeline handle 1.6-2
> times the current load
>
>- *Scalability*
>
>   How does your application scale if you have to handle multiple topics or
> new topics added at later stages? Scalability also
> includes additional nodes, on-prem or having the ability to add more
> resources such as Google Dataproc compute engines etc
>
>- *Supportability and Maintainability*
>
>   Have you updated docs and procedures in Confluence or equivalent or
> they are typically a year old :).  Is there any single point of failure due
> to skill set? Can ops support the design and maintain BAU themselves. How
> about training and hand-over
>
>- *Disaster recovery and Fault tolerance*
>
>   What provisions are made for disaster recovery. Is there any single
> point of failure in the design (end to end pipeline). Are you using
> standalone dockers or Google Kubernetes Engine (GKE or equivalent)
>
> HTH
>
>view my Linkedin profile
> 
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 26 Apr 2021 at 17:16, ashok34...@yahoo.com.INVALID
>  wrote:
>
> Hello,
>
> When we design a typical spark streaming process, the focus is to get
> functional requirements.
>
> However, I have been asked to provide non-functional requirements as well.
> Likely things I can consider are Fault tolerance and Reliability (component
> failures).  Are there a standard list of non-functional requirements for
> Spark streaming that one needs to consider and verify all?
>
> Thanking you
>
>


Re: Spark Streaming non functional requirements

2021-04-27 Thread ashok34...@yahoo.com.INVALID
 Hello Mich
Thank you for your great explanation.
Best
A.
On Tuesday, 27 April 2021, 11:25:19 BST, Mich Talebzadeh 
 wrote:  
 
 
Hi,
Any design (in whatever framework) needs to consider both Functional and 
non-functional requirements. Functional requirements are those which are 
related to the technical functionality of the system that we cover daily in 
this forum. The non-functional requirement is a requirement that specifies 
criteria that can be used to judge the operation of a system conditions, rather 
than specific behaviours.  From my experience the non-functional requirements 
are equally important and in some cases they are underestimated when systems go 
to production. Probably, most importantly they need to cover the following:
   
   - Processing time meeting a service-level agreement (SLA). 
  You can get some of this from Spark GUI. Are you comfortably satisfying the 
requirements? How about total delay, Back pressure etc. Are you within your 
SLA. In streaming applications, there is no such thing as an answer which is 
supposed to be late and correct. The timeliness is part of the application. If 
we get the right answer too slowly it becomes useless or wrong. We also need to 
be aware of latency trades off with throughput.
   
   - Capacity, current and forecast. 
  What is the current capacity? Have you accounted for extra demands, sudden 
surge and loads such as year-end. Can your pipeline handle 1.6-2 times the 
current load 
   
   - Scalability
  How does your application scale if you have to handle multiple topics or new 
topics added at later stages? Scalability also includes additional nodes, 
on-prem or having the ability to add more resources such as Google Dataproc 
compute engines etc   
   - Supportability and Maintainability
  Have you updated docs and procedures in Confluence or equivalent or they are 
typically a year old :).  Is there any single point of failure due to skill 
set? Can ops support the design and maintain BAU themselves. How about training 
and hand-over
   
   - Disaster recovery and Fault tolerance
  What provisions are made for disaster recovery. Is there any single point of 
failure in the design (end to end pipeline). Are you using standalone dockers 
or Google Kubernetes Engine (GKE or equivalent)
HTH

   view my Linkedin profile

 

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destructionof data or any other property which may arise from relying 
on this email's technical content is explicitly disclaimed.The author will in 
no case be liable for any monetary damages arising from suchloss, damage or 
destruction. 

 


On Mon, 26 Apr 2021 at 17:16, ashok34...@yahoo.com.INVALID 
 wrote:

Hello,
When we design a typical spark streaming process, the focus is to get 
functional requirements.
However, I have been asked to provide non-functional requirements as well. 
Likely things I can consider are Fault tolerance and Reliability (component 
failures).  Are there a standard list of non-functional requirements for Spark 
streaming that one needs to consider and verify all?
Thanking you
  

Re: Spark Streaming non functional requirements

2021-04-27 Thread Mich Talebzadeh
Hi,

Any design (in whatever framework) needs to consider both Functional and
non-functional requirements. Functional requirements are those which are
related to the technical functionality of the system that we cover daily in
this forum. The non-functional requirement is a requirement that specifies
criteria that can be used to judge the operation of a system conditions,
rather than specific behaviours.  From my experience the non-functional
requirements are equally important and in some cases they are
underestimated when systems go to production. Probably, most importantly
they need to cover the following:


   - *Processing time meeting a service-level agreement (SLA). *

  You can get some of this from Spark GUI. Are you comfortably satisfying
the requirements? How about total delay, Back pressure etc. Are you within
your SLA. In streaming applications, there is no such thing as an answer
which is supposed to be late and correct. The timeliness is part of the
application. If we get the right answer too slowly it becomes useless or
wrong. We also need to be aware of latency trades off with throughput.

   - *Capacity, current and forecast. *

  What is the current capacity? Have you accounted for extra demands,
sudden surge and loads such as year-end. Can your pipeline handle 1.6-2
times the current load

   - *Scalability*

  How does your application scale if you have to handle multiple topics or
new topics added at later stages? Scalability also
includes additional nodes, on-prem or having the ability to add more
resources such as Google Dataproc compute engines etc

   - *Supportability and Maintainability*

  Have you updated docs and procedures in Confluence or equivalent or
they are typically a year old :).  Is there any single point of failure due
to skill set? Can ops support the design and maintain BAU themselves. How
about training and hand-over

   - *Disaster recovery and Fault tolerance*

  What provisions are made for disaster recovery. Is there any single point
of failure in the design (end to end pipeline). Are you using standalone
dockers or Google Kubernetes Engine (GKE or equivalent)

HTH

   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 26 Apr 2021 at 17:16, ashok34...@yahoo.com.INVALID
 wrote:

> Hello,
>
> When we design a typical spark streaming process, the focus is to get
> functional requirements.
>
> However, I have been asked to provide non-functional requirements as well.
> Likely things I can consider are Fault tolerance and Reliability (component
> failures).  Are there a standard list of non-functional requirements for
> Spark streaming that one needs to consider and verify all?
>
> Thanking you
>


Spark Streaming non functional requirements

2021-04-26 Thread ashok34...@yahoo.com.INVALID
Hello,
When we design a typical spark streaming process, the focus is to get 
functional requirements.
However, I have been asked to provide non-functional requirements as well. 
Likely things I can consider are Fault tolerance and Reliability (component 
failures).  Are there a standard list of non-functional requirements for Spark 
streaming that one needs to consider and verify all?
Thanking you