Hi Qiao, all,

It’s interesting to introduce I/O and computation resources into ALTO to get 
not only the network resources for more accurate query. Here is my review of 
the draft:

>>> It is mentioned in chapter 4.2 (“Sharing raw site/cluster information would 
>>> violate sites' privacy constraints.”) and 6 (“How much privacy, including… 
>>> will be exposed?”) that the privacy is a problem in providing the 
>>> information, but it seems that it is not addressed in this draft.

Abstract
"In this document, we propose that it is feasible to use existing ALTO services 
to provides not only network information, but also information about other 
resources in science networks including computing and storage."

>>> to provides -> to provide

"ExaO provides simple APIs for users to submit and manage dataset transfer and 
analytic requests and to monitor the status of each request, along with 
fine-grained local and global network and site state information in real-time. "

>>> Too many “and” to read and understand.
 
1.  Introduction
"Applications such as the Production ANd Distributed Analysis system (PanDA) in 
ATLAS and the Physics Experiment Data Export system (PhEDEX) in CMS have been 
developed to manage the data transfers among different cluster sites."
   
>>> have been -> has been

"Section 6 lists several key issues to address in order to realize the proposal 
of providng multi-resource information by ALTO topology services."

>>> providng -> providing

5.2.  Example: encode storage bandwidth into path vector
"Other than network resource, assume in this topology eh1 and eh3 are equipped 
with commodity hard drive disk (HDD) while eh2 and eh4 are equipped with SSD."

>>> There is the full name of HDD but no full name for SSD.
   
"In this example, if we see the end hosts as network elements, the storage I/O 
bandwidth of each host can also be encoded as an abstract element into the 
path-vector."

>>> It’s better to have a new figure with the storage I/O bandwidth as an 
>>> abstract element in the topology.

6.  Key Issues
"a large dataset transfer or analytic application always involve many network 
elements in multiple clusters/sites and the absolute number of involved network 
elements keep increasing as the scale of clusters increase."

>>> involve -> involves

"There still lacks of an analytics or experimental understanding on the 
scalability of path-vector and RSA services."

>>> lacks of -> lacks

7.1.  Architecture
"including replica selection, routing path computation and bandwidth 
allocation, and request parallelization decisions, such as which cluster each 
sub-request should be placed at the multi-resource orchestrator."

>>> missing a preposition before “the multi-resource orchestrator” -> in the 
>>> multi-resource…

7.2.1.  User API

>>> Another API that allows a user to query all the requests that the user has 
>>> submitted can be considered in case of missing the requestID.

7.4.  ALTO Server
"Each ALTO server must provide basic information services as specified in 
[RFC7285] such as network map, cost map, endpoint cost service (ECS), etc.  To 
support the fine-grained multi-resource allocation in ExaO, each ALTO server 
should also provide more fine-grained information about different resources in 
clusters through ALTO extension services such as the routing state abstraction 
[DRAFT-RSA], path vector [DRAFT-PV], network graph [DRAFT-NETGRAPH],multi-cost 
[DRAFT-MC] and cost-calendar [DRAFT-CC] services."
                                
>>> Providing basic information along with fine-grained information may has 
>>> some overlap or redundant information such as information provided by cost 
>>> map and RSA. It may increase the burden of data transfer. Some ways to 
>>> reduce the redundant or send part of the information can be considered.

7.7.1.  Orchestration Algorithms
"The modular design of ExaO allows the adoption of different orchestration 
algorithms and methodologies, depending on the specific performance 
requirements."
>>> If using a specific algorithm, what interface (inputs/outputs…) should 
>>> users specify?

7.7.3.  Example: A Max-Min Fairness Resource Allocation Algorithm

>>> As an example, I think there is too much introduction and description for 
>>> MFRA. May be it can be seen as a default algorithm in the system?

8.3.  Constraints of the MFRA Algorithm
"Simply denoting the replica selection as a set of binary constraint will 
significantly increases the computation complexity of the scheduling process."

>>> increases -> increase


_______________________________________________
alto mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/alto

Reply via email to