Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12119 )

Change subject: [blog] a blogpost about location awareness in Kudu
......................................................................


Patch Set 8:

(28 comments)

Thank you for the thorough review.

http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md
File _posts/2019-03-25-location-awareness.md:

http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@1
PS8, Line 1: ---
> Can you push this to the gh_pages branch in your github fork so a rendered
Done: 
https://github.com/alexeyserbin/kudu/blob/gh-pages/_posts/2019-03-25-location-awareness.md


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@15
PS8, Line 15: <!--TODO(aserbin) rename the file to reflect the date when 
published -->
> Should this be removed?
I'll remove it once I'm getting +1 or +2 on the latest patchset.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@19
PS8, Line 19: first cut
> initial implementation?
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@19
PS8, Line 19: starting 1.9.0
> ...starting *with the* 1.9.0...
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@20
PS8, Line 20: is built for the following use case:
> I am not sure this is a "use case" per se, but instead what the term "locat
I rephrased it using term 'requirement'.  Does it sound better now?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@26
PS8, Line 26: A rack failure might happen because of a failure of a hardware 
component shared
            : among servers in the rack: network switch, power supply, etc.
> A rack failure can occur when a hardware component shared
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@31
PS8, Line 31: network latency between datacenters is low.
> This is a good opportunity to explicitly mention that this is why we call t
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@37
PS8, Line 37: are
            : supposed to
> should
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@38
PS8, Line 38: physical or cloud-defined hierarchy of the
            : deployed cluster
> I am not sure I understand what this means in relation to location awarenes
I recall trying to introduce 'fault domain' term in this blog , I got feedback 
that we better not introduce it since 1) readers might be not aware of that 
concept, 2) we don't use that term in how design doc.  That's why you don't see 
'failure domain' or 'fault domain' here at all.  If you feel strong about the 
necessity to speak in 'proper' terms here, let me know and I'll introduce those 
terms with corresponding references.

As of now, I just added the examples you suggested.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@41
PS8, Line 41: However, we want to keep the hierarchy
            : there to make it possible to exploit it later
> However, we plan to leverage the hierarchical structure in future releases.
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@43
PS8, Line 43: compatibility with HDFS
> Perhaps this should be moved up and describe a bit more in detail as a desi
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@52
PS8, Line 52: etc
> What is the "etc"? What else does it use it for?
The text below contains information about what those locations are used for, so 
'etc' here is for:
* client finding the closest tablet server to its own location
* other usages of the location hierarchy which we haven't leveraged yet

I removed the 'etc'.  Does it look better to you now?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@55
PS8, Line 55: location string for the specified IP address/hostname.
> The script below specifically shows ip-address. How do I use hostname?
The script below is an example.  You are free to use hostname to perform any 
flavor of mapping 'hostname' --> 'location string' you want.  I added 
corresponding note into the example's comment.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@59
PS8, Line 59: tablet server restarts
> Is this dependent on `--follower_unavailable_considered_failed_sec`? Or wil
It's not dependent of '--follower_unavailable_considered_failed_sec'.  That's 
about registering tablet servers, not tracking tablet replicas.   Any restart 
of a tablet server will cause assignment of location to the tablet server.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@59
PS8, Line 59: Kudu tablet servers are location
            : agnostic, at least for now, so the assigned location is not 
reported back
            : to the tablet server.
> Maybe this paragraph would flow better if you moved this part to the bottom
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@64
PS8, Line 64: masters provide connected clients
> How do they do this?
I added a reference to the design doc.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@62
PS8, Line 62: to try to place replicas evenly across
            : locations and to keep tablets available in case all tablet 
servers in a single
            : location fail.
> This last part is somewhat duplicated from the Introduction section above.
I want to keep it here because it's a bit different context, and I think this 
is some useful sort of 'duplication' that allows to grasp the presented 
information better.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@75
PS8, Line 75: Essentially, that's about having
> This results in...
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@81
PS8, Line 81: The error handling and the input validation are minimalistic. 
Also, the
            : #   network topology choice, supportability and capacity planning 
aspects of
            : #   this script might be sub-optimal if applied as-is for 
real-world use cases.
> Is there anywhere else anyone can get a "good" production worthy example? I
I don't think we should aim for a 'good' production worthy example in this blog 
because too many error handling and other details will make it harder to read 
and understand.

This is an example, and the deficiencies of this script are outlined. Readers 
interested in extending this example and making it closer to 'good' (what's 
'good', BTW?) production-grade script can do that to their taste.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@104
PS8, Line 104:   echo "ERROR: '$ip_address' is not a valid IPv4 address"
> Should errors map to "/other"? How does Kudu handle this script exiting wit
I think errors should map to errors/failures.  That way it's simpler to track 
and fix them.  If the location mapping script returns an error, the tablet 
server's registration fails and the tablet server is not added into the 
master's registry, making it inaccessible to clients.  I added information on 
what happens if location assignment script fails into the text of the blog and 
into this example script.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@142
PS8, Line 142: The reasoning is simple: with
> I try to stay away from saying something is "simple". People have wide leve
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@151
PS8, Line 151: The location-aware placement policy for tablet replicas in Kudu
> This seems more appropriate for earlier sections. When reading the blog pos
Indeed.  I re-shuffled the paragraphs and explicitly mentioned that the policy 
is the only available one and it doesn't have any parameters to configure.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@162
PS8, Line 162: Automatic re-replication and placement policy
> Per my earlier comment, this is also more about "How it works".
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@177
PS8, Line 177: Reinstating location-aware policy in Kudu cluster
> I think this is "How to use it" and makes sense here.
Ack


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@208
PS8, Line 208: Examples
> Per my earlier comment, this is also more about "How it works".
Done


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@337
PS8, Line 337: roadmap
> What roadmap? Does Apache Kudu have a roadmap?
I think the same roadmap that people refer when asking questions like 'are you 
going to implement feature A any soon?'.

There is a request to implement some sort of 'tagging' as JIRA item at 
issues.apache.org.  I think I'll just add this here.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@342
PS8, Line 342: see [2]
> Any reason not to link inline instead of using reference style?
Nothing particular, but I was hoping to get it referenced as a link in [2] as 
well.  Probably, I confused .md format with something else.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@346
PS8, Line 346: [[1]] [Location awareness in Kudu, design 
document](https://s.apache.org/location-awareness-design)
> Can we check this design doc into https://github.com/apache/kudu/tree/maste
Yes, but it will take some time to convert the design doc into .md format and 
pass it through review cycles.  It should be done anyway, so I'll try to do so 
shortly.



--
To view, visit http://gerrit.cloudera.org:8080/12119
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I10b30a80d5661fb889a11285b8118cdea1a87cd2
Gerrit-Change-Number: 12119
Gerrit-PatchSet: 8
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Greg Solovyev <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Tue, 26 Mar 2019 18:06:02 +0000
Gerrit-HasComments: Yes

Reply via email to