> On June 16, 2014, 5:45 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/jobs.py, lines 181-182
> > <https://reviews.apache.org/r/22457/diff/3/?file=609592#file609592line181>
> >
> >     I don't think it's enough to json-serialize a thrift task. This is 
> > bound to set/dict serialization sorting problem we have battled with in 
> > updater.py. Specifically this block: 
> > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/api/updater.py#L200-L223
> >     
> >     Without sorting, the diff will only work 99% of the time. This has been 
> > demonstrated in a few real-life job updates.
> 
> Mark Chu-Carroll wrote:
>     This isn't a new thing in this updated diff system - this is the current 
> behavior.
>     
>     If you use the new json-tree-diff, you won't have this problem; but for 
> people who rely on the current diff
>     behavior, I'd rather not change that.
>
> 
> Maxim Khutornenko wrote:
>     | If you use the new json-tree-diff, you won't have this problem; 
>     
>     Perhaps I am missing something but how would the new approach ensure 
> equivalence of these two structs?
>     { 'set1': { ['k1':'v1', 'k2':'v2']} }
>     { 'set2': { ['k2':'v2', 'k1':'v1']} }
>     
>     It's "normal" to see element re-ordering like this during thrift struct 
> serialization.
> 
> Maxim Khutornenko wrote:
>     Here is a real TaskConfig json fragment and its reordered twin:
>     
>     1: u'constraints': [{u'name': u'value', u'constraint': {u'values': [u'1', 
> u'2']}}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
>     2: u'constraints': [{u'constraint': {u'values': [u'2', u'1']}, u'name': 
> u'value'}, {u'name': u'limit', u'constraint': {u'limit': {u'limit': 10}}}]
> 
> Mark Chu-Carroll wrote:
>     I don't think that the reordering that you show in your first example 
> isn't valid json. Key-value pairs belong in a dictionary, not a list.  In a 
> dictionary, the comparison routine does the right thing: it compares key by 
> key, without regard to ordering.
>     
>     In the second example: I'd argue that that's actually a bug in our 
> serializer: it's using an ordered list for an unordered constraint. Diff 
> can't tell that the list structure isn't really supposed to be ordered; the 
> serializer should be canonicalizing it.
>
> 
> Maxim Khutornenko wrote:
>     Not sure how else you could represent a python set in JSON without 
> completely changing the underlying data structure (e.g. emitting fake keys 
> for every element and converting set into a dictionary with sorted keys). 
> Agree, the TJSONProtocol could be a bit smarter when dealing with python sets 
> but the reality is that python sets are inherently "unhashable" structs. I 
> have seen cases when completely repr-equivalent python sets were not 
> comparing as equal.
> 
> Mark Chu-Carroll wrote:
>     All the serializer would need to do is just canonicalize when it converts 
> to a list. It's not an issue of being unhashable. But it should at least 
> bother to be *uniform*. If you're converting a set to a list, chose a 
> convention for ordering elements. Put 'em in lexicographic order, problem 
> solved.
>     
>     As it stands, that problem is unsolvable for this diff tool: there's no 
> way to tell that a given field was supposed to be a set, and should be 
> compared without ordering. 
>     
>     I'm not sure what you'd like me to do here, short of giving up and 
> throwing away the new diff tool. I don't see any way to solve this, short of 
> creating something that uses thrift introspection.
>
> 
> Maxim Khutornenko wrote:
>     We solved this problem in updater diff before (see the link above) and I 
> am sure it can be solved here. Converting to sorted tuples does not produce a 
> pretty diff but if we are targeting value-comparison diff output rather than 
> serialized string compare the problem shifts into pretty-fying the sorted 
> tuple output rather than handling diff mechanics.

Not sure what you mean by "converting to sorted tuples". Is there any 
documentation of what the "sorted tuple" format looks like? Or do I just need 
to reverse-engineer? (I hate doing that, because if I make any wrong 
assumptions, things get ugly.)


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22457/#review45835
-----------------------------------------------------------


On June 18, 2014, 10:59 a.m., Mark Chu-Carroll wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22457/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 10:59 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Brian Wickman.
> 
> 
> Bugs: aurora-520
>     https://issues.apache.org/jira/browse/aurora-520
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add a new diff method, which uses field-by-field comparison of JSON trees for 
> comparing running job configurations to potentially updated configs.
> 
> - Allow exclusion of semantically irrelevant fields.
> - Provide a clearer list of the differences between configs.
> - Provide a scripting-friendly alternative JSON syntax for diffs.
> 
> The old diff behavior is still available under the "--use-shell-diff" option.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/client/cli/BUILD 
> ebe681a0d1735b7cc695dc3b7a14c4292d87ae32 
>   src/main/python/apache/aurora/client/cli/jobs.py 
> 4fa03a6c9919651551238b0dc211ed69a8dfe565 
>   src/main/python/apache/aurora/client/cli/json_tree_diff.py PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/BUILD 
> 3c88ed7cf9f654bbbd80d1d44aa1dd1c8655e378 
>   src/test/python/apache/aurora/client/cli/test_diff.py 
> 38629b63c082cf81cb891dace2a70d9e8f418e18 
>   src/test/python/apache/aurora/client/cli/test_json_diff.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22457/diff/
> 
> 
> Testing
> -------
> 
> New unit tests of the JSON tree diff code, plus a bunch of new "job diff" 
> tests of the new functionality.
> All tests pass.
> 
> 
> Thanks,
> 
> Mark Chu-Carroll
> 
>

Reply via email to