On 08/08/2018 10:14 AM, Richard Owlett wrote:
On 08/08/2018 09:50 AM, Paul Heinlein wrote:
[snip]

jq is a great tool. I don't know if it by itself could find duplicates, but you could use its sort_by() routine in conjunction with uniq to do so.

I had seen those in my reading so assumed what I wanted was doable.


My understanding of jq's logic is limited, however, so I can't offer an example without knowing the exact format of the JSON you're seeing.

I suspect the only thing that can be said is that is syntactically correct.
It represents a tree structure and the duplicates may be at any depth.
That was what caused be for tutorial suggestions. The samples given were too simple structurally.


I just realized I know an important detail.
All the objects, no matter how deeply nested, have the form:
{
 "guid": "yb3hKkKKqQSh",
 "title": "Debootstrap (Shallow Thoughts)",
 "index": 1,
 "dateAdded": 1487768687157000,
 "lastModified": 1487768687157000,
 "id": 34488,
 "iconuri": "http://shallowsky.com/favicon.ico";,
 "type": "text/x-moz-place",
 "uri": "http://shallowsky.com/blog/linux/install/debootstrap.html";
},

So I use jq to pretty print to file. Edit that file to remove all occurrences of " ". Then with preferred tool search for lines beginning ' "title": ' and emit the rest of the line. Analyze that for unique titles. Then create a separate file of duplicates for purging. I've done that before. It will just take time. It will be an exercise to hone my tcl skills ;/


Another part of the problem is that I'm a bit more familiar with the implementation of JMESPath queries in AWS than in jq.

_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to