On 08/08/2018 10:14 AM, Richard Owlett wrote:
On 08/08/2018 09:50 AM, Paul Heinlein wrote:
[snip]
jq is a great tool. I don't know if it by itself could find
duplicates, but you could use its sort_by() routine in conjunction
with uniq to do so.
I had seen those in my reading so assumed what I wanted was doable.
My understanding of jq's logic is limited, however, so I can't offer
an example without knowing the exact format of the JSON you're seeing.
I suspect the only thing that can be said is that is syntactically correct.
It represents a tree structure and the duplicates may be at any depth.
That was what caused be for tutorial suggestions. The samples given were
too simple structurally.
I just realized I know an important detail.
All the objects, no matter how deeply nested, have the form:
{
"guid": "yb3hKkKKqQSh",
"title": "Debootstrap (Shallow Thoughts)",
"index": 1,
"dateAdded": 1487768687157000,
"lastModified": 1487768687157000,
"id": 34488,
"iconuri": "http://shallowsky.com/favicon.ico",
"type": "text/x-moz-place",
"uri": "http://shallowsky.com/blog/linux/install/debootstrap.html"
},
So I use jq to pretty print to file. Edit that file to remove all
occurrences of " ". Then with preferred tool search for lines beginning
' "title": ' and emit the rest of the line. Analyze that for unique
titles. Then create a separate file of duplicates for purging. I've done
that before. It will just take time. It will be an exercise to hone my
tcl skills ;/
Another part of the problem is that I'm a bit more familiar with the
implementation of JMESPath queries in AWS than in jq.
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug