Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-26 Thread David Wright
On Sat 23 Mar 2024 at 11:55:04 (-0400), Greg Wooledge wrote: > On Sat, Mar 23, 2024 at 09:54:05AM -0500, Albretch Mueller wrote: > > a) using a chromium-derived browser, which can be used to dump the > > HAR file log of the network back and forth, go, e. g.: > >

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Albretch Mueller
> Archive.org has a well-documented API at > https://archive.org/developers/. There's even a command-line tool > (assuming one doesn't want to use, say, the python library). I had given a somewhat thorough reading to their API some time ago, but didn’t find anything that interesting and I was

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Greg Wooledge
On Sat, Mar 23, 2024 at 02:05:06PM -0500, Albretch Mueller wrote: > Actually, in order to deX-Y it in case anyone can offer any help, it > is more like "I want an index of all the books which have ever been > written/published" in order to read all of them ;-) First of all, you will not achieve

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Albretch Mueller
Greg Wooledge via lists.debian.org >Furthermore, whatever method you are using to *create* this HAR file >is questionable, since apparently you aren't even getting a properly >formatted file in the end. >So, putting these together, it looks like you are taking a file that >was intended to

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Darac Marjal
On 23/03/2024 16:34, Greg Wooledge wrote: On Sat, Mar 23, 2024 at 11:55:04AM -0400, Greg Wooledge wrote: On Sat, Mar 23, 2024 at 09:54:05AM -0500, Albretch Mueller wrote: 1) That HAR file is not properly formatted. Instead of "attribute":value pairs in the standard way, they have used front

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Greg Wooledge
On Sat, Mar 23, 2024 at 11:55:04AM -0400, Greg Wooledge wrote: > On Sat, Mar 23, 2024 at 09:54:05AM -0500, Albretch Mueller wrote: > > 1) That HAR file is not properly formatted. Instead of > > "attribute":value pairs in the standard way, they have used front > > slash + quote pairs (instead of

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Greg Wooledge
On Sat, Mar 23, 2024 at 09:54:05AM -0500, Albretch Mueller wrote: > a) using a chromium-derived browser, which can be used to dump the > HAR file log of the network back and forth, go, e. g.: > https://en.wikipedia.org/wiki/Anaxagoras > b) click on the link that says: "Works by or about

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread Albretch Mueller
>On Sat, Mar 23, 2024 at 1:44 AM wrote: >> On Sat, Mar 23, 2024 at 12:53:24AM -0500, Albretch Mueller wrote: >> out of a HAR file containing lots of obfuscating js cr@p and all kinds of >> nonsense I was able to extract line looking like: >It's not "js cr@p", It is called JSON. And there's a

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread David Christensen
On 3/22/24 22:53, Albretch Mueller wrote: out of a HAR file containing lots of obfuscating js cr@p and all kinds of nonsense I was able to extract line looking like: var00='{\"index\":\"prod-h-006\",\"fields\":{\"identifier\":\"bub_gb_O2EAMAAJ\",\"title\":\"Die Wissenschaft vom subjectiven

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread mgr...@grant.org
t;:797368506},\"_score\":[50.629513]} | jq '.fields.identifier + "|" + .fields.title' jq is an amazing tool, it's a full fledged programming language. You just need to continue concatenating your desired output. You might even find you can do what you want all inside a jq scr

Re: trying to parse lines from an awkwardly formatted HAR file ...

2024-03-23 Thread tomas
On Sat, Mar 23, 2024 at 12:53:24AM -0500, Albretch Mueller wrote: > out of a HAR file containing lots of obfuscating js cr@p and all kinds of > nonsense I was able to extract line looking like: It's not "js cr@p", It is called JSON. And there's a spec for it. [...] > I have tried substring