Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread miran
> xmltree could handle it, so ... bug report please.

It is a known issue: 
[https://github.com/nim-lang/Nim/issues/1034](https://github.com/nim-lang/Nim/issues/1034)
 and 
[https://github.com/nim-lang/Nim/issues/11713](https://github.com/nim-lang/Nim/issues/11713)


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread Araq
Well the HTML is incorrect and should contain `&`, but xmltree could handle it, 
so ... bug report please.


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread anta40
I tried @filip and @SolitudeSF' solutions.

Both give the same result, which is slightly incorrect: 



https://instagram.fcgk12-1.fna.fbcdn.net/vp/ff6ac1dc7b428f4177e1d34989c82765/5E13D4FF/t51.2885-15/e35/p1080x1080/67664270_675154659649319_4991162461801475991_n.jpg?_nc_ht=instagram.fcgk12-1.fna.fbcdn.net_nc_cat=1


Run

When the URL is opened on browser, it gives you "URL signature mismatch"

The correct URL is: 



https://instagram.fcgk12-1.fna.fbcdn.net/vp/ff6ac1dc7b428f4177e1d34989c82765/5E13D4FF/t51.2885-15/e35/p1080x1080/67664270_675154659649319_4991162461801475991_n.jpg?_nc_ht=instagram.fcgk12-1.fna.fbcdn.net&_nc_cat=1


Run

Notice that there's a **&** after fna.fbcdn.net. Maybe this is a xmltree bug?


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread SolitudeSF
this works


import httpclient, xmltree, htmlparser, strtabs
import nimquery

var client = newHttpClient()
var url = "https://www.instagram.com/p/B1oqkXKFlcD;
var htmlsrc = client.getContent(url)

let xml = parseHtml(htmlsrc)
let elements = xml.querySelectorAll("[property='og:image']")

for e in elements:
  echo e.attrs["content"]


Run


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread filip
Try this 


import httpClient
import xmltree
import htmlparser
import strtabs

var client = newHttpClient()
var url = "https://www.instagram.com/p/B1oqkXKFlcD;
var htmlsrc = client.getContent(url)

let xml = parseHtml(htmlsrc)
for meta in xml.findAll("meta"):
if meta.attrs.hasKey("property") and meta.attrs["property"] == 
"og:image":
echo "URL: ", meta.attrs["content"]


Run


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread anta40
OK. Another attempt:


import httpClient
import re
import xmltree
import htmlparser
import streams
import nimquery
import strutils

var client = newHttpClient()
var url = "https://www.instagram.com/p/B1oqkXKFlcD;
var htmlsrc = client.getContent(url)

let xml = parseHtml(newStringStream(htmlsrc))
let elements = xml.querySelectorAll("meta")

for x in 0 .. elements.len-1:
  if contains(elements[x].text, "og:image"):
echo elements[x]


Run

My intention is to print only the line which contains **og:image**.

It crashes, unfortunately: 


fatal.nim(39)sysFatal
Error: unhandled exception: xmltree.nim(176, 10) `n.k in {xnText, 
xnComment, xnCData, xnEntity}`  [AssertionError]


Run


Re: How to use extract string with contains a certain pattern?

2019-09-03 Thread SolitudeSF
use parseHtml from httpparser and xmltree to get tree structure that you can 
traverse then yourself or use something like 
[https://github.com/GULPF/nimquery](https://github.com/GULPF/nimquery) to find 
the element you need.