Raj Dodhiawala wrote:
> Question: how can I save the information scraped by Solvent to a DB or 
> even a tab/comma separated file?
>
> Basically, I guess I could parse the (structured) output for further 
> processing -- the .turtle file but that's kind of a pain for the 
> number of different page formats I might be scraping.
>
> Need some hints. Perhaps I can take the generated code and run it from 
> within my <javascript / ???> so each extracted "record" can be saved 
> to a DB. Am I even in the ball park of possibilities with Solvent?
>
>
The most obvious way that springs to mind, to me at least, is saving 
your scraped data to PiggyBank, and then exporting it as RDF/XML, which 
you can run through Babel to get it into Exhibit JSON, N3 or RSS (not 
sure which is best for your purposes) . If you then make an Exhibit from 
the JSON, you can export as Tab Separated Values.
Strange that Exhibit offers  more export options than Babel ...

Another possibility might be to write a scraper that writes the data 
out, in the format you want it in, to a form  textarea inside a new 
window, which POSTs to a server-side script you'd write, which accepts 
it and saves it to a database/file.

There may be other, easier, saner ways  - I don't know all the ins and 
outs of Solvent etc, and I've just woken up ;)

Keith
> ------------------------------------------------------------------------
>
> _______________________________________________
> General mailing list
> [email protected]
> http://simile.mit.edu/mailman/listinfo/general
>   

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to