curl reads from local file or stdin, so you could do something like

if it only a single file from a webserver


curl http://someserver/file.html/ | curl 
"http://localhost:8983/solr/update/extract?extractOnly=true"; -F na...@-


but this way no crawling, no link following etc...


--
mit freundlichen Grüßen

Markus Rietzler - <rietzler_software/>
Rechenzentrum der Finanzverwaltung NRW
0211/4572-2130
 

> -----Ursprüngliche Nachricht-----
> Von: Insight 49, LLC [mailto:insigh...@gmail.com] 
> Gesendet: Dienstag, 27. Oktober 2009 16:14
> An: solr-user@lucene.apache.org
> Betreff: Solr Cell on web-based files?
> 
> Hi,
> 
> If I use the ExtractingRequestHandler 
> <http://wiki.apache.org/solr/ExtractingRequestHandler> on a 
> local file 
> (as shown in 
> http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput), 
> all works well, but how do I do this for files located on a server?
> 
> e.g. (works)
> curl http://localhost:8983/solr/update/extract?extractOnly=true 
> --data-binary @mylocalfile.htm -H "Content-type:text/html"
> 
> e.g (doesn't work)
> curl http://localhost:8983/solr/update/extract?extractOnly=true 
> --data-binary @http://myweb.com/mylocalfile.htm -H 
> "Content-type:text/html"
> 
> Thanks,
> 
> Dan
> 
> 

Reply via email to