curl reads from local file or stdin, so you could do something like if it only a single file from a webserver
curl http://someserver/file.html/ | curl "http://localhost:8983/solr/update/extract?extractOnly=true" -F na...@- but this way no crawling, no link following etc... -- mit freundlichen Grüßen Markus Rietzler - <rietzler_software/> Rechenzentrum der Finanzverwaltung NRW 0211/4572-2130 > -----Ursprüngliche Nachricht----- > Von: Insight 49, LLC [mailto:insigh...@gmail.com] > Gesendet: Dienstag, 27. Oktober 2009 16:14 > An: solr-user@lucene.apache.org > Betreff: Solr Cell on web-based files? > > Hi, > > If I use the ExtractingRequestHandler > <http://wiki.apache.org/solr/ExtractingRequestHandler> on a > local file > (as shown in > http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput), > all works well, but how do I do this for files located on a server? > > e.g. (works) > curl http://localhost:8983/solr/update/extract?extractOnly=true > --data-binary @mylocalfile.htm -H "Content-type:text/html" > > e.g (doesn't work) > curl http://localhost:8983/solr/update/extract?extractOnly=true > --data-binary @http://myweb.com/mylocalfile.htm -H > "Content-type:text/html" > > Thanks, > > Dan > >