Re: Solr: extracting/indexing HTML via cURL

2012-05-02 Thread Lance Norskog
To: solr-user@lucene.apache.org Subject: Solr: extracting/indexing HTML via cURL Hello, Over the weekend I experimented with extracting HTML content via cURL and just wondering why the extraction/indexing process does not include the HTML tags. It seems as though the HTML tags either being

Solr: extracting/indexing HTML via cURL

2012-04-30 Thread okayndc
Hello, Over the weekend I experimented with extracting HTML content via cURL and just wondering why the extraction/indexing process does not include the HTML tags. It seems as though the HTML tags either being ignored or stripped somewhere in the pipeline. If this is the case, is it possible to

Re: Solr: extracting/indexing HTML via cURL

2012-04-30 Thread Jack Krupansky
:07 AM To: solr-user@lucene.apache.org Subject: Solr: extracting/indexing HTML via cURL Hello, Over the weekend I experimented with extracting HTML content via cURL and just wondering why the extraction/indexing process does not include the HTML tags. It seems as though the HTML tags either being

Re: Solr: extracting/indexing HTML via cURL

2012-04-30 Thread okayndc
@lucene.apache.org Subject: Solr: extracting/indexing HTML via cURL Hello, Over the weekend I experimented with extracting HTML content via cURL and just wondering why the extraction/indexing process does not include the HTML tags. It seems as though the HTML tags either being ignored or stripped