On 03/20/2014 04:52 PM, Laura McCord wrote:
Hi,
This might be a shot in the dark but, I was wondering if anyone has any
experience with web-crawling a website that is ?Casified? but by entering your
credentials it will proceed to crawl and obtain the content? If so, did you use
any specific technologies to perform the task?
Thanks,
Laura
It kind of depends on what you're after here. Are you looking at letting
Google through, or your own crawler?
If it's your own, does it even need to be a web crawler? My experience
with search is around Apache Solr. In that case, I'd just get the data
directly out of the database and put it in Solr. Generally you get
better search results if you don't have to mess with those pesky things
we call web pages.
--
You are currently subscribed to [email protected] as:
[email protected]
To unsubscribe, change settings or access archives, see
http://www.ja-sig.org/wiki/display/JSG/cas-user