On 03/20/2014 04:52 PM, Laura McCord wrote:
Hi,

This might be a shot in the dark but, I was wondering if anyone has any 
experience with web-crawling a website that is ?Casified? but by entering your 
credentials it will proceed to crawl and obtain the content? If so, did you use 
any specific technologies to perform the task?

Thanks,
  Laura




It kind of depends on what you're after here. Are you looking at letting Google through, or your own crawler?

If it's your own, does it even need to be a web crawler? My experience with search is around Apache Solr. In that case, I'd just get the data directly out of the database and put it in Solr. Generally you get better search results if you don't have to mess with those pesky things we call web pages.

--
You are currently subscribed to [email protected] as: 
[email protected]
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-user

Reply via email to