Hi All!

I hope this email isn't an intrusion. I'm looking for a consultant (or 
consultants plural) to work on a Nutch project. I've sent this out to a few 
of you but I wanted to make sure I got it out to everybody.
 
The client is interested in a "proof of concept." What this boils down to is 
a prototype system that crawls terrorist sites with Nutch, creating a web 
interface for searching and then adding some additional features like saved 
searches, active hot spots and export to blog. I've included a rough outline 
of functionality below:

   - *Spider* – Visit a starting set of seed sites, provided by the 
   analyst, and follow links to other sites … index and store what it finds. 
   Most sites will be in the arabic language. 
      - *Geographical Tagging* – When geo.position, 
      DC.coverage.spatial, and/or ICBM information is available for a 
      web page the location is calculated and stored in the index. 
      - *Link-graph* – Each URL is ranked based on a number of 
      parameters like inbound links, etc. 
      - *Query capability* – Query the set of indexed web pages based 
      on keyword, geo-location, and Boolean triggers. 
      - *Saved query capability* – A query can be saved for easy 
      future viewing. 
      - *Hot spots* – A summary report showing which pages are 
      actively being linked to. 
      - *Hot Memes* – A summary report showing phraseology that is 
      currently active. 
   - 
   
   *Deliverable Work Flow for Analysts:*
   - Log in to the Spider tool 
      - View daily summary reports, export a few hot sites to the blog 
      
      - Based on memes appearing in other media (television, radio, 
      print) do some queries to see how those memes are spreading
online, export
      relevant sites to the blog 
      - Create a saved search in the Spider tool to watch progress of 
      memes 
      - Log in to Blog tool 
      - Review the list of sites that were just exported from the 
      Spider tool 
         - Make qualitative analysis, storing it as part of the 
         blog entry 
         - Evaluate site / page via I4 and other quantitative 
         methodologies 
         - Categorize site via datablogging fields (i.e. area = {Middle 
         East, Indonesia, etc.})
      - Publish the blog entries for other analysts to see 
      - Analysts subscribe to each other's blogs via RSS and read 
      daily analysis in one location without having to visit
individual blog sites
      
      - Collaboration happens via comments on blogs, 
      posts-about-posts, etc. This collaboration is on 
      cultural/contextual elements and is documented, archived, and 
      searchable.
   
The deliverable is an installable web application (Tomcat, Java, MySql) 
along with installation, configuration and startup support. I've tried to 
build down the requirements to what I know Nutch can do well out of the box 
and we can wrap fairly quickly. This is a six week proof of concept so I'll 
need a working beta within four weeks.
 Don't worry about the blog collaboration tool at all... that's a piece of 
software that we have currently and can export to via the MetaWeblogAPI very 
quickly and easily. We need help on the Nutch side but I wanted you to see 
the workflow from Nutch to the blogging collaboration system.
 A few questions for you:
 1) If you're interested, when can you start? The client is a hurry up and 
wait client, but they may be willing to jump quickly in the coming days.
2) What are your hourly consulting/coding rates?
3) How many hours over the course of six weeks do you guess this project 
would take? I'm guessing one to two people full time.
 4) Is this sort of deliverable something that you feel you can pull off on 
your own in four weeks (to beta) or would you recommend I bring in somebody 
else with Nutch/web experience? Do you have anybody in mind that you work 
well with?
 As a software developer myself I know that these are heavily loaded 
questions given the lack of exact design requirements. I'm looking for 
somebody who feels this is within their capability and is willing to work 
hard to make it happen. The client is willing to trust us with many of the 
details and if we do a good job this should lead to a much more robust and 
dynamic application. And, of course, building the prototype gives you a good 
leg up on getting the contract once they move forward. So, please answer to 
the best of your ability... this isn't a commitment at this point... just a 
ballpark to get me moving forward with the client.
 I'm going to speak with the client again later this afternoon and would 
like to have a sense of what's possible. I apologize for the urgency... the 
client awoke from something of a slumber yesterday.
 Best,
 Joe Reger

Reply via email to