I'm a somewhat grizzled software guy. My background is mostly making sense of big, messy, piles of code. (If confusing, I clarify; if clear ...)
I've spent a lot of time on internationalization and performance tuning. Over the last year I've had a sort of crash course in NLP. Basis Technology, where I work, has always had a certain amount of NLP going on, but it's become a more and more important part of what we do. In spite of my status as a very, very, rusty mathematician I do my best to keep up. If there's one NLP thing I know something about, now, it is named entity extraction with averaged perceptrons and passive-aggressive training. This has the advantage of being mathematically trivial unless you want to prove that it works, which is as about as useful as proving that bumblebees can (or can't) fly. At Apache my center of gravity is probably CXF (web services), which I wandered into while contributing code to automatically generate Javascript clients for web services. Ironically, Basis owns a lot of code which is/was built by people who believe just the opposite of the Mahout motto -- that cloud distribution can overcome the inherent performance disadvantage of Java, leaving you with all the other advantages. We shall see.
