Good point Pei. 

We would need to do a spike (short sprint) in the future to see if Mahout would 
be a good fit. 
I'm just wondering because I'm planning out how I will be using cTakes, and was 
wondering how others are planning as well.


Cheers, 
--ANdy 


On Apr 28, 2013, at 5:39 PM, "Chen, Pei" <[email protected]> wrote:

> Has anyone tried Mahout recently?
> Last time I tried, it was still closely tied to the Hadoop file system. 
> 
> Sent from my iPhone
> 
> On Apr 28, 2013, at 7:44 PM, "Andy McMurry" <[email protected]> wrote:
> 
>> I encourage committers to checkout Apache Mahout 
>> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms
>> 
>> Why Apache Mahout? 
>> 1. provides ML classifiers and functions not available through UIMA
>> 2. parallel by design, transparently invokes Hadoop  
>> 3. Java and Apache license (every other known toolkit is GPL!) 
>> 4. likely to become standard ML package for Apache 
>> 
>> Why would we use mahout in cTakes? 
>> cTakes models are "provided", for example PoS tagging. 
>> Retraining these models on your own compute cluster would be difficult  (in 
>> my opinion). 
>> LibSVM is nice, but it is only one classification method. 
>> 
>> When ? 
>> No rush, however, I suggest we dont invest time in porting SINGLE-CPU 
>> classifier functions that we will have to parallelize, later. 
>> 
>> Summary: 
>> UIMA + mahout = pipelines + classification 
>> 
>> 
>> 
>> 
>> On Apr 28, 2013, at 4:26 PM, "Savova, Guergana" 
>> <[email protected]> wrote:
>> 
>>> +1 
>>> --guergana
>>> 
>>> -----Original Message-----
>>> From: Kaggal, Vinod C. [mailto:[email protected]] 
>>> Sent: Saturday, April 27, 2013 11:21 PM
>>> To: <[email protected]>
>>> Cc: <[email protected]>
>>> Subject: Re: roadmap for Apache cTakes "big data" processing
>>> 
>>> +1
>>> 
>>> 
>>> On Apr 27, 2013, at 9:05 PM, "Chen, Pei" <[email protected]> 
>>> wrote:
>>> 
>>>> +1 for UIMA-AS
>>>> 
>>>> 
>>>> On Apr 27, 2013, at 9:25 PM, "Andy McMurry" <[email protected]> wrote:
>>>> 
>>>>> I'm writing to gauge community interest and intent for parallel 
>>>>> processing with cTakes. 
>>>>> 
>>>>> Apache UIMA is planning "Async Scaleout" as a replacement for CPM. 
>>>>> http://uima.apache.org/doc-uimaas-what.html
>>>>> 
>>>>> Apache Mahout is likely to become the defacto apache package for machine 
>>>>> learning. 
>>>>> http://mahout.apache.org/
>>>>> 
>>>>> I believe cTakes will embrace both of these in due time.  
>>>>> Do you agree or do you have a different view?
>> 

Reply via email to