Re: [CODE4LIB] text mining software

2013-09-02 Thread Aaron Coburn
Alan, if you are looking for data mining software that runs well in Hadoop, I would definitely recommend looking into Apache Mahout [1]. This software is specifically focused on categorization and clustering, and these algorithms tend to work well in the distributed architecture of a

[CODE4LIB] text mining software

2013-08-27 Thread Eric Lease Morgan
What sorts of text mining software do y'all support / use in your libraries? We here in the Hesburgh Libraries at the University of Notre Dame have all but opened a place called the Center For Digital Scholarship. We are / will be providing a number of different services to a number of

Re: [CODE4LIB] text mining software

2013-08-27 Thread Pottinger, Hardy J.
Hi, Eric, I don't have any experience in this field, but I went looking a while ago when the topic came up, and these two links are in my notes for further exploration, if the topic ever comes around again: http://wordseer.berkeley.edu/ http://mininghumanities.com/ May they serve you well. --

Re: [CODE4LIB] text mining software

2013-08-27 Thread David Lowe
Subject: Re: [CODE4LIB] text mining software Hi, Eric, I don't have any experience in this field, but I went looking a while ago when the topic came up, and these two links are in my notes for further exploration, if the topic ever comes around again: http://wordseer.berkeley.edu/ http

Re: [CODE4LIB] text mining software

2013-08-27 Thread Julia Bauder
] On Behalf Of Pottinger, Hardy J. Sent: Tuesday, August 27, 2013 11:51 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] text mining software Hi, Eric, I don't have any experience in this field, but I went looking a while ago when the topic came up, and these two links are in my notes

Re: [CODE4LIB] text mining software

2013-08-27 Thread Riley, Jenn
This is still command-line, but Mallet is heavily used in the DH community: http://mallet.cs.umass.edu/. I think MONK (http://monkproject.org/) has a UI, but I'm not overly familiar with its features. Jenn Jenn Riley Head, Carolina Digital Library and Archives

Re: [CODE4LIB] text mining software

2013-08-27 Thread Alan Darnell
Do any of these work in Hadoop using MapReduce as a programming model? It seems like Hadoop would be a natural use case for text mining and analysis. Alan On Aug 27, 2013, at 7:44 PM, Riley, Jenn jlri...@email.unc.edu wrote: This is still command-line, but Mallet is heavily used in the DH

Re: [CODE4LIB] text mining software

2013-08-27 Thread stuart yeates
There have been some great software recommendations in this thread, that I really don't want to quibble with. What I'd like to quibble with is the software-first approach. We've all tried the software-first approach, how many of us were happy with it? There is a standard in this area and that

Re: [CODE4LIB] text mining software

2013-08-27 Thread danielle plumer
I worked a lot with GATE in a previous position (not in a library, but in a research position at the Univ. of Texas at Austin). It's handy in that there is both a UI version (GATE Developer) and a set of APIs (GATE Embedded), which were the only versions I worked with. Also nice is the fact that