If you push the poi version  to 3.6 in your maven configuration, do
you still get the error?

Mark


On Fri, Oct 8, 2010 at 9:47 AM, Keith Gilbertson
<keith.gilbert...@library.gatech.edu> wrote:
> Mark - Thank you.  It's in our maven repository.  Graham had mentioned there
> would be some work to get this going, but I didn't know what it involved.
> Everything built and installed with some minor code changes, which was very
> nifty.  I still got an error in the word filter.
> Hardy Pottinger had sent me a link to this notice:
> http://code.google.com/p/text-mining/issues/detail?id=5
> I didn't know what "rejar" meant, but I found this to work:
> 1.  Get source for this version of text-mining utils with 'svn checkout
> http://text-mining.googlecode.com/svn/trunk/ text-mining-read-only' command
> 2.  From this tree, delete lib/poi-3.0.1-FINAL-20070705.jar and replace with
> poi-3.6.jar
> 3.  Rebuild with 'ant' command
> 4.  Copy build/bin/tm-extractors-1.0.jar to
> lib/dspace-tm-extractors-1.0.0.jar directory of my dspace deployment
> directory
> Then filter-media works fine with the new PowerPoint filter and the
> WordFilter.
> So, could we rebuild the dspace-tm-extractors-1.00.jar against poi-3.6 and
> put that in our maven repository? I suppose now would also be a good
> opportunity for me to learn about the unit testing framework and use it to
> make sure filtering still works as well as it did before the change!
>  Ryan Ackley, the developer for these tm-extractors also worked on the POI
> project for a while.   Presumably he's very busy, but I'll contact him and
> ask if POI now has the full capability of the tm-extractors and hope for an
> answer - because maybe we don't even need the tm-extractors library if the
> POI extractors were rewritten by Ryan.
> It looks like the current WordFilter doesn't handle the new Microsoft Word
> XML formats - so that may be another small project for someone to take on
> soon.
>
> --keith
>
> On Oct 7, 2010, at 3:35 AM, Mark Diggory wrote:
>
> As its not in the maven central repository.  We would need to release
> it ourselves under org.dspace.dependencies or see if someone else can
> push out a new version of tm-extractors for maven central.
>
> To release into our repository, we just need to author a pom.xml file
> for the tm-extractors and package the jar... I set this up, but had
> some issues with sonatype failing to let me see the staged release on
> their side. I did release to the central repository.  Still waiting to
> see it show up here:
>
> http://repo2.maven.org/maven2/org/dspace/dependencies/dspace-tm-extractors
>
> once available, give it a try and see if it fixes your issues.
>
> Mark
>
> On Wed, Oct 6, 2010 at 11:11 AM, Keith Gilbertson
> <keith.gilbert...@library.gatech.edu> wrote:
>
> Thanks Graham and Tim.  I hadn't seen that.
>
> On Oct 6, 2010, at 11:52 AM, Graham Triggs wrote:
>
> That version of tm-extractors is quite old.
>
> There is a newer version on the Google site
>
> - http://code.google.com/p/text-mining/ - but it will take a bit of work
>
> wrapping things up for general use.
>
> It has dependencies on newer versions of POI than 0.4, and some distinct
>
> improvements to it's robustness.
>
> G
>
> On 6 October 2010 16:39, Tim Donohue <tdono...@duraspace.org> wrote:
>
> Ugh -- sounds like you've entered dependency hell.
>
> Though, I think the one shred of good news here is that it seems to only
>
> have a dependency conflict in one place in our codebase.
>
> It looks like (at a glance) if our WordFilter can be re-written to no
>
> longer need the org.textmining project, you *might* be OK (i.e.
>
> hopefully it wouldn't snowball on you). But, that would require finding
>
> a Word document text extractor that is as good as (or better than) that
>
> 'org.textmining' one, and then hoping it doesn't cause another
>
> dependency conflict.  Not sure of any alternative Word text extractors,
>
> off the top of my head, but maybe others know of one?
>
> - Tim
>
>
> ------------------------------------------------------------------------------
>
> Beautiful is writing same markup. Internet Explorer 9 supports
>
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
>
> Spend less time writing and  rewriting code and more time creating great
>
> experiences on the web. Be a part of the beta today.
>
> http://p.sf.net/sfu/beautyoftheweb
>
> _______________________________________________
>
> DSpace-tech mailing list
>
> DSpace-tech@lists.sourceforge.net
>
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
>
>
>
>
> --
> Mark R. Diggory
> Head of U.S. Operations - @mire
>
> http://www.atmire.com - Institutional Repository Solutions
> http://www.togather.eu - Before getting together, get t...@ther
>
>



-- 
Mark R. Diggory
Head of U.S. Operations - @mire

http://www.atmire.com - Institutional Repository Solutions
http://www.togather.eu - Before getting together, get t...@ther

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to