Excellent Kirby, thanks for this. The obvious question I guess... where does this leave us with regards to the urlfilter-automation libraries?
For the record as well, can you please provide the Jira you filed, it would be good to know where I can begin with this one. Thanks On Thu, Nov 10, 2011 at 10:18 PM, Kirby Bohling <kirby.bohl...@gmail.com>wrote: > On Thu, Nov 10, 2011 at 6:14 PM, Lewis John Mcgibbney > <lewis.mcgibb...@gmail.com> wrote: > > OK so the required dependencies can be seen below > > > > - FeedParser <dependency org="net.java.dev.rome" name="rome" rev="1.0.0" > > conf="*->master"/> > > - URLAutomationFilter - <dependency org="dk.brics" name="automaton" > > rev="???"/> > > - SWFParser <dependency org="com.google.gwt" name="gwt-incubator" > > rev="2.0.1"/> > > - HTMLParser <dependency org="net.sourceforge.nekohtml" name="nekohtml" > > rev="1.9.15"/> > > > > There is a real nasty hack which would replace the usual ${nutch.root} > with > > <include file="../../../ivy/ivy-configurations.xml"/> is possible, > however > > this is not how I want to progress. > > > > I'm also not sure where to find the dk.brics dependency. > > The Automaton library to the best of my knowledge is not available via > Maven's central repo. > > http://www.brics.dk/automaton/ is the site where you and find it. > > That's the location of the actual jar. > http://www.brics.dk/automaton/automaton.jar > > In order to get the source you have to submit an e-mail address, but > it is all available under the newer BSD/MIT license. > > I believe all of the functionality actually used by Nutch is in a > faster form buried inside the Lucene Util library 4.0 (unreleased last > I knew). I believe I filed an JIRA issue about my backport of the > Lucene improvements to the library at Julian's request. I have > submitted the code to the author, but I'm not sure if he has > integrated it. He was short on time when I submitted all of it. > > It is a nice library, but it isn't very 3rd party user friendly (no > bug tracker, no public source repo). > > Kirby > > > > > > Any thoughts? Jira issue? > > > > Thanks > > > > On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki <a...@getopt.org> > wrote: > >> > >> On 10/11/2011 04:39, Lewis John Mcgibbney wrote: > >>> > >>> Gets even more strange, both SWFParser and AutomationURLFilter import > >>> additonal depenedencies, however they are not included within thier > >>> plugin/ivy/ivy.xml files! > >>> > >>> Am I missing something here? > >> > >> Most likely these problems come from the initial porting of a pure ant > >> build to an ant+ivy build. We should determine what deps are really > needed > >> by these plugins, and sanitize the ivy.xml files so that they make > sense - > >> if the existing files can't be untangled we can ditch them and come up > with > >> new, clean ones. > >> > >> -- > >> Best regards, > >> Andrzej Bialecki <>< > >> ___. ___ ___ ___ _ _ __________________________________ > >> [__ || __|__/|__||\/| Information Retrieval, Semantic Web > >> ___|||__|| \| || | Embedded Unix, System Integration > >> http://www.sigram.com Contact: info at sigram dot com > >> > > > > > > > > -- > > Lewis > > > > > -- *Lewis*