Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-12 Thread Lewis John Mcgibbney
Just for reference, I've sorted all the problems I wasx having with my
build environment and have updated the tutorial on our wiki. I've also
commented on your issue Kirby. Thanks for the pointer.

Lewis

On Fri, Nov 11, 2011 at 1:37 PM, Kirby Bohling kirby.bohl...@gmail.comwrote:

 Lewis,

 https://issues.apache.org/jira/browse/NUTCH-1068

 That is the issue I filed about the patch (it isn't directly related
 to this, but it is related to some potential fixes).

 http://www.mail-archive.com/dev%40nutch.apache.org/msg03419.html

 That's the e-mail thread where I originally mentioned the
 modifications to automaton, and the patch with the backport of the
 Lucene fixes.

 Kirby


 On Fri, Nov 11, 2011 at 11:58 AM, Lewis John Mcgibbney
 lewis.mcgibb...@gmail.com wrote:
  Excellent Kirby, thanks for this.
 
  The obvious question I guess... where does this leave us with regards to
 the
  urlfilter-automation libraries?
 
  For the record as well, can you please provide the Jira you filed, it
 would
  be good to know where I can begin with this one.
 
  Thanks
 
  On Thu, Nov 10, 2011 at 10:18 PM, Kirby Bohling kirby.bohl...@gmail.com
 
  wrote:
 
  On Thu, Nov 10, 2011 at 6:14 PM, Lewis John Mcgibbney
  lewis.mcgibb...@gmail.com wrote:
   OK so the required dependencies can be seen below
  
   - FeedParser dependency org=net.java.dev.rome name=rome
 rev=1.0.0
   conf=*-master/
   - URLAutomationFilter - dependency org=dk.brics name=automaton
   rev=???/
   - SWFParser dependency org=com.google.gwt name=gwt-incubator
   rev=2.0.1/
   - HTMLParser   dependency org=net.sourceforge.nekohtml
   name=nekohtml
   rev=1.9.15/
  
   There is a real nasty hack which would replace the usual ${nutch.root}
   with
   include file=../../../ivy/ivy-configurations.xml/ is possible,
   however
   this is not how I want to progress.
  
   I'm also not sure where to find the dk.brics dependency.
 
  The Automaton library to the best of my knowledge is not available via
  Maven's central repo.
 
  http://www.brics.dk/automaton/ is the site where you and find it.
 
  That's the location of the actual jar.
  http://www.brics.dk/automaton/automaton.jar
 
  In order to get the source you have to submit an e-mail address, but
  it is all available under the newer BSD/MIT license.
 
  I believe all of the functionality actually used by Nutch is in a
  faster form buried inside the Lucene Util library 4.0 (unreleased last
  I knew).  I believe I filed an JIRA issue about my backport of the
  Lucene improvements to the library at Julian's request.  I have
  submitted the code to the author, but I'm not sure if he has
  integrated it.  He was short on time when I submitted all of it.
 
  It is a nice library, but it isn't very 3rd party user friendly (no
  bug tracker, no public source repo).
 
  Kirby
 
 
  
   Any thoughts? Jira issue?
  
   Thanks
  
   On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki a...@getopt.org
   wrote:
  
   On 10/11/2011 04:39, Lewis John Mcgibbney wrote:
  
   Gets even more strange, both SWFParser and AutomationURLFilter
 import
   additonal depenedencies, however they are not included within thier
   plugin/ivy/ivy.xml files!
  
   Am I missing something here?
  
   Most likely these problems come from the initial porting of a pure
 ant
   build to an ant+ivy build. We should determine what deps are really
   needed
   by these plugins, and sanitize the ivy.xml files so that they make
   sense -
   if the existing files can't be untangled we can ditch them and come
 up
   with
   new, clean ones.
  
   --
   Best regards,
   Andrzej Bialecki 
___. ___ ___ ___ _ _   __
   [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
   ___|||__||  \|  ||  |  Embedded Unix, System Integration
   http://www.sigram.com  Contact: info at sigram dot com
  
  
  
  
   --
   Lewis
  
  
 
 
 
  --
  Lewis
 
 




-- 
*Lewis*


Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-11 Thread Lewis John Mcgibbney
Excellent Kirby, thanks for this.

The obvious question I guess... where does this leave us with regards to
the urlfilter-automation libraries?

For the record as well, can you please provide the Jira you filed, it would
be good to know where I can begin with this one.

Thanks

On Thu, Nov 10, 2011 at 10:18 PM, Kirby Bohling kirby.bohl...@gmail.comwrote:

 On Thu, Nov 10, 2011 at 6:14 PM, Lewis John Mcgibbney
 lewis.mcgibb...@gmail.com wrote:
  OK so the required dependencies can be seen below
 
  - FeedParser dependency org=net.java.dev.rome name=rome rev=1.0.0
  conf=*-master/
  - URLAutomationFilter - dependency org=dk.brics name=automaton
  rev=???/
  - SWFParser dependency org=com.google.gwt name=gwt-incubator
  rev=2.0.1/
  - HTMLParser   dependency org=net.sourceforge.nekohtml name=nekohtml
  rev=1.9.15/
 
  There is a real nasty hack which would replace the usual ${nutch.root}
 with
  include file=../../../ivy/ivy-configurations.xml/ is possible,
 however
  this is not how I want to progress.
 
  I'm also not sure where to find the dk.brics dependency.

 The Automaton library to the best of my knowledge is not available via
 Maven's central repo.

 http://www.brics.dk/automaton/ is the site where you and find it.

 That's the location of the actual jar.
 http://www.brics.dk/automaton/automaton.jar

 In order to get the source you have to submit an e-mail address, but
 it is all available under the newer BSD/MIT license.

 I believe all of the functionality actually used by Nutch is in a
 faster form buried inside the Lucene Util library 4.0 (unreleased last
 I knew).  I believe I filed an JIRA issue about my backport of the
 Lucene improvements to the library at Julian's request.  I have
 submitted the code to the author, but I'm not sure if he has
 integrated it.  He was short on time when I submitted all of it.

 It is a nice library, but it isn't very 3rd party user friendly (no
 bug tracker, no public source repo).

 Kirby


 
  Any thoughts? Jira issue?
 
  Thanks
 
  On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki a...@getopt.org
 wrote:
 
  On 10/11/2011 04:39, Lewis John Mcgibbney wrote:
 
  Gets even more strange, both SWFParser and AutomationURLFilter import
  additonal depenedencies, however they are not included within thier
  plugin/ivy/ivy.xml files!
 
  Am I missing something here?
 
  Most likely these problems come from the initial porting of a pure ant
  build to an ant+ivy build. We should determine what deps are really
 needed
  by these plugins, and sanitize the ivy.xml files so that they make
 sense -
  if the existing files can't be untangled we can ditch them and come up
 with
  new, clean ones.
 
  --
  Best regards,
  Andrzej Bialecki 
   ___. ___ ___ ___ _ _   __
  [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
  ___|||__||  \|  ||  |  Embedded Unix, System Integration
  http://www.sigram.com  Contact: info at sigram dot com
 
 
 
 
  --
  Lewis
 
 




-- 
*Lewis*


Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-11 Thread Kirby Bohling
Lewis,

https://issues.apache.org/jira/browse/NUTCH-1068

That is the issue I filed about the patch (it isn't directly related
to this, but it is related to some potential fixes).

http://www.mail-archive.com/dev%40nutch.apache.org/msg03419.html

That's the e-mail thread where I originally mentioned the
modifications to automaton, and the patch with the backport of the
Lucene fixes.

Kirby


On Fri, Nov 11, 2011 at 11:58 AM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:
 Excellent Kirby, thanks for this.

 The obvious question I guess... where does this leave us with regards to the
 urlfilter-automation libraries?

 For the record as well, can you please provide the Jira you filed, it would
 be good to know where I can begin with this one.

 Thanks

 On Thu, Nov 10, 2011 at 10:18 PM, Kirby Bohling kirby.bohl...@gmail.com
 wrote:

 On Thu, Nov 10, 2011 at 6:14 PM, Lewis John Mcgibbney
 lewis.mcgibb...@gmail.com wrote:
  OK so the required dependencies can be seen below
 
  - FeedParser dependency org=net.java.dev.rome name=rome rev=1.0.0
  conf=*-master/
  - URLAutomationFilter - dependency org=dk.brics name=automaton
  rev=???/
  - SWFParser dependency org=com.google.gwt name=gwt-incubator
  rev=2.0.1/
  - HTMLParser   dependency org=net.sourceforge.nekohtml
  name=nekohtml
  rev=1.9.15/
 
  There is a real nasty hack which would replace the usual ${nutch.root}
  with
  include file=../../../ivy/ivy-configurations.xml/ is possible,
  however
  this is not how I want to progress.
 
  I'm also not sure where to find the dk.brics dependency.

 The Automaton library to the best of my knowledge is not available via
 Maven's central repo.

 http://www.brics.dk/automaton/ is the site where you and find it.

 That's the location of the actual jar.
 http://www.brics.dk/automaton/automaton.jar

 In order to get the source you have to submit an e-mail address, but
 it is all available under the newer BSD/MIT license.

 I believe all of the functionality actually used by Nutch is in a
 faster form buried inside the Lucene Util library 4.0 (unreleased last
 I knew).  I believe I filed an JIRA issue about my backport of the
 Lucene improvements to the library at Julian's request.  I have
 submitted the code to the author, but I'm not sure if he has
 integrated it.  He was short on time when I submitted all of it.

 It is a nice library, but it isn't very 3rd party user friendly (no
 bug tracker, no public source repo).

 Kirby


 
  Any thoughts? Jira issue?
 
  Thanks
 
  On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki a...@getopt.org
  wrote:
 
  On 10/11/2011 04:39, Lewis John Mcgibbney wrote:
 
  Gets even more strange, both SWFParser and AutomationURLFilter import
  additonal depenedencies, however they are not included within thier
  plugin/ivy/ivy.xml files!
 
  Am I missing something here?
 
  Most likely these problems come from the initial porting of a pure ant
  build to an ant+ivy build. We should determine what deps are really
  needed
  by these plugins, and sanitize the ivy.xml files so that they make
  sense -
  if the existing files can't be untangled we can ditch them and come up
  with
  new, clean ones.
 
  --
  Best regards,
  Andrzej Bialecki     
   ___. ___ ___ ___ _ _   __
  [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
  ___|||__||  \|  ||  |  Embedded Unix, System Integration
  http://www.sigram.com  Contact: info at sigram dot com
 
 
 
 
  --
  Lewis
 
 



 --
 Lewis




Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-10 Thread Andrzej Bialecki

On 10/11/2011 04:39, Lewis John Mcgibbney wrote:

Gets even more strange, both SWFParser and AutomationURLFilter import
additonal depenedencies, however they are not included within thier
plugin/ivy/ivy.xml files!

Am I missing something here?


Most likely these problems come from the initial porting of a pure ant 
build to an ant+ivy build. We should determine what deps are really 
needed by these plugins, and sanitize the ivy.xml files so that they 
make sense - if the existing files can't be untangled we can ditch them 
and come up with new, clean ones.


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-10 Thread Lewis John Mcgibbney
OK so the required dependencies can be seen below

- FeedParser dependency org=net.java.dev.rome name=rome rev=1.0.0
conf=*-master/
- URLAutomationFilter - dependency org=dk.brics name=automaton
rev=???/
- SWFParser dependency org=com.google.gwt name=gwt-incubator
rev=2.0.1/
- HTMLParser   dependency org=net.sourceforge.nekohtml name=nekohtml
rev=1.9.15/

There is a real nasty hack which would replace the usual ${nutch.root} with
include file=../../../ivy/ivy-configurations.xml/ is possible, however
this is not how I want to progress.

I'm also not sure where to find the dk.brics dependency.

Any thoughts? Jira issue?

Thanks

On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki a...@getopt.org wrote:

 On 10/11/2011 04:39, Lewis John Mcgibbney wrote:

 Gets even more strange, both SWFParser and AutomationURLFilter import
 additonal depenedencies, however they are not included within thier
 plugin/ivy/ivy.xml files!

 Am I missing something here?


 Most likely these problems come from the initial porting of a pure ant
 build to an ant+ivy build. We should determine what deps are really needed
 by these plugins, and sanitize the ivy.xml files so that they make sense -
 if the existing files can't be untangled we can ditch them and come up with
 new, clean ones.

 --
 Best regards,
 Andrzej Bialecki 
  ___. ___ ___ ___ _ _   __**
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




-- 
*Lewis*


Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-10 Thread Kirby Bohling
On Thu, Nov 10, 2011 at 6:14 PM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:
 OK so the required dependencies can be seen below

 - FeedParser dependency org=net.java.dev.rome name=rome rev=1.0.0
 conf=*-master/
 - URLAutomationFilter - dependency org=dk.brics name=automaton
 rev=???/
 - SWFParser dependency org=com.google.gwt name=gwt-incubator
 rev=2.0.1/
 - HTMLParser   dependency org=net.sourceforge.nekohtml name=nekohtml
 rev=1.9.15/

 There is a real nasty hack which would replace the usual ${nutch.root} with
 include file=../../../ivy/ivy-configurations.xml/ is possible, however
 this is not how I want to progress.

 I'm also not sure where to find the dk.brics dependency.

The Automaton library to the best of my knowledge is not available via
Maven's central repo.

http://www.brics.dk/automaton/ is the site where you and find it.

That's the location of the actual jar.
http://www.brics.dk/automaton/automaton.jar

In order to get the source you have to submit an e-mail address, but
it is all available under the newer BSD/MIT license.

I believe all of the functionality actually used by Nutch is in a
faster form buried inside the Lucene Util library 4.0 (unreleased last
I knew).  I believe I filed an JIRA issue about my backport of the
Lucene improvements to the library at Julian's request.  I have
submitted the code to the author, but I'm not sure if he has
integrated it.  He was short on time when I submitted all of it.

It is a nice library, but it isn't very 3rd party user friendly (no
bug tracker, no public source repo).

Kirby



 Any thoughts? Jira issue?

 Thanks

 On Thu, Nov 10, 2011 at 12:39 AM, Andrzej Bialecki a...@getopt.org wrote:

 On 10/11/2011 04:39, Lewis John Mcgibbney wrote:

 Gets even more strange, both SWFParser and AutomationURLFilter import
 additonal depenedencies, however they are not included within thier
 plugin/ivy/ivy.xml files!

 Am I missing something here?

 Most likely these problems come from the initial porting of a pure ant
 build to an ant+ivy build. We should determine what deps are really needed
 by these plugins, and sanitize the ivy.xml files so that they make sense -
 if the existing files can't be untangled we can ditch them and come up with
 new, clean ones.

 --
 Best regards,
 Andrzej Bialecki     
  ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




 --
 Lewis




Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-09 Thread Lewis John Mcgibbney
Say we were to replace ${nutch.root} with ${basedir} in every instance of a
plugin directory that has additional dependencies over and above what we
specify in NUTCH_HOME/ivy/ivy.xml. This then breaks the build.

Where is the nutch.root variable actually specified? I don't know where to
find it.

On Wed, Nov 9, 2011 at 5:41 PM, Lewis John Mcgibbney 
lewis.mcgibb...@gmail.com wrote:

 Hi,
 I've been looking closely at getting a well configured Nutch Eclispe
 environment. I'm having trouble with the following plugins

 - AutomationURLFilter
 - SWFParser
 - HTMLParser
 - FeedParser

 This ties back to the fact that each of these plugins have irregular
 dependencies over and above what we have specificed in
 NUTCH_HOME/ivy/ivy.xml.

 To date, I can see no way of resolving these dependencies within Eclipse
 without specifying the depencies in NUTCH_HOME/ivy/ivy.xml, however it
 would be great if anyone has a work around which we could utilise!!!

 For completeness, I've also tried resolving the dependencies with IvyDE
 plugin by manually adding the above 4 ivy.xml files, however the IvyDE
 struggles to parse the ivy files, due to the presence of the ${nutch.root}
 variable. Is there scope to change this? If so to what?

 Thank you

 --
 *Lewis*




-- 
*Lewis*


Re: Persistent problems with Ivy dependencies in Eclipse

2011-11-09 Thread Lewis John Mcgibbney
Gets even more strange, both SWFParser and AutomationURLFilter import
additonal depenedencies, however they are not included within thier
plugin/ivy/ivy.xml files!

Am I missing something here?

On Wed, Nov 9, 2011 at 6:23 PM, Lewis John Mcgibbney 
lewis.mcgibb...@gmail.com wrote:

 Say we were to replace ${nutch.root} with ${basedir} in every instance of
 a plugin directory that has additional dependencies over and above what we
 specify in NUTCH_HOME/ivy/ivy.xml. This then breaks the build.

 Where is the nutch.root variable actually specified? I don't know where to
 find it.


 On Wed, Nov 9, 2011 at 5:41 PM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi,
 I've been looking closely at getting a well configured Nutch Eclispe
 environment. I'm having trouble with the following plugins

 - AutomationURLFilter
 - SWFParser
 - HTMLParser
 - FeedParser

 This ties back to the fact that each of these plugins have irregular
 dependencies over and above what we have specificed in
 NUTCH_HOME/ivy/ivy.xml.

 To date, I can see no way of resolving these dependencies within Eclipse
 without specifying the depencies in NUTCH_HOME/ivy/ivy.xml, however it
 would be great if anyone has a work around which we could utilise!!!

 For completeness, I've also tried resolving the dependencies with IvyDE
 plugin by manually adding the above 4 ivy.xml files, however the IvyDE
 struggles to parse the ivy files, due to the presence of the ${nutch.root}
 variable. Is there scope to change this? If so to what?

 Thank you

 --
 *Lewis*




 --
 *Lewis*




-- 
*Lewis*