Re: how to upgrade a java application with nutch?

2009-10-02 Thread Jaime Martín
thank you for your responses. My scenario is this:
I have a web application with some pages. I want from one of the web menu
options to redirect to a search page. There you would be only able to search
info that has been previously crawled.
Everything is java, so I think nutch fits really well and on top of that it
is opensource.
So in my nutch downloaded distro I would configure the urls I want to crawl,
change something of the GUI and change some features of the nutch business
logic to customise for the specific project purposes.
That´s why I thought to customise nucth in a project, then create a library
from it and then use it from the main application.
So, for that.. nutch alone isn´t proper and you need to work together with
Solr? (anyway I´ll have a look at bixo and Droids)
than you!


2009/10/1 Fuad Efendi f...@efendi.ca

 Hi Jaime,

 You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin
 for
 SOLR). And use SolrJ client for SOLR from your application. This is very
 easy.
 -Fuad


 http://www.linkedin.com/in/liferay

  -Original Message-
  From: Jaime Martín [mailto:james...@gmail.com]
  Sent: October-01-09 5:59 AM
  To: nutch-user@lucene.apache.org
  Subject: how to upgrade a java application with nutch?
 
  Hi!
  I´ve a java application that I would like to upgrade with nutch. What
 jars
  should I add to my lib applicaction to make it possible to use nutch
  features from some of my app pages and business logic classes?
  I´ve tried with nutch-1.0.jar generated by war target without success.
  I wonder what is the proper nutch build.xml target I should execute for
 this
  and what of the generated jars are to be included in my app. Maybe apart
  from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
  them?
  thanks in advance!





RE: how to upgrade a java application with nutch?

2009-10-02 Thread Fuad Efendi
Nutch alone is OK, but embedded Nutch is not Ok... extremely hard!

You need to embed Nutch Client into your web application. Nutch should run
separately.

I believe Nutch supports Open Search or something similar (XML protocol,
REST-like) - your web application should use it to interact with Nutch, but
you have to develop client-part. Sorry, I don't know current status of Nutch
features...

So that I posted about SOLR: SolrJ is out-of-the-box client library for
Java, and for me it seems extremely easy solution for small search indexes
(few domains). Nutch has plugin for SOLR (and command-line option)

SOLR outputs everything as XML, JSON, etc. (instead of pure HTML which is
hard to embed in another HTML)
-Fuad


 -Original Message-
 From: Jaime Martín [mailto:james...@gmail.com]
 Sent: October-02-09 5:43 AM
 To: nutch-user@lucene.apache.org
 Subject: Re: how to upgrade a java application with nutch?
 
 thank you for your responses. My scenario is this:
 I have a web application with some pages. I want from one of the web menu
 options to redirect to a search page. There you would be only able to
search
 info that has been previously crawled.
 Everything is java, so I think nutch fits really well and on top of that
it
 is opensource.
 So in my nutch downloaded distro I would configure the urls I want to
crawl,
 change something of the GUI and change some features of the nutch business
 logic to customise for the specific project purposes.
 That´s why I thought to customise nucth in a project, then create a
library
 from it and then use it from the main application.
 So, for that.. nutch alone isn´t proper and you need to work together
with
 Solr? (anyway I´ll have a look at bixo and Droids)
 than you!
 
 
 2009/10/1 Fuad Efendi f...@efendi.ca
 
  Hi Jaime,
 
  You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin
  for
  SOLR). And use SolrJ client for SOLR from your application. This is very
  easy.
  -Fuad
 
 
  http://www.linkedin.com/in/liferay
 
   -Original Message-
   From: Jaime Martín [mailto:james...@gmail.com]
   Sent: October-01-09 5:59 AM
   To: nutch-user@lucene.apache.org
   Subject: how to upgrade a java application with nutch?
  
   Hi!
   I´ve a java application that I would like to upgrade with nutch.
What
  jars
   should I add to my lib applicaction to make it possible to use nutch
   features from some of my app pages and business logic classes?
   I´ve tried with nutch-1.0.jar generated by war target without
success.
   I wonder what is the proper nutch build.xml target I should execute
for
  this
   and what of the generated jars are to be included in my app. Maybe
apart
   from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few
of
   them?
   thanks in advance!
 
 
 




how to upgrade a java application with nutch?

2009-10-01 Thread Jaime Martín
Hi!
I´ve a java application that I would like to upgrade with nutch. What jars
should I add to my lib applicaction to make it possible to use nutch
features from some of my app pages and business logic classes?
I´ve tried with nutch-1.0.jar generated by war target without success.
I wonder what is the proper nutch build.xml target I should execute for this
and what of the generated jars are to be included in my app. Maybe apart
from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
them?
thanks in advance!


Re: how to upgrade a java application with nutch?

2009-10-01 Thread Paul Tomblin
2009/10/1 Jaime Martín james...@gmail.com

 Hi!
 I´ve a java application that I would like to upgrade with nutch. What
 jars
 should I add to my lib applicaction to make it possible to use nutch
 features from some of my app pages and business logic classes?
 I´ve tried with nutch-1.0.jar generated by war target without success.
 I wonder what is the proper nutch build.xml target I should execute for
 this
 and what of the generated jars are to be included in my app. Maybe apart
 from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
 them?


Maybe I'm doing it wrong, but I used the nutch-1.0.job file instead of the
jar.

-- 
http://www.linkedin.com/in/paultomblin


Re: how to upgrade a java application with nutch?

2009-10-01 Thread Andrzej Bialecki

Jaime Martín wrote:

Hi!
I´ve a java application that I would like to upgrade with nutch. What jars
should I add to my lib applicaction to make it possible to use nutch
features from some of my app pages and business logic classes?
I´ve tried with nutch-1.0.jar generated by war target without success.
I wonder what is the proper nutch build.xml target I should execute for this
and what of the generated jars are to be included in my app. Maybe apart
from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
them?
thanks in advance!



Nutch is not designed for embedding in other applications, so you may 
face numerous problems. I did such an integration once, and it was far 
from obvious. A lot depends also whether you want to run it on a 
distributed cluster or in a single JVM (local mode).


Take a look at build/nutch*.job, it's a jar file that contains all 
dependencies needed to run Nutch except for Hadoop libraries (which are 
also required).


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: how to upgrade a java application with nutch?

2009-10-01 Thread Jaime Martín
thank you for the info. that´s really a problem. I have a java project and
for some of its new features I would like to use nutch. As I need to
customise nutch my idea was next:
- 1st: change what needed for my requirements in my downloaded nutch and
generate a nutch library
- 2nd: add that library in the other project and invoke libraries features
when needed

is that not advisable? what is the best way then to generate a nutch library
to be used in other java projects? or is that not possible without becoming
crazy due to configuration issues?



2009/10/1 Andrzej Bialecki a...@getopt.org

 Jaime Martín wrote:

 Hi!
 I´ve a java application that I would like to upgrade with nutch. What
 jars
 should I add to my lib applicaction to make it possible to use nutch
 features from some of my app pages and business logic classes?
 I´ve tried with nutch-1.0.jar generated by war target without success.
 I wonder what is the proper nutch build.xml target I should execute for
 this
 and what of the generated jars are to be included in my app. Maybe apart
 from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
 them?
 thanks in advance!


 Nutch is not designed for embedding in other applications, so you may face
 numerous problems. I did such an integration once, and it was far from
 obvious. A lot depends also whether you want to run it on a distributed
 cluster or in a single JVM (local mode).

 Take a look at build/nutch*.job, it's a jar file that contains all
 dependencies needed to run Nutch except for Hadoop libraries (which are also
 required).

 --
 Best regards,
 Andrzej Bialecki 
  ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




Re: how to upgrade a java application with nutch?

2009-10-01 Thread Ken Krugler

Hi Jaime,

Depending on what exactly you're trying to do, there are some other  
projects that offer crawler functionality which could be easier to  
embed.


The two I know about are:

 - Droids (http://incubator.apache.org/droids/), though I haven't  
really used it.
 - Bixo (http://bixo.101tec.com/), which is a project I'm actively  
working on.


-- Ken

On Oct 1, 2009, at 9:37am, Jaime Martín wrote:

thank you for the info. that´s really a problem. I have a java  
project and

for some of its new features I would like to use nutch. As I need to
customise nutch my idea was next:
- 1st: change what needed for my requirements in my downloaded nutch  
and

generate a nutch library
- 2nd: add that library in the other project and invoke libraries  
features

when needed

is that not advisable? what is the best way then to generate a nutch  
library
to be used in other java projects? or is that not possible without  
becoming

crazy due to configuration issues?



2009/10/1 Andrzej Bialecki a...@getopt.org


Jaime Martín wrote:


Hi!
I´ve a java application that I would like to upgrade with nutch.  
What

jars
should I add to my lib applicaction to make it possible to use nutch
features from some of my app pages and business logic classes?
I´ve tried with nutch-1.0.jar generated by war target without  
success.
I wonder what is the proper nutch build.xml target I should  
execute for

this
and what of the generated jars are to be included in my app. Maybe  
apart
from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a  
few of

them?
thanks in advance!


Nutch is not designed for embedding in other applications, so you  
may face
numerous problems. I did such an integration once, and it was far  
from
obvious. A lot depends also whether you want to run it on a  
distributed

cluster or in a single JVM (local mode).

Take a look at build/nutch*.job, it's a jar file that contains all
dependencies needed to run Nutch except for Hadoop libraries (which  
are also

required).

--
Best regards,
Andrzej Bialecki 
___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




--
Ken Krugler
TransPac Software, Inc.
http://www.transpac.com
+1 530-210-6378



RE: how to upgrade a java application with nutch?

2009-10-01 Thread Fuad Efendi
Hi Jaime,

You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin for
SOLR). And use SolrJ client for SOLR from your application. This is very
easy.
-Fuad


http://www.linkedin.com/in/liferay

 -Original Message-
 From: Jaime Martín [mailto:james...@gmail.com]
 Sent: October-01-09 5:59 AM
 To: nutch-user@lucene.apache.org
 Subject: how to upgrade a java application with nutch?
 
 Hi!
 I´ve a java application that I would like to upgrade with nutch. What
jars
 should I add to my lib applicaction to make it possible to use nutch
 features from some of my app pages and business logic classes?
 I´ve tried with nutch-1.0.jar generated by war target without success.
 I wonder what is the proper nutch build.xml target I should execute for
this
 and what of the generated jars are to be included in my app. Maybe apart
 from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
 them?
 thanks in advance!