RE: how to "upgrade" a java application with nutch?

2009-10-02 Thread Fuad Efendi
Nutch alone is OK, but "embedded Nutch" is not Ok... extremely hard!

You need to embed "Nutch Client" into your web application. Nutch should run
separately.

I believe Nutch supports "Open Search" or something similar (XML protocol,
REST-like) - your web application should use it to interact with Nutch, but
you have to develop client-part. Sorry, I don't know current status of Nutch
features...

So that I posted about SOLR: SolrJ is out-of-the-box client library for
Java, and for me it seems extremely easy solution for small search indexes
(few domains). Nutch has plugin for SOLR (and command-line option)

SOLR outputs everything as XML, JSON, etc. (instead of pure HTML which is
hard to embed in another HTML)
-Fuad


> -Original Message-
> From: Jaime Martín [mailto:james...@gmail.com]
> Sent: October-02-09 5:43 AM
> To: nutch-user@lucene.apache.org
> Subject: Re: how to "upgrade" a java application with nutch?
> 
> thank you for your responses. My scenario is this:
> I have a web application with some pages. I want from one of the web menu
> options to redirect to a search page. There you would be only able to
search
> info that has been previously crawled.
> Everything is java, so I think nutch fits really well and on top of that
it
> is opensource.
> So in my nutch downloaded distro I would configure the urls I want to
crawl,
> change something of the GUI and change some features of the nutch business
> logic to customise for the specific project purposes.
> That´s why I thought to customise nucth in a project, then create a
library
> from it and then use it from the main application.
> So, for that.. nutch "alone" isn´t proper and you need to work together
with
> Solr? (anyway I´ll have a look at bixo and Droids)
> than you!
> 
> 
> 2009/10/1 Fuad Efendi 
> 
> > Hi Jaime,
> >
> > You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin
> > for
> > SOLR). And use SolrJ client for SOLR from your application. This is very
> > easy.
> > -Fuad
> >
> >
> > http://www.linkedin.com/in/liferay
> >
> > > -Original Message-
> > > From: Jaime Martín [mailto:james...@gmail.com]
> > > Sent: October-01-09 5:59 AM
> > > To: nutch-user@lucene.apache.org
> > > Subject: how to "upgrade" a java application with nutch?
> > >
> > > Hi!
> > > I´ve a java application that I would like to "upgrade" with nutch.
What
> > jars
> > > should I add to my lib applicaction to make it possible to use nutch
> > > features from some of my app pages and business logic classes?
> > > I´ve tried with nutch-1.0.jar generated by "war" target without
success.
> > > I wonder what is the proper nutch build.xml target I should execute
for
> > this
> > > and what of the generated jars are to be included in my app. Maybe
apart
> > > from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few
of
> > > them?
> > > thanks in advance!
> >
> >
> >




Re: how to "upgrade" a java application with nutch?

2009-10-02 Thread Jaime Martín
thank you for your responses. My scenario is this:
I have a web application with some pages. I want from one of the web menu
options to redirect to a search page. There you would be only able to search
info that has been previously crawled.
Everything is java, so I think nutch fits really well and on top of that it
is opensource.
So in my nutch downloaded distro I would configure the urls I want to crawl,
change something of the GUI and change some features of the nutch business
logic to customise for the specific project purposes.
That´s why I thought to customise nucth in a project, then create a library
from it and then use it from the main application.
So, for that.. nutch "alone" isn´t proper and you need to work together with
Solr? (anyway I´ll have a look at bixo and Droids)
than you!


2009/10/1 Fuad Efendi 

> Hi Jaime,
>
> You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin
> for
> SOLR). And use SolrJ client for SOLR from your application. This is very
> easy.
> -Fuad
>
>
> http://www.linkedin.com/in/liferay
>
> > -Original Message-
> > From: Jaime Martín [mailto:james...@gmail.com]
> > Sent: October-01-09 5:59 AM
> > To: nutch-user@lucene.apache.org
> > Subject: how to "upgrade" a java application with nutch?
> >
> > Hi!
> > I´ve a java application that I would like to "upgrade" with nutch. What
> jars
> > should I add to my lib applicaction to make it possible to use nutch
> > features from some of my app pages and business logic classes?
> > I´ve tried with nutch-1.0.jar generated by "war" target without success.
> > I wonder what is the proper nutch build.xml target I should execute for
> this
> > and what of the generated jars are to be included in my app. Maybe apart
> > from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
> > them?
> > thanks in advance!
>
>
>


RE: how to "upgrade" a java application with nutch?

2009-10-01 Thread Fuad Efendi
Hi Jaime,

You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin for
SOLR). And use SolrJ client for SOLR from your application. This is very
easy.
-Fuad


http://www.linkedin.com/in/liferay

> -Original Message-
> From: Jaime Martín [mailto:james...@gmail.com]
> Sent: October-01-09 5:59 AM
> To: nutch-user@lucene.apache.org
> Subject: how to "upgrade" a java application with nutch?
> 
> Hi!
> I´ve a java application that I would like to "upgrade" with nutch. What
jars
> should I add to my lib applicaction to make it possible to use nutch
> features from some of my app pages and business logic classes?
> I´ve tried with nutch-1.0.jar generated by "war" target without success.
> I wonder what is the proper nutch build.xml target I should execute for
this
> and what of the generated jars are to be included in my app. Maybe apart
> from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
> them?
> thanks in advance!




Re: how to "upgrade" a java application with nutch?

2009-10-01 Thread Ken Krugler

Hi Jaime,

Depending on what exactly you're trying to do, there are some other  
projects that offer crawler functionality which could be easier to  
embed.


The two I know about are:

 - Droids (http://incubator.apache.org/droids/), though I haven't  
really used it.
 - Bixo (http://bixo.101tec.com/), which is a project I'm actively  
working on.


-- Ken

On Oct 1, 2009, at 9:37am, Jaime Martín wrote:

thank you for the info. that´s really a problem. I have a java  
project and

for some of its new features I would like to use nutch. As I need to
customise nutch my idea was next:
- 1st: change what needed for my requirements in my downloaded nutch  
and

generate a "nutch library"
- 2nd: add that library in the other project and invoke libraries  
features

when needed

is that not advisable? what is the best way then to generate a nutch  
library
to be used in other java projects? or is that not possible without  
becoming

crazy due to configuration issues?



2009/10/1 Andrzej Bialecki 


Jaime Martín wrote:


Hi!
I´ve a java application that I would like to "upgrade" with nutch.  
What

jars
should I add to my lib applicaction to make it possible to use nutch
features from some of my app pages and business logic classes?
I´ve tried with nutch-1.0.jar generated by "war" target without  
success.
I wonder what is the proper nutch build.xml target I should  
execute for

this
and what of the generated jars are to be included in my app. Maybe  
apart
from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a  
few of

them?
thanks in advance!


Nutch is not designed for embedding in other applications, so you  
may face
numerous problems. I did such an integration once, and it was far  
from
obvious. A lot depends also whether you want to run it on a  
distributed

cluster or in a single JVM (local mode).

Take a look at build/nutch*.job, it's a jar file that contains all
dependencies needed to run Nutch except for Hadoop libraries (which  
are also

required).

--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




--
Ken Krugler
TransPac Software, Inc.

+1 530-210-6378



Re: how to "upgrade" a java application with nutch?

2009-10-01 Thread Jaime Martín
thank you for the info. that´s really a problem. I have a java project and
for some of its new features I would like to use nutch. As I need to
customise nutch my idea was next:
- 1st: change what needed for my requirements in my downloaded nutch and
generate a "nutch library"
- 2nd: add that library in the other project and invoke libraries features
when needed

is that not advisable? what is the best way then to generate a nutch library
to be used in other java projects? or is that not possible without becoming
crazy due to configuration issues?



2009/10/1 Andrzej Bialecki 

> Jaime Martín wrote:
>
>> Hi!
>> I´ve a java application that I would like to "upgrade" with nutch. What
>> jars
>> should I add to my lib applicaction to make it possible to use nutch
>> features from some of my app pages and business logic classes?
>> I´ve tried with nutch-1.0.jar generated by "war" target without success.
>> I wonder what is the proper nutch build.xml target I should execute for
>> this
>> and what of the generated jars are to be included in my app. Maybe apart
>> from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
>> them?
>> thanks in advance!
>>
>>
> Nutch is not designed for embedding in other applications, so you may face
> numerous problems. I did such an integration once, and it was far from
> obvious. A lot depends also whether you want to run it on a distributed
> cluster or in a single JVM (local mode).
>
> Take a look at build/nutch*.job, it's a jar file that contains all
> dependencies needed to run Nutch except for Hadoop libraries (which are also
> required).
>
> --
> Best regards,
> Andrzej Bialecki <><
>  ___. ___ ___ ___ _ _   __
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>


Re: how to "upgrade" a java application with nutch?

2009-10-01 Thread Andrzej Bialecki

Jaime Martín wrote:

Hi!
I´ve a java application that I would like to "upgrade" with nutch. What jars
should I add to my lib applicaction to make it possible to use nutch
features from some of my app pages and business logic classes?
I´ve tried with nutch-1.0.jar generated by "war" target without success.
I wonder what is the proper nutch build.xml target I should execute for this
and what of the generated jars are to be included in my app. Maybe apart
from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
them?
thanks in advance!



Nutch is not designed for embedding in other applications, so you may 
face numerous problems. I did such an integration once, and it was far 
from obvious. A lot depends also whether you want to run it on a 
distributed cluster or in a single JVM (local mode).


Take a look at build/nutch*.job, it's a jar file that contains all 
dependencies needed to run Nutch except for Hadoop libraries (which are 
also required).


--
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: how to "upgrade" a java application with nutch?

2009-10-01 Thread Paul Tomblin
2009/10/1 Jaime Martín 

> Hi!
> I´ve a java application that I would like to "upgrade" with nutch. What
> jars
> should I add to my lib applicaction to make it possible to use nutch
> features from some of my app pages and business logic classes?
> I´ve tried with nutch-1.0.jar generated by "war" target without success.
> I wonder what is the proper nutch build.xml target I should execute for
> this
> and what of the generated jars are to be included in my app. Maybe apart
> from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of
> them?
>

Maybe I'm doing it wrong, but I used the nutch-1.0.job file instead of the
jar.

-- 
http://www.linkedin.com/in/paultomblin