Re: how to upgrade a java application with nutch?
thank you for your responses. My scenario is this: I have a web application with some pages. I want from one of the web menu options to redirect to a search page. There you would be only able to search info that has been previously crawled. Everything is java, so I think nutch fits really well and on top of that it is opensource. So in my nutch downloaded distro I would configure the urls I want to crawl, change something of the GUI and change some features of the nutch business logic to customise for the specific project purposes. That´s why I thought to customise nucth in a project, then create a library from it and then use it from the main application. So, for that.. nutch alone isn´t proper and you need to work together with Solr? (anyway I´ll have a look at bixo and Droids) than you! 2009/10/1 Fuad Efendi f...@efendi.ca Hi Jaime, You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin for SOLR). And use SolrJ client for SOLR from your application. This is very easy. -Fuad http://www.linkedin.com/in/liferay -Original Message- From: Jaime Martín [mailto:james...@gmail.com] Sent: October-01-09 5:59 AM To: nutch-user@lucene.apache.org Subject: how to upgrade a java application with nutch? Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance!
RE: how to upgrade a java application with nutch?
Nutch alone is OK, but embedded Nutch is not Ok... extremely hard! You need to embed Nutch Client into your web application. Nutch should run separately. I believe Nutch supports Open Search or something similar (XML protocol, REST-like) - your web application should use it to interact with Nutch, but you have to develop client-part. Sorry, I don't know current status of Nutch features... So that I posted about SOLR: SolrJ is out-of-the-box client library for Java, and for me it seems extremely easy solution for small search indexes (few domains). Nutch has plugin for SOLR (and command-line option) SOLR outputs everything as XML, JSON, etc. (instead of pure HTML which is hard to embed in another HTML) -Fuad -Original Message- From: Jaime Martín [mailto:james...@gmail.com] Sent: October-02-09 5:43 AM To: nutch-user@lucene.apache.org Subject: Re: how to upgrade a java application with nutch? thank you for your responses. My scenario is this: I have a web application with some pages. I want from one of the web menu options to redirect to a search page. There you would be only able to search info that has been previously crawled. Everything is java, so I think nutch fits really well and on top of that it is opensource. So in my nutch downloaded distro I would configure the urls I want to crawl, change something of the GUI and change some features of the nutch business logic to customise for the specific project purposes. That´s why I thought to customise nucth in a project, then create a library from it and then use it from the main application. So, for that.. nutch alone isn´t proper and you need to work together with Solr? (anyway I´ll have a look at bixo and Droids) than you! 2009/10/1 Fuad Efendi f...@efendi.ca Hi Jaime, You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin for SOLR). And use SolrJ client for SOLR from your application. This is very easy. -Fuad http://www.linkedin.com/in/liferay -Original Message- From: Jaime Martín [mailto:james...@gmail.com] Sent: October-01-09 5:59 AM To: nutch-user@lucene.apache.org Subject: how to upgrade a java application with nutch? Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance!
how to upgrade a java application with nutch?
Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance!
Re: how to upgrade a java application with nutch?
2009/10/1 Jaime Martín james...@gmail.com Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? Maybe I'm doing it wrong, but I used the nutch-1.0.job file instead of the jar. -- http://www.linkedin.com/in/paultomblin
Re: how to upgrade a java application with nutch?
Jaime Martín wrote: Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance! Nutch is not designed for embedding in other applications, so you may face numerous problems. I did such an integration once, and it was far from obvious. A lot depends also whether you want to run it on a distributed cluster or in a single JVM (local mode). Take a look at build/nutch*.job, it's a jar file that contains all dependencies needed to run Nutch except for Hadoop libraries (which are also required). -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: how to upgrade a java application with nutch?
thank you for the info. that´s really a problem. I have a java project and for some of its new features I would like to use nutch. As I need to customise nutch my idea was next: - 1st: change what needed for my requirements in my downloaded nutch and generate a nutch library - 2nd: add that library in the other project and invoke libraries features when needed is that not advisable? what is the best way then to generate a nutch library to be used in other java projects? or is that not possible without becoming crazy due to configuration issues? 2009/10/1 Andrzej Bialecki a...@getopt.org Jaime Martín wrote: Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance! Nutch is not designed for embedding in other applications, so you may face numerous problems. I did such an integration once, and it was far from obvious. A lot depends also whether you want to run it on a distributed cluster or in a single JVM (local mode). Take a look at build/nutch*.job, it's a jar file that contains all dependencies needed to run Nutch except for Hadoop libraries (which are also required). -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: how to upgrade a java application with nutch?
Hi Jaime, Depending on what exactly you're trying to do, there are some other projects that offer crawler functionality which could be easier to embed. The two I know about are: - Droids (http://incubator.apache.org/droids/), though I haven't really used it. - Bixo (http://bixo.101tec.com/), which is a project I'm actively working on. -- Ken On Oct 1, 2009, at 9:37am, Jaime Martín wrote: thank you for the info. that´s really a problem. I have a java project and for some of its new features I would like to use nutch. As I need to customise nutch my idea was next: - 1st: change what needed for my requirements in my downloaded nutch and generate a nutch library - 2nd: add that library in the other project and invoke libraries features when needed is that not advisable? what is the best way then to generate a nutch library to be used in other java projects? or is that not possible without becoming crazy due to configuration issues? 2009/10/1 Andrzej Bialecki a...@getopt.org Jaime Martín wrote: Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance! Nutch is not designed for embedding in other applications, so you may face numerous problems. I did such an integration once, and it was far from obvious. A lot depends also whether you want to run it on a distributed cluster or in a single JVM (local mode). Take a look at build/nutch*.job, it's a jar file that contains all dependencies needed to run Nutch except for Hadoop libraries (which are also required). -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com -- Ken Krugler TransPac Software, Inc. http://www.transpac.com +1 530-210-6378
RE: how to upgrade a java application with nutch?
Hi Jaime, You don't have to embed; try (simplified) Nutch + SOLR (Nutch has plugin for SOLR). And use SolrJ client for SOLR from your application. This is very easy. -Fuad http://www.linkedin.com/in/liferay -Original Message- From: Jaime Martín [mailto:james...@gmail.com] Sent: October-01-09 5:59 AM To: nutch-user@lucene.apache.org Subject: how to upgrade a java application with nutch? Hi! I´ve a java application that I would like to upgrade with nutch. What jars should I add to my lib applicaction to make it possible to use nutch features from some of my app pages and business logic classes? I´ve tried with nutch-1.0.jar generated by war target without success. I wonder what is the proper nutch build.xml target I should execute for this and what of the generated jars are to be included in my app. Maybe apart from nutch-1.0.jar are all nutch-1.0\lib jars compulsory or just a few of them? thanks in advance!